Features Extraction
In MEDimage
, all the subpackages and modules need a specific configuration to be used correctly, so they respectively
rely on one single JSON configuration file. This file contains parameters for each step of the workflow (processing, extraction…).
For example, IBSI tests require specific parameters for radiomics extraction for each test.
You can check a full example of the file here:
notebooks/ibsi/settings/.
This section will walk you through the details on how to set up and use the configuration file. It will be separated to four subdivision:
General analysis Parameters
n_batch
A numerical value that determines the number of batches to be used in parallel computations, set to 0 for serial computation. |
|
type |
int |
e.g.
{
"n_batch" : 8
}
roi_type_labels
A list of labels for the regions of interest (ROI) to use in the analysis. The labels must match the names of the corresponding CSV files. For example, if you have a csv file named |
|
type |
List[str] |
e.g.
{
"roi_type_labels" : ["GTV"]
}
roi_types
A list of labels that describe the regions of interest, used to save the analysis results. The labels must accurately reflect the regions analyzed. For instance, if you conduct an analysis of a single ROI in a |
|
type |
List[str] |
e.g.
{
"roi_types" : ["GTVMassOnly"]
}
Pre-checks Parameters
The pre radiomics checks configuration is a set of parameters used by the DataManager
class. These parameters must be set in a nested
dictionary as follows:
{
"pre_radiomics_checks": {"All parameters go inside this dict"}
}
wildcards_dimensions
List of wild cards for voxel dimension checks (Read about wildcards here). Checks will be run for every wildcard in the list. For example |
|
type |
List[str] |
e.g.
{
"pre_radiomics_checks" : {
"wildcards_dimensions" : ["Glioma*.MRscan.npy", "STS*.CTscan.npy"],
}
}
wildcards_window
List of wild cards for intensities window checks (Read about wildcards here). Checks will be run for every wildcard in the list. For example |
|
type |
List[str] |
e.g.
{
"pre_radiomics_checks" : {
"wildcards_window" : ["Glioma*.MRscan.npy", "STS*.CTscan.npy"],
}
}
path_data
Path to your data ( |
|
type |
str |
e.g.
{
"pre_radiomics_checks" : {
"path_data" : "home/user/medimage/data/npy/sts",
}
}
path_csv
Path to your dataset csv file (Read more about the CSV File) |
|
type |
str |
e.g.
{
"pre_radiomics_checks" : {
"path_save_checks" : "home/user/medimage/checks",
}
}
path_save_checks
Path where the pre-checks results will be saved |
|
type |
str |
e.g.
{
"pre_radiomics_checks" : {
"path_csv" : "home/user/medimage/data/csv/roiNames_GTV.csv",
}
}
Note
initializing the pre-radiomics checks settings
is optional and can be done in the DataManager
instance initialization step.
Processing Parameters
Each imaging modality should have its own params dict inside the JSON file and should be organized as follows:
{
"imParamMR": {"Processing parameters for MR modality"},
"imParamCT": {"Processing parameters for CT modality"},
"imParamPET": {"Processing parameters for PET modality"}
}
box_string
Box of the ROI used in the workflow. |
||
type |
string |
|
options |
||
|
Use the full ROI |
|
type |
string |
|
|
Use the smallest box possible |
|
type |
string |
|
|
For example |
|
type |
string |
|
|
For example |
|
type |
string |
e.g.
{
"imParamCT" : {
"box_string" : "box7",
}
"imParamMR" : {
"box_string" : "box",
}
"imParamPET" : {
"box_string" : "2box",
}
}
interp
Interpolation parameters. |
||
type |
dict |
|
options |
||
|
size-3 list of the new voxel size |
|
type |
List[float] |
|
|
Lists of size-3 of the new voxel size for texture features (features will be computed for each list) |
|
type |
List[List[float]] |
|
|
Volume interpolation method (“linear”, “spline” or “cubic”) |
|
type |
string |
|
|
This option should be set only for CT scans, set it to 1 to round values to nearest integers (Must be a power of 10) |
|
type |
float |
|
|
ROI interpolation method (“nearest”, “linear” or “cubic”) |
|
type |
string |
|
|
Rounding value for ROI intensities. Must be between 0 and 1. |
|
type |
float |
e.g.
{
"imParamMR" : {
"interp" : {
"scale_non_text" : [1, 1, 1],
"scale_text" : [[1, 1, 1]],
"vol_interp" : "linear",
"gl_round" : [],
"roi_interp" : "linear",
"roi_pv" : 0.5
}
"imParamCT" : {
"interp" : {
"scale_non_text" : [2, 2, 3],
"scale_text" : [[2, 2, 3]],
"vol_interp" : "nearest",
"gl_round" : 1,
"roi_interp" : "nearest",
"roi_pv" : 0.5
}
"imParamPET" : {
"interp" : {
"scale_non_text" : [3, 3, 3],
"scale_text" : [[3, 3, 3]],
"vol_interp" : "spline",
"gl_round" : [],
"roi_interp" : "spline",
"roi_pv" : 0.5
}
}
}
reSeg
Resegmentation parameters. |
||
type |
dict |
|
options |
||
|
Resegmentation range, 2-elements list consists of minimum and maximum intensity value. Use |
|
type |
List |
|
|
Outlier resegmentation algorithm. For now |
|
type |
string |
e.g.
{
"imParamMR" : {
"reSeg" : {
"range" : [0, "inf"],
"outliers" : ""
}
},
"imParamCT" : {
"reSeg" : {
"range" : [-500, 500],
"outliers" : "Collewet"
}
},
"imParamPET" : {
"reSeg" : {
"range" : [0, "inf"],
"outliers" : "Collewet"
}
}
}
discretisation
Discretization parameters. |
||
type |
dict |
|
options |
||
|
Discretization parameters for intensity histogram features |
|
type |
dict |
|
|
Discretization parameters for intensity volume histogram features |
|
type |
dict |
|
|
Discretization parameters for texture features |
|
type |
dict |
IH
Discretization parameters for intensity histogram features. |
||
type |
dict |
|
options |
||
|
Discretization algorithm: |
|
type |
string |
|
|
Bin size or bin number, depending on the algorithm used |
|
type |
int |
IVH
Discretization parameters for intensity volume histogram features. |
||
type |
dict |
|
options |
||
|
Discretization algorithm: |
|
type |
string |
|
|
Bin size or bin number, depending on the algorithm used |
|
type |
int |
texture
Discretization parameters for texture features. |
||
type |
dict |
|
options |
||
|
List of discretisation algorithms: |
|
type |
List[string] |
|
|
List of bin sizes or bin numbers, depending on the algorithm used. Texture features will be computed for each bin number or bin size in the list |
|
type |
List[List[int]] |
e.g. for CT only (the parameters are the same for MR and PET):
{
"imParamCT" : {
"discretisation" : {
"IH" : {
"type" : "FBS",
"val" : 25
},
"IVH" : {
"type" : "FBN",
"val" : 10
},
"texture" : {
"type" : ["FBS"],
"val" : [[25]]
}
}
}
}
compute_suv_map
Computation of the suv map for PET scans. Default |
||
type |
bool |
|
options |
||
|
Will compute suv map for PET scans. |
|
type |
bool |
|
|
Will not compute suv map and it must be computed before. |
|
type |
bool |
This parameter is only used for PET scans and is set as follows:
{
"imParamPET" : {
"compute_suv_map" : true
}
}
Note
This parameter concern PET scans only. MEDimage
only computes suv map for DICOM scans, since the computation relies on
DICOM headers for computation and assumes it’s already computed for NIfTI scans.
filter_type
Name of the filter to use on the scan. Empty string by default. |
||
type |
string |
|
options |
||
|
Filter images using |
|
type |
string |
|
|
Filter images using |
|
type |
string |
|
|
Filter images using |
|
type |
string |
|
|
Filter images using |
|
type |
string |
|
|
Filter images using |
|
type |
string |
e.g.
{
"imParamPET" : {
"filter_type" : "mean"
},
"imParamMR" : {
"filter_type" : "laws"
},
"imParamCT" : {
"filter_type" : "log"
}
}
Extraction Parameters
Extraction parameters are organized in the same wat as the processing parameters so each imaging modality should have its own parameters and the JSON file should be organized as follows:
{
"imParamMR": {"Extraction params for MR modality"},
"imParamCT": {"Extraction params for CT modality"},
"imParamPET": {"Extraction params for PET modality"}
}
glcm dist_correction
glcm features weighting norm. by default |
||
type |
Union[bool, str] |
|
options |
||
|
Will use |
|
type |
string |
|
|
Will use |
|
type |
string |
|
|
Will use |
|
type |
string |
|
|
Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine. |
|
type |
bool |
|
|
|
|
type |
bool |
e.g.
{
"imParamMR" : {
"glcm" : {
"dist_correction" : false
}
},
"imParamCT" : {
"glcm" : {
"dist_correction" : "chebyshev"
}
},
"imParamPET" : {
"glcm" : {
"dist_correction" : "euclidean"
}
}
}
glcm merge_method
glcm features aggregation method. by default |
||
type |
string |
|
options |
||
|
Features are extracted from a single matrix after merging all 3D directional matrices. |
|
type |
string |
|
|
Features are extracted from a single matrix after merging 2D directional matrices per slice, and then averaged over slices. |
|
type |
string |
|
|
Features are extracted from a single matrix after merging 2D directional matrices per direction, and then averaged over direction |
|
type |
string |
|
|
Features are extracted from each 3D directional matrix and averaged over the 3D directions |
|
type |
string |
e.g.
{
"imParamMR" : {
"glcm" : {
"merge_method" : "average"
}
},
"imParamCT" : {
"glcm" : {
"merge_method" : "vol_merge"
}
},
"imParamPET" : {
"glcm" : {
"merge_method" : "dir_merge"
}
}
}
glrlm dist_correction
glrlm features weighting norm. by default |
||
type |
Union[bool, str] |
|
options |
||
|
Will use |
|
type |
string |
|
|
Will use |
|
type |
string |
|
|
Will use |
|
type |
string |
|
|
Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine. |
|
type |
bool |
|
|
|
|
type |
bool |
e.g.
{
"imParamMR" : {
"glrlm" : {
"dist_correction" : false
}
},
"imParamCT" : {
"glrlm" : {
"dist_correction" : "chebyshev"
}
},
"imParamPET" : {
"glrlm" : {
"dist_correction" : "euclidean"
}
}
}
glrlm merge_method
glrlm features aggregation method. by default |
||
type |
string |
|
options |
||
|
Features are extracted from a single matrix after merging all 3D directional matrices. |
|
type |
string |
|
|
Features are extracted from a single matrix after merging 2D directional matrices per slice, and then averaged over slices. |
|
type |
string |
|
|
Features are extracted from a single matrix after merging 2D directional matrices per direction, and then averaged over direction |
|
type |
string |
|
|
Features are extracted from each 3D directional matrix and averaged over the 3D directions |
|
type |
string |
e.g.
{
"imParamMR" : {
"glrlm" : {
"merge_method" : "average"
}
},
"imParamCT" : {
"glrlm" : {
"merge_method" : "vol_merge"
}
},
"imParamPET" : {
"glrlm" : {
"merge_method" : "dir_merge"
}
}
}
ngtdm dist_correction
ngtdm features weighting norm. by default |
||
type |
bool |
|
options |
||
|
Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine. |
|
type |
bool |
|
|
|
|
type |
bool |
e.g.
{
"imParamMR" : {
"ngtdm" : {
"dist_correction" : false
}
},
"imParamCT" : {
"ngtdm" : {
"dist_correction" : true
}
},
"imParamPET" : {
"ngtdm" : {
"dist_correction" : true
}
}
}
Filtering parameters
Filtering parameters are organized in a separate dictionary, each dictionary contains
parameters for every filter of the MEDimage
:
{
"imParamFilter": {
"mean": {"mean filter params"},
"log": {"log filter params"},
"laws": {"laws filter params"},
"gabor": {"gabor filter params"},
"wavelet": {"wavelet filter params"},
"textural": {"textural filter params"}
}
}
mean
Parameters of the mean filter |
||
type |
dict |
|
options |
||
|
Dimension of the imaging data. Usually 3. |
|
type |
int |
|
|
If |
|
type |
bool |
|
|
Size of the filter kernel. |
|
type |
int |
|
|
Padding mode, default |
|
type |
string |
|
|
Saving name added to the end of every radiomics extraction results table (Only if the filter was applied). |
|
type |
string |
e.g.
{
"imParamFilter" : {
"mean" : {
"ndims" : 3,
"orthogonal_rot": false,
"size" : 5,
"padding" : "symmetric",
"name_save" : "mean5"
}
}
log
Parameters of the laplacian of Gaussian filter |
||
type |
dict |
|
options |
||
|
Dimension of the imaging data. Usually 3. |
|
type |
int |
|
|
Standard deviation of the Gaussian, controls the scale of the convolutional operator. |
|
type |
float |
|
|
If |
|
type |
bool |
|
|
Padding mode, default |
|
type |
string |
|
|
Saving name added to the end of every radiomics extraction results table (Only if the filter was applied). |
|
type |
string |
e.g.
{
"imParamFilter" : {
"log" : {
"ndims" : 3,
"sigma" : 1.5,
"orthogonal_rot" : false,
"padding" : "constant",
"name_save" : "log_1.5"
}
}
laws
Parameters of the laws filter |
||
type |
dict |
|
options |
||
|
List of string of every 1D filter to use for the Laws kernel creation. Possible 1D filters: |
|
type |
List[str] |
|
|
The Chebyshev distance that will be used to create the laws texture energy image. |
|
type |
float |
|
|
If |
|
type |
bool |
|
|
If |
|
type |
bool |
|
|
If |
|
type |
bool |
|
|
Padding mode, default |
|
type |
string |
|
|
Saving name added to the end of every radiomics extraction results table (Only if the filter was applied). |
|
type |
string |
e.g.
{
"imParamFilter" : {
"laws" : {
"config" : ["L5", "E5", "E5"],
"energy_distance" : 7,
"rot_invariance" : true,
"orthogonal_rot" : false,
"energy_image" : true,
"padding" : "symmetric",
"name_save" : "laws_l5_e5_e5_7"
}
}
Note
The order of the 1D filters used in laws filter configuration matter, because we use the configuration list to compute the outer product and the outer product is not commutative.
gabor
Parameters of the gabor filter |
||
type |
dict |
|
options |
||
|
Standard deviation of the Gaussian envelope, controls the scale of the filter. |
|
type |
float |
|
|
Wavelength or inverse of the frequency. |
|
type |
float |
|
|
Spatial aspect ratio. |
|
type |
float |
|
|
Angle of the rotation matrix. |
|
type |
str |
|
|
If |
|
type |
bool |
|
|
If |
|
type |
bool |
|
|
Padding mode, default |
|
type |
string |
|
|
Saving name added to the end of every radiomics extraction results table (Only if the filter was applied). |
|
type |
string |
e.g.
{
"imParamFilter" : {
"gabor" : {
"sigma" : 5,
"lambda" : 2,
"gamma" : 1.5,
"theta" : "Pi/8",
"rot_invariance" : true,
"orthogonal_rot" : true,
"padding" : "symmetric",
"name_save" : "gabor_5_2_1.5"
}
}
Note
gamma
parameter should be radian but must be specified as a string, for example \(\frac{\pi}{2}\)
should be specified as “Pi/2”.
wavelet
Parameters of the gabor filter |
||
type |
dict |
|
options |
||
|
Dimension of the imaging data. Usually 3. |
|
type |
int |
|
|
Wavelet name used to create the kernel. The Wavelet families and built-ins can be found here. Custom user wavelets are also supported. |
|
type |
string |
|
|
String of the 1D wavelet kernels ( |
|
type |
string |
|
|
The number of decomposition steps to perform. |
|
type |
int |
|
|
If |
|
type |
bool |
|
|
Padding mode, default |
|
type |
string |
|
|
Saving name added to the end of every radiomics extraction results table (Only if the filter was applied). |
|
type |
string |
e.g.
{
"imParamFilter" : {
"wavelet" : {
"ndims" : 3,
"basis_function" : "db3",
"subband" : "LLH",
"level" : 1,
"rot_invariance" : true,
"padding" : "symmetric",
"name_save" : "Wavelet_db3_LLH"
},
}
textural
Parameters of the textural filter |
||
type |
dict |
|
options |
||
|
Texture features family. Only |
|
type |
string |
|
|
Discretization parameters for the texture features (Defined down below). |
|
type |
dict |
|
|
Wether to discretize the ROI locally or globally. |
|
type |
bool |
|
|
Filter size. |
|
type |
int |
|
|
Saving name added to the end of every radiomics extraction results table (Only if the filter was applied). |
|
type |
string |
Discretization (Textural filters)
Discretization parameters for intensity histogram features. |
||
type |
dict |
|
options |
||
|
Discretization algorithm: |
|
type |
string |
|
|
Bin number. Set if |
|
type |
int |
|
|
Bin size. Set if |
|
type |
int |
|
|
If |
|
type |
bool |
e.g.
{
"imParamFilter" : {
"textural" : {
"family" : "glcm",
"discretization": {
"type" : "FBN",
"bn" : null,
"bw" : 25,
"adapted" : true
},
"size" : 3,
"local" : true,
"name_save" : "glcm_local_fbn_25hu_adapted"
},
}
Example of a full settings dictionary
Here is an example of a complete settings dictionary: