Features Extraction

In MEDimage, all the subpackages and modules need a specific configuration to be used correctly, so they respectively rely on one single JSON configuration file. This file contains parameters for each step of the workflow (processing, extraction…). For example, IBSI tests require specific parameters for radiomics extraction for each test. You can check a full example of the file here: notebooks/ibsi/settings/.

This section will walk you through the details on how to set up and use the configuration file. It will be separated to four subdivision:

Pre-checks
Processing
Radiomics
Filters

General analysis Parameters

n_batch

A numerical value that determines the number of batches to be used in parallel computations, set to 0 for serial computation.
type	int

e.g.

{
    "n_batch" : 8
}

roi_type_labels

A list of labels for the regions of interest (ROI) to use in the analysis. The labels must match the names of the corresponding CSV files. For example, if you have a csv file named `roiNames_GTV.csv`, then the `roi_type_labels` msut be `["GTV"]`.
type	List[str]

e.g.

{
    "roi_type_labels" : ["GTV"]
}

roi_types

A list of labels that describe the regions of interest, used to save the analysis results. The labels must accurately reflect the regions analyzed. For instance, if you conduct an analysis of a single ROI in a `"GTV"` area with two different ROIs (`"Mass"` and `"Edema"`), the label can be `["GTVMassOnly"]`. This name will be displayed in the JSON results file.
type	List[str]

e.g.

{
    "roi_types" : ["GTVMassOnly"]
}

Pre-checks Parameters

The pre radiomics checks configuration is a set of parameters used by the DataManager class. These parameters must be set in a nested dictionary as follows:

{
    "pre_radiomics_checks": {"All parameters go inside this dict"}
}

wildcards_dimensions

List of wild cards for voxel dimension checks (Read about wildcards here). Checks will be run for every wildcard in the list. For example `["Glioma.MRscan.npy", "STS.CTscan.npy"]`
type	List[str]

e.g.

{
    "pre_radiomics_checks" : {
        "wildcards_dimensions" : ["Glioma*.MRscan.npy", "STS*.CTscan.npy"],
        }
}

wildcards_window

List of wild cards for intensities window checks (Read about wildcards here). Checks will be run for every wildcard in the list. For example `["Glioma.MRscan.npy", "STS.CTscan.npy"]`
type	List[str]

e.g.

{
    "pre_radiomics_checks" : {
        "wildcards_window" : ["Glioma*.MRscan.npy", "STS*.CTscan.npy"],
        }
}

path_data

Path to your data (`MEDscan` class pickle objects)
type	str

e.g.

{
    "pre_radiomics_checks" : {
        "path_data" : "home/user/medimage/data/npy/sts",
        }
}

path_csv

Path to your dataset csv file (Read more about the CSV File)
type	str

e.g.

{
    "pre_radiomics_checks" : {
        "path_save_checks" : "home/user/medimage/checks",
        }
}

path_save_checks

Path where the pre-checks results will be saved
type	str

e.g.

{
    "pre_radiomics_checks" : {
        "path_csv" : "home/user/medimage/data/csv/roiNames_GTV.csv",
        }
}

Note

initializing the pre-radiomics checks settings is optional and can be done in the DataManager instance initialization step.

Processing Parameters

Each imaging modality should have its own params dict inside the JSON file and should be organized as follows:

{
    "imParamMR": {"Processing parameters for MR modality"},
    "imParamCT": {"Processing parameters for CT modality"},
    "imParamPET": {"Processing parameters for PET modality"}
}

box_string

Box of the ROI used in the workflow.
type	string
options
`full`	Use the full ROI
	type	string
`box`	Use the smallest box possible
	type	string
`box{n}`	For example `box10`, 10 voxels are added in all three dimensions the smallest bounding box. The number after ‘box’ defines the number of voxels to add.
	type	string
`{n}box`	For example `2box`, Will use double the size of the smallest box . The number before ‘box’ defines the multiplication in size.
	type	string

e.g.

{
    "imParamCT" : {
        "box_string" : "box7",
        }
    "imParamMR" : {
        "box_string" : "box",
        }
    "imParamPET" : {
        "box_string" : "2box",
        }
}

interp

Interpolation parameters.
type	dict
options
`scale_non_text`	size-3 list of the new voxel size
	type	List[float]
`scale_text`	Lists of size-3 of the new voxel size for texture features (features will be computed for each list)
	type	List[List[float]]
`vol_interp`	Volume interpolation method (“linear”, “spline” or “cubic”)
	type	string
`gl_round`	This option should be set only for CT scans, set it to 1 to round values to nearest integers (Must be a power of 10)
	type	float
`roi_interp`	ROI interpolation method (“nearest”, “linear” or “cubic”)
	type	string
`roi_pv`	Rounding value for ROI intensities. Must be between 0 and 1.
	type	float

e.g.

{
    "imParamMR" : {
        "interp" : {
            "scale_non_text" : [1, 1, 1],
            "scale_text" : [[1, 1, 1]],
            "vol_interp" : "linear",
            "gl_round" : [],
            "roi_interp" : "linear",
            "roi_pv" : 0.5
        }
    "imParamCT" : {
        "interp" : {
            "scale_non_text" : [2, 2, 3],
            "scale_text" : [[2, 2, 3]],
            "vol_interp" : "nearest",
            "gl_round" : 1,
            "roi_interp" : "nearest",
            "roi_pv" : 0.5
        }
    "imParamPET" : {
        "interp" : {
            "scale_non_text" : [3, 3, 3],
            "scale_text" : [[3, 3, 3]],
            "vol_interp" : "spline",
            "gl_round" : [],
            "roi_interp" : "spline",
            "roi_pv" : 0.5
        }
    }
}

reSeg

Resegmentation parameters.
type	dict
options
`range`	Resegmentation range, 2-elements list consists of minimum and maximum intensity value. Use `"inf"` for infinity
	type	List
`outliers`	Outlier resegmentation algorithm. For now `MEDimage` only implements `"Collewet"` algorithms. Leave empty for no outlier resegmentation
	type	string

e.g.

{
    "imParamMR" : {
        "reSeg" : {
            "range" : [0, "inf"],
            "outliers" : ""
        }
    },
    "imParamCT" : {
        "reSeg" : {
            "range" : [-500, 500],
            "outliers" : "Collewet"
        }
    },
    "imParamPET" : {
        "reSeg" : {
            "range" : [0, "inf"],
            "outliers" : "Collewet"
        }
    }
}

discretisation

Discretization parameters.
type	dict
options
`IH`	Discretization parameters for intensity histogram features
	type	dict
`IVH`	Discretization parameters for intensity volume histogram features
	type	dict
`texture`	Discretization parameters for texture features
	type	dict

IH

Discretization parameters for intensity histogram features.
type	dict
options
`type`	Discretization algorithm: `"FBS"` for fixed bin size and `"FBN"` for fixed bin number algorithm. Other possible options: `"FBSequal"` and `"FBNequal"`
	type	string
`val`	Bin size or bin number, depending on the algorithm used
	type	int

IVH

Discretization parameters for intensity volume histogram features.
type	dict
options
`type`	Discretization algorithm: `"FBS"` for fixed bin size and `"FBN"` for fixed bin number algorithm
	type	string
`val`	Bin size or bin number, depending on the algorithm used
	type	int

texture

Discretization parameters for texture features.
type	dict
options
`type`	List of discretisation algorithms: `"FBS"` for fixed bin size and `"FBN"` for fixed bin number. Texture features will be computed for each algorithm in the list
	type	List[string]
`val`	List of bin sizes or bin numbers, depending on the algorithm used. Texture features will be computed for each bin number or bin size in the list
	type	List[List[int]]

e.g. for CT only (the parameters are the same for MR and PET):

{
    "imParamCT" : {
        "discretisation" : {
            "IH" : {
                "type" : "FBS",
                "val" : 25
            },
            "IVH" : {
                "type" : "FBN",
                "val" : 10
            },
            "texture" : {
                "type" : ["FBS"],
                "val" : [[25]]
            }
        }
    }
}

compute_suv_map

Computation of the suv map for PET scans. Default `True`
type	bool
options
`True`	Will compute suv map for PET scans.
	type	bool
`False`	Will not compute suv map and it must be computed before.
	type	bool

This parameter is only used for PET scans and is set as follows:

{
    "imParamPET" : {
        "compute_suv_map" : true
        }
}

Note

This parameter concern PET scans only. MEDimage only computes suv map for DICOM scans, since the computation relies on DICOM headers for computation and assumes it’s already computed for NIfTI scans.

filter_type

Name of the filter to use on the scan. Empty string by default.
type	string
options
`mean`	Filter images using `mean` filter.
	type	string
`log`	Filter images using `log` filter.
	type	string
`gabor`	Filter images using `gabor` filter.
	type	string
`laws`	Filter images using `laws` filter.
	type	string
`wavelet`	Filter images using `wavelet` filter.
	type	string

e.g.

{
    "imParamPET" : {
        "filter_type" : "mean"
        },
    "imParamMR" : {
        "filter_type" : "laws"
        },
    "imParamCT" : {
        "filter_type" : "log"
        }
}

Extraction Parameters

Extraction parameters are organized in the same wat as the processing parameters so each imaging modality should have its own parameters and the JSON file should be organized as follows:

{
    "imParamMR": {"Extraction params for MR modality"},
    "imParamCT": {"Extraction params for CT modality"},
    "imParamPET": {"Extraction params for PET modality"}
}

glcm dist_correction

glcm features weighting norm. by default `False`
type	Union[bool, str]
options
`manhattan`	Will use `"manhattan"` weighting norm.
	type	string
`euclidean`	Will use `"euclidean"` weighting norm.
	type	string
`chebyshev`	Will use `"chebyshev"` weighting norm.
	type	string
`True`	Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine.
	type	bool
`False`	`False` to replicate IBSI results.
	type	bool

e.g.

{
    "imParamMR" : {
        "glcm" : {
            "dist_correction" : false
        }
    },
    "imParamCT" : {
        "glcm" : {
            "dist_correction" : "chebyshev"
        }
    },
    "imParamPET" : {
        "glcm" : {
            "dist_correction" : "euclidean"
        }
    }
}

glcm merge_method

glcm features aggregation method. by default `"vol_merge"`
type	string
options
`vol_merge`	Features are extracted from a single matrix after merging all 3D directional matrices.
	type	string
`slice_merge`	Features are extracted from a single matrix after merging 2D directional matrices per slice, and then averaged over slices.
	type	string
`dir_merge`	Features are extracted from a single matrix after merging 2D directional matrices per direction, and then averaged over direction
	type	string
`average`	Features are extracted from each 3D directional matrix and averaged over the 3D directions
	type	string

e.g.

{
    "imParamMR" : {
        "glcm" : {
            "merge_method" : "average"
        }
    },
    "imParamCT" : {
        "glcm" : {
            "merge_method" : "vol_merge"
        }
    },
    "imParamPET" : {
        "glcm" : {
            "merge_method" : "dir_merge"
        }
    }
}

glrlm dist_correction

glrlm features weighting norm. by default `False`
type	Union[bool, str]
options
`manhattan`	Will use `"manhattan"` weighting norm.
	type	string
`euclidean`	Will use `"euclidean"` weighting norm.
	type	string
`chebyshev`	Will use `"chebyshev"` weighting norm.
	type	string
`True`	Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine.
	type	bool
`False`	`False` to replicate IBSI results.
	type	bool

e.g.

{
    "imParamMR" : {
        "glrlm" : {
            "dist_correction" : false
        }
    },
    "imParamCT" : {
        "glrlm" : {
            "dist_correction" : "chebyshev"
        }
    },
    "imParamPET" : {
        "glrlm" : {
            "dist_correction" : "euclidean"
        }
    }
}

glrlm merge_method

glrlm features aggregation method. by default `"vol_merge"`
type	string
options
`vol_merge`	Features are extracted from a single matrix after merging all 3D directional matrices.
	type	string
`slice_merge`	Features are extracted from a single matrix after merging 2D directional matrices per slice, and then averaged over slices.
	type	string
`dir_merge`	Features are extracted from a single matrix after merging 2D directional matrices per direction, and then averaged over direction
	type	string
`average`	Features are extracted from each 3D directional matrix and averaged over the 3D directions
	type	string

e.g.

{
    "imParamMR" : {
        "glrlm" : {
            "merge_method" : "average"
        }
    },
    "imParamCT" : {
        "glrlm" : {
            "merge_method" : "vol_merge"
        }
    },
    "imParamPET" : {
        "glrlm" : {
            "merge_method" : "dir_merge"
        }
    }
}

ngtdm dist_correction

ngtdm features weighting norm. by default `False`
type	bool
options
`True`	Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine.
	type	bool
`False`	`False` to replicate IBSI results.
	type	bool

e.g.

{
    "imParamMR" : {
        "ngtdm" : {
            "dist_correction" : false
        }
    },
    "imParamCT" : {
        "ngtdm" : {
            "dist_correction" : true
        }
    },
    "imParamPET" : {
        "ngtdm" : {
            "dist_correction" : true
        }
    }
}

Filtering parameters

Filtering parameters are organized in a separate dictionary, each dictionary contains parameters for every filter of the MEDimage:

{
    "imParamFilter": {
        "mean": {"mean filter params"},
        "log": {"log filter params"},
        "laws": {"laws filter params"},
        "gabor": {"gabor filter params"},
        "wavelet": {"wavelet filter params"},
        "textural": {"textural filter params"}
    }
}

mean

Parameters of the mean filter
type	dict
options
`ndims`	Dimension of the imaging data. Usually 3.
	type	int
`orthogonal_rot`	If `True`, the images will be rotated over all the planes.
	type	bool
`size`	Size of the filter kernel.
	type	int
`padding`	Padding mode, default `"symmetric"`. All the padding modes possible can be found here
	type	string
`name_save`	Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).
	type	string

e.g.

{
    "imParamFilter" : {
        "mean" : {
            "ndims" : 3,
            "orthogonal_rot": false,
            "size" : 5,
            "padding" : "symmetric",
            "name_save" : "mean5"
        }
}

log

Parameters of the laplacian of Gaussian filter
type	dict
options
`ndims`	Dimension of the imaging data. Usually 3.
	type	int
`sigma`	Standard deviation of the Gaussian, controls the scale of the convolutional operator.
	type	float
`orthogonal_rot`	If `True`, the images will be rotated over all the planes.
	type	bool
`padding`	Padding mode, default `"symmetric"`. All the padding modes possible can be found here
	type	string
`name_save`	Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).
	type	string

e.g.

{
    "imParamFilter" : {
        "log" : {
            "ndims" : 3,
            "sigma" : 1.5,
            "orthogonal_rot" : false,
            "padding" : "constant",
            "name_save" : "log_1.5"
        }
}

laws

Parameters of the laws filter
type	dict
options
`config`	List of string of every 1D filter to use for the Laws kernel creation. Possible 1D filters: `"L3"`, `"L5"`, `"E3"`, `"E5"`, `"S3"`, `"S5"`, `"W5"` or `"R5"`
	type	List[str]
`energy_distance`	The Chebyshev distance that will be used to create the laws texture energy image.
	type	float
`rot_invariance`	If `True`, rotational invariance will be approximated.
	type	bool
`orthogonal_rot`	If `True`, the images will be rotated over all the planes.
	type	bool
`energy_image`	If `True`, Laws texture energy images are computed.
	type	bool
`padding`	Padding mode, default `"symmetric"`. All the padding modes possible can be found here
	type	string
`name_save`	Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).
	type	string

e.g.

{
    "imParamFilter" : {
        "laws" : {
            "config" : ["L5", "E5", "E5"],
            "energy_distance" : 7,
            "rot_invariance" : true,
            "orthogonal_rot" : false,
            "energy_image" : true,
            "padding" : "symmetric",
            "name_save" : "laws_l5_e5_e5_7"
        }
}

Note

The order of the 1D filters used in laws filter configuration matter, because we use the configuration list to compute the outer product and the outer product is not commutative.

gabor

Parameters of the gabor filter
type	dict
options
`sigma`	Standard deviation of the Gaussian envelope, controls the scale of the filter.
	type	float
`lambda`	Wavelength or inverse of the frequency.
	type	float
`gamma`	Spatial aspect ratio.
	type	float
`theta`	Angle of the rotation matrix.
	type	str
`rot_invariance`	If `True`, rotational invariance will be approximated by combining the response maps of several elements of the Gabor filter bank.
	type	bool
`orthogonal_rot`	If `True`, the images will be rotated over all the planes.
	type	bool
`padding`	Padding mode, default `"symmetric"`. All the padding modes possible can be found here
	type	string
`name_save`	Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).
	type	string

e.g.

{
    "imParamFilter" : {
        "gabor" : {
            "sigma" : 5,
            "lambda" : 2,
            "gamma" : 1.5,
            "theta" : "Pi/8",
            "rot_invariance" : true,
            "orthogonal_rot" : true,
            "padding" : "symmetric",
            "name_save" : "gabor_5_2_1.5"
        }
}

Note

gamma parameter should be radian but must be specified as a string, for example \(\frac{\pi}{2}\) should be specified as “Pi/2”.

wavelet

Parameters of the gabor filter
type	dict
options
`ndims`	Dimension of the imaging data. Usually 3.
	type	int
`basis_function`	Wavelet name used to create the kernel. The Wavelet families and built-ins can be found here. Custom user wavelets are also supported.
	type	string
`subband`	String of the 1D wavelet kernels (`"H"` for high-pass filter or `"L"` for low-pass filter). Must have a size of `ndims`.
	type	string
`level`	The number of decomposition steps to perform.
	type	int
`rot_invariance`	If `True`, rotational invariance will be approximated.
	type	bool
`padding`	Padding mode, default `"symmetric"`. All the padding modes possible can be found here
	type	string
`name_save`	Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).
	type	string

e.g.

{
    "imParamFilter" : {
        "wavelet" : {
            "ndims" : 3,
            "basis_function" : "db3",
            "subband" : "LLH",
            "level" : 1,
            "rot_invariance" : true,
            "padding" : "symmetric",
            "name_save" : "Wavelet_db3_LLH"
        },
}

textural

Parameters of the textural filter
type	dict
options
`family`	Texture features family. Only `"glcm"` is supported for now.
	type	string
`discretization`	Discretization parameters for the texture features (Defined down below).
	type	dict
`local`	Wether to discretize the ROI locally or globally.
	type	bool
`size`	Filter size.
	type	int
`name_save`	Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).
	type	string

Discretization (Textural filters)

Discretization parameters for intensity histogram features.
type	dict
options
`type`	Discretization algorithm: `"FBS"` for fixed bin size and `"FBN"` for fixed bin number algorithm.
	type	string
`bn`	Bin number. Set if `type` is `"FBN"`.
	type	int
`bw`	Bin size. Set if `type` is `"FBS"` or `type` is `"FBN"` and `adapted` is `True`.
	type	int
`adapted`	If `True`, the bin number will be computed using the bin width and the intensity range. Only valid if `type` is `"FBN"`.
	type	bool

e.g.

{
    "imParamFilter" : {
        "textural" : {
            "family" : "glcm",
            "discretization": {
                "type" : "FBN",
                "bn" : null,
                "bw" : 25,
                "adapted" : true
            },
            "size" : 3,
            "local" : true,
            "name_save" : "glcm_local_fbn_25hu_adapted"
        },
}

Example of a full settings dictionary

Here is an example of a complete settings dictionary: