Features Extraction

In MEDimage, all the subpackages and modules need a specific configuration to be used correctly, so they respectively rely on one single JSON configuration file. This file contains parameters for each step of the workflow (processing, extraction…). For example, IBSI tests require specific parameters for radiomics extraction for each test. You can check a full example of the file here: notebooks/ibsi/settings/.

This section will walk you through the details on how to set up and use the configuration file. It will be separated to four subdivision:

General analysis Parameters

n_batch

A numerical value that determines the number of batches to be used in parallel computations, set to 0 for serial computation.

type

int

e.g.

{
    "n_batch" : 8
}

roi_type_labels

A list of labels for the regions of interest (ROI) to use in the analysis. The labels must match the names of the corresponding CSV files. For example, if you have a csv file named roiNames_GTV.csv, then the roi_type_labels msut be ["GTV"].

type

List[str]

e.g.

{
    "roi_type_labels" : ["GTV"]
}

roi_types

A list of labels that describe the regions of interest, used to save the analysis results. The labels must accurately reflect the regions analyzed. For instance, if you conduct an analysis of a single ROI in a "GTV" area with two different ROIs ("Mass" and "Edema"), the label can be ["GTVMassOnly"]. This name will be displayed in the JSON results file.

type

List[str]

e.g.

{
    "roi_types" : ["GTVMassOnly"]
}

Pre-checks Parameters

The pre radiomics checks configuration is a set of parameters used by the DataManager class. These parameters must be set in a nested dictionary as follows:

{
    "pre_radiomics_checks": {"All parameters go inside this dict"}
}

wildcards_dimensions

List of wild cards for voxel dimension checks (Read about wildcards here). Checks will be run for every wildcard in the list. For example ["Glioma*.MRscan.npy", "STS*.CTscan.npy"]

type

List[str]

e.g.

{
    "pre_radiomics_checks" : {
        "wildcards_dimensions" : ["Glioma*.MRscan.npy", "STS*.CTscan.npy"],
        }
}

wildcards_window

List of wild cards for intensities window checks (Read about wildcards here). Checks will be run for every wildcard in the list. For example ["Glioma*.MRscan.npy", "STS*.CTscan.npy"]

type

List[str]

e.g.

{
    "pre_radiomics_checks" : {
        "wildcards_window" : ["Glioma*.MRscan.npy", "STS*.CTscan.npy"],
        }
}

path_data

Path to your data (MEDscan class pickle objects)

type

str

e.g.

{
    "pre_radiomics_checks" : {
        "path_data" : "home/user/medimage/data/npy/sts",
        }
}

path_csv

Path to your dataset csv file (Read more about the CSV File)

type

str

e.g.

{
    "pre_radiomics_checks" : {
        "path_save_checks" : "home/user/medimage/checks",
        }
}

path_save_checks

Path where the pre-checks results will be saved

type

str

e.g.

{
    "pre_radiomics_checks" : {
        "path_csv" : "home/user/medimage/data/csv/roiNames_GTV.csv",
        }
}

Note

initializing the pre-radiomics checks settings is optional and can be done in the DataManager instance initialization step.

Processing Parameters

Each imaging modality should have its own params dict inside the JSON file and should be organized as follows:

{
    "imParamMR": {"Processing parameters for MR modality"},
    "imParamCT": {"Processing parameters for CT modality"},
    "imParamPET": {"Processing parameters for PET modality"}
}

box_string

Box of the ROI used in the workflow.

type

string

options

full

Use the full ROI

type

string

box

Use the smallest box possible

type

string

box{n}

For example box10, 10 voxels are added in all three dimensions the smallest bounding box. The number after ‘box’ defines the number of voxels to add.

type

string

{n}box

For example 2box, Will use double the size of the smallest box . The number before ‘box’ defines the multiplication in size.

type

string

e.g.

{
    "imParamCT" : {
        "box_string" : "box7",
        }
    "imParamMR" : {
        "box_string" : "box",
        }
    "imParamPET" : {
        "box_string" : "2box",
        }
}

interp

Interpolation parameters.

type

dict

options

scale_non_text

size-3 list of the new voxel size

type

List[float]

scale_text

Lists of size-3 of the new voxel size for texture features (features will be computed for each list)

type

List[List[float]]

vol_interp

Volume interpolation method (“linear”, “spline” or “cubic”)

type

string

gl_round

This option should be set only for CT scans, set it to 1 to round values to nearest integers (Must be a power of 10)

type

float

roi_interp

ROI interpolation method (“nearest”, “linear” or “cubic”)

type

string

roi_pv

Rounding value for ROI intensities. Must be between 0 and 1.

type

float

e.g.

{
    "imParamMR" : {
        "interp" : {
            "scale_non_text" : [1, 1, 1],
            "scale_text" : [[1, 1, 1]],
            "vol_interp" : "linear",
            "gl_round" : [],
            "roi_interp" : "linear",
            "roi_pv" : 0.5
        }
    "imParamCT" : {
        "interp" : {
            "scale_non_text" : [2, 2, 3],
            "scale_text" : [[2, 2, 3]],
            "vol_interp" : "nearest",
            "gl_round" : 1,
            "roi_interp" : "nearest",
            "roi_pv" : 0.5
        }
    "imParamPET" : {
        "interp" : {
            "scale_non_text" : [3, 3, 3],
            "scale_text" : [[3, 3, 3]],
            "vol_interp" : "spline",
            "gl_round" : [],
            "roi_interp" : "spline",
            "roi_pv" : 0.5
        }
    }
}

reSeg

Resegmentation parameters.

type

dict

options

range

Resegmentation range, 2-elements list consists of minimum and maximum intensity value. Use "inf" for infinity

type

List

outliers

Outlier resegmentation algorithm. For now MEDimage only implements "Collewet" algorithms. Leave empty for no outlier resegmentation

type

string

e.g.

{
    "imParamMR" : {
        "reSeg" : {
            "range" : [0, "inf"],
            "outliers" : ""
        }
    },
    "imParamCT" : {
        "reSeg" : {
            "range" : [-500, 500],
            "outliers" : "Collewet"
        }
    },
    "imParamPET" : {
        "reSeg" : {
            "range" : [0, "inf"],
            "outliers" : "Collewet"
        }
    }
}

discretisation

Discretization parameters.

type

dict

options

IH

Discretization parameters for intensity histogram features

type

dict

IVH

Discretization parameters for intensity volume histogram features

type

dict

texture

Discretization parameters for texture features

type

dict

  • IH

Discretization parameters for intensity histogram features.

type

dict

options

type

Discretization algorithm: "FBS" for fixed bin size and "FBN" for fixed bin number algorithm. Other possible options: "FBSequal" and "FBNequal"

type

string

val

Bin size or bin number, depending on the algorithm used

type

int

  • IVH

Discretization parameters for intensity volume histogram features.

type

dict

options

type

Discretization algorithm: "FBS" for fixed bin size and "FBN" for fixed bin number algorithm

type

string

val

Bin size or bin number, depending on the algorithm used

type

int

  • texture

Discretization parameters for texture features.

type

dict

options

type

List of discretisation algorithms: "FBS" for fixed bin size and "FBN" for fixed bin number. Texture features will be computed for each algorithm in the list

type

List[string]

val

List of bin sizes or bin numbers, depending on the algorithm used. Texture features will be computed for each bin number or bin size in the list

type

List[List[int]]

e.g. for CT only (the parameters are the same for MR and PET):

{
    "imParamCT" : {
        "discretisation" : {
            "IH" : {
                "type" : "FBS",
                "val" : 25
            },
            "IVH" : {
                "type" : "FBN",
                "val" : 10
            },
            "texture" : {
                "type" : ["FBS"],
                "val" : [[25]]
            }
        }
    }
}

compute_suv_map

Computation of the suv map for PET scans. Default True

type

bool

options

True

Will compute suv map for PET scans.

type

bool

False

Will not compute suv map and it must be computed before.

type

bool

This parameter is only used for PET scans and is set as follows:

{
    "imParamPET" : {
        "compute_suv_map" : true
        }
}

Note

This parameter concern PET scans only. MEDimage only computes suv map for DICOM scans, since the computation relies on DICOM headers for computation and assumes it’s already computed for NIfTI scans.

filter_type

Name of the filter to use on the scan. Empty string by default.

type

string

options

mean

Filter images using mean filter.

type

string

log

Filter images using log filter.

type

string

gabor

Filter images using gabor filter.

type

string

laws

Filter images using laws filter.

type

string

wavelet

Filter images using wavelet filter.

type

string

e.g.

{
    "imParamPET" : {
        "filter_type" : "mean"
        },
    "imParamMR" : {
        "filter_type" : "laws"
        },
    "imParamCT" : {
        "filter_type" : "log"
        }
}

Extraction Parameters

Extraction parameters are organized in the same wat as the processing parameters so each imaging modality should have its own parameters and the JSON file should be organized as follows:

{
    "imParamMR": {"Extraction params for MR modality"},
    "imParamCT": {"Extraction params for CT modality"},
    "imParamPET": {"Extraction params for PET modality"}
}

glcm dist_correction

glcm features weighting norm. by default False

type

Union[bool, str]

options

manhattan

Will use "manhattan" weighting norm.

type

string

euclidean

Will use "euclidean" weighting norm.

type

string

chebyshev

Will use "chebyshev" weighting norm.

type

string

True

Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine.

type

bool

False

False to replicate IBSI results.

type

bool

e.g.

{
    "imParamMR" : {
        "glcm" : {
            "dist_correction" : false
        }
    },
    "imParamCT" : {
        "glcm" : {
            "dist_correction" : "chebyshev"
        }
    },
    "imParamPET" : {
        "glcm" : {
            "dist_correction" : "euclidean"
        }
    }
}

glcm merge_method

glcm features aggregation method. by default "vol_merge"

type

string

options

vol_merge

Features are extracted from a single matrix after merging all 3D directional matrices.

type

string

slice_merge

Features are extracted from a single matrix after merging 2D directional matrices per slice, and then averaged over slices.

type

string

dir_merge

Features are extracted from a single matrix after merging 2D directional matrices per direction, and then averaged over direction

type

string

average

Features are extracted from each 3D directional matrix and averaged over the 3D directions

type

string

e.g.

{
    "imParamMR" : {
        "glcm" : {
            "merge_method" : "average"
        }
    },
    "imParamCT" : {
        "glcm" : {
            "merge_method" : "vol_merge"
        }
    },
    "imParamPET" : {
        "glcm" : {
            "merge_method" : "dir_merge"
        }
    }
}

glrlm dist_correction

glrlm features weighting norm. by default False

type

Union[bool, str]

options

manhattan

Will use "manhattan" weighting norm.

type

string

euclidean

Will use "euclidean" weighting norm.

type

string

chebyshev

Will use "chebyshev" weighting norm.

type

string

True

Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine.

type

bool

False

False to replicate IBSI results.

type

bool

e.g.

{
    "imParamMR" : {
        "glrlm" : {
            "dist_correction" : false
        }
    },
    "imParamCT" : {
        "glrlm" : {
            "dist_correction" : "chebyshev"
        }
    },
    "imParamPET" : {
        "glrlm" : {
            "dist_correction" : "euclidean"
        }
    }
}

glrlm merge_method

glrlm features aggregation method. by default "vol_merge"

type

string

options

vol_merge

Features are extracted from a single matrix after merging all 3D directional matrices.

type

string

slice_merge

Features are extracted from a single matrix after merging 2D directional matrices per slice, and then averaged over slices.

type

string

dir_merge

Features are extracted from a single matrix after merging 2D directional matrices per direction, and then averaged over direction

type

string

average

Features are extracted from each 3D directional matrix and averaged over the 3D directions

type

string

e.g.

{
    "imParamMR" : {
        "glrlm" : {
            "merge_method" : "average"
        }
    },
    "imParamCT" : {
        "glrlm" : {
            "merge_method" : "vol_merge"
        }
    },
    "imParamPET" : {
        "glrlm" : {
            "merge_method" : "dir_merge"
        }
    }
}

ngtdm dist_correction

ngtdm features weighting norm. by default False

type

bool

options

True

Will use discretization length difference corrections as used by the Institute of Physics and Engineering in Medicine.

type

bool

False

False to replicate IBSI results.

type

bool

e.g.

{
    "imParamMR" : {
        "ngtdm" : {
            "dist_correction" : false
        }
    },
    "imParamCT" : {
        "ngtdm" : {
            "dist_correction" : true
        }
    },
    "imParamPET" : {
        "ngtdm" : {
            "dist_correction" : true
        }
    }
}

Filtering parameters

Filtering parameters are organized in a separate dictionary, each dictionary contains parameters for every filter of the MEDimage:

{
    "imParamFilter": {
        "mean": {"mean filter params"},
        "log": {"log filter params"},
        "laws": {"laws filter params"},
        "gabor": {"gabor filter params"},
        "wavelet": {"wavelet filter params"},
        "textural": {"textural filter params"}
    }
}

mean

Parameters of the mean filter

type

dict

options

ndims

Dimension of the imaging data. Usually 3.

type

int

orthogonal_rot

If True, the images will be rotated over all the planes.

type

bool

size

Size of the filter kernel.

type

int

padding

Padding mode, default "symmetric". All the padding modes possible can be found here

type

string

name_save

Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).

type

string

e.g.

{
    "imParamFilter" : {
        "mean" : {
            "ndims" : 3,
            "orthogonal_rot": false,
            "size" : 5,
            "padding" : "symmetric",
            "name_save" : "mean5"
        }
}

log

Parameters of the laplacian of Gaussian filter

type

dict

options

ndims

Dimension of the imaging data. Usually 3.

type

int

sigma

Standard deviation of the Gaussian, controls the scale of the convolutional operator.

type

float

orthogonal_rot

If True, the images will be rotated over all the planes.

type

bool

padding

Padding mode, default "symmetric". All the padding modes possible can be found here

type

string

name_save

Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).

type

string

e.g.

{
    "imParamFilter" : {
        "log" : {
            "ndims" : 3,
            "sigma" : 1.5,
            "orthogonal_rot" : false,
            "padding" : "constant",
            "name_save" : "log_1.5"
        }
}

laws

Parameters of the laws filter

type

dict

options

config

List of string of every 1D filter to use for the Laws kernel creation. Possible 1D filters: "L3", "L5", "E3", "E5", "S3", "S5", "W5" or "R5"

type

List[str]

energy_distance

The Chebyshev distance that will be used to create the laws texture energy image.

type

float

rot_invariance

If True, rotational invariance will be approximated.

type

bool

orthogonal_rot

If True, the images will be rotated over all the planes.

type

bool

energy_image

If True, Laws texture energy images are computed.

type

bool

padding

Padding mode, default "symmetric". All the padding modes possible can be found here

type

string

name_save

Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).

type

string

e.g.

{
    "imParamFilter" : {
        "laws" : {
            "config" : ["L5", "E5", "E5"],
            "energy_distance" : 7,
            "rot_invariance" : true,
            "orthogonal_rot" : false,
            "energy_image" : true,
            "padding" : "symmetric",
            "name_save" : "laws_l5_e5_e5_7"
        }
}

Note

The order of the 1D filters used in laws filter configuration matter, because we use the configuration list to compute the outer product and the outer product is not commutative.

gabor

Parameters of the gabor filter

type

dict

options

sigma

Standard deviation of the Gaussian envelope, controls the scale of the filter.

type

float

lambda

Wavelength or inverse of the frequency.

type

float

gamma

Spatial aspect ratio.

type

float

theta

Angle of the rotation matrix.

type

str

rot_invariance

If True, rotational invariance will be approximated by combining the response maps of several elements of the Gabor filter bank.

type

bool

orthogonal_rot

If True, the images will be rotated over all the planes.

type

bool

padding

Padding mode, default "symmetric". All the padding modes possible can be found here

type

string

name_save

Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).

type

string

e.g.

{
    "imParamFilter" : {
        "gabor" : {
            "sigma" : 5,
            "lambda" : 2,
            "gamma" : 1.5,
            "theta" : "Pi/8",
            "rot_invariance" : true,
            "orthogonal_rot" : true,
            "padding" : "symmetric",
            "name_save" : "gabor_5_2_1.5"
        }
}

Note

gamma parameter should be radian but must be specified as a string, for example \(\frac{\pi}{2}\) should be specified as “Pi/2”.

wavelet

Parameters of the gabor filter

type

dict

options

ndims

Dimension of the imaging data. Usually 3.

type

int

basis_function

Wavelet name used to create the kernel. The Wavelet families and built-ins can be found here. Custom user wavelets are also supported.

type

string

subband

String of the 1D wavelet kernels ("H" for high-pass filter or "L" for low-pass filter). Must have a size of ndims.

type

string

level

The number of decomposition steps to perform.

type

int

rot_invariance

If True, rotational invariance will be approximated.

type

bool

padding

Padding mode, default "symmetric". All the padding modes possible can be found here

type

string

name_save

Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).

type

string

e.g.

{
    "imParamFilter" : {
        "wavelet" : {
            "ndims" : 3,
            "basis_function" : "db3",
            "subband" : "LLH",
            "level" : 1,
            "rot_invariance" : true,
            "padding" : "symmetric",
            "name_save" : "Wavelet_db3_LLH"
        },
}

textural

Parameters of the textural filter

type

dict

options

family

Texture features family. Only "glcm" is supported for now.

type

string

discretization

Discretization parameters for the texture features (Defined down below).

type

dict

local

Wether to discretize the ROI locally or globally.

type

bool

size

Filter size.

type

int

name_save

Saving name added to the end of every radiomics extraction results table (Only if the filter was applied).

type

string

  • Discretization (Textural filters)

Discretization parameters for intensity histogram features.

type

dict

options

type

Discretization algorithm: "FBS" for fixed bin size and "FBN" for fixed bin number algorithm.

type

string

bn

Bin number. Set if type is "FBN".

type

int

bw

Bin size. Set if type is "FBS" or type is "FBN" and adapted is True.

type

int

adapted

If True, the bin number will be computed using the bin width and the intensity range. Only valid if type is "FBN".

type

bool

e.g.

{
    "imParamFilter" : {
        "textural" : {
            "family" : "glcm",
            "discretization": {
                "type" : "FBN",
                "bn" : null,
                "bw" : 25,
                "adapted" : true
            },
            "size" : 3,
            "local" : true,
            "name_save" : "glcm_local_fbn_25hu_adapted"
        },
}

Example of a full settings dictionary

Here is an example of a complete settings dictionary: