Wrangling

DataManager

class MEDimage.wrangling.DataManager.DataManager(path_to_dicoms: List = [], path_to_niftis: List = [], path_csv: Path | str | None = None, path_save: Path | str | None = None, path_save_checks: Path | str | None = None, path_pre_checks_settings: Path | str | None = None, save: bool = True, n_batch: int = 2)[source]

Bases: object

Reads all the raw data (DICOM, NIfTI) content and organizes it in instances of the MEDscan class.

class DICOM(stack_series_rs: List, stack_path_rs: List, stack_frame_rs: List, cell_series_id: List, cell_path_rs: List, cell_path_images: List, cell_frame_rs: List, cell_frame_id: List)[source]

Bases: object

DICOM data management class that will organize data during the conversion to MEDscan class process

cell_frame_id: List
cell_frame_rs: List
cell_path_images: List
cell_path_rs: List
cell_series_id: List
stack_frame_rs: List
stack_path_rs: List
stack_series_rs: List
class NIfTI(stack_path_images: List, stack_path_roi: List, stack_path_all: List)[source]

Bases: object

NIfTI data management class that will organize data during the conversion to MEDscan class process

stack_path_all: List
stack_path_images: List
stack_path_roi: List
class Paths(_path_to_dicoms: List, _path_to_niftis: List, _path_csv: Path | str, _path_save: Path | str, _path_save_checks: Path | str, _path_pre_checks_settings: Path | str)[source]

Bases: object

Paths management class that will organize the paths used in the processing

__init__(path_to_dicoms: List = [], path_to_niftis: List = [], path_csv: Path | str | None = None, path_save: Path | str | None = None, path_save_checks: Path | str | None = None, path_pre_checks_settings: Path | str | None = None, save: bool = True, n_batch: int = 2) None[source]

Constructor of the class DataManager.

Parameters:
  • path_to_dicoms (Union[Path, str], optional) – Full path to the starting directory where the DICOM data is located.

  • path_to_niftis (Union[Path, str], optional) – Full path to the starting directory where the NIfTI is located.

  • path_csv (Union[Path, str], optional) – Full path to the CSV file containing the scans info list.

  • path_save (Union[Path, str], optional) – Full path to the directory where to save all the MEDscan classes.

  • path_save_checks (Union[Path, str], optional) – Full path to the directory where to save all the pre-radiomics checks analysis results.

  • path_pre_checks_settings (Union[Path, str], optional) – Full path to the JSON file of the pre-checks analysis parameters.

  • save (bool, optional) – True to save the MEDscan classes in path_save.

  • n_batch (int, optional) – Numerical value specifying the number of batch to use in the parallel computations (use 0 for serial computation).

Returns:

None

perform_ct_imaging_summary(wildcards_scans: List[str], path_data: Path | None = None, path_save_checks: Path | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95) None[source]

Summarizes CT imaging acquisition parameters. Plots summary histograms for different dimensions and saves all acquisition parameters locally in JSON files.

Parameters:
  • wildcards_scans (List[str]) – List of wildcards that determines the scans that will be analyzed (Only MRI scans will be analyzed). You can learn more about wildcards in this link. For example: ["STS*.CTscan.npy"].

  • path_data (Path, optional) – Path to the MEDscan objects, if not specified will use path_save from the inner-class Paths in the current instance.

  • path_save_checks (Path, optional) – Path where to save the checks, if not specified will use the one in the current instance.

  • min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.

  • max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.

Returns:

None.

perform_imaging_summary(wildcards_scans: List[str], path_data: Path | None = None, path_save_checks: Path | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95) None[source]

Summarizes CT and MR imaging acquisition parameters. Plots summary histograms for different dimensions and saves all acquisition parameters locally in JSON files.

Parameters:
  • wildcards_scans (List[str]) –

    List of wildcards that determines the scans that will be analyzed (CT and MRI scans will be analyzed). You can learn more about wildcards in this link. For example: ["STS*.CTscan.npy", "STS*.MRscan.npy"].

  • path_data (Path, optional) – Path to the MEDscan objects, if not specified will use path_save from the inner-class Paths in the current instance.

  • path_save_checks (Path, optional) – Path where to save the checks, if not specified will use the one in the current instance.

  • min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.

  • max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.

Returns:

None.

perform_mr_imaging_summary(wildcards_scans: List[str], path_data: Path | None = None, path_save_checks: Path | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95) None[source]

Summarizes MRI imaging acquisition parameters. Plots summary histograms for different dimensions and saves all acquisition parameters locally in JSON files.

Parameters:
  • wildcards_scans (List[str]) –

    List of wildcards that determines the scans that will be analyzed (Only MRI scans will be analyzed). You can learn more about wildcards in this link. For example: ["STS*.MRscan.npy"].

  • path_data (Path, optional) – Path to the MEDscan objects, if not specified will use path_save from the inner-class Paths in the current instance.

  • path_save_checks (Path, optional) – Path where to save the checks, if not specified will use the one in the current instance.

  • min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.

  • max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.

Returns:

None.

pre_radiomics_checks(path_data: Path | str | None = None, wildcards_dimensions: List = [], wildcards_window: List = [], path_csv: Path | str | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95, bin_width: int = 0, hist_range: list = [], nifti: bool = False, save: bool = False) None[source]

Finds proper dimension and re-segmentation ranges options for radiomics analyses.

The resulting files from this method can then be analyzed and used to set up radiomics parameters options in computation methods.

Parameters:
  • path_data (Path, optional) – Path to the MEDscan objects, if not specified will use path_save from the inner-class Paths in the current instance.

  • wildcards_dimensions (List[str], optional) –

    List of wildcards that determines the scans that will be analyzed. You can learn more about wildcards in this link.

  • wildcards_window (List[str], optional) –

    List of wildcards that determines the scans that will be analyzed. You can learn more about wildcards in this link.

  • path_csv (Union[str, Path], optional) – Path to a csv file containing a list of the scans that will be analyzed (a CSV file for a single ROI type).

  • min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.

  • max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.

  • bin_width (int, optional) – Width of the bins for the histograms. If not provided, will use the default number of bins in the method pandas.DataFrame.hist: 10 bins.

  • hist_range (list, optional) – Range of the histograms. If empty, will use the minimum and maximum values.

  • nifti (bool, optional) – Set to True if the scans are nifti files. Defaults to False.

  • save (bool, optional) – If True, will save the results in a json file. Defaults to False.

Returns:

None

process_all() None[source]

Processes both DICOM & NIfTI content to create MEDscan classes

process_all_dicoms() List[MEDscan] | None[source]

This function reads the DICOM content of all the sub-folder tree of a starting directory defined by path_to_dicoms. It then organizes the data (files throughout the starting directory are associated by ‘SeriesInstanceUID’) in the MEDscan class including the region of interest (ROI) defined by an associated RTstruct. All MEDscan classes hereby created are saved in path_save with a name varying with every scan.

Returns:

List of MEDscan instances.

Return type:

List[MEDscan]

process_all_niftis() List[MEDscan][source]

This function reads the NIfTI content of all the sub-folder tree of a starting directory. It then organizes the data in the MEDscan class including the region of interest (ROI) defined by an associated mask file. All MEDscan classes hereby created are saved in a specific path with a name specific name varying with every scan.

Parameters:

None.

Returns:

List of MEDscan instances.

Return type:

List[MEDscan]

summarize()[source]

Creates and shows a summary of processed scans organized by study, institution, scan type and roi type

Parameters:

None

Returns:

None

update_from_csv(path_csv: Path | str | None = None) None[source]

Updates the class from a given CSV and summarizes the processed scans again according to it.

Parameters:

path_csv (optional, Union[str, Path]) – Path to a csv file, if not given, will check for csv info in the class attributes.

Returns:

None

ProcessDICOM

class MEDimage.wrangling.ProcessDICOM.ProcessDICOM(path_images: List[Path], path_rs: List[Path], path_save: str | Path, save: bool)[source]

Bases: object

Class to process dicom files and extract imaging volume and 3D masks from it in order to oganize the data in a MEDscan class object.

__init__(path_images: List[Path], path_rs: List[Path], path_save: str | Path, save: bool) None[source]
Parameters:
  • path_images (List[Path]) – List of paths to the dicom files of a single scan.

  • path_rs (List[Path]) – List of paths to the RT struct dicom files for the same scan.

  • path_save (Union[str, Path]) – Path to the folder where the MEDscan object will be saved.

  • save (bool) – Whether to save the MEDscan object or not.

Returns:

None.

combine_slices(slice_datasets: List[FileDataset]) List[ndarray][source]

Given a list of pydicom datasets for an image series, stitch them together into a three-dimensional numpy array of iamging data. Also calculate a 4x4 affine transformation matrix that converts the ijk-pixel-indices into the xyz-coordinates in the DICOM patient’s coordinate system and 4x4 rotation and scaling matrix. If any of the DICOM images contain either the Rescale Slope or the Rescale Intercept attributes they will be applied to each slice individually. This function requires that the datasets:

If any of these conditions are not met, a dicom_numpy.DicomImportException is raised.

Parameters:

slice_datasets (List[pydicom.dataset.FileDataset]) – List of dicom headers.

Returns:

List of numpy arrays containing the data extracted the dicom files (voxels, translation, rotation and scaling matrix).

Return type:

List[numpy.ndarray]

process_files()[source]

Reads DICOM files (imaging volume + ROIs) in the instance data path and then organizes it in the MEDscan class.

Parameters:

None.

Returns:

Instance of a MEDscan class.

Return type:

medscan (MEDscan)