Wrangling
DataManager
- class MEDimage.wrangling.DataManager.DataManager(path_to_dicoms: List = [], path_to_niftis: List = [], path_csv: Path | str | None = None, path_save: Path | str | None = None, path_save_checks: Path | str | None = None, path_pre_checks_settings: Path | str | None = None, save: bool = True, n_batch: int = 2)[source]
Bases:
object
Reads all the raw data (DICOM, NIfTI) content and organizes it in instances of the MEDscan class.
- class DICOM(stack_series_rs: List, stack_path_rs: List, stack_frame_rs: List, cell_series_id: List, cell_path_rs: List, cell_path_images: List, cell_frame_rs: List, cell_frame_id: List)[source]
Bases:
object
DICOM data management class that will organize data during the conversion to MEDscan class process
- cell_frame_id: List
- cell_frame_rs: List
- cell_path_images: List
- cell_path_rs: List
- cell_series_id: List
- stack_frame_rs: List
- stack_path_rs: List
- stack_series_rs: List
- class NIfTI(stack_path_images: List, stack_path_roi: List, stack_path_all: List)[source]
Bases:
object
NIfTI data management class that will organize data during the conversion to MEDscan class process
- stack_path_all: List
- stack_path_images: List
- stack_path_roi: List
- class Paths(_path_to_dicoms: List, _path_to_niftis: List, _path_csv: Path | str, _path_save: Path | str, _path_save_checks: Path | str, _path_pre_checks_settings: Path | str)[source]
Bases:
object
Paths management class that will organize the paths used in the processing
- __init__(path_to_dicoms: List = [], path_to_niftis: List = [], path_csv: Path | str | None = None, path_save: Path | str | None = None, path_save_checks: Path | str | None = None, path_pre_checks_settings: Path | str | None = None, save: bool = True, n_batch: int = 2) None [source]
Constructor of the class DataManager.
- Parameters:
path_to_dicoms (Union[Path, str], optional) – Full path to the starting directory where the DICOM data is located.
path_to_niftis (Union[Path, str], optional) – Full path to the starting directory where the NIfTI is located.
path_csv (Union[Path, str], optional) – Full path to the CSV file containing the scans info list.
path_save (Union[Path, str], optional) – Full path to the directory where to save all the MEDscan classes.
path_save_checks (Union[Path, str], optional) – Full path to the directory where to save all the pre-radiomics checks analysis results.
path_pre_checks_settings (Union[Path, str], optional) – Full path to the JSON file of the pre-checks analysis parameters.
save (bool, optional) – True to save the MEDscan classes in path_save.
n_batch (int, optional) – Numerical value specifying the number of batch to use in the parallel computations (use 0 for serial computation).
- Returns:
None
- perform_ct_imaging_summary(wildcards_scans: List[str], path_data: Path | None = None, path_save_checks: Path | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95) None [source]
Summarizes CT imaging acquisition parameters. Plots summary histograms for different dimensions and saves all acquisition parameters locally in JSON files.
- Parameters:
wildcards_scans (List[str]) – List of wildcards that determines the scans that will be analyzed (Only MRI scans will be analyzed). You can learn more about wildcards in this link. For example:
["STS*.CTscan.npy"]
.path_data (Path, optional) – Path to the MEDscan objects, if not specified will use
path_save
from the inner-classPaths
in the current instance.path_save_checks (Path, optional) – Path where to save the checks, if not specified will use the one in the current instance.
min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.
max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.
- Returns:
None.
- perform_imaging_summary(wildcards_scans: List[str], path_data: Path | None = None, path_save_checks: Path | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95) None [source]
Summarizes CT and MR imaging acquisition parameters. Plots summary histograms for different dimensions and saves all acquisition parameters locally in JSON files.
- Parameters:
wildcards_scans (List[str]) –
List of wildcards that determines the scans that will be analyzed (CT and MRI scans will be analyzed). You can learn more about wildcards in this link. For example:
["STS*.CTscan.npy", "STS*.MRscan.npy"]
.path_data (Path, optional) – Path to the MEDscan objects, if not specified will use
path_save
from the inner-classPaths
in the current instance.path_save_checks (Path, optional) – Path where to save the checks, if not specified will use the one in the current instance.
min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.
max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.
- Returns:
None.
- perform_mr_imaging_summary(wildcards_scans: List[str], path_data: Path | None = None, path_save_checks: Path | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95) None [source]
Summarizes MRI imaging acquisition parameters. Plots summary histograms for different dimensions and saves all acquisition parameters locally in JSON files.
- Parameters:
wildcards_scans (List[str]) –
List of wildcards that determines the scans that will be analyzed (Only MRI scans will be analyzed). You can learn more about wildcards in this link. For example:
["STS*.MRscan.npy"]
.path_data (Path, optional) – Path to the MEDscan objects, if not specified will use
path_save
from the inner-classPaths
in the current instance.path_save_checks (Path, optional) – Path where to save the checks, if not specified will use the one in the current instance.
min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.
max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.
- Returns:
None.
- pre_radiomics_checks(path_data: Path | str | None = None, wildcards_dimensions: List = [], wildcards_window: List = [], path_csv: Path | str | None = None, min_percentile: float = 0.05, max_percentile: float = 0.95, bin_width: int = 0, hist_range: list = [], nifti: bool = False, save: bool = False) None [source]
Finds proper dimension and re-segmentation ranges options for radiomics analyses.
The resulting files from this method can then be analyzed and used to set up radiomics parameters options in computation methods.
- Parameters:
path_data (Path, optional) – Path to the MEDscan objects, if not specified will use
path_save
from the inner-classPaths
in the current instance.wildcards_dimensions (List[str], optional) –
List of wildcards that determines the scans that will be analyzed. You can learn more about wildcards in this link.
wildcards_window (List[str], optional) –
List of wildcards that determines the scans that will be analyzed. You can learn more about wildcards in this link.
path_csv (Union[str, Path], optional) – Path to a csv file containing a list of the scans that will be analyzed (a CSV file for a single ROI type).
min_percentile (float, optional) – Minimum percentile to use for the histograms. Defaults to 0.05.
max_percentile (float, optional) – Maximum percentile to use for the histograms. Defaults to 0.95.
bin_width (int, optional) – Width of the bins for the histograms. If not provided, will use the default number of bins in the method pandas.DataFrame.hist: 10 bins.
hist_range (list, optional) – Range of the histograms. If empty, will use the minimum and maximum values.
nifti (bool, optional) – Set to True if the scans are nifti files. Defaults to False.
save (bool, optional) – If True, will save the results in a json file. Defaults to False.
- Returns:
None
- process_all_dicoms() List[MEDscan] | None [source]
This function reads the DICOM content of all the sub-folder tree of a starting directory defined by path_to_dicoms. It then organizes the data (files throughout the starting directory are associated by ‘SeriesInstanceUID’) in the MEDscan class including the region of interest (ROI) defined by an associated RTstruct. All MEDscan classes hereby created are saved in path_save with a name varying with every scan.
- Returns:
List of MEDscan instances.
- Return type:
List[MEDscan]
- process_all_niftis() List[MEDscan] [source]
This function reads the NIfTI content of all the sub-folder tree of a starting directory. It then organizes the data in the MEDscan class including the region of interest (ROI) defined by an associated mask file. All MEDscan classes hereby created are saved in a specific path with a name specific name varying with every scan.
- Parameters:
None.
- Returns:
List of MEDscan instances.
- Return type:
List[MEDscan]
- summarize()[source]
Creates and shows a summary of processed scans organized by study, institution, scan type and roi type
- Parameters:
None
- Returns:
None
- update_from_csv(path_csv: Path | str | None = None) None [source]
Updates the class from a given CSV and summarizes the processed scans again according to it.
- Parameters:
path_csv (optional, Union[str, Path]) – Path to a csv file, if not given, will check for csv info in the class attributes.
- Returns:
None
ProcessDICOM
- class MEDimage.wrangling.ProcessDICOM.ProcessDICOM(path_images: List[Path], path_rs: List[Path], path_save: str | Path, save: bool)[source]
Bases:
object
Class to process dicom files and extract imaging volume and 3D masks from it in order to oganize the data in a MEDscan class object.
- __init__(path_images: List[Path], path_rs: List[Path], path_save: str | Path, save: bool) None [source]
- Parameters:
path_images (List[Path]) – List of paths to the dicom files of a single scan.
path_rs (List[Path]) – List of paths to the RT struct dicom files for the same scan.
path_save (Union[str, Path]) – Path to the folder where the MEDscan object will be saved.
save (bool) – Whether to save the MEDscan object or not.
- Returns:
None.
- combine_slices(slice_datasets: List[FileDataset]) List[ndarray] [source]
Given a list of pydicom datasets for an image series, stitch them together into a three-dimensional numpy array of iamging data. Also calculate a 4x4 affine transformation matrix that converts the ijk-pixel-indices into the xyz-coordinates in the DICOM patient’s coordinate system and 4x4 rotation and scaling matrix. If any of the DICOM images contain either the Rescale Slope or the Rescale Intercept attributes they will be applied to each slice individually. This function requires that the datasets:
Be in same series (have the same Series Instance UID, Modality, and SOP Class UID).
The binary storage of each slice must be the same (have the same Bits Allocated, Bits Stored, High Bit, and Pixel Representation).
The image slice must approximately form a grid. This means there can not be any missing internal slices (missing slices on the ends of the dataset are not detected). It also means that each slice must have the same Rows, Columns, Pixel Spacing, and Image Orientation (Patient) attribute values.
The direction cosines derived from the Image Orientation (Patient) attribute must, within 1e-4, have a magnitude of 1. The cosines must also be approximately perpendicular (their dot-product must be within 1e-4 of 0). Warnings are displayed if any of theseapproximations are below 1e-8, however, since we have seen real datasets with values up to 1e-4, we let them pass.
The Image Position (Patient) values must approximately form a line.
If any of these conditions are not met, a dicom_numpy.DicomImportException is raised.
- Parameters:
slice_datasets (List[pydicom.dataset.FileDataset]) – List of dicom headers.
- Returns:
List of numpy arrays containing the data extracted the dicom files (voxels, translation, rotation and scaling matrix).
- Return type:
List[numpy.ndarray]