Getting Started¶
Dataset setup¶
PD-DWI model training and inference expects a specific folders and files structure
.
├── train # Training dataset
│ ├── clinical.csv # Clinical information of subjects
│ ├── subject X # Subject imaging modalities
│ │ ├── T0 # Subject modalities acquired at T0
│ │ │ ├── ADC 0100.dcm # ADC calculated from b-values 0-100
│ │ │ ├── F.dcm # Diffusion Fraction volume
│ │ │ ├── MASK.dcm # Diffusion Weighted Imaging mask
│ │ │ └── ...
│ │ └── ...
│ └── ...
├── test # Testing dataset
│ ├── clinical.csv
│ ├── subject Y
└── ...
Imaging data¶
As shown in above structure, imaging data is stored by subject id and acquision time.
Our PD-DWI framework requires at least one DWI-based map, accompanied by a MASK which represents the tumor ROI. It is assumed that tumor MASK is corresponding to the DWI-based map, and available in the same spacing. The DWI-based map can be either ADC or F, or both.
To calculate the ADC and F maps from your DWI data, please use our pre-processing utility.
Clinical data¶
All clinical data will be stored in a file named clinical.csv. Each line will contain the following values, by order of appearance:
- Patient ID DICOM - subject identifier, must be identical to subject's folder name
- hrher4g - 4 level hormone receptor status
- SBRgrade - 3 level tumor grade
- race - subject's race
- Ltype - lesion type
- pcr - pCR label of subject. If not available, should be defined as an empty string