The DSST Defacing Pipeline has been developed to make the process of defacing anatomical scans as well as visually quality controlling (QC) and fixing scans that fail QC more efficient and straightforward. The pipeline requires the input dataset to be in BIDS format. A conceptual description of the pipeline can be found below.
This pipeline is designed and tested to work on the NIH HPC systems. While it's possible to get the pipeline running on other platforms, please note that it can be error-prone and is not recommended.
git clone https://github.com/nimh-dsst/dsst-defacing-pipeline.git
Apart from AFNI and FSL packages, available as HPC modules, users will need the following packages in their working environment
- VisualQC
- FSLeyes
- Python 3.7+
There are many ways to create a virtual environment with the required packages, however, we currently only provide instructions to create a conda environment. If you don't already have conda installed, please find Miniconda install instructions here.
Run the following command to create a conda
environment called dsstdeface
using the environment.yml
file from this repo.
conda env create -f environment.yml
Once conda finishes creating the virtual environment, activate dsstdeface
.
conda activate dsstdeface
To deface anatomical scans in the dataset, run the src/run.py
script. From within the dsst-defacing-pipeline
cloned directory, run the following command to see the help message.
% python src/run.py -h
usage: run.py [-h] [-n N_CPUS] [-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
[-s SESSION_ID [SESSION_ID ...]] [--no-clean]
bids_dir output_dir
Deface anatomical scans for a given BIDS dataset or a subject directory in
BIDS format.
positional arguments:
bids_dir The directory with the input dataset formatted
according to the BIDS standard.
output_dir The directory where the output files should be stored.
optional arguments:
-h, --help show this help message and exit
-n N_CPUS, --n-cpus N_CPUS
Number of parallel processes to run when there is more
than one folder. Defaults to 1, meaning "serial
processing".
-p PARTICIPANT_LABEL [PARTICIPANT_LABEL ...], --participant-label PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]
The label(s) of the participant(s) that should be
defaced. The label corresponds to
sub-<participant_label> from the BIDS spec (so it does
not include "sub-"). If this parameter is not provided
all subjects should be analyzed. Multiple participants
can be specified with a space separated list.
-s SESSION_ID [SESSION_ID ...], --session-id SESSION_ID [SESSION_ID ...]
The ID(s) of the session(s) that should be defaced.
The label corresponds to ses-<session_id> from the
BIDS spec (so it does not include "ses-"). If this
parameter is not provided all subjects should be
analyzed. Multiple sessions can be specified with a
space separated list.
--no-clean If this argument is provided, then AFNI intermediate
files are preserved.
The script can be run serially on a BIDS dataset or in parallel at subject/session level. Both these methods of running the script have been described below with example commands.
If you have a small dataset with less than 10 subjects, then it might be easiest to run the defacing algorithm serially.
# activate your conda environment
conda activate dsstdeface
# once your conda environment is active, execute the following
python src/run.py ${INPUT_DIR} ${OUTPUT_DIR}
If you have dataset with over 10 subjects and since each defacing job is independent, it might be more practical to run the pipeline in parallel for every
subject/session in the dataset using the -n/--n-cpus
option. The following example command will run the pipeline occupying 10 processors at a time.
# activate your conda environment
conda activate dsstdeface
# once your conda environment is active, execute the following
python src/run.py ${INPUT_DIR} ${OUTPUT_DIR} -n 10
Additionally, the pipeline can be run on a single subject or session using the -p/--participant-label
and -s/--session-id
, respectively.
To visually inspect quality of defacing with VisualQC, we'll need to:
-
Open TurboVNC through an spersist session. More info on the NIH HPC docs.
-
Run the
vqcdeface
command from a command-line terminal within a TurboVNC instancesh ${OUTPUT_DIR}/QC_prep/defacing_qc_cmd
While describing this process, we frequently use the following terms:
- Primary Scan: The best quality T1w scan within a session. For programmatic selection, we assume that the most recently acquired T1w scan is of the best quality.
- Other/Secondary Scans: All scans except the primary scan are grouped together and referred to as "other" or "secondary" scans for a given session.
- Mapping File: A JSON file that assigns/maps a primary scan (or
primary_t1
) to all other scans within a session. Please find an example file here. - VisualQC: A suite of QC tools developed by Pradeep Raamana, PhD (Assistant Professor at University of Pittsburgh).
We'd like to thank Pradeep Raamana, PhD., Assistant Professor at the Department of Radiology at University of Pittsburgh, and Paul Taylor, Acting Director of Scientific and Statistical Computing Core (SSCC) at NIMH for their timely help in resolving and adapting VisualQC and AFNI Refacer, respectively, for the specific needs of this project.