"Idealized" versus "good-enough" processing stream

I'm curious what would be considered an "idealized" reproducible processing stream, and what is a "good enough" reproducible processing stream, and identify the tools/skills needed to complete a "good enough"  reproducible analysis. I have some hypothesized steps and some tools listed to complete those steps.

## Sparse Learner's Profile
Starting from the top where a PI (or someone) hands you a bunch of dicoms and asks you get subcortical volumes from the structural scans (but there are other currently irrelevant dicoms as well). The PI also wants to be able to run your analysis and wants the data to be publicly available (assuming all IRB/data sharing agreements are satisfied)

## An Idealized Processing Pipeline

I imagine we would be using datalad to record all our data/code/processing steps, and always be using/developing containers from the beginning. I'm not exactly sure where/how to place NIDM annotations of data/results or what tool I should use ([PyNIDM](https://github.com/incf-nidash/PyNIDM)?).

- [ ] search through and find the relevant dicoms
  - nibabel
  - afni
- [ ] version control the relevant dicoms
  - datalad
  - git-annex
- [ ] convert the dicoms to nifti file format named to the BIDS standard
  - heudiconv (via docker/singularity)
  - datalad
- [ ] deface and rename the files
  - pydeface (via docker/singularity)
  - shell
  - datalad
- [ ] write a script that calculates subcortical volumes
  - niflows (via pip/conda env)
  - fsl
  - datalad
- [ ] place the script in a container with all the requisite software installed
  - neurodocker
- [ ] upload the container to a hub (docker and/or singularity)
  - docker
  - singularity
- [ ] run the script on the data and output data in a derivatives directory
  - docker
  - singularity
- [ ] upload the BIDS organized nifti files to some online database
  - openneuro
- [ ] upload the code/outputs to an online repository
  - git
  - github
- [ ] test your code against that uploaded data
  - testkraken
  - circleci
  - travisci
  - shell

## Good Enough Processing Pipeline

Removed datalad from the processing stream, removed testing, removed niflows, but still want to use desired software from within a container.

- [ ] search through and find the relevant dicoms
  - nibabel
  - afni
- [ ] convert the dicoms to nifti file format named to the BIDS standard
  - heudiconv (via docker/singularity)
- [ ] deface and rename the files
  - pydeface (via docker/singularity)
  - shell
- [ ] write a script that calculates subcortical volumes
  - shell
  - fsl
  - datalad
- [ ] place the script in a container with all the requisite software installed
  - neurodocker
- [ ] upload the container to a hub (docker and/or singularity)
  - docker
  - singularity
- [ ] run the script on the data and output data in a derivatives directory
  - docker
  - singularity
- [ ] upload the BIDS organized nifti files to some online database
  - openneuro
- [ ] upload the code/outputs to an online repository and link to what containers you used
  - git
  - github

I would like feedback on both the "Idealized" and "Good Enough" analyses since I am not as knowledgeable as I would like to be on designing processing pipelines. I may not be most up to date on what are the hot/new tools versus what will get the job done.

Once we pin what we would like workshop attendees to be able to do (and hopefully this matches with what they wish to do as well), then I think we will have an easier time elucidating necessary skills and modifying episodes to make sure they help build these skills.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"Idealized" versus "good-enough" processing stream #26

Sparse Learner's Profile

An Idealized Processing Pipeline

Good Enough Processing Pipeline

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

"Idealized" versus "good-enough" processing stream #26

Description

Sparse Learner's Profile

An Idealized Processing Pipeline

Good Enough Processing Pipeline

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions