Giter VIP home page Giter VIP logo

dicomsort's Introduction

dicomsort

A project to provide custom sorting and renaming of dicom files

Description

Given DICOM files in a random folder structure, this program copies all into a user-defined folder hierarchy, creating folders as necessary and changing DICOM file names to be more meaningful.

The user can define the target folder structure and file naming by using a string consisting of concatenated tag names (like 'PatientName'), underscores and slashes. The last part of the string (as separated by slashes) denotes the naming convention for the file parts.

An Example: a target string of 'Modality/PatientName_PatientID' means that all DICOM images are arranged in a base folder and named by PatientName_PatientID, followed by an underscore and a unique number for every file that falls into the same category (and is not the same..?)

dicomsort returns with a count for both DICOM files organized and non-DICOM (or invalid DICOM) files skipped. It aborts with an error if it is to overwrite any existing file.

Installation

pip install thedicomsort
% dicomsort --help
dicomsort [options...] sourceDir targetDir/<patterns>

 where [options...] can be:
    [-z,--compressTargets] - create a .zip file in the target directory
    [-d,--deleteSource] - remove source files/directories after sorting
    [-f,--forceDelete] - remove source without confirmation
    [-k,--keepGoing] - report but ignore duplicate target files
    [-v,--verbose] - print diagnostics while processing
    [-s,--symlink] - create a symlink to dicom files in sourceDir instead of copying them
    [-t,--test] - run the built in self test (requires internet)
    [-u,--unsafe] - do not replace unsafe characters with '_' in the path
    [--help] - print this message

 where sourceDir is directory to be scanned or "" (null string) to read file list from stdin

 where targetDir/<patterns...> is a string defining the output file and directory
 names based on the dicom tags in the file.

If patterns are not specified, the following default is used:

  %PatientName-%Modality%StudyID-%StudyDescription-%StudyDate/%SeriesNumber_%SeriesDescription-%InstanceNumber.dcm

Example 1:

  dicomsort data sorted/%PatientName/%StudyDate/%SeriesDescription-%InstanceNumber.dcm

  could create a folder structure like:

  sorted/JohnDoe/2013-40-18/FLAIR-2.dcm

Example 2:

  find DicomSourceDir/ | grep "IMA$" | dicomsort -s "" DicomTargetDir

  would scan DicomSourceDir for file path names ending in IMA and create an
  output directory DicomTargetDir. The folder structure will be created using
  the default pattern with symbolic links to the source dicom data files.

dicomsort's People

Contributors

bebbi avatar dependabot[bot] avatar fedorov avatar michaelonken avatar pieper avatar piiq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dicomsort's Issues

Zero pad numerical values

Hello
The script works
python dicomsort.py /Users/me/unsorted /Users/me/sorted/%SeriesNumber/%InstanceNumber.dcm
but instance numbers are saved 1.dcm,2.dcm,... how does one zero pad the outputs so you get 0001.dcm, 0002.dcm,etc?

pydicom pip install workaround?

Hi there, not super familiar with python - there were some issues installing pydicom with conda on our system, so I tried installing pydicom using pip using venv. After activating the environment containing pydicom, I tried to run dicomsort.py but it is unable to locate the pydicom module.

How can I setup the path or whatever is necessary for dicomsort.py to know where pydicom can be imported from? Thank you

What DICOM tags are available for use?

Hi there - I was trying to use the DICOM tag %ProtocolName (0018,1030) but it wasn't working to sort the files...

What DICOM tags are available for dicomsort to use? Is there a way to use the actual tags rather than the name of the tag?

confirm source deletion?

Can the tool ask for user choice of deleting src files after successful cp operation?

The (process) idea is:

  • user sends data from scanner to tmp location
  • user uses dicomsort to cp dicoms to workspace
  • user does quick manual QC check on images in workspace (while dicomsort waits for input)
  • user confirms (or rejects) removal of source_folder (with a [y/N] )

Assigning unique file names within series

It would be helpful if it was possible to automatically assign unique names to the sorted files in the bottom-level directory. InstanceNumber unfortunately is not always present. SOPInstanceUID is too long (which is related to a separate concern, since Windows users will quickly hit maximum path limit if using UIDs as directory components).

Would it make sense to have a mode where consecutive numbers are assigned to the files within the bottom level sorted directory?

command line argument option

How about the following syntax:

dicomsort src_folderroot tgt_folderroot/%tag1_%tag2/%tag3/%tag4_%tag5

So, for example:

dicomsort data sorted/%PatientName/%StudyDate/%SeriesDescription-%InstanceUID.dcm

would create folders like:

sorted/JohnDoe/2013-40-18/FLAIR-2.dcm

Skip patient/Series processing if corrupted DICOM cannot be read by pydicom's dcmread()

Steve,

As we discussed, I think it could be beneficial to have the possibility to skip the processing of a patient/specific Series if one of the DICOM files in the Series/folder is corrupted or, in general, if it cannot be read by pydicom's dcmread()โ€ .

โ€  as I reported at the end of the issue, this is already done in some cases - so I'm sorry if the title is a bit misleading but I didn't know how to name this properly!

For the sake of completeness, I'm pasting here the example I already shared privately:

Considering file XXXXXXXXXXXXXXXXXXXXXXXXX.dcm
ERROR, UNEXPECTED EXCEPTION
'bytes' object has no attribute 'name'
Traceback (most recent call last):
  File "dicomsort.py", line 493, in <module>
    if not sorter.renameFiles():
  File "dicomsort.py", line 214, in renameFiles
    if self.renameFile(file):
  File "dicomsort.py", line 232, in renameFile
    ds = dicom.read_file(file,stop_before_pixels=True)
  File "/home/dennis/.local/lib/python3.6/site-packages/pydicom/filereader.py", line 1034, in dcmread
    specific_tags=specific_tags,
  File "/home/dennis/.local/lib/python3.6/site-packages/pydicom/filereader.py", line 874, in read_partial
    if class_uid and class_uid.name == "Media Storage Directory Storage":
AttributeError: 'bytes' object has no attribute 'name'

If it can help in any way, apparently another .dcm file in the dataset was not readable - and was put under the folder UnknownPatientID/UnknownStudyInstanceUID/UnknownSeriesInstanceUID.

When I tried to open that single .dcm file, pydicom wasn't happy again and raised an exception (unfortunately, I removed the file without thinking this through, but it was something almost identical to this problem here - except the tags were different).

That exception was somehow handled correctly by DICOMSort (even if the file was not readable - I guess the exception raised was probably different and correctly caught).

python3 compatibility

Hey Steve,

Thanks again for the work you put in dicomsort.

While running the script in python3, one python2 to 3 compatibility error hit me:

python3 dicomsort.py -u $SRC_DIR $DST_DIR/%PatientID/%StudyInstanceUID/%SeriesInstanceUID/%SOPInstanceUID.dcm --deleteSource --keepGoing --verbose

ERROR, UNEXPECTED EXCEPTION
name 'raw_input' is not defined
Traceback (most recent call last):
  File "dicomsort.py", line 500, in <module>
    if confirmDelete(sorter):
  File "dicomsort.py", line 480, in confirmDelete
    response = raw_input ('Delete source directory? [y/N] ')
NameError: name 'raw_input' is not defined

In this case, the quick fix would be substituting python2's raw_input() with python3's input().
Supposing many users will still be using dicomsort with python2 (and this would break their script), I guess the simplest thing to do would be some binding like:

try:
    raw_input = input
except NameError:
    pass

or the other way around. If the user has future installed, then the following should also work:

# import from future
from builtins import input

Pylint3

Wondering if there were other potential compatibility issues - and being as lazy as a toad - I ran pylint3. Here's the output of pylint3 dicomsort.py | grep "W:\|E:"(I separated the errors from the warning for the sake of clarity):

W:138, 0: Bad indentation. Found 14 spaces, expected 16 (bad-indentation)
W:139, 0: Bad indentation. Found 16 spaces, expected 20 (bad-indentation)
W:141, 0: Bad indentation. Found 14 spaces, expected 16 (bad-indentation)
W:142, 0: Bad indentation. Found 16 spaces, expected 20 (bad-indentation)
W:143, 0: Bad indentation. Found 14 spaces, expected 16 (bad-indentation)
W:144, 0: Bad indentation. Found 16 spaces, expected 20 (bad-indentation)
W:145, 0: Bad indentation. Found 16 spaces, expected 20 (bad-indentation)
W:146, 0: Bad indentation. Found 16 spaces, expected 20 (bad-indentation)
W:148, 0: Bad indentation. Found 14 spaces, expected 16 (bad-indentation)
W:198,22: Unused variable 'subFolders' (unused-variable)
W:111, 8: Attribute 'options' defined outside __init__ (attribute-defined-outside-init)
W:185, 8: Attribute 'filesRenamed' defined outside __init__ (attribute-defined-outside-init)
W:186, 8: Attribute 'filesSkipped' defined outside __init__ (attribute-defined-outside-init)
W:398,13: Redefining name 'sorter' from outer scope (line 487) (redefined-outer-name)
W:420,12: Unused variable 'root' (unused-variable)
W:420,18: Unused variable 'subFolders' (unused-variable)
W:441,14: Redefining name 'sorter' from outer scope (line 487) (redefined-outer-name)
W:476,18: Redefining name 'sorter' from outer scope (line 487) (redefined-outer-name)
W:510,11: Catching too general exception Exception (broad-except)
W:514, 8: Access to a protected member _exit of a client class (protected-access)
W: 24, 0: Unused import time (unused-import)
E:344,12: Module 'urllib' has no 'urlretrieve' member (no-member)
E:480,15: Undefined variable 'raw_input' (undefined-variable)

Hoping this helps,
Dennis.

meaning of the -z flag

Is the goal of -z to run gzip compression on the files when putting them in the target directory?

Create symbolic link instead of copying file

This is a feature request.

Could there be an option to create a symbolic link instead of a copy of the file?

The newly created folder would provide and organized view of the data which remained in place in the old file structure. Multiple views could be created without duplicating data.

need sample data and self-test

Since we expect the program to work on 'normal' dicom files (meaning single frame, without missing tags) we can use arbitrary dicom data for testing.

I'll set up a small set of test data on a web site that we can be downloaded for use with in a self-test mode for the program. We can associate this data with a desired directory hierarchy output to confirm things are working correctly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.