Sorter base
sorter_base
#
Base module for sorting files based on customizable patterns.
This module provides a foundation for implementing file sorting logic, particularly for handling DICOM files or other structured data.
The SorterBase
class serves as an abstract base class for:
- Parsing and validating patterns used for organizing files.
- Visualizing the target directory structure through a tree representation.
- Allowing subclasses to implement specific validation and resolution logic.
Important:
While this module helps define the target directory structure for
files based on customizable metadata-driven patterns, it does not
alter the filename (basename) of the source files. The original
filename is preserved during the sorting process. This ensures that
files with the same metadata fields but different filenames are not
overwritten, which is critical when dealing with fields like
InstanceNumber
that may have common values across different files.
Examples:
Given a source file:
/source_dir/HN-CHUS-082/1-1.dcm
And a target pattern:
./data/dicoms/%PatientID/Study-%StudyInstanceUID/Series-%SeriesInstanceUID/%Modality/
The resolved path will be:
./data/dicoms/HN-CHUS-082/Study-06980/Series-67882/RTSTRUCT/1-1.dcm
The SorterBase
class ensures that only the directory structure is
adjusted based on metadata, leaving the original filename intact.
Functions:
Name | Description |
---|---|
resolve_path |
Worker function to resolve a single path. |
SorterBase
#
SorterBase(
source_directory: pathlib.Path,
target_pattern: str,
pattern_parser: typing.Pattern = imgtools.dicom.sort.sorter_base.DEFAULT_PATTERN_PARSER,
)
Bases: abc.ABC
Abstract base class for sorting files based on customizable patterns.
This class provides functionalities for: - Pattern parsing and validation - Tree visualization of file structures - Extensibility for subclass-specific implementations
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
pathlib.Path
|
The directory containing the files to be sorted. |
required |
|
str
|
The pattern string for sorting files. |
required |
|
typing.Pattern
|
Custom regex pattern for parsing patterns uses default that
matches placeholders in the format of |
imgtools.dicom.sort.sorter_base.DEFAULT_PATTERN_PARSER
|
Attributes:
Name | Type | Description |
---|---|---|
source_directory |
pathlib.Path
|
The directory containing the files to be sorted. |
format |
str
|
The parsed format string with placeholders for keys. |
dicom_files |
list of Path
|
The list of DICOM files to be sorted. |
Methods:
Name | Description |
---|---|
print_tree |
Display the pattern structure as a tree visualization. |
validate_keys |
Validate extracted keys. Subclasses should implement this method |
Source code in src/imgtools/dicom/sort/sorter_base.py
pattern_preview
property
#
print_tree
#
Display the pattern structure as a tree visualization.
Notes
This only prints the target pattern, parsed and formatted. Performing a dry-run execute will display more information.
Source code in src/imgtools/dicom/sort/sorter_base.py
validate_keys
abstractmethod
#
Validate extracted keys. Subclasses should implement this method to perform specific validations based on their context.
resolve_path
#
resolve_path(
path: pathlib.Path,
keys: typing.Set[str],
format_str: str,
truncate: int = 5,
check_existing: bool = True,
force: bool = True,
) -> typing.Tuple[pathlib.Path, pathlib.Path]
Worker function to resolve a single path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
pathlib.Path
|
The source file path. |
required |
|
typing.Set[str]
|
The DICOM keys required for resolving the path. |
required |
|
str
|
The format string for the resolved path. |
required |
|
bool
|
If True, check if the resolved path already exists (default is True). |
True
|
|
int
|
The number of characters to trunctae UID values (default is 5). |
5
|
|
bool
|
passed to pydicom.dcmread() to force reading the file (default is False). |
True
|
Returns:
Type | Description |
---|---|
typing.Tuple[pathlib.Path, pathlib.Path]
|
The source path and resolved path. |