Abstract base writer
abstract_base_writer
#
AbstractBaseWriter
dataclass
#
AbstractBaseWriter(
root_directory: pathlib.Path = dataclasses.field(),
filename_format: str = dataclasses.field(),
create_dirs: bool = True,
existing_file_mode: imgtools.io.writers.abstract_base_writer.ExistingFileMode = imgtools.io.writers.abstract_base_writer.ExistingFileMode.FAIL,
sanitize_filenames: bool = True,
context: typing.Dict[str, typing.Any] = dict(),
overwrite_index: bool = False,
absolute_paths_in_index: bool = False,
index_filename: typing.Optional[str] = None,
)
Bases: abc.ABC
, typing.Generic[imgtools.io.writers.abstract_base_writer.ContentType]
Abstract base class for managing file writing with customizable paths and filenames.
This class provides a template for writing files with a flexible directory structure and consistent file naming patterns. It handles common operations such as directory creation, file path resolution, and maintaining an index of saved files.
The class supports various file existence handling modes, filename sanitization, and easy context management for generating dynamic paths with placeholder variables.
Attributes:
Name | Type | Description |
---|---|---|
root_directory |
pathlib.Path
|
Root directory where files will be saved. This directory will be created
if it doesn't exist and |
filename_format |
str
|
Format string defining the directory and filename structure. Supports placeholders for context variables enclosed in curly braces. Example: '{subject_id}_{date}/{disease}.txt' |
create_dirs |
bool, default=True
|
Creates necessary directories if they don't exist. |
existing_file_mode |
ExistingFileMode, default=ExistingFileMode.FAIL
|
Behavior when a file already exists. Options: OVERWRITE, SKIP, FAIL |
sanitize_filenames |
bool, default=True
|
Replaces illegal characters from filenames with underscores. |
context |
Dict[str, Any], default={}
|
Internal context storage for pre-checking. |
index_filename |
Optional[str], default=None
|
Name of the index file to track saved files. If an absolute path is provided, it will be used as is. If not provided, it will be saved in the root directory with the format of {root_directory.name}_index.csv. |
overwrite_index |
bool, default=False
|
Overwrites the index file if it already exists. |
absolute_paths_in_index |
bool, default=False
|
If True, saves absolute paths in the index file. If False, saves paths relative to the root directory. |
pattern_resolver |
imgtools.pattern_parser.PatternResolver
|
Instance used to handle filename formatting with placeholders. |
Properties
index_file : Path Returns the path to the index CSV file.
Notes
When using this class, consider the following best practices:
1. Implement the abstract save
method in subclasses to handle the actual file writing.
2. Use the preview_path
method to check if a file exists before performing expensive operations.
3. Use the class as a context manager when appropriate to ensure proper resource cleanup.
4. Set appropriate file existence handling mode based on your application's needs.
Methods:
Name | Description |
---|---|
add_to_index |
Add or update an entry in the shared CSV index file using IndexWriter. |
clear_context |
Clear the context for the writer. |
preview_path |
Pre-checking file existence and setting up the writer context. |
resolve_path |
Generate a file path based on the filename format, subject ID, and |
save |
Abstract method for writing data. Must be implemented by subclasses. |
set_context |
Set the context for the writer. |
add_to_index
#
add_to_index(
path: pathlib.Path,
include_all_context: bool = True,
filepath_column: str = "path",
replace_existing: bool = False,
merge_columns: bool = True,
) -> None
Add or update an entry in the shared CSV index file using IndexWriter.
What It Does:
- Logs the file's path and associated context variables to a shared CSV index file.
- Uses IndexWriter to safely handle concurrent writes and schema evolution.
When to Use It:
- Use this method to maintain a centralized record of saved files for auditing or debugging.
Relevant Writer Parameters
-
The
index_filename
parameter allows you to specify a custom filename for the index file. By default, it will be named after theroot_directory
with_index.csv
appended. -
If the index file already exists in the root directory, it will overwrite it unless the
overwrite_index
parameter is set toFalse
. -
The
absolute_paths_in_index
parameter controls whether the paths in the index file are absolute or relative to theroot_directory
, withFalse
being the default.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
pathlib.Path
|
The file path being saved. |
required |
|
bool
|
If True, write existing context variables passed into writer and
the additional context to the CSV.
If False, determines only the context keys parsed from the
|
True
|
|
str
|
The name of the column to store the file path. Defaults to "path". |
"path"
|
|
bool
|
If True, checks if the file path already exists in the index and replaces it. |
False
|
|
bool
|
If True, allows schema evolution by merging new columns with existing ones. Set to False for strict schema enforcement (will raise an error if schemas don't match). |
True
|
Source code in src/imgtools/io/writers/abstract_base_writer.py
500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 |
|
clear_context
#
Clear the context for the writer.
Useful for resetting the context after using preview_path
or save
and want to make sure that the context is empty for new operations.
Source code in src/imgtools/io/writers/abstract_base_writer.py
preview_path
#
preview_path(
**kwargs: object,
) -> typing.Optional[pathlib.Path]
Pre-checking file existence and setting up the writer context.
Meant to be used by users to skip expensive computations if a file
already exists and you dont want to overwrite it.
Only difference between this and resolve_path is that this method
does not return the path if the file exists and the mode is set to
SKIP
.
This is because the .save()
method should be able to return
the path even if the file exists.
What It Does:
- Pre-checks the file path based on context without writing the file.
- Returns
None
if the file exists and the mode is set toSKIP
. - Raises a
FileExistsError
if the mode is set toFAIL
. - An added benefit of using
preview_path
is that it automatically caches the context variables for future use, andsave()
can be called without passing in the context variables again.
Examples:
Main idea here is to allow users to save computation if they choose to skip existing files.
i.e. if file exists and mode is SKIP
, we return
None
, so the user can skip the computation.
>>> if nifti_writer.preview_path(subject="math", name="test") is None:
>>> logger.info("File already exists. Skipping computation.")
>>> continue # could be `break` or `return` depending on the use case
if the mode is FAIL
, we raise an error if the file exists, so user
doesnt have to perform expensive computation only to fail when saving.
Useful Feature
The context is saved in the instance, so running
.save()
after this will use the same context, and user can optionally
update the context with new values passed to .save()
.
>>> if path := writer.preview_path(subject="math", name="test"):
>>> ... # do some expensive computation to generate the data
>>> writer.save(data)
.save()
automatically uses the context for subject
and name
we
passed to preview_path
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
typing.Any
|
Parameters for resolving the filename and validating existence. |
{}
|
Returns:
Type | Description |
---|---|
pathlib.Path | None
|
If the file exists and the mode is |
Source code in src/imgtools/io/writers/abstract_base_writer.py
346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 |
|
resolve_path
#
resolve_path(**kwargs: object) -> pathlib.Path
Generate a file path based on the filename format, subject ID, and additional parameters.
Meant to be used by developers when creating a new writer class
and used internally by the save
method.
What It Does:
- Dynamically generates a file path based on the provided context and filename format.
When to Use It:
- This method is meant to be used in the
save
method to determine the file’s target location, but can also be used by external code to generate paths. - It ensures you’re working with a valid path and can handle file existence scenarios.
- Only raises
FileExistsError
if the file already exists and the mode is set toFAIL
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
typing.Any
|
Parameters for resolving the filename and validating existence. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
resolved_path |
pathlib.Path
|
The resolved path for the file. |
Source code in src/imgtools/io/writers/abstract_base_writer.py
save
abstractmethod
#
save(
data: imgtools.io.writers.abstract_base_writer.ContentType,
**kwargs: typing.Any
) -> pathlib.Path
Abstract method for writing data. Must be implemented by subclasses.
Can use resolve_path() to get the output path and write the data to it.
For efficiency, use self.context to access the context variables, updating them with the kwargs passed from the save method.
This will help simplify repeated saves with similar context variables.
Source code in src/imgtools/io/writers/abstract_base_writer.py
ExampleWriter
dataclass
#
ExampleWriter(
root_directory: pathlib.Path = dataclasses.field(),
filename_format: str = dataclasses.field(),
create_dirs: bool = True,
existing_file_mode: imgtools.io.writers.abstract_base_writer.ExistingFileMode = imgtools.io.writers.abstract_base_writer.ExistingFileMode.FAIL,
sanitize_filenames: bool = True,
context: typing.Dict[str, typing.Any] = dict(),
overwrite_index: bool = False,
absolute_paths_in_index: bool = False,
index_filename: typing.Optional[str] = None,
)
Bases: imgtools.io.writers.abstract_base_writer.AbstractBaseWriter[str]
A concrete implementation of AbstractBaseWriter for demonstration.
Methods:
Name | Description |
---|---|
add_to_index |
Add or update an entry in the shared CSV index file using IndexWriter. |
clear_context |
Clear the context for the writer. |
preview_path |
Pre-checking file existence and setting up the writer context. |
resolve_path |
Generate a file path based on the filename format, subject ID, and |
save |
Abstract method for writing data. Must be implemented by subclasses. |
set_context |
Set the context for the writer. |
add_to_index
#
add_to_index(
path: pathlib.Path,
include_all_context: bool = True,
filepath_column: str = "path",
replace_existing: bool = False,
merge_columns: bool = True,
) -> None
Add or update an entry in the shared CSV index file using IndexWriter.
What It Does:
- Logs the file's path and associated context variables to a shared CSV index file.
- Uses IndexWriter to safely handle concurrent writes and schema evolution.
When to Use It:
- Use this method to maintain a centralized record of saved files for auditing or debugging.
Relevant Writer Parameters
-
The
index_filename
parameter allows you to specify a custom filename for the index file. By default, it will be named after theroot_directory
with_index.csv
appended. -
If the index file already exists in the root directory, it will overwrite it unless the
overwrite_index
parameter is set toFalse
. -
The
absolute_paths_in_index
parameter controls whether the paths in the index file are absolute or relative to theroot_directory
, withFalse
being the default.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
pathlib.Path
|
The file path being saved. |
required |
|
bool
|
If True, write existing context variables passed into writer and
the additional context to the CSV.
If False, determines only the context keys parsed from the
|
True
|
|
str
|
The name of the column to store the file path. Defaults to "path". |
"path"
|
|
bool
|
If True, checks if the file path already exists in the index and replaces it. |
False
|
|
bool
|
If True, allows schema evolution by merging new columns with existing ones. Set to False for strict schema enforcement (will raise an error if schemas don't match). |
True
|
Source code in src/imgtools/io/writers/abstract_base_writer.py
500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 |
|
clear_context
#
Clear the context for the writer.
Useful for resetting the context after using preview_path
or save
and want to make sure that the context is empty for new operations.
Source code in src/imgtools/io/writers/abstract_base_writer.py
preview_path
#
preview_path(
**kwargs: object,
) -> typing.Optional[pathlib.Path]
Pre-checking file existence and setting up the writer context.
Meant to be used by users to skip expensive computations if a file
already exists and you dont want to overwrite it.
Only difference between this and resolve_path is that this method
does not return the path if the file exists and the mode is set to
SKIP
.
This is because the .save()
method should be able to return
the path even if the file exists.
What It Does:
- Pre-checks the file path based on context without writing the file.
- Returns
None
if the file exists and the mode is set toSKIP
. - Raises a
FileExistsError
if the mode is set toFAIL
. - An added benefit of using
preview_path
is that it automatically caches the context variables for future use, andsave()
can be called without passing in the context variables again.
Examples:
Main idea here is to allow users to save computation if they choose to skip existing files.
i.e. if file exists and mode is SKIP
, we return
None
, so the user can skip the computation.
>>> if nifti_writer.preview_path(subject="math", name="test") is None:
>>> logger.info("File already exists. Skipping computation.")
>>> continue # could be `break` or `return` depending on the use case
if the mode is FAIL
, we raise an error if the file exists, so user
doesnt have to perform expensive computation only to fail when saving.
Useful Feature
The context is saved in the instance, so running
.save()
after this will use the same context, and user can optionally
update the context with new values passed to .save()
.
>>> if path := writer.preview_path(subject="math", name="test"):
>>> ... # do some expensive computation to generate the data
>>> writer.save(data)
.save()
automatically uses the context for subject
and name
we
passed to preview_path
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
typing.Any
|
Parameters for resolving the filename and validating existence. |
{}
|
Returns:
Type | Description |
---|---|
pathlib.Path | None
|
If the file exists and the mode is |
Source code in src/imgtools/io/writers/abstract_base_writer.py
346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 |
|
resolve_path
#
resolve_path(**kwargs: object) -> pathlib.Path
Generate a file path based on the filename format, subject ID, and additional parameters.
Meant to be used by developers when creating a new writer class
and used internally by the save
method.
What It Does:
- Dynamically generates a file path based on the provided context and filename format.
When to Use It:
- This method is meant to be used in the
save
method to determine the file’s target location, but can also be used by external code to generate paths. - It ensures you’re working with a valid path and can handle file existence scenarios.
- Only raises
FileExistsError
if the file already exists and the mode is set toFAIL
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
typing.Any
|
Parameters for resolving the filename and validating existence. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
resolved_path |
pathlib.Path
|
The resolved path for the file. |
Source code in src/imgtools/io/writers/abstract_base_writer.py
save
#
save(data: str, **kwargs: object) -> pathlib.Path
Abstract method for writing data. Must be implemented by subclasses.
Can use resolve_path() to get the output path and write the data to it.
For efficiency, use self.context to access the context variables, updating them with the kwargs passed from the save method.
This will help simplify repeated saves with similar context variables.
Save content to a file with the resolved path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
str
|
The content to write to the file. |
required |
|
typing.Any
|
Additional context for filename generation. |
{}
|
Returns:
Type | Description |
---|---|
pathlib.Path
|
The path to the saved file. |
Source code in src/imgtools/io/writers/abstract_base_writer.py
ExistingFileMode
#
Bases: str
, enum.Enum
Enum to specify handling behavior for existing files.
Attributes:
Name | Type | Description |
---|---|---|
OVERWRITE |
str
|
Overwrite the existing file. Logs as debug and continues with the operation. |
FAIL |
str
|
Fail the operation if the file exists. Logs as error and raises a FileExistsError. |
SKIP |
str
|
Skip the operation if the file exists. Meant to be used for previewing
the path before any expensive computation. |
WriterIndexError
#
WriterIndexError(
message: str,
writer: imgtools.io.writers.abstract_base_writer.AbstractBaseWriter,
)
Bases: Exception
Exception raised when a writer encounters an error while interacting with its index.
This exception wraps the underlying IndexWriter exceptions to provide a clearer context about the writer that encountered the error.