Pipelines
- class celltk.core.pipeline.Pipeline(parent_folder=None, output_folder=None, image_folder=None, mask_folder=None, array_folder=None, name=None, frame_rng=None, skip_frames=None, file_extension='tif', overwrite=True, log_file=True, yaml_path=None, verbose=False, _split_key='&')
Pipeline organizes running Operations on saved images.
- Parameters
parent_folder (
Optional
[str
], default:None
) – Location of the raw imagesoutput_folder (
Optional
[str
], default:None
) – Location to store outputs. Defaults to ‘parent_folder/outputs’image_folder (
Optional
[str
], default:None
) – Location of images if different from parent_folder or output_foldermask_folder (
Optional
[str
], default:None
) – Location of masks if different from parent_folder or output_folderarray_folder (
Optional
[str
], default:None
) – Location of arrays if different from parent_folder or output_foldername (
Optional
[str
], default:None
) – Used to identify output array. Defaults to name parent_folder.frame_rng (
Optional
[Tuple
[int
]], default:None
) – Specify the frames to be loaded and used. If a single int, will load that many images. Otherwise designates the bounds of the range.skip_frames (
Optional
[Tuple
[int
]], default:None
) – Use to specify frames to be skipped. For example, frames that are out of focus.file_extension (
str
, default:'tif'
) – File extension of image files to be loadedovewrite – If True, outputs in output_folder are overwritten
log_file (
bool
, default:True
) – If True, save log outputs to a text file in output folderyaml_path (
Optional
[str
], default:None
) – Path to yaml file defining the Pipeline to be runverbose (
bool
, default:False
) – If True, increases logging verbosity_split_key (
str
, default:'&'
) – Used to specify outputs from functions that return multiple outputs. For example, if you align two channels the outputs will be saved as ‘align&channel000’ and ‘align&channel001’.
- Returns
None
- Parameters
overwrite (
bool
, default:True
) –
- add_operations(operation)
Adds Operations to the Pipeline.
- Parameters
operation (
Collection
[Operation
]) – A single operation or collection of operations to be run in the order passed- Return type
None
- Returns
None
- classmethod load_from_yaml(path)
Builds Pipeline class from specifications in YAML file
- Parameters
path (
str
) – Path to yaml file- Returns
Pipeline with specified configuration
- Return type
Pipeline class
- run()
Load images and run the operations
- Retrun
Generator returning outputs of the last operation
- Return type
Generator
- save_as_yaml(path=None, fname=None)
Saves Pipeline configuration as a YAML file
- Parameters
path (
Optional
[str
], default:None
) – Path to save yaml file in. Defaults to output_folderfname (
Optional
[str
], default:None
) – Name of the YAML file. Defaults to parent_folder
- Return type
None
- Returns
None
- save_operations_as_yaml(path=None, fname='operations.yaml')
Save configuration of Operations in Pipeline as a YAML file
- Parameters
path (
Optional
[str
], default:None
) – Path to save yaml file in. Defaults to output_folderfname (
str
, default:'operations.yaml'
) – Name of the YAML file. Defaults to operations.yaml
- Return type
None
- Returns
None
- class celltk.core.orchestrator.Orchestrator(yaml_folder=None, parent_folder=None, output_folder=None, match_str=None, image_folder=None, mask_folder=None, array_folder=None, condition_map={}, position_map=None, cond_map_only=False, name='experiment', frame_rng=None, skip_frames=None, file_extension='tif', overwrite=True, log_file=True, save_master_df=True, job_controller=None, verbose=True)
Orchestrator organizes running multiple Pipelines on a set of folders
- Parameters
yaml_folder (
Optional
[str
], default:None
) – Absolute path to location of Pipeline yamlsparent_folder (
Optional
[str
], default:None
) – Absolute path of the folders with raw imagesoutput_folder (
Optional
[str
], default:None
) – Absolute path to folder to store outputs. Defaults to ‘parent_folder/outputs’match_str (
Optional
[str
], default:None
) – If provided, folders that do not contain match_str are excluded from analysisimage_folder (
Optional
[str
], default:None
) – Absolute path to folder with images if different from parent_folder or output_foldermask_folder (
Optional
[str
], default:None
) – Absolute path to folder with masks if different from parent_folder or output_folderarray_folder (
Optional
[str
], default:None
) – Absolute path to folder with arrays if different from parent_folder or output_foldercondition_map (
dict
, default:{}
) – Dictionoary that maps folder names too the condition in the experiment. i.e {A1-Site_0: control}position_map (
Union
[dict
,Callable
,None
], default:None
) – If multiple positions have the same condition position map is used to uniquely identify themcond_map_only (
bool
, default:False
) – If True, folders not in cond_map are not run. Only used if condition_map is provided.name (
str
, default:'experiment'
) – Used to identify output array. Defaults to name parent_folder.frame_rng (
Optional
[Tuple
[int
]], default:None
) – Specify the frames to be loaded and used. If a single int, will load that many images. Otherwise designates the bounds of the range.skip_frames (
Optional
[Tuple
[int
]], default:None
) – Use to specify frames to be skipped. For example, frames that are out of focus.file_extension (
str
, default:'tif'
) – File extension of image files to be loadedovewrite – If True, outputs in output_folder are overwritten
log_file (
bool
, default:True
) – If True, save log outputs to a text file in output foldersave_master_df (
bool
, default:True
) – If True, saves a single hdf5 file containing the data from all of the pipelinesjob_controller (
Optional
[JobController
], default:None
) – If given, controls how pipelines are runverbose (
bool
, default:True
) – If True, increases logging verbosity
- Returns
None
- Parameters
overwrite (
bool
, default:True
) –
- add_operations(operation, index=-1)
Adds Operations to the Orchestrator and all pipelines.
- Parameters
operation (
Collection
[Operation
]) – A single operation or collection of operations to be run in the order passedindex (
int
, default:-1
) – If given, dictates where to insert the new operations
- Return type
None
- Returns
None
- build_experiment_file(match_str=None)
Builds a single ExperimentArray from all of the ConditionArrays found in the Pipeline folders.
- Parameters
match_str (
Optional
[str
], default:None
) – If given, only files containing match_str are included
- Return type
None
- load_operations_from_yaml(path)
Loads Operations from a YAML file
- Parameters
path (
str
) – path to YAML file- Return type
None
- Returns
None
- run(n_cores=1)
Load images and run all of the pipelines
- Parameters
n_cores (
int
, default:1
) – Not currently implemented- Retrun
None
- Return type
None
- save_condition_map_as_yaml(path=None, fname='conditions.yaml')
Saves the conditions in Orchestrator as a YAML file
- Parameters
path (
Optional
[str
], default:None
) – Path to save the file atfname (
str
, default:'conditions.yaml'
) –
- Return type
None
- Returns
None
- save_operations_as_yaml(path=None, fname='operations.yaml')
Save configuration of Operations in Pipeline as a YAML file
- Parameters
path (
Optional
[str
], default:None
) – Path to save yaml file in. Defaults to output_folderfname (
str
, default:'operations.yaml'
) – Name of the YAML file. Defaults to operations.yaml
- Return type
None
- Returns
None
- save_pipelines_as_yamls(path=None)
Saves Orchestrator configuration as a YAML file
- Parameters
path (
Optional
[str
], default:None
) – Path to save yaml file in. Defaults to output_folder- Return type
None
- Returns
None
- update_condition_map(condition_map={}, path=None)
Adds conditions to each of the Pipelines in Orchestrator
- Parameters
condition_map (
dict
, default:{}
) – Dictionoary that maps folder names too the condition in the experiment. i.e {A1-Site_0: control}path (
Optional
[str
], default:None
) – If given, loads a condition map from the file found at path. Overwrites condition_map.
- Return type
None
- Returns
None