Pipelines

class celltk.core.pipeline.Pipeline(parent_folder=None, output_folder=None, image_folder=None, mask_folder=None, array_folder=None, name=None, frame_rng=None, skip_frames=None, file_extension='tif', overwrite=True, log_file=True, yaml_path=None, verbose=False, _split_key='&')

Pipeline organizes running Operations on saved images.

Parameters
  • parent_folder (Optional[str], default: None) – Location of the raw images

  • output_folder (Optional[str], default: None) – Location to store outputs. Defaults to ‘parent_folder/outputs’

  • image_folder (Optional[str], default: None) – Location of images if different from parent_folder or output_folder

  • mask_folder (Optional[str], default: None) – Location of masks if different from parent_folder or output_folder

  • array_folder (Optional[str], default: None) – Location of arrays if different from parent_folder or output_folder

  • name (Optional[str], default: None) – Used to identify output array. Defaults to name parent_folder.

  • frame_rng (Optional[Tuple[int]], default: None) – Specify the frames to be loaded and used. If a single int, will load that many images. Otherwise designates the bounds of the range.

  • skip_frames (Optional[Tuple[int]], default: None) – Use to specify frames to be skipped. For example, frames that are out of focus.

  • file_extension (str, default: 'tif') – File extension of image files to be loaded

  • ovewrite – If True, outputs in output_folder are overwritten

  • log_file (bool, default: True) – If True, save log outputs to a text file in output folder

  • yaml_path (Optional[str], default: None) – Path to yaml file defining the Pipeline to be run

  • verbose (bool, default: False) – If True, increases logging verbosity

  • _split_key (str, default: '&') – Used to specify outputs from functions that return multiple outputs. For example, if you align two channels the outputs will be saved as ‘align&channel000’ and ‘align&channel001’.

Returns

None

Parameters

overwrite (bool, default: True) –

add_operations(operation)

Adds Operations to the Pipeline.

Parameters

operation (Collection[Operation]) – A single operation or collection of operations to be run in the order passed

Return type

None

Returns

None

classmethod load_from_yaml(path)

Builds Pipeline class from specifications in YAML file

Parameters

path (str) – Path to yaml file

Returns

Pipeline with specified configuration

Return type

Pipeline class

run()

Load images and run the operations

Retrun

Generator returning outputs of the last operation

Return type

Generator

save_as_yaml(path=None, fname=None)

Saves Pipeline configuration as a YAML file

Parameters
  • path (Optional[str], default: None) – Path to save yaml file in. Defaults to output_folder

  • fname (Optional[str], default: None) – Name of the YAML file. Defaults to parent_folder

Return type

None

Returns

None

save_operations_as_yaml(path=None, fname='operations.yaml')

Save configuration of Operations in Pipeline as a YAML file

Parameters
  • path (Optional[str], default: None) – Path to save yaml file in. Defaults to output_folder

  • fname (str, default: 'operations.yaml') – Name of the YAML file. Defaults to operations.yaml

Return type

None

Returns

None

class celltk.core.orchestrator.Orchestrator(yaml_folder=None, parent_folder=None, output_folder=None, match_str=None, image_folder=None, mask_folder=None, array_folder=None, condition_map={}, position_map=None, cond_map_only=False, name='experiment', frame_rng=None, skip_frames=None, file_extension='tif', overwrite=True, log_file=True, save_master_df=True, job_controller=None, verbose=True)

Orchestrator organizes running multiple Pipelines on a set of folders

Parameters
  • yaml_folder (Optional[str], default: None) – Absolute path to location of Pipeline yamls

  • parent_folder (Optional[str], default: None) – Absolute path of the folders with raw images

  • output_folder (Optional[str], default: None) – Absolute path to folder to store outputs. Defaults to ‘parent_folder/outputs’

  • match_str (Optional[str], default: None) – If provided, folders that do not contain match_str are excluded from analysis

  • image_folder (Optional[str], default: None) – Absolute path to folder with images if different from parent_folder or output_folder

  • mask_folder (Optional[str], default: None) – Absolute path to folder with masks if different from parent_folder or output_folder

  • array_folder (Optional[str], default: None) – Absolute path to folder with arrays if different from parent_folder or output_folder

  • condition_map (dict, default: {}) – Dictionoary that maps folder names too the condition in the experiment. i.e {A1-Site_0: control}

  • position_map (Union[dict, Callable, None], default: None) – If multiple positions have the same condition position map is used to uniquely identify them

  • cond_map_only (bool, default: False) – If True, folders not in cond_map are not run. Only used if condition_map is provided.

  • name (str, default: 'experiment') – Used to identify output array. Defaults to name parent_folder.

  • frame_rng (Optional[Tuple[int]], default: None) – Specify the frames to be loaded and used. If a single int, will load that many images. Otherwise designates the bounds of the range.

  • skip_frames (Optional[Tuple[int]], default: None) – Use to specify frames to be skipped. For example, frames that are out of focus.

  • file_extension (str, default: 'tif') – File extension of image files to be loaded

  • ovewrite – If True, outputs in output_folder are overwritten

  • log_file (bool, default: True) – If True, save log outputs to a text file in output folder

  • save_master_df (bool, default: True) – If True, saves a single hdf5 file containing the data from all of the pipelines

  • job_controller (Optional[JobController], default: None) – If given, controls how pipelines are run

  • verbose (bool, default: True) – If True, increases logging verbosity

Returns

None

Parameters

overwrite (bool, default: True) –

add_operations(operation, index=-1)

Adds Operations to the Orchestrator and all pipelines.

Parameters
  • operation (Collection[Operation]) – A single operation or collection of operations to be run in the order passed

  • index (int, default: -1) – If given, dictates where to insert the new operations

Return type

None

Returns

None

build_experiment_file(match_str=None)

Builds a single ExperimentArray from all of the ConditionArrays found in the Pipeline folders.

Parameters

match_str (Optional[str], default: None) – If given, only files containing match_str are included

Return type

None

load_operations_from_yaml(path)

Loads Operations from a YAML file

Parameters

path (str) – path to YAML file

Return type

None

Returns

None

run(n_cores=1)

Load images and run all of the pipelines

Parameters

n_cores (int, default: 1) – Not currently implemented

Retrun

None

Return type

None

save_condition_map_as_yaml(path=None, fname='conditions.yaml')

Saves the conditions in Orchestrator as a YAML file

Parameters
  • path (Optional[str], default: None) – Path to save the file at

  • fname (str, default: 'conditions.yaml') –

Return type

None

Returns

None

save_operations_as_yaml(path=None, fname='operations.yaml')

Save configuration of Operations in Pipeline as a YAML file

Parameters
  • path (Optional[str], default: None) – Path to save yaml file in. Defaults to output_folder

  • fname (str, default: 'operations.yaml') – Name of the YAML file. Defaults to operations.yaml

Return type

None

Returns

None

save_pipelines_as_yamls(path=None)

Saves Orchestrator configuration as a YAML file

Parameters

path (Optional[str], default: None) – Path to save yaml file in. Defaults to output_folder

Return type

None

Returns

None

update_condition_map(condition_map={}, path=None)

Adds conditions to each of the Pipelines in Orchestrator

Parameters
  • condition_map (dict, default: {}) – Dictionoary that maps folder names too the condition in the experiment. i.e {A1-Site_0: control}

  • path (Optional[str], default: None) – If given, loads a condition map from the file found at path. Overwrites condition_map.

Return type

None

Returns

None