Arrays

class celltk.core.arrays.ConditionArray(regions=['nuc'], channels=['TRITC'], metrics=['label'], cells=[0], frames=[0], name='default', time=None, pos_id=0)

For now, this class is only built by Extract. It’s not meant to be built by the user yet. Stores the results from a single condition in an experiment.

Parameters
  • regions (List[str], default: ['nuc']) –

  • channels (List[str], default: ['TRITC']) –

  • metrics (List[str], default: ['label']) –

  • cells (List[int], default: [0]) –

  • frames (List[int], default: [0]) –

  • name (str, default: 'default') –

  • time (Union[float, ndarray, None], default: None) –

  • pos_id (int, default: 0) –

add_metric_slots(name)

Expands the ConditionArry to make room for more metrics.

Note

  • This method must be used before attempting to add new metrics to the ConditionArray.

Parameters

name (List[str]) – List of names of the metrics to add

Return type

None

Returns

None

property channels

Names of the channels in the data array.

property condition

Name of the data array

property coordinate_dimensions

Dictionary of with coordinate names as keys and coordinate length as values

property coordinates

Names of the coordinates (axes) in the data array

property dtype

Data type of the data array. Usually double.

filter_cells(mask=None, key=None, delete=True, *args, **kwargs)

Uses an arbitrary mask or a saved mask (key) to filter cells from the data. If delete, the underlying structure is changed, otherwise, the data are only returned.

Parameters
  • mask (Optional[ndarray], default: None) – A boolean mask to filter cells with. Can be 1D, 2D or 5D.

  • key (Optional[str], default: None) – Name of a saved mask to use for filtering cells. Overwrites mask if provided.

  • delete (bool, default: True) – If True, cells are removed in the base array. Otherwise they are only removed in the array that is returned.

Returns

array with cells designated by maks or key removed.

Return type

np.ndarray

filter_peaks(value_key, metrics, thresholds, kwargs=[{}], peak_key=None, propagate=True)

Removes segmented peaks based on arbitrary peak criteria. See celltk.utils.peak_utils for more information about possible metrics to use

Parameters
  • value_key (Tuple[int, str]) – Key to the traces used to calculate the peak metrics.

  • metrics (Collection[str]) – Names of the metrics to use for segmentation. See PeakHelper in celltk.utils.peak_utils.

  • thresholds (Collection[str]) – Lower threshold for the metrics given. If a peak metric is less than the threshold, it will be removed. Must be the same length as metrics.

  • kwargs (Collection[dict], default: [{}]) – Collection of dictionaries containing kwargs for the metrics given. If given, should be same length as metrics.

  • peak_key (Optional[Tuple[int, str]], default: None) – Key to the peak labels. If not given, will attempt to find peak labels based on the value key

  • propagate (bool, default: True) – If True, propagates filtered peak labels to the other keys in ConditionArray

Return type

None

Returns

None

generate_mask(function, metric, region=0, channel=0, frame_rng=None, key=None, *args, **kwargs)

Generate a mask to remove cells using an arbitrary filter.

Parameters
  • function – If str, name of function in filter_utils. Otherwise, should be a Callable that inputs a 2D array and returns a 2D boolean array.

  • metric – Name of metric to use. Can be any key in the array.

  • region – Name of region to calculate the filter in.

  • channel – Name of channel to calculate filter in.

  • frame_rng – Frames to use in calculation. If int, takes that many frames from start of trace. If tuple, uses passed frames.

  • key – If given, saves the mask in ConditionArray as key.

  • args – passed to function

  • kwargs – passed to function

Returns

2D boolean array that masks cells outside filter

get_mask(key)

Returns a saved mask. Masks are generated and saved using ConditionArray().generate_mask()

Parameters

key (str) – Name of mask to retreive

Returns

The saved boolean mask

Return type

np.ndarray

interpolate_nans(keys=None)

Linear interpolation of nans for each cell. Modification is done in-place.

Parameters

keys (Optional[Collection[tuple]], default: None) – keys that will have nans removed. Each key should be a tuple of strings with length=3

Return type

None

Returns

None

property keys

All the possible keys that can be used to index the data array.

classmethod load(path)

Load an hdf5 file and convert to a ConditionArray.

Parameters

path (str) – Path to the hdf5 file to be loaded.

Returns

Loaded ConditionArray

Return type

ConditionArray

mark_active_cells(key, thres=1, propagate=True)

Uses peak labels to mark in what frames cells are active

Parameters
  • key (Tuple[int, str]) – Key defining peak labels

  • thres (float, default: 1) – Leave as 1, not currently used

  • propagate (bool, default: True) – if True, propagate active marks to other keys in ConditionArray

Return type

None

Returns

None

property metrics

Names of the metrics in the data array.

property ncells

Number of cells present in array

property ndim

Number of dimensions in the data array. Should equal 5.

predict_peaks(key, model=None, weight_path=None, propagate=True, segment=True, **kwargs)

Uses a UNet-based neural net to predict peaks in the traces defined by key. Adds two keys to ConditionArray, ‘slope_prob’ and ‘plateau_prob’. If segment is True, also adds a ‘peaks’ key. ‘slope_prob’ is the probability that a point is on the upward or downward slope of a peak. ‘plateau_prob’ is the probability that a point is at the top of a peak.

Parameters
  • key (Tuple[int, str]) – Key to the traces to predict peaks with. Must return a 2D array.

  • model (Optional[UPeakModel], default: None) – An instantiated UPeakModel to use

  • weight_path (Optional[str], default: None) – Path to the model weights to use for a new UPeakModel instance

  • propagate (bool, default: True) – If True, propagates peak probabilities to the other keys in ConditionArray

  • segment (bool, default: True) – If True, uses a watershed-based segmentation to label peaks based on the predictions.

  • kwargs – Passed to segmentation function. See utils.peak_utils.segment_peaks_agglomeration.

Return type

None

Returns

None

Raises

ValueError – If neither model or weights are provided.

propagate_values(key, prop_to='both')

Propagates metric value to other keys in ConditionArray.

Parameters
  • key (Tuple[str]) – Key to metric containing the values to propagate

  • prop_to (str, default: 'both') – Define the keys to propagate values to. Options are ‘channel’, ‘region’, or ‘both’.

Return type

None

Returns

None

property regions

Names of the regions in the data array.

remove_parents(parent_daughter, cell_index)

Returns 1D boolean mask to remove traces of cells that divided. Daughter cell traces are kept. Use with reshape_mask to remove parent cells. Typically called by Extract.

Parameters
  • parent_daughter (Dict) – Dictionary of parent_label : daughter_label

  • cell_index (Dict) – Dictionary of cell_label : cell_index_in_array

Returns

Boolean mask with False in the rows of parent cells

Return type

np.ndarray

remove_short_traces(min_trace_length)

Removes cells with less than min_trace_length non-nan values. Uses label as the metric to determine non-nan values. Typically called by Extract.

Parameters

min_trace_length (int) – Shortest trace that should not be deleted

Returns

array with short traces removed

Return type

np.ndarray

reshape_mask(mask)

Takes in a 1D, 2D, or 5D boolean mask and casts to tuple of 5-dimensional indices. Use this to apply a 1D or 2D mask to self._arr.

Note

  • Always assumes that filtering is to happen in cell axis.

Parameters

mask (ndarray) – Boolean mask to be used as filter.

Returns

Indices that can be used to index ConditionArray

Return type

Tuple

save(path)

Saves ConditionArray to an hdf5 file.

Parameters

path (str) – Absolute path to save the file.

Return type

None

Returns

None

set_condition(condition)

Updates name of the ConditionArray.

Parameters

condition (str) – New name of the ConditionArray

Return type

None

Returns

None

set_position_id(pos=None)

Adds unique identifiers to the cells in ConditionArray. Typically called by Pipeline or ExperimentArray.

Parameters

pos (Optional[int], default: None) – Integer or string identifying the position

Return type

None

Returns

None

set_time(time)

Define the time axis. Time can be given as a frame interval or an array specifying the time for each frame.

Parameters

time (Union[float, ndarray]) – If int or float, designates time between frames. If array, marks the frame time points.

Return type

None

Returns

None

property shape

Shape of the data array

class celltk.core.arrays.ExperimentArray(arrays=None, name=None, time=None)

Base class to create arrays that can store an almost arbitrary number of ConditionArrays. Typically made by Orchestrator.build_experiment_file()

Parameters
  • arrays (Optional[List[ConditionArray]], default: None) –

  • name (Optional[str], default: None) –

  • time (Optional[float], default: None) –

add_metric_slots(name)

Expands each ConditionArry to make room for more metrics.

Note

  • This method must be used before attempting to add new metrics to the ConditionArray.

Parameters

name (List[str]) – List of names of the metrics to add

Return type

None

Returns

None

property channels

Returns list of the names of the channels in each ConditionArray.

property conditions: List

Returns list of the name of each ConditionArray.

Return type

List

property coordinates

Returns the names of the coordinates of each ConditionArray.

property dtype

Returns list of the data type of each ConditionArray.

filter_cells(mask=None, key=None, delete=True, *args, **kwargs)

Uses an arbitrary mask or a saved mask (key) to filter cells from each ConditionArray. If delete, the underlying data are changed. Otherwise, the filtered data are only returned.

Parameters
  • mask (Optional[List[ndarray]], default: None) – A boolean mask to filter cells with. Can be 1D, 2D or 5D.

  • key (Optional[str], default: None) – Name of a saved mask to use for filtering cells. Overwrites mask if provided.

  • delete (bool, default: True) – If True, cells are removed in the base array. Otherwise they are only removed in the array that is returned.

  • args – Passed to filtering function.

  • kwargs – Passed to filtering function.

Returns

array with cells designated by maks or key removed.

Return type

np.ndarray

filter_peaks(value_key, metrics, thresholds, kwargs=[{}], peak_key=None, propagate=True)

Removes segmented peaks based on arbitrary peak criteria. See celltk.utils.peak_utils for more information about possible metrics to use

Parameters
  • value_key (Tuple[int, str]) – Key to the traces used to calculate the peak metrics.

  • metrics (Collection[str]) – Names of the metrics to use for segmentation. See PeakHelper in celltk.utils.peak_utils.

  • thresholds (Collection[str]) – Lower threshold for the metrics given. If a peak metric is less than the threshold, it will be removed. Must be the same length as metrics.

  • kwargs (Collection[dict], default: [{}]) – Collection of dictionaries containing kwargs for the metrics given. If given, should be same length as metrics.

  • peak_key (Optional[Tuple[int, str]], default: None) – Key to the peak labels. If not given, will attempt to find peak labels based on the value key

  • propagate (bool, default: True) – If True, propagates filtered peak labels to the other keys in ConditionArray

Return type

None

Returns

None

generate_mask(function, metric, region=0, channel=0, frame_rng=None, key=None, individual=True, *args, **kwargs)

Generates a boolean mask for each ConditionArray based on an arbitrary filter.

Parameters
  • function – If str, name of function in filter_utils. Otherwise, should be a Callable that inputs a 2D array and returns a 2D boolean array.

  • metric – Name of metric to use. Can be any key in the array.

  • region – Name of region to calculate the filter in.

  • channel – Name of channel to calculate filter in.

  • frame_rng – Frames to use in calculation. If int, takes that many frames from start of trace. If tuple, uses passed frames.

  • key – If given, saves the mask in ConditionArray as key.

  • individual – If true, the filter is calculated for each ConditionArray independently. Otherwise, calculated on the whole data set, then applied to ConditionArrays.

  • args – passed to function

  • kwargs – passed to function

Returns

List of 2D boolean array to masks cells outside filter

interpolate_nans(keys=None)
Linear interpolation of nans for each cell in each ConditionArray.

Modification is done in-place.

Parameters

keys (Optional[Collection[tuple]], default: None) – keys that will have nans removed. Each key should be a tuple of strings with length=3

Return type

None

Returns

None

items()

Use to iterate through the key and array for each ConditionArray.

keys()

Use to iterate through all the keys in ExperimentArray.

classmethod load(path)

Load an ExperimentArray from an hdf5 file.

Parameters

path (str) – Path to the hdf5 file

Returns

ExperimentArray

Return type

ExperimentArray

load_condition(array, name=None, pos_id=None)

Adds a ConditionArray to the ExperimentArray from an hdf5 file. The new ConditionArray gets saved as name + pos_id if provided, otherwise uses the name saved in the hdf5 file.

Parameters
  • array (Union[str, ConditionArray]) – ConditionArray or path to the hdf5 file with ConditionArray

  • name (Optional[str], default: None) – Name of the ConditionArray to be loaded.

  • pos_id (Optional[int], default: None) – Unique identifier for the ConditionArray.

Return type

None

Returns

None

mark_active_cells(key, thres=1, propagate=True)

Uses peak labels to mark in what frames cells are active in each ConditionArray.

Parameters
  • key (Tuple[int, str]) – Key defining peak labels

  • thres (float, default: 1) – Leave as 1, not currently used

  • propagate (bool, default: True) – if True, propagate active marks to other keys in ConditionArray

Return type

None

Returns

None

merge_conditions()

Concatenate ConditionArrays with matching conditions. If no arrays have matching conditions, nothing is done. If matching conditions are found, looks for position map to label each uniquely, or will just number them in the order that they were saved in the ExperimentArray. Arrays are concatenated along the cell axis.

Return type

None

Returns

None

Note

  • Any masks that have been saved in the individual ConditionArrays will be lost.

property metrics

Returns list of the names of the metrics in each ConditionArray.

property ncells

Returns list of the number of cells in each ConditionArray.

property ndim

Returns list of the number of dimensoins of each ConditionArray.

predict_peaks(key, weight_path, propagate=True, segment=True, **kwargs)

Uses a UNet-based neural net to predict peaks in the traces defined by key in each ConditionArray. Adds two keys to ConditionArrays, ‘slope_prob’ and ‘plateau_prob’. If segment is True, also adds a ‘peaks’ key. ‘slope_prob’ is the probability that a point is on the upward or downward slope of a peak. ‘plateau_prob’ is the probability that a point is at the top of a peak.

Parameters
  • key (Tuple[int, str]) – Key to the traces to predict peaks with. Must return a 2D array.

  • weight_path (str) – Path to the model weights to use for a new UPeakModel instance

  • propagate (bool, default: True) – If True, propagates peak probabilities to the other keys in ConditionArray

  • segment (bool, default: True) – If True, uses a watershed-based segmentation to label peaks based on the predictions.

  • kwargs – Passed to segmentation function.

Return type

None

Returns

None

property regions

Returns list of the names of the regions in each ConditionArray.

remove_empty_sites()

Removes all sites that have one or more empty dimensions.

Return type

None

Returns

None

remove_short_traces(min_trace_length=0)

Applies a filter to each condition to remove cells with fewer non-nan frames than min_trace_length. The ‘label’ metric is used for determining non-nan frames.

Parameters

min_trace_length (int, default: 0) – Minimum number of non-nan frames allowed

Return type

None

Returns

None

save(path)

Saves all the Conditions in Experiment to an hdf5 file.

Loads the hdf5 file for each condition and then saves them in a single hdf5 file at path. Runs merge_conditions() first to ensure data doesn’t get overwritten.

Parameters

path (str) – Path to the location where file should be saved.

Return type

None

Returns

None

Raise

ValueError if any cell or frame is greater than 2 ** 16.

set_conditions(condition_map={})

Updates names of all of the ConditionArrays

Parameters
  • condition – Dict of current_name : new_name for each ConditionArray

  • condition_map (Dict[str, str], default: {}) –

Return type

None

Returns

None

set_time(time=None)

Define the time axis in each ConditionArray. Time can be given as a frame interval or an array specifying the time for each frame.

Parameters

time (Optional[float], default: None) – If int or float, designates time between frames. If array, marks the frame time points.

Return type

None

Returns

None

property shape

Returns list of the shape of each ConditionArray.

property time

Returns list of the time axis of each ConditionArray.

values()

Use to iterate through all of the ConditionArrays.