Arrays

class celltk.core.arrays.ConditionArray(regions=['nuc'], channels=['TRITC'], metrics=['label'], cells=[0], frames=[0], name='default', time=None, pos_id=0)

For now, this class is only built by Extract. It’s not meant to be built by the user yet. Stores the results from a single condition in an experiment.

Parameters

regions (List[str], default: ['nuc']) –
channels (List[str], default: ['TRITC']) –
metrics (List[str], default: ['label']) –
cells (List[int], default: [0]) –
frames (List[int], default: [0]) –
name (str, default: 'default') –
time (Union[float, ndarray, None], default: None) –
pos_id (int, default: 0) –

add_metric_slots(name)

Expands the ConditionArry to make room for more metrics.

Note

This method must be used before attempting to add new metrics to the ConditionArray.

Parameters: name (List[str]) – List of names of the metrics to add
Return type: None
Returns: None

property channels: Names of the channels in the data array.

property condition: Name of the data array

property coordinate_dimensions: Dictionary of with coordinate names as keys and coordinate length as values

property coordinates: Names of the coordinates (axes) in the data array

property dtype: Data type of the data array. Usually double.

filter_cells(mask=None, key=None, delete=True, *args, **kwargs)

Uses an arbitrary mask or a saved mask (key) to filter cells from the data. If delete, the underlying structure is changed, otherwise, the data are only returned.

Parameters

mask (Optional[ndarray], default: None) – A boolean mask to filter cells with. Can be 1D, 2D or 5D.
key (Optional[str], default: None) – Name of a saved mask to use for filtering cells. Overwrites mask if provided.
delete (bool, default: True) – If True, cells are removed in the base array. Otherwise they are only removed in the array that is returned.

Returns

array with cells designated by maks or key removed.

Return type

np.ndarray

filter_peaks(value_key, metrics, thresholds, kwargs=[{}], peak_key=None, propagate=True)

Removes segmented peaks based on arbitrary peak criteria. See celltk.utils.peak_utils for more information about possible metrics to use

Parameters

value_key (Tuple[int, str]) – Key to the traces used to calculate the peak metrics.
metrics (Collection[str]) – Names of the metrics to use for segmentation. See PeakHelper in celltk.utils.peak_utils.
thresholds (Collection[str]) – Lower threshold for the metrics given. If a peak metric is less than the threshold, it will be removed. Must be the same length as metrics.
kwargs (Collection[dict], default: [{}]) – Collection of dictionaries containing kwargs for the metrics given. If given, should be same length as metrics.
peak_key (Optional[Tuple[int, str]], default: None) – Key to the peak labels. If not given, will attempt to find peak labels based on the value key
propagate (bool, default: True) – If True, propagates filtered peak labels to the other keys in ConditionArray

Return type

None

Returns

None

generate_mask(function, metric, region=0, channel=0, frame_rng=None, key=None, *args, **kwargs)

Generate a mask to remove cells using an arbitrary filter.

Parameters

function – If str, name of function in filter_utils. Otherwise, should be a Callable that inputs a 2D array and returns a 2D boolean array.
metric – Name of metric to use. Can be any key in the array.
region – Name of region to calculate the filter in.
channel – Name of channel to calculate filter in.
frame_rng – Frames to use in calculation. If int, takes that many frames from start of trace. If tuple, uses passed frames.
key – If given, saves the mask in ConditionArray as key.
args – passed to function
kwargs – passed to function

Returns

2D boolean array that masks cells outside filter

get_mask(key)

Returns a saved mask. Masks are generated and saved using ConditionArray().generate_mask()

Parameters: key (str) – Name of mask to retreive
Returns: The saved boolean mask
Return type: np.ndarray

interpolate_nans(keys=None)

Linear interpolation of nans for each cell. Modification is done in-place.

Parameters: keys (Optional[Collection[tuple]], default: None) – keys that will have nans removed. Each key should be a tuple of strings with length=3
Return type: None
Returns: None

property keys: All the possible keys that can be used to index the data array.

classmethod load(path)

Load an hdf5 file and convert to a ConditionArray.

Parameters: path (str) – Path to the hdf5 file to be loaded.
Returns: Loaded ConditionArray
Return type: ConditionArray

mark_active_cells(key, thres=1, propagate=True)

Uses peak labels to mark in what frames cells are active

Parameters

key (Tuple[int, str]) – Key defining peak labels
thres (float, default: 1) – Leave as 1, not currently used
propagate (bool, default: True) – if True, propagate active marks to other keys in ConditionArray

Return type

None

Returns

None

property metrics: Names of the metrics in the data array.

property ncells: Number of cells present in array

property ndim: Number of dimensions in the data array. Should equal 5.

predict_peaks(key, model=None, weight_path=None, propagate=True, segment=True, **kwargs)

Uses a UNet-based neural net to predict peaks in the traces defined by key. Adds two keys to ConditionArray, ‘slope_prob’ and ‘plateau_prob’. If segment is True, also adds a ‘peaks’ key. ‘slope_prob’ is the probability that a point is on the upward or downward slope of a peak. ‘plateau_prob’ is the probability that a point is at the top of a peak.

Parameters

key (Tuple[int, str]) – Key to the traces to predict peaks with. Must return a 2D array.
model (Optional[UPeakModel], default: None) – An instantiated UPeakModel to use
weight_path (Optional[str], default: None) – Path to the model weights to use for a new UPeakModel instance
propagate (bool, default: True) – If True, propagates peak probabilities to the other keys in ConditionArray
segment (bool, default: True) – If True, uses a watershed-based segmentation to label peaks based on the predictions.
kwargs – Passed to segmentation function. See utils.peak_utils.segment_peaks_agglomeration.

Return type

None

Returns

None

Raises

ValueError – If neither model or weights are provided.

propagate_values(key, prop_to='both')

Propagates metric value to other keys in ConditionArray.

Parameters

key (Tuple[str]) – Key to metric containing the values to propagate
prop_to (str, default: 'both') – Define the keys to propagate values to. Options are ‘channel’, ‘region’, or ‘both’.

Return type

None

Returns

None

property regions: Names of the regions in the data array.

remove_parents(parent_daughter, cell_index)

Returns 1D boolean mask to remove traces of cells that divided. Daughter cell traces are kept. Use with reshape_mask to remove parent cells. Typically called by Extract.

Parameters

parent_daughter (Dict) – Dictionary of parent_label : daughter_label
cell_index (Dict) – Dictionary of cell_label : cell_index_in_array

Returns

Boolean mask with False in the rows of parent cells

Return type

np.ndarray

remove_short_traces(min_trace_length)

Removes cells with less than min_trace_length non-nan values. Uses label as the metric to determine non-nan values. Typically called by Extract.

Parameters: min_trace_length (int) – Shortest trace that should not be deleted
Returns: array with short traces removed
Return type: np.ndarray

reshape_mask(mask)

Takes in a 1D, 2D, or 5D boolean mask and casts to tuple of 5-dimensional indices. Use this to apply a 1D or 2D mask to self._arr.

Note

Always assumes that filtering is to happen in cell axis.

Parameters: mask (ndarray) – Boolean mask to be used as filter.
Returns: Indices that can be used to index ConditionArray
Return type: Tuple

save(path)

Saves ConditionArray to an hdf5 file.

Parameters: path (str) – Absolute path to save the file.
Return type: None
Returns: None

set_condition(condition)

Updates name of the ConditionArray.

Parameters: condition (str) – New name of the ConditionArray
Return type: None
Returns: None

set_position_id(pos=None)

Adds unique identifiers to the cells in ConditionArray. Typically called by Pipeline or ExperimentArray.

Parameters: pos (Optional[int], default: None) – Integer or string identifying the position
Return type: None
Returns: None

set_time(time)

Define the time axis. Time can be given as a frame interval or an array specifying the time for each frame.

Parameters: time (Union[float, ndarray]) – If int or float, designates time between frames. If array, marks the frame time points.
Return type: None
Returns: None

property shape: Shape of the data array

class celltk.core.arrays.ExperimentArray(arrays=None, name=None, time=None)

Base class to create arrays that can store an almost arbitrary number of ConditionArrays. Typically made by Orchestrator.build_experiment_file()

Parameters

arrays (Optional[List[ConditionArray]], default: None) –
name (Optional[str], default: None) –
time (Optional[float], default: None) –

add_metric_slots(name)

Expands each ConditionArry to make room for more metrics.

Note

This method must be used before attempting to add new metrics to the ConditionArray.

Parameters: name (List[str]) – List of names of the metrics to add
Return type: None
Returns: None

property channels: Returns list of the names of the channels in each ConditionArray.

property conditions: List

Returns list of the name of each ConditionArray.

Return type: List

property coordinates: Returns the names of the coordinates of each ConditionArray.

property dtype: Returns list of the data type of each ConditionArray.

filter_cells(mask=None, key=None, delete=True, *args, **kwargs)

Uses an arbitrary mask or a saved mask (key) to filter cells from each ConditionArray. If delete, the underlying data are changed. Otherwise, the filtered data are only returned.

Parameters

mask (Optional[List[ndarray]], default: None) – A boolean mask to filter cells with. Can be 1D, 2D or 5D.
key (Optional[str], default: None) – Name of a saved mask to use for filtering cells. Overwrites mask if provided.
delete (bool, default: True) – If True, cells are removed in the base array. Otherwise they are only removed in the array that is returned.
args – Passed to filtering function.
kwargs – Passed to filtering function.

Returns

array with cells designated by maks or key removed.

Return type

np.ndarray

filter_peaks(value_key, metrics, thresholds, kwargs=[{}], peak_key=None, propagate=True)

Removes segmented peaks based on arbitrary peak criteria. See celltk.utils.peak_utils for more information about possible metrics to use

Parameters

value_key (Tuple[int, str]) – Key to the traces used to calculate the peak metrics.
metrics (Collection[str]) – Names of the metrics to use for segmentation. See PeakHelper in celltk.utils.peak_utils.
thresholds (Collection[str]) – Lower threshold for the metrics given. If a peak metric is less than the threshold, it will be removed. Must be the same length as metrics.
kwargs (Collection[dict], default: [{}]) – Collection of dictionaries containing kwargs for the metrics given. If given, should be same length as metrics.
peak_key (Optional[Tuple[int, str]], default: None) – Key to the peak labels. If not given, will attempt to find peak labels based on the value key
propagate (bool, default: True) – If True, propagates filtered peak labels to the other keys in ConditionArray

Return type

None

Returns

None

generate_mask(function, metric, region=0, channel=0, frame_rng=None, key=None, individual=True, *args, **kwargs)

Generates a boolean mask for each ConditionArray based on an arbitrary filter.

Parameters

function – If str, name of function in filter_utils. Otherwise, should be a Callable that inputs a 2D array and returns a 2D boolean array.
metric – Name of metric to use. Can be any key in the array.
region – Name of region to calculate the filter in.
channel – Name of channel to calculate filter in.
frame_rng – Frames to use in calculation. If int, takes that many frames from start of trace. If tuple, uses passed frames.
key – If given, saves the mask in ConditionArray as key.
individual – If true, the filter is calculated for each ConditionArray independently. Otherwise, calculated on the whole data set, then applied to ConditionArrays.
args – passed to function
kwargs – passed to function

Returns

List of 2D boolean array to masks cells outside filter

interpolate_nans(keys=None)

Linear interpolation of nans for each cell in each ConditionArray.: Modification is done in-place.

Parameters: keys (Optional[Collection[tuple]], default: None) – keys that will have nans removed. Each key should be a tuple of strings with length=3
Return type: None
Returns: None

items(): Use to iterate through the key and array for each ConditionArray.

keys(): Use to iterate through all the keys in ExperimentArray.

classmethod load(path)

Load an ExperimentArray from an hdf5 file.

Parameters: path (str) – Path to the hdf5 file
Returns: ExperimentArray
Return type: ExperimentArray

load_condition(array, name=None, pos_id=None)

Adds a ConditionArray to the ExperimentArray from an hdf5 file. The new ConditionArray gets saved as name + pos_id if provided, otherwise uses the name saved in the hdf5 file.

Parameters

array (Union[str, ConditionArray]) – ConditionArray or path to the hdf5 file with ConditionArray
name (Optional[str], default: None) – Name of the ConditionArray to be loaded.
pos_id (Optional[int], default: None) – Unique identifier for the ConditionArray.

Return type

None

Returns

None

mark_active_cells(key, thres=1, propagate=True)

Uses peak labels to mark in what frames cells are active in each ConditionArray.

Parameters

key (Tuple[int, str]) – Key defining peak labels
thres (float, default: 1) – Leave as 1, not currently used
propagate (bool, default: True) – if True, propagate active marks to other keys in ConditionArray

Return type

None

Returns

None

merge_conditions()

Concatenate ConditionArrays with matching conditions. If no arrays have matching conditions, nothing is done. If matching conditions are found, looks for position map to label each uniquely, or will just number them in the order that they were saved in the ExperimentArray. Arrays are concatenated along the cell axis.

Return type: None
Returns: None

Note

Any masks that have been saved in the individual ConditionArrays will be lost.

property metrics: Returns list of the names of the metrics in each ConditionArray.

property ncells: Returns list of the number of cells in each ConditionArray.

property ndim: Returns list of the number of dimensoins of each ConditionArray.

predict_peaks(key, weight_path, propagate=True, segment=True, **kwargs)

Uses a UNet-based neural net to predict peaks in the traces defined by key in each ConditionArray. Adds two keys to ConditionArrays, ‘slope_prob’ and ‘plateau_prob’. If segment is True, also adds a ‘peaks’ key. ‘slope_prob’ is the probability that a point is on the upward or downward slope of a peak. ‘plateau_prob’ is the probability that a point is at the top of a peak.

Parameters

key (Tuple[int, str]) – Key to the traces to predict peaks with. Must return a 2D array.
weight_path (str) – Path to the model weights to use for a new UPeakModel instance
propagate (bool, default: True) – If True, propagates peak probabilities to the other keys in ConditionArray
segment (bool, default: True) – If True, uses a watershed-based segmentation to label peaks based on the predictions.
kwargs – Passed to segmentation function.

Return type

None

Returns

None

property regions: Returns list of the names of the regions in each ConditionArray.

remove_empty_sites()

Removes all sites that have one or more empty dimensions.

Return type: None
Returns: None

remove_short_traces(min_trace_length=0)

Applies a filter to each condition to remove cells with fewer non-nan frames than min_trace_length. The ‘label’ metric is used for determining non-nan frames.

Parameters: min_trace_length (int, default: 0) – Minimum number of non-nan frames allowed
Return type: None
Returns: None

save(path)

Saves all the Conditions in Experiment to an hdf5 file.

Loads the hdf5 file for each condition and then saves them in a single hdf5 file at path. Runs merge_conditions() first to ensure data doesn’t get overwritten.

Parameters: path (str) – Path to the location where file should be saved.
Return type: None
Returns: None
Raise: ValueError if any cell or frame is greater than 2 ** 16.

set_conditions(condition_map={})

Updates names of all of the ConditionArrays

Parameters

condition – Dict of current_name : new_name for each ConditionArray
condition_map (Dict[str, str], default: {}) –

Return type

None

Returns

None

set_time(time=None)

Define the time axis in each ConditionArray. Time can be given as a frame interval or an array specifying the time for each frame.

Parameters: time (Optional[float], default: None) – If int or float, designates time between frames. If array, marks the frame time points.
Return type: None
Returns: None

property shape: Returns list of the shape of each ConditionArray.

property time: Returns list of the time axis of each ConditionArray.

values(): Use to iterate through all of the ConditionArrays.