Arrays
- class celltk.core.arrays.ConditionArray(regions=['nuc'], channels=['TRITC'], metrics=['label'], cells=[0], frames=[0], name='default', time=None, pos_id=0)
For now, this class is only built by Extract. It’s not meant to be built by the user yet. Stores the results from a single condition in an experiment.
- Parameters
regions (
List
[str
], default:['nuc']
) –channels (
List
[str
], default:['TRITC']
) –metrics (
List
[str
], default:['label']
) –cells (
List
[int
], default:[0]
) –frames (
List
[int
], default:[0]
) –name (
str
, default:'default'
) –time (
Union
[float
,ndarray
,None
], default:None
) –pos_id (
int
, default:0
) –
- add_metric_slots(name)
Expands the ConditionArry to make room for more metrics.
Note
This method must be used before attempting to add new metrics to the ConditionArray.
- Parameters
name (
List
[str
]) – List of names of the metrics to add- Return type
None
- Returns
None
- property channels
Names of the channels in the data array.
- property condition
Name of the data array
- property coordinate_dimensions
Dictionary of with coordinate names as keys and coordinate length as values
- property coordinates
Names of the coordinates (axes) in the data array
- property dtype
Data type of the data array. Usually double.
- filter_cells(mask=None, key=None, delete=True, *args, **kwargs)
Uses an arbitrary mask or a saved mask (key) to filter cells from the data. If delete, the underlying structure is changed, otherwise, the data are only returned.
- Parameters
mask (
Optional
[ndarray
], default:None
) – A boolean mask to filter cells with. Can be 1D, 2D or 5D.key (
Optional
[str
], default:None
) – Name of a saved mask to use for filtering cells. Overwrites mask if provided.delete (
bool
, default:True
) – If True, cells are removed in the base array. Otherwise they are only removed in the array that is returned.
- Returns
array with cells designated by maks or key removed.
- Return type
np.ndarray
- filter_peaks(value_key, metrics, thresholds, kwargs=[{}], peak_key=None, propagate=True)
Removes segmented peaks based on arbitrary peak criteria. See celltk.utils.peak_utils for more information about possible metrics to use
- Parameters
value_key (
Tuple
[int
,str
]) – Key to the traces used to calculate the peak metrics.metrics (
Collection
[str
]) – Names of the metrics to use for segmentation. See PeakHelper in celltk.utils.peak_utils.thresholds (
Collection
[str
]) – Lower threshold for the metrics given. If a peak metric is less than the threshold, it will be removed. Must be the same length as metrics.kwargs (
Collection
[dict
], default:[{}]
) – Collection of dictionaries containing kwargs for the metrics given. If given, should be same length as metrics.peak_key (
Optional
[Tuple
[int
,str
]], default:None
) – Key to the peak labels. If not given, will attempt to find peak labels based on the value keypropagate (
bool
, default:True
) – If True, propagates filtered peak labels to the other keys in ConditionArray
- Return type
None
- Returns
None
- generate_mask(function, metric, region=0, channel=0, frame_rng=None, key=None, *args, **kwargs)
Generate a mask to remove cells using an arbitrary filter.
- Parameters
function – If str, name of function in filter_utils. Otherwise, should be a Callable that inputs a 2D array and returns a 2D boolean array.
metric – Name of metric to use. Can be any key in the array.
region – Name of region to calculate the filter in.
channel – Name of channel to calculate filter in.
frame_rng – Frames to use in calculation. If int, takes that many frames from start of trace. If tuple, uses passed frames.
key – If given, saves the mask in ConditionArray as key.
args – passed to function
kwargs – passed to function
- Returns
2D boolean array that masks cells outside filter
- get_mask(key)
Returns a saved mask. Masks are generated and saved using ConditionArray().generate_mask()
- Parameters
key (
str
) – Name of mask to retreive- Returns
The saved boolean mask
- Return type
np.ndarray
- interpolate_nans(keys=None)
Linear interpolation of nans for each cell. Modification is done in-place.
- Parameters
keys (
Optional
[Collection
[tuple
]], default:None
) – keys that will have nans removed. Each key should be a tuple of strings with length=3- Return type
None
- Returns
None
- property keys
All the possible keys that can be used to index the data array.
- classmethod load(path)
Load an hdf5 file and convert to a ConditionArray.
- Parameters
path (
str
) – Path to the hdf5 file to be loaded.- Returns
Loaded ConditionArray
- Return type
- mark_active_cells(key, thres=1, propagate=True)
Uses peak labels to mark in what frames cells are active
- Parameters
key (
Tuple
[int
,str
]) – Key defining peak labelsthres (
float
, default:1
) – Leave as 1, not currently usedpropagate (
bool
, default:True
) – if True, propagate active marks to other keys in ConditionArray
- Return type
None
- Returns
None
- property metrics
Names of the metrics in the data array.
- property ncells
Number of cells present in array
- property ndim
Number of dimensions in the data array. Should equal 5.
- predict_peaks(key, model=None, weight_path=None, propagate=True, segment=True, **kwargs)
Uses a UNet-based neural net to predict peaks in the traces defined by key. Adds two keys to ConditionArray, ‘slope_prob’ and ‘plateau_prob’. If segment is True, also adds a ‘peaks’ key. ‘slope_prob’ is the probability that a point is on the upward or downward slope of a peak. ‘plateau_prob’ is the probability that a point is at the top of a peak.
- Parameters
key (
Tuple
[int
,str
]) – Key to the traces to predict peaks with. Must return a 2D array.model (
Optional
[UPeakModel
], default:None
) – An instantiated UPeakModel to useweight_path (
Optional
[str
], default:None
) – Path to the model weights to use for a new UPeakModel instancepropagate (
bool
, default:True
) – If True, propagates peak probabilities to the other keys in ConditionArraysegment (
bool
, default:True
) – If True, uses a watershed-based segmentation to label peaks based on the predictions.kwargs – Passed to segmentation function. See utils.peak_utils.segment_peaks_agglomeration.
- Return type
None
- Returns
None
- Raises
ValueError – If neither model or weights are provided.
- propagate_values(key, prop_to='both')
Propagates metric value to other keys in ConditionArray.
- Parameters
key (
Tuple
[str
]) – Key to metric containing the values to propagateprop_to (
str
, default:'both'
) – Define the keys to propagate values to. Options are ‘channel’, ‘region’, or ‘both’.
- Return type
None
- Returns
None
- property regions
Names of the regions in the data array.
- remove_parents(parent_daughter, cell_index)
Returns 1D boolean mask to remove traces of cells that divided. Daughter cell traces are kept. Use with reshape_mask to remove parent cells. Typically called by Extract.
- Parameters
parent_daughter (
Dict
) – Dictionary of parent_label : daughter_labelcell_index (
Dict
) – Dictionary of cell_label : cell_index_in_array
- Returns
Boolean mask with False in the rows of parent cells
- Return type
np.ndarray
- remove_short_traces(min_trace_length)
Removes cells with less than min_trace_length non-nan values. Uses label as the metric to determine non-nan values. Typically called by Extract.
- Parameters
min_trace_length (
int
) – Shortest trace that should not be deleted- Returns
array with short traces removed
- Return type
np.ndarray
- reshape_mask(mask)
Takes in a 1D, 2D, or 5D boolean mask and casts to tuple of 5-dimensional indices. Use this to apply a 1D or 2D mask to self._arr.
Note
Always assumes that filtering is to happen in cell axis.
- Parameters
mask (
ndarray
) – Boolean mask to be used as filter.- Returns
Indices that can be used to index ConditionArray
- Return type
Tuple
- save(path)
Saves ConditionArray to an hdf5 file.
- Parameters
path (
str
) – Absolute path to save the file.- Return type
None
- Returns
None
- set_condition(condition)
Updates name of the ConditionArray.
- Parameters
condition (
str
) – New name of the ConditionArray- Return type
None
- Returns
None
- set_position_id(pos=None)
Adds unique identifiers to the cells in ConditionArray. Typically called by Pipeline or ExperimentArray.
- Parameters
pos (
Optional
[int
], default:None
) – Integer or string identifying the position- Return type
None
- Returns
None
- set_time(time)
Define the time axis. Time can be given as a frame interval or an array specifying the time for each frame.
- Parameters
time (
Union
[float
,ndarray
]) – If int or float, designates time between frames. If array, marks the frame time points.- Return type
None
- Returns
None
- property shape
Shape of the data array
- class celltk.core.arrays.ExperimentArray(arrays=None, name=None, time=None)
Base class to create arrays that can store an almost arbitrary number of ConditionArrays. Typically made by Orchestrator.build_experiment_file()
- Parameters
arrays (
Optional
[List
[ConditionArray
]], default:None
) –name (
Optional
[str
], default:None
) –time (
Optional
[float
], default:None
) –
- add_metric_slots(name)
Expands each ConditionArry to make room for more metrics.
Note
This method must be used before attempting to add new metrics to the ConditionArray.
- Parameters
name (
List
[str
]) – List of names of the metrics to add- Return type
None
- Returns
None
- property channels
Returns list of the names of the channels in each ConditionArray.
- property conditions: List
Returns list of the name of each ConditionArray.
- Return type
List
- property coordinates
Returns the names of the coordinates of each ConditionArray.
- property dtype
Returns list of the data type of each ConditionArray.
- filter_cells(mask=None, key=None, delete=True, *args, **kwargs)
Uses an arbitrary mask or a saved mask (key) to filter cells from each ConditionArray. If delete, the underlying data are changed. Otherwise, the filtered data are only returned.
- Parameters
mask (
Optional
[List
[ndarray
]], default:None
) – A boolean mask to filter cells with. Can be 1D, 2D or 5D.key (
Optional
[str
], default:None
) – Name of a saved mask to use for filtering cells. Overwrites mask if provided.delete (
bool
, default:True
) – If True, cells are removed in the base array. Otherwise they are only removed in the array that is returned.args – Passed to filtering function.
kwargs – Passed to filtering function.
- Returns
array with cells designated by maks or key removed.
- Return type
np.ndarray
- filter_peaks(value_key, metrics, thresholds, kwargs=[{}], peak_key=None, propagate=True)
Removes segmented peaks based on arbitrary peak criteria. See celltk.utils.peak_utils for more information about possible metrics to use
- Parameters
value_key (
Tuple
[int
,str
]) – Key to the traces used to calculate the peak metrics.metrics (
Collection
[str
]) – Names of the metrics to use for segmentation. See PeakHelper in celltk.utils.peak_utils.thresholds (
Collection
[str
]) – Lower threshold for the metrics given. If a peak metric is less than the threshold, it will be removed. Must be the same length as metrics.kwargs (
Collection
[dict
], default:[{}]
) – Collection of dictionaries containing kwargs for the metrics given. If given, should be same length as metrics.peak_key (
Optional
[Tuple
[int
,str
]], default:None
) – Key to the peak labels. If not given, will attempt to find peak labels based on the value keypropagate (
bool
, default:True
) – If True, propagates filtered peak labels to the other keys in ConditionArray
- Return type
None
- Returns
None
- generate_mask(function, metric, region=0, channel=0, frame_rng=None, key=None, individual=True, *args, **kwargs)
Generates a boolean mask for each ConditionArray based on an arbitrary filter.
- Parameters
function – If str, name of function in filter_utils. Otherwise, should be a Callable that inputs a 2D array and returns a 2D boolean array.
metric – Name of metric to use. Can be any key in the array.
region – Name of region to calculate the filter in.
channel – Name of channel to calculate filter in.
frame_rng – Frames to use in calculation. If int, takes that many frames from start of trace. If tuple, uses passed frames.
key – If given, saves the mask in ConditionArray as key.
individual – If true, the filter is calculated for each ConditionArray independently. Otherwise, calculated on the whole data set, then applied to ConditionArrays.
args – passed to function
kwargs – passed to function
- Returns
List of 2D boolean array to masks cells outside filter
- interpolate_nans(keys=None)
- Linear interpolation of nans for each cell in each ConditionArray.
Modification is done in-place.
- Parameters
keys (
Optional
[Collection
[tuple
]], default:None
) – keys that will have nans removed. Each key should be a tuple of strings with length=3- Return type
None
- Returns
None
- items()
Use to iterate through the key and array for each ConditionArray.
- keys()
Use to iterate through all the keys in ExperimentArray.
- classmethod load(path)
Load an ExperimentArray from an hdf5 file.
- Parameters
path (
str
) – Path to the hdf5 file- Returns
ExperimentArray
- Return type
- load_condition(array, name=None, pos_id=None)
Adds a ConditionArray to the ExperimentArray from an hdf5 file. The new ConditionArray gets saved as name + pos_id if provided, otherwise uses the name saved in the hdf5 file.
- Parameters
array (
Union
[str
,ConditionArray
]) – ConditionArray or path to the hdf5 file with ConditionArrayname (
Optional
[str
], default:None
) – Name of the ConditionArray to be loaded.pos_id (
Optional
[int
], default:None
) – Unique identifier for the ConditionArray.
- Return type
None
- Returns
None
- mark_active_cells(key, thres=1, propagate=True)
Uses peak labels to mark in what frames cells are active in each ConditionArray.
- Parameters
key (
Tuple
[int
,str
]) – Key defining peak labelsthres (
float
, default:1
) – Leave as 1, not currently usedpropagate (
bool
, default:True
) – if True, propagate active marks to other keys in ConditionArray
- Return type
None
- Returns
None
- merge_conditions()
Concatenate ConditionArrays with matching conditions. If no arrays have matching conditions, nothing is done. If matching conditions are found, looks for position map to label each uniquely, or will just number them in the order that they were saved in the ExperimentArray. Arrays are concatenated along the cell axis.
- Return type
None
- Returns
None
Note
Any masks that have been saved in the individual ConditionArrays will be lost.
- property metrics
Returns list of the names of the metrics in each ConditionArray.
- property ncells
Returns list of the number of cells in each ConditionArray.
- property ndim
Returns list of the number of dimensoins of each ConditionArray.
- predict_peaks(key, weight_path, propagate=True, segment=True, **kwargs)
Uses a UNet-based neural net to predict peaks in the traces defined by key in each ConditionArray. Adds two keys to ConditionArrays, ‘slope_prob’ and ‘plateau_prob’. If segment is True, also adds a ‘peaks’ key. ‘slope_prob’ is the probability that a point is on the upward or downward slope of a peak. ‘plateau_prob’ is the probability that a point is at the top of a peak.
- Parameters
key (
Tuple
[int
,str
]) – Key to the traces to predict peaks with. Must return a 2D array.weight_path (
str
) – Path to the model weights to use for a new UPeakModel instancepropagate (
bool
, default:True
) – If True, propagates peak probabilities to the other keys in ConditionArraysegment (
bool
, default:True
) – If True, uses a watershed-based segmentation to label peaks based on the predictions.kwargs – Passed to segmentation function.
- Return type
None
- Returns
None
- property regions
Returns list of the names of the regions in each ConditionArray.
- remove_empty_sites()
Removes all sites that have one or more empty dimensions.
- Return type
None
- Returns
None
- remove_short_traces(min_trace_length=0)
Applies a filter to each condition to remove cells with fewer non-nan frames than min_trace_length. The ‘label’ metric is used for determining non-nan frames.
- Parameters
min_trace_length (
int
, default:0
) – Minimum number of non-nan frames allowed- Return type
None
- Returns
None
- save(path)
Saves all the Conditions in Experiment to an hdf5 file.
Loads the hdf5 file for each condition and then saves them in a single hdf5 file at path. Runs merge_conditions() first to ensure data doesn’t get overwritten.
- Parameters
path (
str
) – Path to the location where file should be saved.- Return type
None
- Returns
None
- Raise
ValueError if any cell or frame is greater than 2 ** 16.
- set_conditions(condition_map={})
Updates names of all of the ConditionArrays
- Parameters
condition – Dict of current_name : new_name for each ConditionArray
condition_map (
Dict
[str
,str
], default:{}
) –
- Return type
None
- Returns
None
- set_time(time=None)
Define the time axis in each ConditionArray. Time can be given as a frame interval or an array specifying the time for each frame.
- Parameters
time (
Optional
[float
], default:None
) – If int or float, designates time between frames. If array, marks the frame time points.- Return type
None
- Returns
None
- property shape
Returns list of the shape of each ConditionArray.
- property time
Returns list of the time axis of each ConditionArray.
- values()
Use to iterate through all of the ConditionArrays.