Compress paths
- connectome_interpreter.compress_paths.add_first_n_matrices(matrices, n)[source]
Adds the first N connectivity matrices from a list, supporting different path lengths. This function is designed to work with scipy sparse matrices, and dense numpy matrices. Each matrix in the list represents connectivity information for a specific path length.
- Parameters:
matrices (list) – A list of connectivity matrices of different path lengths. The matrices can be scipy sparse matrices, or dense numpy arrays. The function expects all matrices in the list to be of the same type and shape.
n (int) – The number of initial matrices in the list to be summed. This number must not exceed the length of the matrices list.
- Returns:
The resulting matrix after summing the first N matrices. The type of the returned matrix matches the type of the input matrices (scipy sparse matrix, or numpy array).
- Return type:
matrix
- Raises:
ValueError – If the list of matrices is empty or if n is larger than the number of matrices available in the list.
Example
>>> from scipy.sparse import csc_matrix >>> matrices = [csc_matrix([[1, 2], [3, 4]]), csc_matrix([[5, 6], [7, 8]]), csc_matrix([[9, 10], [11, 12]])] >>> n = 2 >>> result_matrix = add_first_n_matrices(matrices, n) >>> print(result_matrix.toarray()) [[ 6 8] [10 12]]
Note
Ensure that all matrices in the list are of compatible types and shapes before using this function.
- connectome_interpreter.compress_paths.compress_paths(A: ~scipy.sparse._matrix.spmatrix, step_number: int, threshold: float = 0, output_threshold: float = 0.0001, root: bool = False, chunkSize=2000, device: ~torch.device = device(type='cpu'), save_to_disk: bool = False, save_path: str = './', save_prefix: str = 'step_', return_results: bool = True, high_cpu_ram: bool = True, output_dtype: ~numpy.dtype = <class 'numpy.float32'>, density_threshold: float = 0.2)[source]
Computes A^0 to A^n by chunking the computation to save memory. Results are thresholded and returned as a list of sparse scipy matrices or numpy arrays (if above density_threshold). If it’s too slow, try changing chunkSize.
- Parameters:
A (scipy.sparse.matrix) – The connectivity matrix as a scipy sparse matrix.
step_number (int) – Power to raise A to.
threshold (float) – Threshold to apply to the matrix after each multiplication. If 0, no thresholding is applied.
output_threshold (float) – Threshold to apply to the output (>=), but not during matrix multiplication.
root (bool) – If True, take the n-th root of the result in output.
chunkSize (int) – Size of the chunks to process at a time. This determines the memory usage (on GPU if available). e.g. it might need A + (A.shape[0] * chunkSize) * float32 / 8 btes.
device (torch.device) – Device to use for computation. Uses GPU if available.
save_to_disk (bool) – If True, save the results to disk.
save_path (str) – Path to save the results.
save_prefix (str) – Prefix to use for the saved files.
return_results (bool, optional) – Whether to return the results as a list of sparse matrices. Defaults to True. If False, returns an empty list.
high_cpu_ram (bool) – if high_cpu_ram, keep all the resulting chunked sparse matrices in memory, before combining and writing to disk altogether. if False, write each chunk to disk to a temporary directory as soon as it is computed, and combine them at the end.
output_dtype (np.dtype) – Data type to use for the output matrices. Defaults to np.float32.
density_threshold (float) – If the density of the matrix exceeds this threshold, the matrix is saved as dense. Defaults to 0.2.
- Returns:
List of matrices representing A^0 to A^n.
- Return type:
List[scipy.sparse.csr_matrix or numpy.ndarray]
- connectome_interpreter.compress_paths.compress_paths_dense_chunked(inprop: csc_matrix, step_number: int, threshold: float = 0, output_threshold: float = 0.0001, root: bool = False, chunkSize: int = 10000, dtype: dtype = torch.float32, device: device = device(type='cpu'), save_to_disk: bool = False, save_path: str = './', save_prefix: str = 'step_') list[source]
Performs iterative multiplication of a sparse matrix inprop for a specified number of steps, applying thresholding to filter out values below a certain threshold to optimize memory usage and computation speed.
The function is optimized to run on GPU if available. It needs >= size_of_inprop.to_dense() * 3 amount of GPU memory, for matrix multiplication, and thresholding.
This function multiplies the connectivity matrix (input in rows; output in columns) inprop with itself step_number times, with each step’s result being thresholded to zero out elements below a given threshold. The function stores each step’s result in a list, where each result is further processed to drop values below the output_threshold to save memory.
- Parameters:
inprop (scipy.sparse.matrix) – The connectivity matrix as a scipy sparse matrix.
step_number (int) – The number of iterations to perform the matrix multiplication.
threshold (float, optional) – The threshold value to apply after each multiplication. Values below this threshold are set to zero. Defaults to 0.
output_threshold (float, optional) – The threshold value to apply to the final output, used to reduce output size. Defaults to 1e-4.
root (bool, optional) – Whether to take the nth root of the output. This can be understood as “the average direct connection strength” (when root=True), as opposed to “the proportion of influence among all partners n steps away” (when root=False). Defaults to False.
chunkSize (int, optional) – The size of the chunks to split the matrix into for matrix multiplication. Defaults to 10000.
dtype (torch.dtype, optional) – The data type to use for the tensor calculations. Defaults to torch.float32.
device (torch.device, optional) – The device to use for the tensor calculations. Defaults to torch.device(“cuda” if torch.cuda.is_available() else “cpu”).
save_to_disk (bool, optional) – Whether to save the output matrices to disk. Defaults to False.
save_path (str, optional) – The path to save the output matrices to. Defaults to “./” (the current folder).
save_prefix (str, optional) – The prefix to use for the output matrix filenames. Defaults to
"step_".
- Returns:
- A list of scipy.sparse.csc_matrix objects, each representing
connectivity from all neurons to all neurons n steps away.
- Return type:
list
Note
This function requires PyTorch and is designed to automatically utilize CUDA-enabled GPU devices if available to accelerate computations. The input matrix inprop is converted to a dense tensor before processing.
Example
>>> from scipy.sparse import csc_matrix >>> import numpy as np >>> inprop = csc_matrix(np.array([[0.1, 0.2], [0.3, 0.4]])) >>> step_number = 2 >>> compressed_paths = compress_paths(inprop, step_number, threshold=0.1, output_threshold=0.01) >>> print(compressed_paths)
- connectome_interpreter.compress_paths.compress_paths_not_chunked(inprop, step_number, threshold=0, output_threshold=0.0001, root=False)[source]
As above, but without chunking.
This would be more demanding for GPU RAM.
- connectome_interpreter.compress_paths.compress_paths_signed(inprop, idx_to_sign: dict, target_layer_number: int, threshold: float = 0, output_threshold: float = 0.0001, root: bool = False, chunkSize: int = 2000, save_to_disk: bool = False, save_path: str = './', return_results: bool = True)[source]
Calculates the excitatory and inhibitory influences across specified layers of a neural network, using PyTorch for GPU acceleration. Even numbers of inhibition is treated as excitation.
- Parameters:
inprop (scipy.sparse.csc_matrix) – The initial connectivity matrix representing direct connections between all neurons, shape (n_neurons, n_neurons).
idx_to_sign (dict) – A dictionary mapping neuron indices to the sign of output (1 for excitatory, -1 for inhibitory).
target_layer_number (int) – The number of layers through which to calculate influences. 1 for direct connections, 2 for one step away, etc.
threshold (float, optional) – A value to threshold the influences during calculation; influences below this value are set to zero, and not passed on. Defaults to 0.
output_threshold (float, optional) – A threshold for the final output to reduce output file size, with values below this threshold set to zero. Defaults to 1e-4.
root (bool, optional) – Whether to take the nth root of the output. This can be understood as “the average direct connection strength” (when root=True), as opposed to “the proportion of influence among all partners n steps away” (when root=False). Defaults to False.
chunkSize (int, optional) – The size of the chunks to split the matrix into for matrix multiplication. Defaults to 5000. A chunk is a dense matrix of size (chunkSize, n_neurons). This determines the memory usage: e.g. 2000 * 160000 * float32 / 8 bytes / 1e9 = 1.28 GB. We need two copies (one excitation, one inhibition), plus the sparse representations of the (n_neurons, n_neurons) matrix.
save_to_disk (bool, optional) – Whether to save the output matrices to disk.
save_path (str, optional) – Path to save the results. Defaults to “./”.
return_results (bool, optional) – Whether to return the results as a list of sparse matrices. Defaults to True.
- Returns:
Two lists of sparse matrices representing the excitatory and inhibitory influences, respectively, up to the specified target layer.
- Return type:
Tuple[List[scipy.sparse.csc_matrix], List[scipy.sparse.csc_matrix]]
- connectome_interpreter.compress_paths.compress_paths_signed_no_chunking(inprop, idx_to_sign, target_layer_number, threshold=0, output_threshold=0.0001, root=False)[source]
Calculates the excitatory and inhibitory influences across specified layers of a neural network, using PyTorch for GPU acceleration. This function processes a connectivity matrix (where presynaptic neurons are represented by rows and postsynaptic neurons by columns) to distinguish and compute the influence of excitatory and inhibitory neurons at each layer.
- Parameters:
inprop (scipy.sparse.csc_matrix) – The initial connectivity matrix
layers. (representing direct connections between adjacent)
idx_to_sign (dict) – A dictionary mapping neuron indices to their types
excitatory ((1 for)
inhibitory) (-1 for)
between (used to differentiate)
influences. (excitatory and inhibitory)
target_layer_number (int) – The number of layers through which to
influences (calculate)
first (starting from the second layer (with the)
influence (layer's)
connectivity (the direct)
inprop). (being defined by)
threshold (float, optional) – A value to threshold the influences;
zero (influences below this value are set to)
in (and not passed on)
0. (future layers. Defaults to)
output_threshold (float, optional) – A threshold for the final output
usage (to reduce memory)
zero. (with values below this threshold set to)
1e-4. (Defaults to)
root (bool, optional) – Whether to take the nth root of the output.
strength" (This can be understood as "the average direct connection)
root=True) ((when)
all (as opposed to "the proportion of influence among)
away" (partners n steps)
- Returns:
Two lists of sparse matrices representing the excitatory and inhibitory influences, respectively, up to the specified target layer.
- Return type:
Tuple[List[scipy.sparse.csc_matrix], List[scipy.sparse.csc_matrix]]
Note
This function is ideal with GPU support. Ensure your environment supports CUDA and that PyTorch is correctly installed.
- connectome_interpreter.compress_paths.contribution_by_path_lengths(steps, inidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], outidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], outidx_map: dict | None = None, inidx_map: dict | None = None, width: int = 800, height: int = 400)[source]
Plots the connection strength from all of inidx (grouped by inidx_map) to an average outidx (grouped by outidx_map) over different path lengths. Either inidx_map or outidx_map, but not both, should be provided. If neither is provided, presynaptic neurons are grouped together. Direct connections are in path_length 1.
- Parameters:
steps (list of scipy.sparse matrices or numpy.array) – List of sparse matrices, each representing synaptic strengths for a specific path length.
inidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing input (presynaptic) neurons.
outidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing output (postsynaptic) neurons.
outidx_map (dict) – Mapping from indices to postsynaptic neuron groups. Only one of inidx_map and outidx_map should be specified.
inidx_map (dict) – Mapping from indices to presynaptic neuron groups. Only one of inidx_map and outidx_map should be specified.
width (int, optional) – The width of the plot. Defaults to 800.
height (int, optional) – The height of the plot. Defaults to 400.
- Returns:
- Displays an interactive line plot showing the connection strength from all
of inidx to an average outidx over different path lengths.
- Return type:
None
- connectome_interpreter.compress_paths.contribution_by_path_lengths_data(steps, inidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], outidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], outidx_map: dict | None = None, inidx_map: dict | None = None)[source]
Calculates the contribution from all of inidx (grouped by inidx_map) to an average outidx (grouped by outidx_map) over different path lengths. Either inidx_map or outidx_map, but not both, should be provided. If neither is provided, presynaptic neurons are grouped together. Direct connections are in path_length 1.
- Parameters:
steps (list of scipy.sparse matrices or numpy.array) – List of sparse matrices, each representing synaptic strengths for a specific path length.
inidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing input (presynaptic) neurons.
outidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing output (postsynaptic) neurons.
outidx_map (dict) – Mapping from indices to postsynaptic neuron groups. Only one of inidx_map and outidx_map should be specified.
inidx_map (dict) – Mapping from indices to presynaptic neuron groups. Only one of inidx_map and outidx_map should be specified.
- Returns:
- A DataFrame containing the contributions from presynaptic neurons
to postsynaptic neurons over different path lengths. The DataFrame has three columns: ‘path_length’, ‘presynaptic_type’ (or ‘postsynaptic_type’), and ‘value’.
- Return type:
pd.DataFrame
- connectome_interpreter.compress_paths.contribution_by_path_lengths_heatmap(steps, inidx, outidx, inidx_map=None, outidx_map=None, sort_by_index=True, sort_names=None, pre_in_column=False, display_threshold=0, cmap='viridis', figsize=(30, 15))[source]
Display the contribution from inidx to outidx, grouped by inidx_map and outidx_map, across different path lengths.
- Parameters:
steps (list of scipy.sparse matrices) – List of sparse matrices, each representing synaptic strengths for a specific path length.
inidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing input (presynaptic) neurons.
outidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing output (postsynaptic) neurons.
inidx_map (dict, optional) – Mapping from indices to input neuron groups. Defaults to None, in which case neurons are not grouped.
outidx_map (dict, optional) – Mapping from indices to output neuron groups. Defaults to None, in which case it is set to be the same as inidx_map.
sort_by_index (bool, optional) – Whether to sort the output by index. Defaults to True.
sort_names (str or list, optional) – the column name(s) to sort the result by. If none is provided, then sort by the first column.
pre_in_column (bool, optional) – Whether to have the presynaptic neuron groups as columns. Defaults to False (pre in rows, post: columns).
display_threshold (float, optional) – The threshold for displaying the output. Defaults to 0.
cmap (str, optional) – The colormap to use for the heatmap. Defaults to ‘viridis’.
figsize (tuple, optional) – The size of the figure to display. Defaults to (30, 15).
- Returns:
- Displays an interactive heatmap showing the contribution from
inidx to outidx, grouped by inidx_map and outidx_map, across different path lengths.
- Return type:
None
- connectome_interpreter.compress_paths.effective_conn_from_paths(paths, group_dict=None, wide=True)[source]
Calculate the effective connectivity between (groups of) neurons based only on the provided paths between neurons. This function runs on CPU, and doesn’t expect a big connectivity matrix as input.
- Parameters:
paths (pd.DataFrame) – A dataframe representing the paths between neurons, with columns ‘pre’, ‘post’, ‘weight’, and ‘layer’.
group_dict (dict, optional) – A dictionary mapping neuron indices (values in columns pre and post) to groups. Defaults to None.
wide (bool, optional) – Whether to pivot the output dataframe to a wide format. Defaults to True.
- Returns:
- A dataframe representing the effective connectivity
between groups of neurons.
- Return type:
pd.DataFrame
- connectome_interpreter.compress_paths.read_precomputed(prefix: str, file_path: str | None = None, first_n: int | None = None) List[source]
Reads the precomputed compressed paths.
- Parameters:
prefix (str) – The prefix/folder name (expected to be the same) of the files to read.
file_path (str, optional) – The path to the files. Defaults to None. If None, checks if running in Google Colab, and sets the path: if running in Colab, sets the path to “/content/”; otherwise, sets the path to “”.
first_n (int, optional) – Number of files to read. If None, reads all files.
- Returns:
A list of matrices (sparse or dense) representing the steps.
- Return type:
List
- connectome_interpreter.compress_paths.result_summary(stepsn, inidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], outidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], inidx_map: dict | None = None, outidx_map: dict | None = None, display_output: bool = True, display_threshold: float = 0.001, threshold_axis: str = 'row', sort_within: str = 'column', sort_names: str | List | None = None, pre_in_column: bool = False, include_undefined_groups: bool = False, outprop: bool = False, combining_method: str = 'mean')[source]
Generates a summary of connections between different types of neurons, represented by their input and output indexes. The function calculates the total synaptic input from presynaptic neuron groups to an average neuron in each postsynaptic neuron group.
- Parameters:
stepsn (scipy.sparse matrix or numpy.ndarray) – Matrix representing the synaptic strengths between neurons, can be dense or sparse.
inidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing the input (presynaptic) neurons, used to subset stepsn. nan values are removed.
outidx (int, float, list, set, numpy.ndarray, or pandas.Series) – Array of indices representing the output (postsynaptic) neurons.
inidx_map (dict, optional) – Mapping from indices to neuron groups for the input neurons. Defaults to None, in which case neurons are not grouped.
outidx_map (dict, optional) – Mapping from indices to neuron groups for the output neurons. Defaults to None, in which case it is set to be the same as inidx_map.
display_output (bool, optional) – Whether to display the output in a coloured dataframe. Defaults to True.
display_threshold (float, optional) – The minimum threshold for displaying the output. Defaults to 0.
threshold_axis (str, optional) – The axis to apply the display_threshold to. Defaults to ‘row’ (removing entire rows if no value exceeds display_threshold).
sort_within (str, optional) – The axis to sort the output in. Defaults to ‘column’.
sort_names (str or list, optional) – the column/row name(s) to sort the result by. If none is provided, then sort by the first column/row.
pre_in_column (bool, optional) – Whether to have the presynaptic neuron groups as columns. Defaults to False (pre in rows, post: columns).
include_undefined_groups (bool, optional) – Whether to include undefined groups in the output. Defaults to False.
outprop (bool, optional) – If True, get the summed output proportion (across recipient single cells in the same cell type) for each average sender. If False (default), get the summed input proportion across all senders for each average recipient.
combining_method (str, optional) – Method to combine inputs (outprop=False) or outputs (outprop=True). Can be ‘sum’, ‘mean’, or ‘median’. Defaults to ‘mean’.
- Returns:
A dataframe representing the summed synaptic input from presynaptic neuron groups to an average neuron in each postsynaptic neuron group. This dataframe is always returned, regardless of the value of display_output.
- Return type:
pd.DataFrame
- Displays:
If display_output is True, the function will display a styled version of the resulting dataframe.
- connectome_interpreter.compress_paths.signed_effective_conn_from_paths(paths, group_dict=None, wide=True, idx_to_nt=None)[source]
Calculate the signed effective connectivity between (groups of) neurons based only on the provided paths between neurons. This function runs on CPU, and doesn’t expect a big connectivity matrix as input.
- Parameters:
paths (pd.DataFrame) – A dataframe representing the paths between neurons, with columns ‘pre’, ‘post’, ‘weight’, ‘layer’, and optionally ‘sign’.
group_dict (dict, optional) – A dictionary mapping neuron indices (values in columns pre and post) to groups. Defaults to None.
wide (bool, optional) – Whether to pivot the output dataframe to a wide format. Defaults to True.
idx_to_nt (dict, optional) – A dictionary mapping neuron indices (values in columns pre and post) to 1 (excitatory) / -1 (inhibitory). Defaults to None.
- Returns:
- A list of two dataframes representing the effective connectivity
between groups of neurons, one for effective excitation, the other inhibition.
- Return type:
list