External paths

connectome_interpreter.external_paths.compute_flow_hitting_time(conn_df: DataFrame | spmatrix, flow_seed_idx: ndarray[int], flow_steps: int, flow_thre: float)[source]

Compute hitting time for all cells in conn_df. Hitting time is the average number of hops required to reach a cell from a set of seed cells. The main algorithm is implemented in the ‘navis’ library (https://github.com/navis-org/navis).

Parameters:

conn_df (pd.DataFrame) – DataFrame containing the connections with columns ‘pre’, ‘post’, and ‘weight’.
flow_seed_idx (np.ndarray) – Array of seed cell indices.
flow_steps (int) – Number of steps for flow calculation.
flow_thre (float) – Threshold for activation in flow calculation.

Returns:

DataFrame with columns ‘idx’ and ‘hitting_time’, where ‘idx’ is the cell index and ‘hitting_time’ is the computed hitting time.

Return type:

pd.DataFrame

connectome_interpreter.external_paths.find_instance_flow(inprop: spmatrix | DataFrame, idx_to_group: dict, flow_seed_groups: list[str] = ['L1', 'L2', 'L3', 'R7p', 'R8p', 'R7y', 'R8y', 'R7d', 'R8d', 'HBeyelet'], file_path: str | None = None, save_flow: bool | None = True, save_prefix: str | None = 'flow_', flow_steps: int = 20, flow_thre: float = 0.1) → DataFrame[source]

Get the hitting time for all cell groups. The hitting time is computed using the information flow algorithm (navis:https://github.com/navis-org/navis) for each neuron, and then taking the median across neurons of each cell group.

Parameters:

inprop (Union[spmatrix, pd.DataFrame]) – Input sparse matrix or DataFrame representing connections. If a DataFrame, it should have columns ‘pre’, ‘post’, and ‘weight’.
idx_to_group (dict) – Dictionary mapping cell indices to their respective cell groups.
flow_seed_groups (list) – List of cell groups to be used as seeds for flow calculation.
file_path (str) – Path to the directory containing the hitting time data.
save_flow (bool) – Whether to save the computed hitting time to a CSV file. Defaults to True.
save_prefix (str) – Prefix for the saved file names.
flow_seed_groups – List of cell group to be used as seeds for flow calculation.
flow_steps (int) – Number of steps for flow calculation.
flow_thre (float) – Threshold for flow calculation.

Returns:

DataFrame with columns ‘cell_group’ and ‘hitting_time’, where ‘cell_group’ is the name of the cell group and ‘hitting_time’ is the median hitting time for that group.

Return type:

pd.DataFrame

connectome_interpreter.external_paths.find_shortest_paths(paths: DataFrame, start_nodes: list[str], end_nodes: list[str]) → list[list[str]][source]

Find the shortest paths between groups in start_nodes and end_nodes in a paths dataframe (paths is the output of find_path_iteratively).

Parameters:

paths (pd.DataFrame) – DataFrame containing the path data, including columns ‘weight’, ‘pre’, and ‘post’.
start_nodes (list) – List of ‘pre’ groups.
end_nodes (list) – List of ‘post’ groups.

Returns:

A list of shortest paths, where each path is a list of groups: that connect the start and end nodes (ordered from start to end).

Return type:

list

First finds paths within n steps, given the threshold (applied to direct connections), grouped by idx_to_group, and adds flow layers to the edge list based on the information flow hitting time (navis: https://github.com/navis-org/navis).

Parameters:

inprop (spmatrix) – Input sparse matrix representing connections.
steps (list) – A list of connectivity matrices, e.g. the result from compress_paths().
inidx (np.ndarray) – Array of input indices.
outidx (np.ndarray) – Array of output indices.
n (int) – The maximum number of hops. n=1 for direct connections.
idx_to_group (dict) – Dictionary mapping cell indices to their respective cell groups.
threshold (float) – The threshold for the weight of the direct connection between pre and post, after grouping by idx_to_group. Defaults to 0.

Returns:

DataFrame containing the grouped edge list with flow layers.

Return type:

pd.DataFrame

connectome_interpreter.external_paths.plot_flow_layered_paths(paths: DataFrame, figsize: tuple = (10, 8), weight_decimals: int = 2, neuron_to_sign: dict | None = None, sign_color_map: dict = {-1: 'blue', 1: 'red'}, neuron_to_color: dict | None = None, edge_text: bool = True, node_text: bool = True, highlight_nodes: list[str] = [], interactive: bool = True, save_plot: bool = False, file_name: str = 'layered_paths', label_pos: float = 0.7, default_neuron_color: str = 'lightblue', default_edge_color: str = 'lightgrey', node_size: int = 500) → None[source]

Plots a directed graph of layered paths based on flow layers. Similar to the plot_layered_paths function, but the x-axis is defined by the flow layers of the nodes.

Parameters:

paths (pandas.DataFrame) – A dataframe containing the columns ‘pre’, ‘post’, ‘weight’, ‘pre_layer’, and ‘post_layer’. Each row represents an edge in the graph. The ‘pre’ and ‘post’ columns refer to the source and target nodes, respectively, and ‘weight’ indicates the edge weight. ‘pre_layer’ and ‘post_layer’ are the flow layers of the corresponding nodes.
figsize (tuple, optional) – A tuple indicating the size of the matplotlib figure. Defaults to (10, 8).
weight_decimals (int, optional) – The number of decimal places to display for edge weights. Defaults to 2.
neuron_to_sign (dict, optional) – A dictionary mapping neuron names (as they appear in path_df) to their signs (e.g. {‘KCg-m’: 1, ‘Delta7’: -1}). Can also use a dictionary to map neuron names to their neurotransmitter name. Defaults to None.
sign_color_map (dict, optional) – A dictionary used to color edges. Defaults is lightgrey but if neuron_to_sign is provided, the default is to color edges red if the pre-neuron is excitatory, and blue if inhibitory. If the neuron_to_sign values are neurotransmitter names, then provide a dictionary that maps neurotransmitter names to colors. If the keys of sign_color_map do not match the values of neuron_to_sign, a warning is printed and the default color is used for the difference.
neuron_to_color (dict, optional) – A dictionary mapping neuron names to colors. If not provided, a default color is used for all nodes. The keys should match the node names (‘pre’ and ‘post’) in paths. The difference is given the default color.
edge_text (bool, optional) – Whether to display edge weights as text on the plot. Defaults to True.
node_text (bool, optional) – Whether to display node names as text on the plot. Defaults to True.
highlight_nodes (list[str], optional) – A list of node names to highlight bold in the plot. Defaults to an empty list.
interactive (bool, optional) – Whether to create an interactive plot using pyvis. Defaults to False. If False, a static matplotlib plot is created.
save_plot (bool, optional) – Whether to save the plot to a file. Defaults to False.
file_name (str, optional) – The name of the file to save the plot. Defaults to “layered_paths” in the local directory (.html if interactive and .pdf if static).
label_pos (float, optional) – The position of the edge labels. Defaults to 0.7. Bigger values move the labels closer to the left of the edge. Only works if interactive is False.
default_neuron_color (str, optional) – The default color for nodes if no specific color is provided in neuron_to_color. Defaults to “lightblue”.
default_edge_color (str, optional) – The default color for edges if no specific color is provided. Defaults to “lightgrey”.
node_size (int, optional) – The size of the nodes in the plot. Defaults to 500.

Returns:

This function does not return a value. It generates a plot using: matplotlib or pyvis.

Return type:

None

Note

The positions of the nodes are determined by a custom positioning function: (find_flow_positions).
This function requires the networkx library for graph operations and matplotlib: for plotting. For interactive plots, it requires the pyvis library (where the node label has to be underneath the node).

connectome_interpreter.external_paths.trim_inprop_by_flow(inprop: spmatrix | DataFrame, idx_to_group: dict, flow_seed_groups: list[str] = ['L1', 'L2', 'L3', 'R7p', 'R8p', 'R7y', 'R8y', 'R7d', 'R8d', 'HBeyelet'], file_path: str | None = None, save_flow: bool | None = True, save_prefix: str | None = 'flow_', flow_steps: int = 20, flow_thre: float = 0.1, flow_diff_min: float = 0.5, flow_diff_max: float = 20) → csc_matrix[source]

Trim connections based on hitting time assigned by information flow algorithm (navis: https://github.com/navis-org/navis). The hitting time is the mean number of hops required to reach a neuron from a neuron in flow_seed_groups. If the hitting time of the post neuron is larger than that of the pre neuron (i.e., the difference is between flow_diff_min and flow_diff_max), then the connection is interpreted as a feedforward connection and kept. For similar hitting times, the connection is interpreted as a lateral connection. If the hitting time of the pre neuron is larger than that of the post neuron, the connection is interpreted as a feedback connection. Lateral and feedback connections are removed.

Parameters:

inprop (csc_matrix) – Input sparse matrix representing connections.
idx_to_group (dict) – Dictionary mapping cell indices to their respective cell groups.
flow_seed_groups (list) – List of cell groups to be used as seeds for flow calculation.
file_path (str) – Path to the directory containing the hitting time data.
save_flow (bool) – Whether to save the computed hitting time to a CSV file.
flow_steps (int) – Number of steps for flow calculation.
flow_thre (float) – Threshold for flow calculation.
flow_diff_min (float) – Minimum difference in hitting time for connection retention.
flow_diff_max (float) – Maximum difference in hitting time for connection retention.

Returns:

sparse matrix for which pairs of connections have hitting time within specified range.

Return type:

csc_matrix