External paths
- connectome_interpreter.external_paths.compute_flow_hitting_time(conn_df: DataFrame | spmatrix, flow_seed_idx: ndarray[int], flow_steps: int, flow_thre: float)[source]
Compute hitting time for all cells in conn_df. Hitting time is the average number of hops required to reach a cell from a set of seed cells. The main algorithm is implemented in the ‘navis’ library (https://github.com/navis-org/navis).
- Parameters:
conn_df (pd.DataFrame) – DataFrame containing the connections with columns ‘pre’, ‘post’, and ‘weight’.
flow_seed_idx (np.ndarray) – Array of seed cell indices.
flow_steps (int) – Number of steps for flow calculation.
flow_thre (float) – Threshold for activation in flow calculation.
- Returns:
DataFrame with columns ‘idx’ and ‘hitting_time’, where ‘idx’ is the cell index and ‘hitting_time’ is the computed hitting time.
- Return type:
pd.DataFrame
- connectome_interpreter.external_paths.find_instance_flow(inprop: spmatrix | DataFrame, idx_to_group: dict, flow_seed_groups: list[str] = ['L1', 'L2', 'L3', 'R7p', 'R8p', 'R7y', 'R8y', 'R7d', 'R8d', 'HBeyelet'], file_path: str | None = None, save_flow: bool | None = True, save_prefix: str | None = 'flow_', flow_steps: int = 20, flow_thre: float = 0.1) DataFrame[source]
Get the hitting time for all cell groups. The hitting time is computed using the information flow algorithm (navis:https://github.com/navis-org/navis) for each neuron, and then taking the median across neurons of each cell group.
- Parameters:
inprop (Union[spmatrix, pd.DataFrame]) – Input sparse matrix or DataFrame representing connections. If a DataFrame, it should have columns ‘pre’, ‘post’, and ‘weight’.
idx_to_group (dict) – Dictionary mapping cell indices to their respective cell groups.
flow_seed_groups (list) – List of cell groups to be used as seeds for flow calculation.
file_path (str) – Path to the directory containing the hitting time data.
save_flow (bool) – Whether to save the computed hitting time to a CSV file. Defaults to True.
save_prefix (str) – Prefix for the saved file names.
flow_seed_groups – List of cell group to be used as seeds for flow calculation.
flow_steps (int) – Number of steps for flow calculation.
flow_thre (float) – Threshold for flow calculation.
- Returns:
DataFrame with columns ‘cell_group’ and ‘hitting_time’, where ‘cell_group’ is the name of the cell group and ‘hitting_time’ is the median hitting time for that group.
- Return type:
pd.DataFrame
- connectome_interpreter.external_paths.find_shortest_paths(paths: DataFrame, start_nodes: list[str], end_nodes: list[str]) list[list[str]][source]
Find the shortest paths between groups in start_nodes and end_nodes in a paths dataframe (paths is the output of find_path_iteratively).
- Parameters:
paths (pd.DataFrame) – DataFrame containing the path data, including columns ‘weight’, ‘pre’, and ‘post’.
start_nodes (list) – List of ‘pre’ groups.
end_nodes (list) – List of ‘post’ groups.
- Returns:
- A list of shortest paths, where each path is a list of groups
that connect the start and end nodes (ordered from start to end).
- Return type:
list
- connectome_interpreter.external_paths.layered_el(inprop: spmatrix, steps: list, inidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], outidx: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], n: int, idx_to_group: dict, threshold: float = 0, flow_steps: int = 20, flow_thre: float = 0.1)[source]
First finds paths within n steps, given the threshold (applied to direct connections), grouped by idx_to_group, and adds flow layers to the edge list based on the information flow hitting time (navis: https://github.com/navis-org/navis).
- Parameters:
inprop (spmatrix) – Input sparse matrix representing connections.
steps (list) – A list of connectivity matrices, e.g. the result from compress_paths().
inidx (np.ndarray) – Array of input indices.
outidx (np.ndarray) – Array of output indices.
n (int) – The maximum number of hops. n=1 for direct connections.
idx_to_group (dict) – Dictionary mapping cell indices to their respective cell groups.
threshold (float) – The threshold for the weight of the direct connection between pre and post, after grouping by idx_to_group. Defaults to 0.
- Returns:
DataFrame containing the grouped edge list with flow layers.
- Return type:
pd.DataFrame
- connectome_interpreter.external_paths.plot_flow_layered_paths(paths: DataFrame, figsize: tuple = (10, 8), weight_decimals: int = 2, neuron_to_sign: dict | None = None, sign_color_map: dict = {-1: 'blue', 1: 'red'}, neuron_to_color: dict | None = None, edge_text: bool = True, node_text: bool = True, highlight_nodes: list[str] = [], interactive: bool = True, save_plot: bool = False, file_name: str = 'layered_paths', label_pos: float = 0.7, default_neuron_color: str = 'lightblue', default_edge_color: str = 'lightgrey', node_size: int = 500) None[source]
Plots a directed graph of layered paths based on flow layers. Similar to the plot_layered_paths function, but the x-axis is defined by the flow layers of the nodes.
- Parameters:
paths (pandas.DataFrame) – A dataframe containing the columns ‘pre’, ‘post’, ‘weight’, ‘pre_layer’, and ‘post_layer’. Each row represents an edge in the graph. The ‘pre’ and ‘post’ columns refer to the source and target nodes, respectively, and ‘weight’ indicates the edge weight. ‘pre_layer’ and ‘post_layer’ are the flow layers of the corresponding nodes.
figsize (tuple, optional) – A tuple indicating the size of the matplotlib figure. Defaults to (10, 8).
weight_decimals (int, optional) – The number of decimal places to display for edge weights. Defaults to 2.
neuron_to_sign (dict, optional) – A dictionary mapping neuron names (as they appear in path_df) to their signs (e.g. {‘KCg-m’: 1, ‘Delta7’: -1}). Can also use a dictionary to map neuron names to their neurotransmitter name. Defaults to None.
sign_color_map (dict, optional) – A dictionary used to color edges. Defaults is lightgrey but if neuron_to_sign is provided, the default is to color edges red if the pre-neuron is excitatory, and blue if inhibitory. If the neuron_to_sign values are neurotransmitter names, then provide a dictionary that maps neurotransmitter names to colors. If the keys of sign_color_map do not match the values of neuron_to_sign, a warning is printed and the default color is used for the difference.
neuron_to_color (dict, optional) – A dictionary mapping neuron names to colors. If not provided, a default color is used for all nodes. The keys should match the node names (‘pre’ and ‘post’) in paths. The difference is given the default color.
edge_text (bool, optional) – Whether to display edge weights as text on the plot. Defaults to True.
node_text (bool, optional) – Whether to display node names as text on the plot. Defaults to True.
highlight_nodes (list[str], optional) – A list of node names to highlight bold in the plot. Defaults to an empty list.
interactive (bool, optional) – Whether to create an interactive plot using pyvis. Defaults to False. If False, a static matplotlib plot is created.
save_plot (bool, optional) – Whether to save the plot to a file. Defaults to False.
file_name (str, optional) – The name of the file to save the plot. Defaults to “layered_paths” in the local directory (.html if interactive and .pdf if static).
label_pos (float, optional) – The position of the edge labels. Defaults to 0.7. Bigger values move the labels closer to the left of the edge. Only works if interactive is False.
default_neuron_color (str, optional) – The default color for nodes if no specific color is provided in neuron_to_color. Defaults to “lightblue”.
default_edge_color (str, optional) – The default color for edges if no specific color is provided. Defaults to “lightgrey”.
node_size (int, optional) – The size of the nodes in the plot. Defaults to 500.
- Returns:
- This function does not return a value. It generates a plot using
matplotlib or pyvis.
- Return type:
None
Note
- The positions of the nodes are determined by a custom positioning function
(find_flow_positions).
- This function requires the networkx library for graph operations and matplotlib
for plotting. For interactive plots, it requires the pyvis library (where the node label has to be underneath the node).
- connectome_interpreter.external_paths.trim_inprop_by_flow(inprop: spmatrix | DataFrame, idx_to_group: dict, flow_seed_groups: list[str] = ['L1', 'L2', 'L3', 'R7p', 'R8p', 'R7y', 'R8y', 'R7d', 'R8d', 'HBeyelet'], file_path: str | None = None, save_flow: bool | None = True, save_prefix: str | None = 'flow_', flow_steps: int = 20, flow_thre: float = 0.1, flow_diff_min: float = 0.5, flow_diff_max: float = 20) csc_matrix[source]
Trim connections based on hitting time assigned by information flow algorithm (navis: https://github.com/navis-org/navis). The hitting time is the mean number of hops required to reach a neuron from a neuron in flow_seed_groups. If the hitting time of the post neuron is larger than that of the pre neuron (i.e., the difference is between flow_diff_min and flow_diff_max), then the connection is interpreted as a feedforward connection and kept. For similar hitting times, the connection is interpreted as a lateral connection. If the hitting time of the pre neuron is larger than that of the post neuron, the connection is interpreted as a feedback connection. Lateral and feedback connections are removed.
- Parameters:
inprop (csc_matrix) – Input sparse matrix representing connections.
idx_to_group (dict) – Dictionary mapping cell indices to their respective cell groups.
flow_seed_groups (list) – List of cell groups to be used as seeds for flow calculation.
file_path (str) – Path to the directory containing the hitting time data.
save_flow (bool) – Whether to save the computed hitting time to a CSV file.
flow_steps (int) – Number of steps for flow calculation.
flow_thre (float) – Threshold for flow calculation.
flow_diff_min (float) – Minimum difference in hitting time for connection retention.
flow_diff_max (float) – Maximum difference in hitting time for connection retention.
- Returns:
sparse matrix for which pairs of connections have hitting time within specified range.
- Return type:
csc_matrix