Complex Methods 2

Auditorium

Chair: Annick Vignes

The functions of biological networks are often hypothesized to be governed by an ensemble of regularly repeated, small subgraphs termed motifs. Network theory provides statistical tools to detect potential candidate motifs that may govern the functions of a network. However, these methods are grounded in random graph models which purely describe the network topology, independently of the networks’ dynamics and function. In parallel, there exists a multitude of dynamic processes on networks, whose evolution imply dynamical representations that we generally apply to make sense of, or simply guess, the computations (i.e., the functions) occurring in biological motifs. Consequently, there is a methodological gap between how we infer motifs—with random graph models—and how we study them—with dynamical models. Furthermore, existing topological network similarity measures only provide one-dimensional representations of the difference between a motif pair, and thus cannot capture all relevant differences. Thus, we argue that biological network motifs are best compared not directly by their topology but by how a biologically relevant dynamic process unfolds on them.

Here, we report on our progress in designing dynamics-based dissimilarity measures between candidate network motifs, formally defined as subgraph isomorphism classes called graphlets. We formally introduce the concept of node-dynamics-based dissimilarity between graphs, and we investigate the subtleties involved in defining appropriate distances between individual node dynamics and aggregating them to compare labeled and unlabeled graphs. Given its pertinence in biological applications as well as its well-established characterization in the mathematics and physics literature, we take the Kuramoto model as a working example and propose specific dissimilarity measures based on node synchronization, which we use to compare and cluster directed graphlets of up to five nodes. We discuss how to design dissimilarities for graphlets with and without labeled nodes (which requires accounting for their automorphism and isomorphism groups, respectively) and how this leads to different perceptions of proximity between graphlets. And we also show how dissimilarity measures based on non-linear synchronization dynamics differ from measures based on linear dynamics and from purely topological distance measures.

Alex Barbier-Chebbah, Nicolas Billy, Jean-Baptiste Masson, Srinivas Turaga and Christian Vestergaard

Network data are inherently high-dimensional, so identifying functional roles of nodes and subgraphs in a whole biological network by exhaustive screening is in general unfeasible. Thus, in order to characterize their peculiar structural features and understand their function, it is crucial to extract relevant low-dimensional representations of these. Network embedding provides a powerful statistical framework for analyzing network data, and in particular biological networks, naturally accounting for their nodes’ real space or latent structures and tendency to cluster in functional groups. However, including the heterogeneous, asymmetrical, and weighted characteristics of many biological networks in a latent space model remains a significant challenge.

Building on recent tools, we here propose an embedding approach specifically tailored to model these structural constraints. Our model provides a tractable likelihood for the edge weights between each pair of nodes based on a dual-space latent embedding with learnable distance kernels. This makes it possible to simultaneously: 1. account for edge weights and deterministic asymmetries in connectivity by
using two different spaces for ingoing and outgoing connections; 2. learn the most appropriate distance kernels and embedding dimension from data, and extract a low-dimensional latent representation of a network; 3. identify specific latent space features to uncover hidden structural features; 4. generate artificial networks with realistic and tunable structural features, which may serve as null models and to
investigate the functional importance of identified latent features in simulations of network dynamics. We validate our model on synthetic networks and apply it to characterize the complete adult Drosophila
connectome.