Genonets Server
Analysis and Visualization of Genotype Networks
Home
Learn Genonets
Python package
About us
Tutorial
The complete tutorial is available in the following formats:
Online HTML
Downloadable PDF
Video tutorials
Quick start
Get started with the Genonets Server in less than 2 minutes with the sample input file.
Deep dives
Learn the input file format, how to use the input form, format of the results files, and how to use the visualization features to explore the genotype networks.
Terminology
We refer to a set of genotypes with a given phenotype as a genotype set, which is typically a very small subset of genotype space – the set of all possible genotypes. Each genotype set therefore corresponds to a single phenotype. A genotype set may comprise one or more genotype networks. In such networks, vertices represent genotypes and edges connect vertices if their corresponding genotypes are separated by a single small mutation. Vertices that share an edge are referred to as neighbors. Since individual genotypes may belong to multiple genotype sets, genotype networks may overlap.
We also construct and visualize phenotype networks. In such networks, vertices represent genotype sets, and edges connect vertices if any genotypes in the corresponding genotype sets can be interconverted via a single small mutation. Vertices that share an edge are referred to as adjacent. We refer to mutations that lead to a change in phenotype as non-neutral. For further information on these and related concepts, the reader is referred to [
4
].
Analyses
Please review the following important information that applies to analyses in general:
All analyses are performed on the dominant genotype network. That is, if the genotype set is fragmented into several genotype networks, analyses will be performed on the largest of these networks. All other components are ignored.
Genotypes with a score below the score threshold
tau
are not considered in any analysis.
The noise threshold
delta
is only used in landscape related analyses, i.e., Paths, Peaks, and Epistasis (see below).
Evolvability
We define evolvability as the ability of mutation to bring forth a novel phenotype. Evolvability can be measured at the scale of an individual genotype or of a genotype set. In the Genonets Server, evolvability analyses are always enabled, resulting in the following:
Calculation of genotype evolvability
Calculation of phenotype evolvability
Construction of the phenotype network
These computational routines are based on [
1
], and are described in detail below.
Genotype evolvability
For a genotype
g
in a genotype set
S
, evolvability is the ratio of the number of genotype sets to which
g
can evolve via a single mutation, to the total number of genotype sets in the input data.
The higher the evolvability of a genotype
g
, the higher the number of genotype sets that can be reached by a single mutation from
g
.
Phenotype evolvability
For a genotype set
S
, phenotype evolvability is the ratio of the number of unique genotype sets to which genotypes in
S
can evolve via single mutations, to the total number of genotype sets available in the input data.
Phenotype network
Each vertex in the phenotype network represents a genotype set.
The size of a vertex corresponds to the phenotype evolvability of the corresponding genotype set, i.e., the higher the phenotype evolvability, the larger the vertex size.
The higher the out-degree of a vertex, the higher the number of genotype sets to which genotypes in this genotype set can evolve.
The higher the in-degree of a vertex, the higher the number of genotype sets from which this genotype set is accessible.
The network is a
directed graph
, which captures the possibly asymmetric relation between vertices. This means that it is possible for a genotype set to be accessible from other genotype sets, despite having zero phenotype evolvability itself.
Robustness
We define robustness as the invariance of a phenotype in the face of genetic perturbation. Like evolvability, robustness can be measured at the scale of an individual genotype or of a genotype set. In the Genonets Server, robustness analyses can be selected from the list of analyses in the input form. These analyses results in the following computations:
Genotype robustness
Average robustness of the genotype set
These computational routines are based on [
1
], and are described in detail below.
Genotype robustness
For a genotype
g
in a genotype set
S
, robustness is the fraction of all possible mutational neighbors that are also in
S
. Thus,
g
is maximally robust if all possible neighbors are members of
S
.
Phenotype robustness
Phenotype robustness is the arithmetic mean of the genotype robustness values for all genotypes in the genotype set
S
.
Accessibility
The accessibility of a genotype set
S
measures the potential for mutation to generate a genotype in
S
from genotypes in different genotype sets. In the Genonets Server, accessibility can be selected from the list of analyses in the input form, and is measured for all genotype sets in the input data. Specifically, for a genotype set
S
, accessibility is computed as follows:
For a genotype set
S
, accessibility is computed as follows:
For each pair of genotype sets (
S
,
T
), calculate the ratio of the number of genotypes in
S
that are separated by a single mutation from any genotype in
T
, to the total number of genotypes that are separated by a single mutation from genotypes in
T
.
Then calculate the accessibility of
S
as the sum of these ratios for all pairs (
S
,
T
).
Computational routines for accessibility are based on [
2
].
Neighbor abundance
The neighbor abundance of a genotype set
S
measures the size of adjacent genotype sets, in proportion to the probability that a mutation will generate a genotype in these adjacent genotype sets. In the Genonets Server, neighbor abundance can be selected from the list of analyses in the input form, and is measured for all genotype sets in the input data. Specifically, for a genotype set
S
, neighbor abundance is computed as follows:
Calculate the ratio of the number of genotypes in
T
that are accessible from
S
, to the total number of genotypes that are accessible from
S
.
Multiply this ratio by the number of genotypes in
S
.
Repeat this process for all genotype set pairs (
S
,
T
), taking the sum as the neighbor abundance of
S
.
Computational routines for neighbor abundance are based on [
2
].
Diversity index
The diversity index of a genotype set
S
gives the probability that two randomly chosen non-neutral mutations to genotypes in
S
yield genotypes that belong to the same genotype set
T
. In the Genonets Server, the diversity index can be selected from the list of analyses in the input form, and is measured for all genotype sets in the input data. Specifically, the diversity index of a genotype set
S
is computed as follows:
Calculate the ratio of the number of genotypes in
T
that are accessible from
S
, to the total number of genotypes that are accessible from
S
.
Square this ratio.
Repeat this process for all genotype set pairs (
S
,
T
), summing up along the way.
The diversity index of
S
is one minus this sum.
Computational routines for diversity index are based on [
2
].
Structure
The last two decades of research in network science have produced a wealth of measures for describing the structure of networks. The Genonets Server includes many of these analyses. They can be selected from the list of analyses in the input form, resulting in measures at the level of individual genotypes and genotype sets.
Computations performed at the level of the genotype set are:
Number of connected components, i.e., number of genotype networks within a single genotype set
Sizes of all connected components
Size of the giant component, i.e., size of the dominant genotype network
Proportional size of the dominant genotype network
Diameter of the dominant genotype network
Edge density of the dominant genotype network
Average clustering coefficient for the dominant genotype network
Computations performed at the level of genotypes are:
Coreness
Clustering coefficient
Computational routines for structural analysis are described in [
5
].
Overlap
Since some genotypes belong to more than one genotype set, genotype networks sometimes overlap. By selecting overlap from the list of analyses in the input form, the Genonets Server will characterize these regions of overlap for all pairs of genotype sets. Specifically, for each pair of genotype sets (
S
,
T
) available in the input data, this analysis calculates the number of genotypes that are common to both genotype sets
S
and
T
.
Overlap analysis can be selected from the list of analyses in the input form.
Epistasis
Epistasis – non-additive interactions between mutations – can impose severe constraints on molecular evolution because the mutations that are beneficial in one genetic background may be deleterious in another. Epistasis can be classified as magnitude, simple sign, or reciprocal sign epistasis depending on the sign (i.e., positive or negative) of the individual mutations and of the mutations in combination (please see [
3
] for details). In the Genonets Server, epistasis can be selected from the list of analyses in the input form, resulting in the following calculations:
Identify all squares in the dominant genotype network, as these represent pairs of mutations.
For each square, determine the class of epistasis (magnitude, simple sign, reciprocal sign).
For each epistasis class, calculate the proportion of all squares in the dominant genotype network that belong to this class.
Computational routines for epistasis are based on [
3
].
Peaks
In the input data, the user is required to provide a score for each genotype. Since these scores may reflect a quantitative phenotype that is related to organismal fitness, and because these scores vary amongst the genotypes in a genotype network, one may think of a genotype network as an adaptive landscape [
6
]. This opens the door to a slew of analyses that characterize the potential for mutation and selection to explore these landscapes. One of these analyses comprises determination of peaks in the landscape.
Peaks can be selected from the list of analyses in the input form, resulting in the determination of the global and all local peaks in the landscape. We refer to the genotype with the highest score in the genotype network as the
summit
. Please note that even though there can be multiple genotypes within a peak, when referring to the global peak within the Genonets Server documentation, we are in fact referring to the
summit
.
Computational routines for peaks are based on [
3
].
Paths
Another analysis where the genotype network is considered an adaptive landscape [
6
] (see the introduction to Peaks analysis above) is the compuation of accessible mutational paths.
Paths can be selected from the list of analyses in the input form. This analysis involves computing all accessible mutational paths from each genotype in the network, to the
summit
. A path is accessible, if and only if the scores for the genotypes on the path increase monotonically (plus or minus the user-supplied parameter delta), from the source genotype to the target genotype.
Computational routines for paths are based on [
3
].
Mapping of visualization features to analysis types
The following table is provided to help the user determine which analysis types are pre-requisite for which visualization features.
Visualization feature
Required analysis type
Diameter path
Structure
Landscape view
Paths, Peaks
Path epistasis
Paths, Epistasis
Squares: All
Squares: No epistasis
Squares: Magnitude epistasis
Squares: Simple sign epistasis
Squares: Reciprocal sign epistasis
Epistasis
Overlap target sets
Overlap
Epistasis types: bar plot
Epistasis
Paths to summit
Paths
Highlight in landscape view
Paths, Epistasis
Genotype set parameters
The following table provides a description of columns in the phenotype network table in the visualization, as well as the attributes in
Genotype_set_measures.txt
results file.
Attribute
Description
Visualization
Genotype_set_measures.txt
Name
Genotype_set
Name of the genotype set
Accessibility
Accessibility
Accessibility value computed for the genotype set.
Please refer to the accessibility analysis description for further details.
Neighbor abundance
Neighbor_abundance
Neighbor abundance value computed for the genotype set.
Please refer to the accessibility analysis description for further details.
Diversity index
Diversity_index
Diversity index value computed for the genotype set.
Please refer to the accessibility analysis description for further details.
Robustness
Robustness
Average robustness value computed for the genotype set.
Please refer to the robustness analysis description for further details.
Evolvability
Evolvability
Phenotype evolvability value computed for the genotype set.
Please refer to the evolvability analysis description for further details.
Evolvability targets
Evolvability_targets
List of genotype sets accessible from this genotype set by a single mutation.
Summit
n/a
ID of the vertex in the corresponding genotype network that represents the genotype with the highest score.
n/a
Peaks
Dictionary, where key is the
peak ID
, and value is a list of genotypes in the peak. Peak ID '0' is always the global peak.
Number of peaks
Number_of_peaks
Total number of peaks in the genotype network, including the global peak.
Number of squares
Number_of_squares
Total number of squares in the genotype network
Magnitude epistasis
Magnitude_epistasis
Ratio of the number of squares characterized by magnitude epistasis, to the total number of squares in the genotype network.
Simple sign epistasis
Simple_sign_epistasis
Ratio of the number of squares characterized by simple sign epistasis, to the total number of squares in the genotype network.
Reciprocal sign epistasis
Reciprocal_sign_epistasis
Ratio of the number of squares characterized by reciprocal sign epistasis, to the total number of squares in the genotype network.
Diameter
Diameter
Diameter of the the genotype network
Assortativity
Assortativity
Assortativity of the the genotype network
Edge density
Edge_density
Edge density of the genotype network
Number of genotype networks
Number_of_genotype_networks
Number of genotype network in the genotype set
Genotype network sizes
Genotype_network_sizes
A list of sizes of all genotype networks in the genotype set
Size of dominant genotype network
Size_of_dominant_genotype_network
Size of the dominant genotype network
Proportional size of dominant network
Proportional_size_of_dominant_genotype_network
Proportion of the total number of genotypes in the genotype set, which are in the dominant network
Average clustering coefficient
Average_clustering_coefficient_of_dominant_genotype_network
Average clustering coefficient computed for the dominant genotype network
Ratio of overlapping genotype sets
Ratio_of_overlapping_genotype_sets
Ratio of the number of overlapping genotype sets to the total number of genotype sets under consideration
Overlapping genotype sets
Overlapping_genotype_sets
List of names of genotype sets, where each genotype set has at least one genotype in common with this genotype set
Genotype parameters
The following table provides a description of columns in the genotype network table in the visualization, as well as the attributes in
<>_genotype_measures.txt
results file.
Attribute
Description
Visualization
_genotype_measures.txt
Vertex ID
n/a
ID of the vertex that corresponds to the genotype
Genotype
Sequence
The genotype
Score
n/a
Score value corresonding to the genotype as read from the input file
Robustness
Robustness
Genotype robustness.
Please refer to the robustness analysis description for further details.
Evolvability
Evolvability
Genotype evolvability.
Please refer to the evolvability analysis description for further details.
Evolvability targets
Evolvability_targets
Dictionary, where key is the genotype set name, and value is a list of genotypes in the genotype set to which the focal genotype can evolve.
Evolves to genotypes in
n/a
List of names of the genotype sets to which this genotype can evolve
Overlap with genotypes in
Overlaps_with_genotypes_in
List of names of the genotype sets which also contain this genotype
Coreness
Coreness
Coreness is an alternative measure of mutational robustness.
Clustering coefficient
Clustering_coefficient
The clustering coefficient measures the proportion of a vertex’s neighbors that are neighbors themselves.
Distance from summit
Distance from Summit
The number of edges between this genotype and the summit.
Accessible paths through
Accessible_paths_through
The number of accessible mutational paths that pass through this genotype.
Please note that it includes the paths of which the genotype is a starting or ending vertex.
References
Andreas Wagner.
Robustness and evolvability: a paradox resolved
. Proc. R. Soc. B 2008 275 91-100; DOI: 10.1098/rspb.2007.1137. Published 7 January 2008.
Cowperthwaite MC, Economo EP, Harcombe WR, Miller EL, Meyers LA (2008)
The Ascent of the Abundant: How Mutational Networks Constrain Evolution
. PLoS Comput Biol 4(7): e1000110. doi:10.1371/journal.pcbi.1000110
Jose Aguilar Rodriguez, Joshua L. Payne, Andreas Wagner
One thousand adaptive landscapes and their navigability
. In review.
Andreas Wagner.
Neutralism and selectionism: a network-based reconciliation
. Nature Reviews Genetics 9, 965-974 (December 2008).
Mark Newman.
Networks: An Introduction
. Oxford University Press, Inc., New York, NY, USA. (2010).
Sewall Wright.
The roles of mutation, inbreeding, crossbreeding and selection in evolution
. In Proc. Sixth Int. Congr. Genet. 356–366 (1932).