Extended documentation

This page describes some of CytoGEDEVO's functionality in more detail, and explains some background as well.

Table of contents:

Alignment import

CytoGEDEVO supports importing two formats: A simple unheadered two-column text format, and a headered multi-column text format.


There are two ways of importing an alignment:

Resuming an alignment

CytoGEDEVO can resume existing alignments. Given an existing alignment (either imported or performed earlier) the procedure is the same as for performing a new one:

CytoGEDEVO checks for the existence of gedevoSourceNetworks in a networks' table -- if it exists, the network is considered an aligned network eligible for continuing.

The source network names are used to populate certain UI controls and should not be changed by hand.

E.g. in the custom data panel:


Aligned node pairs are collected from the following sources, in order of priority:

The resulting list of pairs is then used to continue the alignment from this point.

Note that drawing a mapping edge between two nodes will not change the gedevoPartnerUID and gedevoPartnerName columns!

The easiest way to update these two columns after adding/changing mapping edges is to perform a subsequent re-alignment and have it finish immediately.

Mapping edges stay present after a re-alignment finishes, but some may be removed if the new alignment conflicts with the old configuration (i.e. if a pair has been broken and re-assigned its mapping edge is removed).

Fixed mapping edges (and their corresponding node pair) are never broken during a re-alignment, but can be unfixated with the corresponding button (or just deleted).

External data files

Any number of external data files can be imported (as long as your machine has sufficient RAM to hold the resulting matrices in memory).

Each matrix to import is configured like this:

There is most certainly no reason to specify anything else than Autodetect. If you know which column corresponds to which network, setting this explicitly saves some startup time as the autodetection is somewhat costly, but this is less than 20 seconds on 2009-class hardware.

Data model is the type of score this matrix contains (distance or similarity).

Value range is how to interpret the score:

Manual/assisted alignment

There are two ways to align parts of networks by hand:

The second part of resuming an alignment explains how these mapping edges are applied when combined with the table data.

[ * If someone really wants a button that does this faster, drop me an email. -- Max]

Visualization

CytoGEDEVO uses Cytoscape's visualization capabilities. The quick visualization panel exists as a shortcut to apply the most often used coloring schemes to an aligned network pair.

The Avoid red/green checkbox is to support people with color blindness which might otherwise have trouble with the default coloring.

All buttons with a [+] on them allow selecting one column to be interpreted as the respective score type, and change the network style accordingly. Since it is not recorded which column represents which score type (distance / similarity / whatever), you need to take care of selecting the right thing yourself. Also see the score columns documentation.

The Graph Edits and CCS buttons also calculate their respective measures and add additional data to node and edge tables:

Important: These values are not recalculated automatically! If you perform a re-alignment, you need to click these buttons again, otherwise the table values & coloring will be out of date!

Scoring

The core of GEDEVO is an evolutionary algorithm that represents solutions as individuals. Each individual has a list of node pairs that it thinks should be aligned.

Due to its nature, there are two different types of scores in use:

This is why all scores have two weights, one for their pair contribution, and one for their overall contribution to the fitness.