(Left)
Pictorial description of how the plot to the
right was generated. A protein is represented by a circle. Assume that
there are two super-families, identified by the two different colours,
blue (solid) and black (pattern). For each protein in turn we computed
the distance to the closest protein with the same colour (and we used
it for the red plot) and the distance to the closest protein with a
different colour (and we used it for the green plot). In the figure,
the distances used for one of the blue proteins are shown.
(Right)
Distribution of minimum E-values
within (red) and across (green) super-families in Astral-95, for E-values
between 1e–80
and 100.
[from A.
Paccanaro, J. A. Casbon, M. A. S. Saqi (2006). Spectral
Clustering of Proteins Sequences Nucleic Acids Research
2006 Mar 17;34(5):1571-80].