Skip to main content

From Time to Space Recurrences in Biopolymers

  • Chapter
  • First Online:
Recurrence Quantification Analysis

Part of the book series: Understanding Complex Systems ((UCS))

Abstract

The application of Recurrence-Based Techniques to biopolymers is herewith introduced with an emphasis on the differences holding between the analysis of strings endowed with a mainly logical (DNA) or chemico-physical (Proteins) information content. The unique features of RQA when applied to systems in which spatial order (sequence) takes the place of time are described, highlighting the emergence of ‘time distortions’. This is a metaphorical term stressing the fact that a monodimensional array of aminoacid residues (sequence) beside being formally identical to a discrete time series is a physical object that folds in the usual three dimensional space. This behavior allows to fully appreciate the fact that RQA as an analytical tool is flexible enough to deal with complex networks in either the spatial or the temporal dimension. The comparison of DNA sequences with text strings helps to shed light on the particular nature of biological information coding as well as on the role of RQA technique in bioinformatics and computational biology fields.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Satellite DNA is the main component of functional centromeres, and form the main structural constituent of heterochromatin, i.e. the densely packed, non expressed part of DNA molecule. The name “satellite DNA” refers to how repetitions tend to produce a different frequency of the nucleotides adenine, cytosine, guanine and thymine, and thus have a different density from bulk DNA, such that they form a second or ‘satellite’ band when genomic DNA is separated on a density gradient [28].

References

  1. J.P. Eckmann, S.O. Kamporst, D. Ruelle, Recurrence plots of dynamical systems. Eur. Phys. Lett. 4, 973–977 (1987)

    Article  ADS  Google Scholar 

  2. C.L. Webber Jr., J.P. Zbilut, Dynamical assessment of physiological systems and states using recurrence plot strategies. J. Appl. Physiol. 76, 965–973 (1994)

    Google Scholar 

  3. N. Marwan, N. Wessel, U. Meyerfeldt, A Schirdewan, J. Kurths, Recurrence plot based measures of complexity and its application to heart rate variability data. Phys. Rev. E 66, 026702–1026702–7 (2002)

    Google Scholar 

  4. N. Marwan, M.C. Romano, M. Thiel, J. Kurths, Recurrence plots for the analysis of complex systems. Phys. Rep. 438, 237–329 (2007)

    Article  ADS  MathSciNet  Google Scholar 

  5. D.B. Vasconcelos, S.R. Lopes, R.L. Viana, J. Kurths, Spatial recurrence plots. Phys. Rev. E. 73, 056207 (2006)

    Article  ADS  MathSciNet  Google Scholar 

  6. A. Giuliani, R. Benigni, J.P. Zbilut, C.L. Webber Jr., P. Sirabella, A. Colosimo, Nonlinear signal analysis methods in the elucidation of protein sequence structure relationships. Chem. Rev. 102, 1471–1491 (2002)

    Article  Google Scholar 

  7. G. Oliva, L. Di Paola, A. Giuliani, F. Pascucci, R. Setola. Assessing protein resilience via a complex network approach. In Network Science Workshop (NSW), 2013 IEEE 2nd, (IEEE 2013), pp. 131–137

    Google Scholar 

  8. L. Di Paola, M. De Ruvo, P. Paci, D. Santoni, A. Giuliani, Protein contact networks: an emerging paradigm in chemistry. Chem. Rev. 113, 1598–1613 (2013)

    Article  Google Scholar 

  9. C.L. Webber Jr., A. Giuliani, J.P. Zbilut, A. Colosimo, Elucidating protein secondary structures using alpha carbon recurrence quantifications. Proteins Struct. Funct. Genet. 44, 292–303 (2001)

    Article  Google Scholar 

  10. M. De Ruvo, A. Giuliani, P. Paci, D. Santoni, L. Di Paola, Shedding light on protein-ligand binding by graph theory: the topological nature of allostery. Biophys. Chem. 165–166, 21–29 (2012)

    Article  Google Scholar 

  11. S. Vishveshwara, K. Brinda, N. Kannan, Protein structure: insights from graph theory. J. Theor. Comput. Chem. 1, 187–212 (2002)

    Article  Google Scholar 

  12. C. Hansch, D. Hoekman, H. Gao, Comparative qsar: toward adeeper understanding of chemico-biological interactions. Chem. Rev. 96, 1045–1075 (1996)

    Article  Google Scholar 

  13. S. Miyazawa, R.L. Jernigan, Estimation of effective inter-residue contact energies from protein crystal structure: quasi-chemical approximation. Macromolecules 18, 534–552 (1985)

    Article  ADS  Google Scholar 

  14. J. Kyte, R.F. Doolitle, A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982)

    Article  Google Scholar 

  15. A. Porrello, S. Soddu, J.P. Zbilut, M. Crescenzi, A. Giuliani, Discrimination of single aminoacid mutations of the p53 protein by means of recurrence quantification analysis. Proteins Struct. Funct. Bioinf. 55, 743–755 (2004)

    Article  Google Scholar 

  16. A. Giuliani, R. Benigni, P. Sirabella, J.P. Zbilut, A. Colosimo, Nonlinear methods in the analysis of protein sequences: a case study in rubredoxins. Biophys. J. 78, 136–149 (2000)

    Article  Google Scholar 

  17. S. Soddu, G. Blandino, R. Scardigli, R. Martinelli, M.G. Rizzo, M. Crescenzi, A. Sacchi, Wild-type p53 induces diverse effects in 32d cells expressing different oncogenes. Mol. Cell. Biol. 16, 487–495 (1996)

    Google Scholar 

  18. T. Soussi, Y. Legros, R. Lubin, K. Ory, B. Schlichtholz, Multifactorial analysis of p53 alteration in human cancer: a review. Int. J. Cancer 57, 1–9 (1994)

    Article  Google Scholar 

  19. H.J. Jeffrey, Chaos game representation of gene structure. Nucleic Acid Res. 18, 2163–2170 (1990)

    Article  Google Scholar 

  20. O.C. Kulkarni, R. Vigneshwar, V.K. Jayaraman, B.D. Kulkarni, Identification of coding and noncoding sequences using local holder exponent formalism. Bioinformatics 21, 3818–3822 (2005)

    Article  Google Scholar 

  21. R.N. Mantegna, S.V. Buldyrev, A.L. Goldberger, S. Havlin, C.K. Peng, M. Simons, H.E. Stanley, Linguistic features of noncoding dna sequences. Phys. Rev. Lett. 73, 3169–3175 (1994)

    Article  ADS  Google Scholar 

  22. E.A. Feingold, P.J. Good, M.S. Guyer, S. Kamholz, L. Liefer, K. Wetterstrand, F.S. Collins et al., The encode (encyclopedia of dna elements) project. Science 306, 636–640 (2004)

    Article  Google Scholar 

  23. J.O. Andersson, S.G. Andersson, Pseudogenes, junk dna, and the dynamics of rickettsia genomes. Mol. Biol. Evol. 18(5), 829–839 (2001)

    Article  Google Scholar 

  24. C. Frontali, E. Pizzi, Similarity in oligonucleotide usage in introns and intergenic regions contributes to long-range correlation in the caenorhabditis elegans genome. Gene 232, 87–95 (1999)

    Article  Google Scholar 

  25. E. Bultrini, E. Pizzi, P. Del Giudice, C. Frontali, Pentamer vocabularies characterizing introns and intron-like intergenic tracts from Caenorabditis elegans and Drosophila melanogaster. Gene 304, 183–192 (2003)

    Google Scholar 

  26. F. Orsucci, A. Giuliani, C.L. Webber, J.P. Zbilut, P. Fonagy, M. Mazza, Combinatorics and synchronization in natural semiotics. Phys. A 361, 665–676 (2006)

    Article  Google Scholar 

  27. G. Leonardi, The study of language and conversation with recurrence analysis methods. Psychol. Lang. Commun. 16, 165–183 (2012)

    Google Scholar 

  28. B. John, G.L. Miklos, Functional aspects of satellite dna and heterochromatin. Int. Rev. Cytol. 58, 1–114 (1979)

    Article  Google Scholar 

  29. M.A. Montemurro, Beyond the Zipf-Mandelbrot law in quantitative linguistics. Physica A 300, 567–578 (2001)

    Article  ADS  MATH  Google Scholar 

  30. C.L. Webber Jr., J.P. Zbilut, Recurrence quantification analysis of nonlinear dynamical systems, in Tutorials in Contemporary Nonlinear Methods for the Behavioral Sciences, Chap. 2 (National Science Foundation, Washington, DC, 2005) pp. 26–94

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Colosimo .

Editor information

Editors and Affiliations

Appendices

Appendix 1: Cryptography

Cryptography was a strategically crucial discipline during the Second World War: the decipherment of hidden information in encrypted messages (like in the case of the cracking by the allies of the German code generated by the system Enigma) was based upon the notion that any human language, despite its apparent randomness and arbitrariness, is endowed with regularities of various kinds (e.g. the relative abundance of words of given length, the juxtaposition of pairs of symbols, etc.) and that no masking code can obscure the code-independent features typical of the original language. These code-independent features are supposed to derive from some general invariants common to all languages, like the so called Zipfs law [29] stating the frequency of occurrence of the words in any kind of (sufficiently long) text written in any language is negatively correlated with the number of letters according to a power law (Fig. 5.16).

Fig. 5.16
figure 16

Scaling of the frequency of occurrence of words on word length as computed from different book collection

Figure 5.16 reports (on a double-logarithm scale) the strictly invariant relation between frequency of occurrence and word length in different book collections. Such regularities are clearly independent from the rich semantic information present in the analyzed books: the observed scaling comes from global constraints linked to the general features of human languages. The fact these general features are largely content independent was considered very important for many investigators involved in the analysis of the DNA sequences: in this way they could skip the specific (and extremely heterogeneous) function of different patches so to concentrate on the global statistical features of the billions letters DNA text.

Appendix 2: Strings from Human Languages

2.1 (A) Dante Alighieri - Inferno - I Canto (tercets 1–3) - FILTERED

Nelmezzodelcammindinostravitamiritrovaiperunaselvaoscuracheladirittaviaerasmarrita Ahiquantoadirqualeraecosaduraestaselvaselvaggiaeaspraefortechenelpensierrinovalapaura Tanteamarachepocoepiumortemapertrattardelbenchivitrovaidirodellaltrecosechivhoscorte

2.2 (B) Dante Alighieri - Inferno - I Canto (tercets 1–3) - NONFILTERED (English Translation by Henry Wadsworth Longfellow))

Midway upon the journey of our life

I found myself within a forest dark

For the straightforward pathway had been lost.

Ah me! how hard a thing it is to say

What was this forest savage, rough, and stern

Which in the very thought renews the fear.

So bitter is it, death is little more;

But of the good to treat, which there I found,

Speak will I of the other things I saw there.

2.3 (C) Dr. Suess Poem - NONFILTERED

I do not like eggs in the file.

I do not like them in any style.

I will not take them fried or boiled.

I will not take them poached or broiled.

I will not take them soft or scrambled,

Despite an argument well-rambled.

No fan I am of the egg at hand.

Destroy that egg! Today! Today!

Today I say!

Without delay!

(A), (B) and (C) refer to the strings from spoken languages whose % Det is shown in Fig. 5.14. Notice that in all cases RQA was applied after filtering the original texts as indicated in [30]: only in (A), however, the filtered text is shown. (A) and (B) show the first three of the 45 analyzed tercets in Fig. 5.14.

Appendix 3: Nucleotidic Strings

3.1 Satellite DNA 1 - GenBank: BI067039.1 Homo Sapiens Genomic Region Containing Hypervariable Minisatellites, mRNA Sequence

GTCCTCCGCCCCACACTTATGGGGCAGAACCCACACTTCCGGTCCTCCGCTCCACACTTATGGGGCACAGCCCACACTTCTGGTCCTCTGCCCCACACTTATGGGGCACAGTTGGGTGTTCTGCCCCACACTTATGGGGCACAGACAGCAGTTCCGGACCTCCACCCCACACTTATGGGGCAGAACCCACACTTCCGGTCCTCCGCCCCACACTTATGGGGCAGAACCCACAGTTTTGGTCCTCCGCTCCACACTTATGGGGCACAACAACCCACAGTTATGGGGCTTATGAGGTTCTGCCCCACACTTATGGGGCACAGACAGCAGTTCTGGTCCTCCGCCCCACACTTATGGGGCAGAACCCACACTTCCGGTCCTCCGCCCCACACTTATGGGGCAGAACCCACACTTCCGGTCCTCCGCTCCACACTTATGGGGCACAGCCCACACTTCTGGTCCTCTGCCCCACACTTATGGGGCACAGCTGGGGGTCCTACCCCACACTTATGGGGCAGAACCCACAGTTCCGGTCCTCCACCCCACACTTATGGGGCACAGCTGGGGATTCTGTGCCACACTTATGGGGCAGAACCCACAGTTCCGGCCCTCCGCCCCACACTTATGGGGCAGNNCNNGCNGNNCGGG

3.2 Satellite DNA 2 - GenBank: BM439581.1 Homo Sapiens Genomic Region Containing Hypervariable Minisatellites, mRNA Sequence

GGCACAGCTGGGGATTCTGCCCCACACTTATGCGGCACAACCCACAGTTCTGGTCCTCTCCCCCACACTTATGGGGCACAACAACCCACAGTTATGGGGCTTATGAGGTTCTGCCCCACACTTACGGGGCACAGACAGCAGTTCCAGTCCTCCGCCCCACACTTATGGGGCAGAACCCACAATTCCGGACCTCTGCCCCACACTTACGGGGCACAGCTGGGGATTCTGCCCCACACTTATGGGGCACAACCCACAGTTCTGGTCCTCTCCCCCACACTTATGGGGCAGAACCCACACTTCCGGTCCTCCGCCCCACACTTAGGGAGCAGAACCCACACTTCCGGTCCTCCGCCCCACACTTATGGGGCACAACAACCCACAGTTATGGGGCCTATGAGGTTCTGCCCCACACTTATGGGGCACAGACAGCAGTTCCGGACCTCTGCCCCACACTTATGGGGCACAGTTGGGGGTCCTACCCCACACTTATGGGGCAGAACCCACAGTTCCGGACCTCCGCCCCACACTTATGGGGCAGAACCCACACTTCCGNACCTCTGCCCCACACTTATGGGGCACA

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Colosimo, A., Giuliani, A. (2015). From Time to Space Recurrences in Biopolymers. In: Webber, Jr., C., Marwan, N. (eds) Recurrence Quantification Analysis. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-07155-8_5

Download citation

Publish with us

Policies and ethics