The NH2-terminal propeptides of fibrillar collagens: highly conserved domains with poorly understood functions
Introduction
Although the existence of non-triple helical regions or domains in collagens is now considered commonplace, this aspect of collagen structure was not appreciated some 30 years ago. At that time, known sequences that were not composed of repeating Gly-X-Y triplets were limited to the telo- or end-peptides of the α1 and α2 chains of type I collagen. Today, it is recognized that all of the 21 or more of the described collagen types contain ‘non-collagenous’ sequences, many of which separate triple helical segments. Sound experimental evidence for a biosynthetic precursor of type I collagen, termed procollagen, containing non-collagenous sequences, was first published by a number of different laboratories in 1971 (Bellamy and Bornstein, 1971, Layman et al., 1971, Jimenez et al., 1971, Lenaers et al., 1971, Stark et al., 1971). During the course of the following decade, the primary structures of types I, II and III procollagens were determined and a general molecular plan for these proteins was established. This plan consists of a central uninterrupted triple helix and NH2- and COOH-terminal propeptides that differ considerably in amino acid composition and sequence (see Bornstein, 1974, Martin et al., 1975 and Bornstein and Traub, 1979 for details and references to the earlier literature). Types V and XI procollagens also follow this model, but since the structure of the NH2-terminal (N)-propeptides in these proteins differs considerably from that in procollagens I, II and III (Takahara et al., 1995, Gregory et al., 2000), these proteins will not be considered in this review.
Of the fibrillar procollagens, types I, II and III are converted by procollagen N- and C-proteases to monomeric collagens, which are the direct precursors of collagen fibrils (Prockop and Kivirikko, 1995, Prockop et al., 1998, Hojima et al., 1994, Colige et al., 1997), and type V is partially processed by other enzymes (Unsöld et al., 2001). The non-triple helical regions of a number of other collagens are also subject to partial proteolysis. In the case of type XVIII collagen, a COOH-terminal sequence (endostatin) has been shown to be released enzymatically, both in cell culture and in vivo (Marneros and Olsen, 2001). However, such enzymatic processes, which can reveal putative cryptic functions in the precursor proteins, cannot be considered functionally analogous to the conversion of procollagens to collagens. Thus far, only fibrillar procollagens have been shown to require limited proteolysis to achieve a mature, functional state.
This review will compare the primary structures of the cysteine-rich repeats (CRR) in types I, II and III procollagens, which constitute almost the entire globular domains of the N-propeptides in these procollagens (Fig. 1), and will attempt to deduce the functions of these domains from new information obtained for both procollagens and other CRR-containing proteins. Since there is much less published information on the type III N-propeptide, attention will be focused on the N-propeptides of types I and II procollagens. A major conclusion of this review is that few, if any, of the previously proposed functions of the type I N-propeptide, some based on highly credible evidence in vitro, are supported by recent experiments in mice. We will therefore be obliged to rethink the possible functions of this domain in type I procollagen or at least consider the possibility that the mouse may not be an appropriate model for the function of the N-propeptide in other mammals.
Section snippets
The structure and function of cysteine-rich repeats (CRR) in proteins
CRR are found in a wide variety of proteins and are characterized by a signature sequence of 10 cysteines in the sequence CX1CX2CX3CX4CX5CX6CX7CCX8C. The sequences X1 to X8 are variable in length, with X1 and X6 the most variable. When the sequences of evolutionarily distant CRR are compared, only a few amino acids, with the exception of the cysteines, are conserved. These include glycine, tryptophan and proline in X1, and proline in X7.
However, in the N-propeptides of types I–III procollagens,
Amino acid sequences of the CRR in the N-propeptides of procollagens I–III
As shown in Fig. 1, the N-propeptide of the mouse proα1(I) chain is composed of a globular domain and a Gly-X-Y-repeating sequence that interacts with the corresponding sequences in a second proα1(I) chain and a proα2(I) chain to form a short triple helix. However, it should be noted that the proα2(I) chain lacks a CRR domain. In the mouse proα1(I) chain, the CRR constitutes the majority of the globular domain (56 of the 76 amino acids), and exon 2 encodes 65 amino acids that encompass the
Possible functions of the α1(I) collagen N-propeptide
Initial studies documenting the existence of a biosynthetic precursor of type I collagen, termed procollagen (Bellamy and Bornstein, 1971), focused on the presence of N-terminal ‘extensions’ now known as N-propeptides. A number of functions were attributed to these N-propeptides, including initiation of chain association in triple helix formation, inhibition of intracellular fibrillogenesis, and facilitation of both intra- and extracellular transport of procollagen (see Bornstein, 1974 and
Possible functions of the proα1(II) N-propeptide
The existence of a CRR in the proα1(II) N-propeptide was not recognized until well after the sequence of the cDNA encoding the major form of the proα1(II) chain had been determined. Initially, it was thought that the col2a1 gene lacked this sequence and resembled the col1a2 gene in this respect (Su et al., 1989). However, it was subsequently discovered that exon 2 does exist in the col2a1 gene, but is alternatively spliced (Ryan et al., 1990, Ryan and Sandell, 1990) in a pattern that may be
Conclusions and future directions
The possible functions of the N-propeptides of types I, II and III procollagens have remained a subject of interest, and to some extent an enigma, in collagen biology for the past three decades. The generation of mice with a targeted deletion of exon 2 in the col1a1 gene has revealed a phenotype that is unexpectedly mild and is characterized by an apparent absence of defects in procollagen chain synthesis, assembly, secretion, or proteolytic processing. The only abnormality determined thus far
Acknowledgments
Studies from the author's laboratory were supported by Grant AR 11248 from the National Institute of Health. I thank Helene Sage, Lucas Armstrong, and Mary Lou Augustine for helpful comments on the manuscript.
References (67)
- et al.
The chemistry and biology of collagen
- et al.
The amino acid sequence encoded by exon 2 of the murine collal gene is not required for normal secretion, processing by procollagen N-proteinase or fibrillogenesis of type I collagen in mice
J. Biol. Chem.
(2002) - et al.
Identification of a substrate site for liver transglutaminase on the aminopropeptide of type III collagen
J. Biol. Chem.
(1987) - et al.
Human Ehlers-Danlos syndrome type VIIC and bovine dermatosparaxis are caused by mutations in the procollagen I N-proteinase gene
Am. J. Hum. Genet.
(1999) - et al.
Folding of carboxyl domain and assembly of procollagen I
J. Biol. Chem.
(1986) - et al.
Genomic sequence of mouse COL1A1 encoding the collagen propeptides
Biochim. Biophys. Acta
(1993) - et al.
Xenopus chordin and Drosophila short gastrulation genes encode homologous proteins functioning in dorsal-ventral axis formation
Cell
(1995) - et al.
Structural organization of distinct domains within the non-collagenous N-terminal region of collagen type XI
J. Biol. Chem.
(2000) - et al.
Characterization of type I procollagen N-proteinase from fetal bovine tendon and skin: Purification of the 500-kilodalton form of the enzyme from bovine tendon
J. Biol. Chem.
(1994) - et al.
The Xenopus dorsalizing factor noggin ventralized Drosophila embryos by preventing DPP from activating its receptor
Cell
(1996)
Further evidence for a transport form of collagen: its extrusion and extracellular conversion to tropocollagen in embryonic tendon
FEBS Letts.
Role of thrombospondin-1-derived peptide, 4N1K, in FGF-2-induced angiogenesis
Exp. Cell Res.
Biochemical characterization and expression analysis of neural thrombospondin-1-like proteins NELL1 and NELL2
Biochem. Biophys. Res. Commun.
The CCN family of angiogenic regulators: the integrin connection
Exp. Cell Res.
Deletion of the pro-alpha1(I) N-propeptide affects secretion of type-I collagen in Chinese hamster lung cells but not in Mov-13 mouse cells
J. Biol. Chem.
The role of collagen-derived proteolytic fragments in angiogenesis
Matrix Biol.
Production of a DPP activity gradient in the early Drosophila embryo through the opposing actions of the SOG and TLD proteins
Cell
Molecular recognition in procollagen chain assembly
Matrix Biol.
Physical characterization of the procollagen module of human thrombospondin 1 expressed in insect cells
J. Biol. Chem.
Preferential expression of alternatively spliced messenger RNAs encoding type-II procollagen with a cysteine-rich amino-propeptide in differentiating cartilage and non-chondrogenic tissues during early mouse development
Dev. Biol.
Production of human type I collagen in yeast reveals unexpected new insights into the molecular assembly of collagen trimers
J. Biol. Chem.
Cleavage of chordin by Xolloid metalloprotease suggests a role for proteolytic processing in the regulation of Spemann organizer activity
Cell
Dorsoventral patterning in Xenopus: inhibition of ventral signals by direct binding of chordin to BMP-4
Cell
Procollagen N-proteinase and procollagen C-proteinase. Two unusual metalloproteinases that are essential for procollagen processing probably have important roles in development and cell signaling
Matrix Biol.
Differential expression of a cysteine-rich domain in the amino-terminal propeptide of type II (cartilage) procollagen by alternative splicing of mRNA
J. Biol. Chem.
The human type II procollagen gene: identification of an additional protein-coding domain and location of potential regulatory sequences in the promoter and first intron
Genomics
Electronoptical studies of procollagen from the skin of dermatosparaxic calves
FEBS Letts.
Organization of the exons coding for pro α1(II) collagen N-propeptide confirms a distinct evolutionary history of this domain of the fibrillar collagen genes
Genomics
Complete structural organization of the human α1(V) collagen gene (COL5A1): divergence from the conserved organization of other characterized fibrillar collagen genes
Genomics
Inhibiting effect of procollagen peptides on collagen biosynthesis in fibroblast cultures
J. Biol. Chem.
Evidence for procollagen, a biosynthetic precursor of collagen
Proc. Natl. Acad. Sci. USA
Nucleotide sequence of pre-pro-von Willebrand factor cDNA
Nucleic Acids Res.
The biosynthesis of collagen
Annu. Rev. Biochem.
Cited by (30)
Cysteine-rich domain of type III collagen N-propeptide inhibits fibroblast activation by attenuating TGFβ signaling
2022, Matrix BiologyCitation Excerpt :These data indicate that Col3 plays an important role in regulating wound healing and tumor development, both of which are also affected by TGFβ. The N-terminal propeptides of fibrillar collagens have been studied extensively both as markers in physiologic and pathologic processes, as well as for their roles in these events (reviewed in [33–35]). The N-propeptides of Col1 and Col2 are enzymatically cleaved during fibril assembly [8]; however, Col3 retains its N-propeptide to a greater extent than do other fibrillar collagens, and it persists in many tissues, suggesting that this domain may have a novel regulatory role [36].
Type I collagen structure, synthesis, and regulation
2019, Principles of Bone BiologyQuantification of type II procollagen splice forms using alternative transcript-qPCR (AT-qPCR)
2012, Matrix BiologyCitation Excerpt :Recently, expression of the IID splice form was also observed in human articular chondrocytes during expansion in monolayer culture in the presence of BMP-2 (Claus et al., 2010). The alternatively-spliced exon 2-encoded domain of Col2a1 encodes a highly conserved cysteine-rich (CR) von Willebrand factor C-like domain within the amino (NH2) propeptide (Bornstein, 2002). This exon 2-encoded protein shares some homology with CR domains in other proteins that are known to bind to and regulate growth factors of the TGF-β superfamily (Abreu et al., 2002).