Primary structure of human nuclear ribonucleoprotein particle C proteins: conservation of sequence and domain structures in heterogeneous nuclear RNA, mRNA, and pre-rRNA-binding proteins

Mol Cell Biol. 1987 May;7(5):1731-9. doi: 10.1128/mcb.7.5.1731-1739.1987.

Abstract

In the eucaryotic nucleus, heterogeneous nuclear RNAs exist in a complex with a specific set of proteins to form heterogeneous nuclear ribonucleoprotein particles (hnRNPs). The C proteins, C1 and C2, are major constituents of hnRNPs and appear to play a role in RNA splicing as suggested by antibody inhibition and immunodepletion experiments. With the use of a previously described partial cDNA clone as a hybridization probe, full-length cDNAs for the human C proteins were isolated. All of the cDNAs isolated hybridized to two poly(A)+ RNAs of 1.9 and 1.4 kilobases (kb). DNA sequencing of a cDNA clone for the 1.9-kb mRNA (pHC12) revealed a single open reading frame of 290 amino acids coding for a protein of 31,931 daltons and two polyadenylation signals, AAUAAA, approximately 400 base pairs apart in the 3' untranslated region of the mRNA. DNA sequencing of a clone corresponding to the 1.4-kb mRNA (pHC5) indicated that the sequence of this mRNA is identical to that of the 1.9-kb mRNA up to the first polyadenylation signal which it uses. Both mRNAs therefore have the same coding capacity and are probably transcribed from a single gene. Translation in vitro of the 1.9-kb mRNA selected by hybridization with a 3'-end subfragment of pHC12 demonstrated that it by itself can direct the synthesis of both C1 and C2. The difference between the C1 and C2 proteins which results in their electrophoretic separation is not known, but most likely one of them is generated from the other posttranslationally. Since several hnRNP proteins appeared by sodium dodecyl sulfate-polyacrylamide gel electrophoresis as multiple antigenically related polypeptides, this raises the possibility that some of these other groups of hnRNP proteins are also each produced from a single mRNA. The predicted amino acid sequence of the protein indicates that it is composed of two distinct domains: an amino terminus that contains what we have recently described as a RNP consensus sequence, which is the putative RNA-binding site, and a carboxy terminus that is very negatively charged, contains no aromatic amino acids or prolines, and contains a putative nucleoside triphosphate-binding fold, as well as a phosphorylation site for casein kinase type II. The RNP consensus sequence was also found in the yeast poly(A)-binding protein (PABP), the heterogeneous nuclear RNA-binding proteins A1 and A2, and the pre-rRNA binding protein C23. All of these proteins are also composed of at least two distinct domains: an amino terminus, which possesses one or more RNP consensus sequences, and a carboxy terminus, which is unique to each protein, being very acidic in the C proteins and rich in glycine in A1, and C23 and rich in proline in the poly(A)-binding protein. These findings suggest that the amino terminus of these proteins possesses a highly conserved RNA-binding domain, whereas the carboxy terminus contains a region essential to the unique function and interactions of each of the RNA-binding proteins.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Carrier Proteins / genetics*
  • Cloning, Molecular
  • DNA / genetics
  • Heterogeneous-Nuclear Ribonucleoprotein Group C*
  • Heterogeneous-Nuclear Ribonucleoproteins
  • Humans
  • Molecular Sequence Data
  • Protein Biosynthesis
  • Protein Conformation
  • RNA Splicing*
  • RNA, Messenger / genetics
  • RNA, Ribosomal / metabolism
  • RNA-Binding Proteins
  • Ribonucleoproteins / genetics*
  • Structure-Activity Relationship

Substances

  • C1 HNRNP
  • Carrier Proteins
  • HNRNPC protein, human
  • Heterogeneous-Nuclear Ribonucleoprotein Group C
  • Heterogeneous-Nuclear Ribonucleoproteins
  • RNA, Messenger
  • RNA, Ribosomal
  • RNA-Binding Proteins
  • Ribonucleoproteins
  • messenger ribonucleoprotein
  • DNA

Associated data

  • GENBANK/M16342