Influenza Database and Tools Entrez Nucleotide Split Database Third Party Annotation Database RefSeq Release 18 1918 Killer Flu Virus UniGene GenBank Release 155 Mammoths and Moas at NCBI Recent NCBI Publications NCBI Papers Most Cited NCBI Courses BLAST Lab Genome Builds and Map Viewer Masthead |
A Collaborative Effort The Trace Archive was established in 2001 as a collaborative effort between NCBI and the European Molecular Biology Laboratory (EMBL/ENSEMBL) to collect raw data produced at sequencing centers around the world. Today, these data are submitted to one of two central processing centers—NCBI or the Wellcome Trust Sanger Centre. The amount of data in the archive has doubled every 10 months since 2001 so that it is now an overwhelming 22 trillion bytes in size, large enough to fill a stack of compact disks 10 stories high. New sequencing technologies promise an even sharper increase in data volume in the future. NCBI works closely with the groups pioneering these new techniques to develop the necessary processing , storage and retrieval technologies in advance of the anticipated data influx. Traces are Pieces of a Puzzle NCBI’s Trace Archive provides direct access to the raw traces, typically between 300 and 1,000 DNA letters in length. Researchers can view and evaluate over 850 assemblies, such as that shown in Fig. 1, of trace-derived sequences for influenza virus.
Click on image to view larger
These assemblies are found in the Assembly Archive, a database that builds upon the sequences in the Trace Archive to provide a higher level view. A Vital Resource in the Fight Against Disease Sequencing traces are vital to the hunt for polymorphisms in gene sequences that are linked to disease when they occur in human DNA or linked to virulence when they occur in the DNA of a virus. To further support studies of DNA sequence variability, NCBI maintains the core dbSNP database with detailed information for over 25 million genetic variations, predominantly single DNA letter changes called ‘Single Nucleotide Polymorphisms’. The trace data, combined with that of dbSNP, is a boon to medical researchers seeking to gain greater insight into the impact of genetic variation on health. Trace sequences may be searched using MegaBLAST, or via the web-based form at (see the ‘Mammoth found in Trace Archive’ section of the “Mammoths and Moas. . .” article in this issue.) |
|||||
|
||||||