accession

download a gene dataset by RefSeq nucleotide or protein accession

accession

download a gene dataset by RefSeq nucleotide or protein accession

Name

datasets download gene accession - download a gene dataset by RefSeq nucleotide or protein accession

Synopsis

datasets download gene accession <refseq-accession ...> [flags]

Description

Download a gene dataset by RefSeq nucleotide or protein accession. Gene datasets include gene, transcript and protein sequence, a data table and a data report. Datasets are downloaded as a zip file.

The default gene dataset includes the following files:

  • gene.fna (gene sequences)
  • rna.fna (transcript sequences)
  • protein.faa (protein sequences)
  • data_report.jsonl (data report with gene metadata)
  • data_table.tsv (data table with gene metadata, one transcript per row)
  • dataset_catalog.json (a list of files and file types included in the dataset)
  • annotation_report.jsonl (included with prokaryotic gene packages)

Refer to NCBI Datasets Gene Package documentation for more information about the gene package.

Examples

  datasets download gene accession NP_000483.3
  datasets download gene accession NM_000546.6 NM_000492.4
  datasets download gene accession WP_004675351.1

Options

      --api-key string             NCBI Datasets API Key
      --exclude-gene               exclude gene.fna (gene sequence file)
      --exclude-protein            exclude protein.faa (protein sequence file)
      --exclude-rna                exclude rna.fna (transcript sequence file)
      --fasta-filter strings       limit gene fasta download to a specific list of accessions
      --fasta-filter-file string   file of accessions to limit gene fasta download
      --filename string            specify a custom file name for the downloaded dataset (default "ncbi_dataset.zip")
  -h, --help                       help for accession
      --include-3p-utr             include 3p_utr.fna (3'-UTR sequence file)
      --include-5p-utr             include 5p_utr.fna (5'-UTR sequence file)
      --include-cds                include cds.fna (CDS sequence file)
      --include-flanks-bp int      include gene flanking sequence, limited to prokaryotic genes (default: gene_flanks.fna is omitted from package)
      --inputfile string           read a list of RefSeq nucleotide or protein accessions from a file to use as input
      --no-progressbar             hide progress bar
      --taxon-filter string        limit genes to a specified taxon (any rank)
Generated October 14, 2021