datasets download ortholog symbol - download an ortholog dataset by gene symbol


datasets download ortholog symbol <gene_symbol> [flags]


Download an ortholog dataset by gene symbol and taxon (species name or species-level NCBI Taxonomy ID). If no taxon is specified, data will be returned for human. Ortholog data is calculated by NCBI for vertebrates and insects. Ortholog datasets include gene, transcript and protein sequence, a data table and a data report. Datasets are downloaded as a zip file.

The default ortholog dataset includes the following files:

  • gene.fna (gene sequences)
  • rna.fna (transcript sequences)
  • protein.faa (protein sequences)
  • data_report.jsonl (data report with gene metadata)
  • data_table.tsv (data table with gene metadata, one transcript per row)
  • dataset_catalog.json (a list of files and file types included in the dataset)

Refer to NCBI’s command line quickstart documentation for information about getting started with the command-line tools.


  datasets download ortholog symbol tp53
  datasets download ortholog symbol brca1 --taxon mouse


      --api-key string         NCBI Datasets API Key
      --exclude-gene           exclude gene.fna (gene sequence file)
      --exclude-protein        exclude protein.faa (protein sequence file)
      --exclude-rna            exclude rna.fna (transcript sequence file)
      --filename string        specify a custom file name for the downloaded dataset (default "")
  -h, --help                   help for symbol
      --include-3p-utr         include 3p_utr.fna (3'-UTR sequence file)
      --include-5p-utr         include 5p_utr.fna (5'-UTR sequence file)
      --include-cds            include cds.fna (CDS sequence file)
      --inputfile string       read a list of gene symbols from a file to use as input
      --no-progressbar         hide progress bar
      --taxon string           specify a species name (common or scientific) or species-level NCBI Taxonomy ID (default "human")
      --taxon-filter strings   limit results to ortholog data for a specified taxonomic group
Generated November 19, 2021