ensembl-database

Query Ensembl genome database REST API for 250+ species. Gene lookups, sequence retrieval, variant analysis, comparative genomics, orthologs, VEP predictions, for genomic research.

Category

Other Tools

Install

Hot:15

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-ensembl-database&locale=en&source=copy

Ensembl Database Skill Details

Skill Overview


The Ensembl Database skill lets you directly access the authoritative genome database maintained by EMBL-EBI via Python or REST API. It supports gene queries, sequence downloads, variant analysis, and cross-species comparisons for more than 250 species including human and mouse.

Use Cases

1. Gene Annotation and Sequence Retrieval


When you need to quickly look up detailed information for a gene (e.g., BRCA2, TP53) or obtain its DNA, transcript, or protein sequences, this skill can query by gene symbol or Ensembl ID and return complete annotations and sequence data without manually downloading large database files.

2. Genetic Variant Functional Prediction


For variant analysis, when you need to predict the potential biological impact of a variant (e.g., rsID or genomic coordinate), this skill integrates VEP (Variant Effect Predictor) to predict whether a variant affects protein coding, splice sites, or other important functional regions.

3. Cross-Species Gene Comparison Studies


When studying evolutionary relationships of genes across species, and you need to find direct orthologs of a target gene in other species, this skill supports one-click homology searches, providing gene trees and gene family information suitable for comparative genomics and evolutionary biology research.

Core Features

Gene Information Lookup


Supports gene lookup by gene symbol (e.g., "BRCA2"), Ensembl ID (e.g., "ENSG00000139618"), or external database IDs, returning chromosome location, transcripts, protein sequences, and external database cross-references (UniProt, RefSeq, etc.).

Sequence Data Retrieval


Provides multiple ways to retrieve sequences, including genomic DNA, transcript cDNA, protein sequences, and sequence extraction for specified genomic regions. Supports output formats such as JSON and FASTA for convenient downstream analysis.

Variant Effect Prediction (VEP)


Input a variant in HGVS notation or rsID to predict the biological consequence of the variant, including whether it causes an amino acid change, affects splice sites, or impacts regulatory regions, along with population frequency data and phenotype association information.

Comparative Genomics Tools


Find orthologs and paralogs of a specified gene in other species, obtain gene trees and gene family information to help understand a gene’s evolutionary history and functional conservation.

Genome Coordinate Conversion


Supports mapping coordinates between different genome assembly versions (e.g., GRCh37/hg19 to GRCh38/hg38) to resolve mismatches between historical data and the latest reference assemblies.

Genomic Region Retrieval


Query all genes, transcripts, regulatory elements (promoters, enhancers), and structural variants within a specified chromosome region, suitable for regional genomic analyses.

Frequently Asked Questions

Which species does the Ensembl database support?


The Ensembl database covers over 250 vertebrate species, including human, mouse, rat, zebrafish, fruit fly, and other common model organisms, as well as many non-model organisms. You can query the full species list and genome assembly information for each species via the API. In addition to the main Ensembl site, Ensembl Genomes contains non-vertebrate data such as plants, fungi, and protists.

How do I use the Ensembl API to query gene information?


The simplest way is to install the Python package ensembl_rest and then use EnsemblClient to query. For example: client.symbol_lookup(species='human', symbol='BRCA2') will return the full information for the BRCA2 gene. You can also call the REST API endpoints directly using the requests library without installing any package.

Are there rate limits for the Ensembl REST API?


Yes. Anonymous users can make up to 15 requests per second. If you exceed the limit, the API will return a 429 status code and include a Retry-After header in the response to indicate how long to wait. It is recommended to implement retry logic and rate limiting in your code, or use batch endpoints to reduce the number of requests.

How do I use VEP (Variant Effect Predictor)?


VEP accepts multiple variant input formats, including HGVS notation (e.g., ENST00000380152.7:c.803C>T), rsID (e.g., rs699), or genomic coordinates. Calling VEP returns predicted consequences such as synonymous, missense, splice site changes, etc., along with population frequencies and clinical association information.

How do I convert coordinates from an old genome assembly to a new one?


The AssemblyMapper feature enables coordinate conversion. For example, to convert from GRCh37 to GRCh38, specify asm_from='GRCh37' and asm_to='GRCh38', then input the chromosome and coordinate to obtain mapped results. Note that GRCh37 queries require using the dedicated domain grch37.rest.ensembl.org.