Skip to contents

Map Ensembl IDs to Entrez Gene ID, HGNC symbol, and UniProt ID, with basic annotation information such as gene type.

Usage

grex(ensembl_id)

Arguments

ensembl_id

Character vector of Ensembl IDs

Value

This function returns a data frame with the same number of rows as the length of input Ensembl IDs, containing:

  • ensembl_id - Input Ensembl ID

  • entrez_id - Entrez Gene ID

  • hgnc_symbol - HGNC gene symbol

  • hgnc_name - HGNC gene name

  • cyto_loc - Cytogenetic location

  • uniprot_id - UniProt ID

  • gene_biotype - Gene type

The elements that cannot be mapped will be NA.

Examples

# Ensembl IDs in GTEx v6p gene count data
data("gtexv6p")
# select 100 IDs as example
id <- gtexv6p[101:200]
df <- grex(id)
# Rows that have a mapped Entrez ID
df[
  !is.na(df$"entrez_id"),
  c("ensembl_id", "entrez_id", "gene_biotype")
]
#>          ensembl_id entrez_id                     gene_biotype
#> 1   ENSG00000162576     54587                   protein_coding
#> 2   ENSG00000175756     54998                   protein_coding
#> 4   ENSG00000221978     81669                   protein_coding
#> 5   ENSG00000224870    148413             processed_transcript
#> 6   ENSG00000242485     55052                   protein_coding
#> 8   ENSG00000235098    441869                   protein_coding
#> 10  ENSG00000205116    643965                   protein_coding
#> 12  ENSG00000179403     64856                   protein_coding
#> 13  ENSG00000215915    219293                   protein_coding
#> 14  ENSG00000160072     83858                   protein_coding
#> 15  ENSG00000197785     55210                   protein_coding
#> 16  ENSG00000205090    339453                   protein_coding
#> 17  ENSG00000160075     29101                   protein_coding
#> 20  ENSG00000228594    643988                   protein_coding
#> 22  ENSG00000197530    142678                   protein_coding
#> 23  ENSG00000189409      8510                   protein_coding
#> 24  ENSG00000248333       984                   protein_coding
#> 26  ENSG00000189339    728661                   protein_coding
#> 29  ENSG00000215914      8511           unprocessed_pseudogene
#> 30  ENSG00000008128    728642                   protein_coding
#> 32  ENSG00000215790      9906                   protein_coding
#> 33  ENSG00000008130     65220                   protein_coding
#> 34  ENSG00000078369      2782                   protein_coding
#> 36  ENSG00000169885    163688                   protein_coding
#> 37  ENSG00000178821    339456                   protein_coding
#> 38  ENSG00000142609     85452                   protein_coding
#> 40  ENSG00000187730      2563                   protein_coding
#> 41  ENSG00000226969 105378591                        antisense
#> 42  ENSG00000067606      5590                   protein_coding
#> 44  ENSG00000182873 100506504                        antisense
#> 45  ENSG00000162585    199990                   protein_coding
#> 49  ENSG00000157933      6497                   protein_coding
#> 50  ENSG00000116151     79906                   protein_coding
#> 53  ENSG00000269896 100129534 transcribed_processed_pseudogene
#> 58  ENSG00000157916     11079                   protein_coding
#> 59  ENSG00000157911      5192                   protein_coding
#> 60  ENSG00000149527      9651                   protein_coding
#> 63  ENSG00000157881     55229                   protein_coding
#> 64  ENSG00000197921    388585                   protein_coding
#> 67  ENSG00000157873      8764                   protein_coding
#> 69  ENSG00000228037 100996583                          lincRNA
#> 70  ENSG00000157870    127281                   protein_coding
#> 71  ENSG00000142606     79258                   protein_coding
#> 73  ENSG00000215912 100287898                   protein_coding
#> 76  ENSG00000169717    140625                   protein_coding
#> 77  ENSG00000177133    440556                        antisense
#> 78  ENSG00000142611     63976                   protein_coding
#> 81  ENSG00000130762     27237                   protein_coding
#> 83  ENSG00000162591      1953                   protein_coding
#> 84  ENSG00000207776    693135                            miRNA
#> 86  ENSG00000158109    127262                   protein_coding
#> 87  ENSG00000116213     49856                   protein_coding
#> 88  ENSG00000078900      7161                   protein_coding
#> 91  ENSG00000227372     57212   transcribed_unitary_pseudogene
#> 92  ENSG00000162592    148870                   protein_coding
#> 93  ENSG00000235169    388588                   protein_coding
#> 94  ENSG00000130764     57470                   protein_coding
#> 97  ENSG00000116198      9731                   protein_coding
#> 98  ENSG00000169598      1677                   protein_coding
#> 99  ENSG00000198912    339448                   protein_coding
#> 100 ENSG00000236423 100133612                          lincRNA