Map Ensembl IDs to Entrez Gene ID, HGNC symbol, and UniProt ID, with basic annotation information such as gene type.
Value
This function returns a data frame with the same number of rows as the length of input Ensembl IDs, containing:
ensembl_id
- Input Ensembl IDentrez_id
- Entrez Gene IDhgnc_symbol
- HGNC gene symbolhgnc_name
- HGNC gene namecyto_loc
- Cytogenetic locationuniprot_id
- UniProt IDgene_biotype
- Gene type
The elements that cannot be mapped will be NA
.
Examples
# Ensembl IDs in GTEx v6p gene count data
data("gtexv6p")
# select 100 IDs as example
id <- gtexv6p[101:200]
df <- grex(id)
# Rows that have a mapped Entrez ID
df[
!is.na(df$"entrez_id"),
c("ensembl_id", "entrez_id", "gene_biotype")
]
#> ensembl_id entrez_id gene_biotype
#> 1 ENSG00000162576 54587 protein_coding
#> 2 ENSG00000175756 54998 protein_coding
#> 4 ENSG00000221978 81669 protein_coding
#> 5 ENSG00000224870 148413 processed_transcript
#> 6 ENSG00000242485 55052 protein_coding
#> 8 ENSG00000235098 441869 protein_coding
#> 10 ENSG00000205116 643965 protein_coding
#> 12 ENSG00000179403 64856 protein_coding
#> 13 ENSG00000215915 219293 protein_coding
#> 14 ENSG00000160072 83858 protein_coding
#> 15 ENSG00000197785 55210 protein_coding
#> 16 ENSG00000205090 339453 protein_coding
#> 17 ENSG00000160075 29101 protein_coding
#> 20 ENSG00000228594 643988 protein_coding
#> 22 ENSG00000197530 142678 protein_coding
#> 23 ENSG00000189409 8510 protein_coding
#> 24 ENSG00000248333 984 protein_coding
#> 26 ENSG00000189339 728661 protein_coding
#> 29 ENSG00000215914 8511 unprocessed_pseudogene
#> 30 ENSG00000008128 728642 protein_coding
#> 32 ENSG00000215790 9906 protein_coding
#> 33 ENSG00000008130 65220 protein_coding
#> 34 ENSG00000078369 2782 protein_coding
#> 36 ENSG00000169885 163688 protein_coding
#> 37 ENSG00000178821 339456 protein_coding
#> 38 ENSG00000142609 85452 protein_coding
#> 40 ENSG00000187730 2563 protein_coding
#> 41 ENSG00000226969 105378591 antisense
#> 42 ENSG00000067606 5590 protein_coding
#> 44 ENSG00000182873 100506504 antisense
#> 45 ENSG00000162585 199990 protein_coding
#> 49 ENSG00000157933 6497 protein_coding
#> 50 ENSG00000116151 79906 protein_coding
#> 53 ENSG00000269896 100129534 transcribed_processed_pseudogene
#> 58 ENSG00000157916 11079 protein_coding
#> 59 ENSG00000157911 5192 protein_coding
#> 60 ENSG00000149527 9651 protein_coding
#> 63 ENSG00000157881 55229 protein_coding
#> 64 ENSG00000197921 388585 protein_coding
#> 67 ENSG00000157873 8764 protein_coding
#> 69 ENSG00000228037 100996583 lincRNA
#> 70 ENSG00000157870 127281 protein_coding
#> 71 ENSG00000142606 79258 protein_coding
#> 73 ENSG00000215912 100287898 protein_coding
#> 76 ENSG00000169717 140625 protein_coding
#> 77 ENSG00000177133 440556 antisense
#> 78 ENSG00000142611 63976 protein_coding
#> 81 ENSG00000130762 27237 protein_coding
#> 83 ENSG00000162591 1953 protein_coding
#> 84 ENSG00000207776 693135 miRNA
#> 86 ENSG00000158109 127262 protein_coding
#> 87 ENSG00000116213 49856 protein_coding
#> 88 ENSG00000078900 7161 protein_coding
#> 91 ENSG00000227372 57212 transcribed_unitary_pseudogene
#> 92 ENSG00000162592 148870 protein_coding
#> 93 ENSG00000235169 388588 protein_coding
#> 94 ENSG00000130764 57470 protein_coding
#> 97 ENSG00000116198 9731 protein_coding
#> 98 ENSG00000169598 1677 protein_coding
#> 99 ENSG00000198912 339448 protein_coding
#> 100 ENSG00000236423 100133612 lincRNA