Skip to contents

Retrieve Protein Sequence in FASTA Format from the UniProt Database

Usage

getFASTAFromUniProt(id, parallel = 5)

Arguments

id

A character vector, as the protein ID.

parallel

An integer, the parallel parameter, indicates how many process the user would like to use for retrieving the data (using RCurl), default is 5. For regular cases, we recommend a number less than 20.

Value

A list, each component contains one of the protein sequences in FASTA format.

Details

This function retrieves protein sequences in FASTA format from the UniProt database.

References

UniProt. https://www.uniprot.org/

UniProt REST API Documentation. https://www.uniprot.org/help/api

See also

See getSeqFromUniProt for retrieving protein represented by amino acid sequence from the UniProt database. See readFASTA for reading FASTA format files.

Examples

id = c('P00750', 'P00751', 'P00752')
# \donttest{
getFASTAFromUniProt(id)# }
#> [1] ">sp|P00750|TPA_HUMAN Tissue-type plasminogen activator OS=Homo sapiens OX=9606 GN=PLAT PE=1 SV=1\nMDAMKRGLCCVLLLCGAVFVSPSQEIHARFRRGARSYQVICRDEKTQMIYQQHQSWLRPV\nLRSNRVEYCWCNSGRAQCHSVPVKSCSEPRCFNGGTCQQALYFSDFVCQCPEGFAGKCCE\nIDTRATCYEDQGISYRGTWSTAESGAECTNWNSSALAQKPYSGRRPDAIRLGLGNHNYCR\nNPDRDSKPWCYVFKAGKYSSEFCSTPACSEGNSDCYFGNGSAYRGTHSLTESGASCLPWN\nSMILIGKVYTAQNPSAQALGLGKHNYCRNPDGDAKPWCHVLKNRRLTWEYCDVPSCSTCG\nLRQYSQPQFRIKGGLFADIASHPWQAAIFAKHRRSPGERFLCGGILISSCWILSAAHCFQ\nERFPPHHLTVILGRTYRVVPGEEEQKFEVEKYIVHKEFDDDTYDNDIALLQLKSDSSRCA\nQESSVVRTVCLPPADLQLPDWTECELSGYGKHEALSPFYSERLKEAHVRLYPSSRCTSQH\nLLNRTVTDNMLCAGDTRSGGPQANLHDACQGDSGGPLVCLNDGRMTLVGIISWGLGCGQK\nDVPGVYTKVTNYLDWIRDNMRP\n"                                                                                                                                                                                                  
#> [2] ">sp|P00751|CFAB_HUMAN Complement factor B OS=Homo sapiens OX=9606 GN=CFB PE=1 SV=2\nMGSNLSPQLCLMPFILGLLSGGVTTTPWSLARPQGSCSLEGVEIKGGSFRLLQEGQALEY\nVCPSGFYPYPVQTRTCRSTGSWSTLKTQDQKTVRKAECRAIHCPRPHDFENGEYWPRSPY\nYNVSDEISFHCYDGYTLRGSANRTCQVNGRWSGQTAICDNGAGYCSNPGIPIGTRKVGSQ\nYRLEDSVTYHCSRGLTLRGSQRRTCQEGGSWSGTEPSCQDSFMYDTPQEVAEAFLSSLTE\nTIEGVDAEDGHGPGEQQKRKIVLDPSGSMNIYLVLDGSDSIGASNFTGAKKCLVNLIEKV\nASYGVKPRYGLVTYATYPKIWVKVSEADSSNADWVTKQLNEINYEDHKLKSGTNTKKALQ\nAVYSMMSWPDDVPPEGWNRTRHVIILMTDGLHNMGGDPITVIDEIRDLLYIGKDRKNPRE\nDYLDVYVFGVGPLVNQVNINALASKKDNEQHVFKVKDMENLEDVFYQMIDESQSLSLCGM\nVWEHRKGTDYHKQPWQAKISVIRPSKGHESCMGAVVSEYFVLTAAHCFTVDDKEHSIKVS\nVGGEKRDLEIEVVLFHPNYNINGKKEAGIPEFYDYDVALIKLKNKLKYGQTIRPICLPCT\nEGTTRALRLPPTTTCQQQKEELLPAQDIKALFVSEEEKKLTRKEVYIKNGDKKGSCERDA\nQYAPGYDKVKDISEVVTPRFLCTGGVSPYADPNTCRGDSGGPLIVHKRSRFIQVGVISWG\nVVDVCKNQKRQKQVPAHARDFHINLFQVLPWLKEKLQDEDLGFL\n"
#> [3] ">sp|P00752|KLK_PIG Glandular kallikrein OS=Sus scrofa OX=9823 PE=1 SV=4\nAPPIQSRIIGGRECEKNSHPWQVAIYHYSSFQCGGVLVNPKWVLTAAHCKNDNYEVWLGR\nHNLFENENTAQFFGVTADFPHPGFNLSLLKXHTKADGKDYSHDLMLLRLQSPAKITDAVK\nVLELPTQEPELGSTCEASGWGSIEPGPDBFEFPDEIQCVQLTLLQNTFCABAHPBKVTES\nMLCAGYLPGGKDTCMGDSGGPLICNGMWQGITSWGHTPCGSANKPSIYTKLIFYLDWIND\nTITENP\n"