This function renames protein ids in a data frame or file
convert_protein_ids(
data_table,
column_name = "Protein",
species = "hsapiens_gene_ensembl",
host = "www.ensembl.org",
mart = "ENSEMBL_MART_ENSEMBL",
ID1 = "uniprotswissprot",
ID2 = "hgnc_symbol",
id.separator = "/",
copy_nonconverted = TRUE,
verbose = FALSE
)
A data frame or file name.
The column name where the original protein identifiers are present.
The species of the protein identifiers in the term used by biomaRt (e.g. "hsapiens_gene_ensembl", "mmusculus_gene_ensembl", "drerio_gene_ensembl", etc.)
Path of the biomaRt database (e.g. "www.ensembl.org", "dec2017.archive.ensembl.org").
The type of mart (e.g. "ENSEMBL_MART_ENSEMBL", etc.)
The type of the original protein identifiers (e.g. "uniprotswissprot", "ensembl_peptide_id").
The type of the converted protein identifiers (e.g. "hgnc_symbol", "mgi_symbol", "external_gene_name").
Separator between protein identifiers of shared peptides.
Option defining if the identifiers that cannot be converted should be copied.
Option to write a file containing the version of the database used.
The data frame with an added column of the converted protein identifiers.
Protein identifiers from shared peptides should be separated by a forward slash. The host of archived ensembl databases can be introduced as well (e.g. "dec2017.archive.ensembl.org")
if (FALSE) {
data_table <- data.frame(
"Protein" = c("Q01581", "P49327", "2/P63261/P60709"),
"Abundance" = c(100, 3390, 43423))
convert_protein_ids(data_table)
}