This function renames protein ids in a data frame or file
convert_protein_ids(
data_table,
column_name = "Protein",
species = "hsapiens_gene_ensembl",
host = "www.ensembl.org",
mart = "ENSEMBL_MART_ENSEMBL",
ID1 = "uniprotswissprot",
ID2 = "hgnc_symbol",
id.separator = "/",
copy_nonconverted = TRUE,
verbose = FALSE
)A data frame or file name.
The column name where the original protein identifiers are present.
The species of the protein identifiers in the term used by biomaRt (e.g. "hsapiens_gene_ensembl", "mmusculus_gene_ensembl", "drerio_gene_ensembl", etc.)
Path of the biomaRt database (e.g. "www.ensembl.org", "dec2017.archive.ensembl.org").
The type of mart (e.g. "ENSEMBL_MART_ENSEMBL", etc.)
The type of the original protein identifiers (e.g. "uniprotswissprot", "ensembl_peptide_id").
The type of the converted protein identifiers (e.g. "hgnc_symbol", "mgi_symbol", "external_gene_name").
Separator between protein identifiers of shared peptides.
Option defining if the identifiers that cannot be converted should be copied.
Option to write a file containing the version of the database used.
The data frame with an added column of the converted protein identifiers.
Protein identifiers from shared peptides should be separated by a forward slash. The host of archived ensembl databases can be introduced as well (e.g. "dec2017.archive.ensembl.org")
if (FALSE) {
data_table <- data.frame(
"Protein" = c("Q01581", "P49327", "2/P63261/P60709"),
"Abundance" = c(100, 3390, 43423))
convert_protein_ids(data_table)
}