Convert protein ids — convert_protein

This function renames protein ids in a data frame or file

convert_protein_ids(
  data_table,
  column_name = "Protein",
  species = "hsapiens_gene_ensembl",
  host = "www.ensembl.org",
  mart = "ENSEMBL_MART_ENSEMBL",
  ID1 = "uniprotswissprot",
  ID2 = "hgnc_symbol",
  id.separator = "/",
  copy_nonconverted = TRUE,
  verbose = FALSE
)

Arguments

data_table: A data frame or file name.
column_name: The column name where the original protein identifiers are present.
species: The species of the protein identifiers in the term used by biomaRt (e.g. "hsapiens_gene_ensembl", "mmusculus_gene_ensembl", "drerio_gene_ensembl", etc.)
host: Path of the biomaRt database (e.g. "www.ensembl.org", "dec2017.archive.ensembl.org").
mart: The type of mart (e.g. "ENSEMBL_MART_ENSEMBL", etc.)
ID1: The type of the original protein identifiers (e.g. "uniprotswissprot", "ensembl_peptide_id").
ID2: The type of the converted protein identifiers (e.g. "hgnc_symbol", "mgi_symbol", "external_gene_name").
id.separator: Separator between protein identifiers of shared peptides.
copy_nonconverted: Option defining if the identifiers that cannot be converted should be copied.
verbose: Option to write a file containing the version of the database used.

Value

The data frame with an added column of the converted protein identifiers.

Note

Protein identifiers from shared peptides should be separated by a forward slash. The host of archived ensembl databases can be introduced as well (e.g. "dec2017.archive.ensembl.org")

Author

Peter Blattmann

Examples

 if (FALSE) {
  data_table <- data.frame(
       "Protein" = c("Q01581", "P49327", "2/P63261/P60709"),
       "Abundance" = c(100, 3390, 43423))
  convert_protein_ids(data_table)
}