Skip to content

Peptide objects

ms2ml.peptide

Attributes

Classes

ms2ml.peptide.Peptide(sequence, properties, config, extras) -> None

Bases: ProForma

Represents a peptide sequence with modifications.

Examples:

>>> p = Peptide.from_sequence("MYPEPTIDE")
>>> p.mass
1093.46377747225
>>> p = Peptide.from_sequence("MYPEPTIDE/2")
>>> p.charge
2
>>> p = Peptide.from_sequence("J")
>>> p.mass
131.09462866083
>>> p = Peptide.from_sequence("X")
>>> p.mass
18.010564683699997
>>> p = Peptide.from_sequence("Z")

Note that it does not throw an error ... it should ...

>>> p.mass
18.010564683699997
Attributes
config = config instance-attribute
extras = extras instance-attribute
stripped_sequence property

Returns the stripped sequence of the peptide.

Examples:

>>> p = Peptide.from_sequence("PEPTIDE")
>>> p.stripped_sequence
'PEPTIDE'
>>> p = Peptide.from_sequence("PEPTIDE/2")
>>> p.stripped_sequence
'PEPTIDE'
>>> p = Peptide.from_sequence("PEPM[Oxidation]ASDA")
>>> p.stripped_sequence
'PEPMASDA'
ProForma property
mz property

Returns the mz of the peptide.

mass_pyteomics: float property

Returns the mass of the peptide.

charge: int property
fragment_masses: list property
Functions
pre_parse_mods(seq, config) -> str staticmethod

Parse the modifications in the sequence.

from_proforma_seq(seq, config: Config | None = None, extras = None) -> Peptide classmethod

Generates a peptide from a proforma sequence.

Examples:

>>> p = Peptide.from_proforma_seq("PEPTIDE")
>>> p.mass
799.3599640267099
>>> p = Peptide.from_proforma_seq("PEPTIDE", extras={"test": 1})
>>> p.extras
{'test': 1}
from_sequence(*args, **kwargs) classmethod

Alias for from_proforma_seq.

from_ProForma(proforma: ProForma, config, extras = None) -> Peptide classmethod

Creates a peptide from a pyteomics.proforma.ProForma object.

Examples:

>>> from pyteomics.proforma import ProForma, parse
>>> config = Config()
>>> seq, props = parse("PEPTIDE")
>>> p = ProForma(seq, props)
>>> p = Peptide.from_ProForma(p, config)
>>> p.mass
799.3599
to_proforma() -> str

Converts the peptide to a string following the proforma specifications.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.to_proforma()
'<[UNIMOD:4]@C>AMC'
to_massdiff_seq() -> str

Converts the peptide to a string following the massdiff specifications.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.to_massdiff_seq()
'AMC[+57.021464]'
>>> p = Peptide.from_sequence("[UNIMOD:1]-AMC")
>>> p.to_massdiff_seq()
'A[+42.010565]MC[+57.021464]'
validate() -> bool

Validates the built peptide.

Not yet implemented.

mass() -> float

Calculates the mass of a peptide

Examples:

>>> p = Peptide.from_sequence("MYPEPTIDE")
>>> p.mass
1093.46377747225
__str__() -> str
__getitem__(i)

Slices a peptide

Examples:

>>> pep = Peptide._sample()
>>> pep.stripped_sequence
'PEPTIDEPINK'
>>> foo = pep[:2]
>>> foo.stripped_sequence
'PE'
__len__() -> int

Returns the length of the peptide sequence.

ion_series(ion_type: str, charge: int) -> NDArray[np.float32]

Calculates all the masses of an ion type.

Calculates the masses of all fragments of the peptide for a given ion type. and charge.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.ion_series("a", 1)
array([ 44.05003, 175.0905 ], dtype=float32)
annotated_ion_series(ion_type: str, charge: int) -> list[AnnotatedIon]

Returns a list of annotated ions.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.annotated_ion_series("b", 1)
[AnnotatedIon(mass=72.044945, charge=1,
position=1, ion_series='b', intensity=0, neutral_loss=None),
AnnotatedIon(mass=203.08542,
charge=1, position=2, ion_series='b', intensity=0, neutral_loss=None)]
ion_dict() -> dict[str, AnnotatedIon]

Returns a dictionary of all ion series for the peptide.

RAISES DESCRIPTION
ValueError

If peptide does not have a charge state.

Examples:

>>> p = Peptide.from_sequence("PEPPINK/2")
>>> p.ion_dict
{'y1^1': AnnotatedIon(mass=147.11334, ...
charge=2, position=6, ion_series='b', intensity=0, neutral_loss=None)}
>>> p.ion_dict["y5^1"].mass
568.34537
theoretical_ion_labels() -> np.ndarray
theoretical_ion_masses() -> np.ndarray
aa_to_onehot()

Converts the peptide sequence to a one-hot encoding.

Returns a binary array of shape

(nterm + peptide_length + cterm, len(self.config.encoding_aa_order))

The positions along the second axis are the one-hot encoding of the aminoacid, matching the order of the encoding_aa_order argument in the config.

For instance, if the peptide was "ABA" and the encoding_aa_order was ["n_term", "A", "B", "c_term"], the vector would be:

[
    [1, 0, 0, 0 ,0],
    [0, 1, 0, 0 ,0],
    [0, 0, 1, 0 ,0],
    [0, 1, 0, 0 ,0],
    [0, 0, 0, 0 ,1]
]

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.aa_to_onehot()
array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 1, 0]], dtype=int32)
mod_to_onehot()

Converts the peptide sequence to a one-hot encoding.

Returns a binary array of shape

(nterm + peptide_length + cterm, len(self.config.encoding_mod_order))

The positions along the second axis are the one-hot encoding of the aminoacid, matching the order of the encoding_mod_order argument in the config.

For instance, if the peptide was "AC" and the encoding_mod_order was [None, "[UNIMOD:4]"], being [UNIMOD:4] carbamidomethyl, the vector would be:

[
    [1, 0],
    [1, 0],
    [0, 1],
    [1, 0],
]

Note that the 3rd position shows up as modified due to the implicit carbamidomethylation of C.

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.mod_to_onehot()
array([[1, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0]], dtype=int32)
decode_onehot(config: Config, seq_onehot: np.ndarray, mod_onehot: np.ndarray | None = None) -> Peptide staticmethod

Decodes a one-hot encoded vector into a peptide sequence.

Examples:

>>> config = Config()
>>> foo = Peptide.from_sequence("AMC", config=config)
>>> onehot = foo.aa_to_onehot()
>>> mod_onehot = foo.mod_to_onehot()
>>> Peptide.decode_onehot(config, onehot, mod_onehot)
Peptide([('A', None), ('M', None),
 ('C', [UnimodModification('4', None, None)])],
 {'n_term': None, 'c_term': None, 'unlocalized_modifications': [],
  'labile_modifications': [],
  'fixed_modifications':
      [ModificationRule(UnimodModification('4', None, None), ['C'])],
  'intervals': [], 'isotopes': [], 'group_ids': [], 'charge_state': None})
decode_vector(config: Config, seq: np.ndarray, mod: np.ndarray | None, charge: int | None = None) -> Peptide staticmethod

Decodes a one-hot encoded vector into a peptide sequence.

Examples:

>>> config = Config()
>>> foo = Peptide.from_sequence("AMC", config)
>>> foo.aa_to_vector()
array([ 0,  1, 13,  3, 27])
>>> foo.mod_to_vector()  # Default config has carbamido
array([0, 0, 0, 1, 0])
>>> Peptide.decode_vector(
...     foo.config, foo.aa_to_vector(), foo.mod_to_vector()
... )
Peptide([('A', None), ('M', None),
 ('C', [UnimodModification('4', None, None)])],
 {'n_term': None, 'c_term': None, 'unlocalized_modifications': [],
  'labile_modifications': [],
  'fixed_modifications':
      [ModificationRule(UnimodModification('4', None, None), ['C'])],
  'intervals': [], 'isotopes': [], 'group_ids': [], 'charge_state': None})
aa_to_count()

Converts the peptide sequence to a one-hot encoding.

Returns a binary array of shape

(nterm + peptide_length + cterm, len(self.config.encoding_aa_order))

The positions along the second axis are the one-hot encoding of the aminoacid, matching the order of the encoding_aa_order argument in the config.

For instance, if the peptide was "ABA" and the encoding_aa_order was ["n_term", "A", "B", "C", "c_term"], the vector would be:

[1, 2, 1, 0 ,1],

Examples:

>>> foo = Peptide.from_sequence("AAMC")
>>> foo.aa_to_count()
array([1, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0])
mod_to_count()
aa_to_vector()

Converts the peptide sequence to a vector encoding.

Returns a binary array of length

(nterm + peptide_length + cterm)

The number in every positions corresponds to the matching the order of the encoding_aa_order argument in the config.

For instance, if the peptide was "ABA" and the encoding_aa_order was ["n_term", "A", "B", "c_term"], the vector would be:

[0, 1, 2, 1, 3]

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.aa_to_vector()
array([ 0,  1,  13,  3, 27])
mod_seq()

Returns the sequence of modifications mathhing the aminoacid positions

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.mod_seq
[None, None, None, '[UNIMOD:4]', None]
mod_to_vector()

Converts modifications to vectors

Converts the modifications peptide sequence to a vector encoding.

Examples:

>>> foo = Peptide.from_sequence("AMC")  # Implicit Carbamido.
>>> foo.mod_to_vector()
array([0, 0, 0, 1, 0])
from_vector(aa_vector: list[int], mod_vector, config: Config) classmethod

Converts vectors back to peptides.

Examples:

>>> foo = Peptide.from_vector([0, 1, 13, 3, 27], [0, 0, 0, 1, 0], Config())
>>> foo.to_proforma()
'<[UNIMOD:4]@C>AMC'
__iter__() -> Iterator[tuple[str, list[str] | None]]

Iterates over the peptide sequence.

YIELDS DESCRIPTION
Iterator[tuple[str, list[str] | None]]

(aa, mod) tuples

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> [x for x in foo]
[('n_term', None), ('A', None), ('M', None),
 ('C', ['[UNIMOD:4]']), ('c_term', None)]
>>> foo = Peptide.from_sequence("AMS[Phospho]C")
>>> [x for x in foo]
[('n_term', None), ('A', None), ('M', None),
('S', ['[UNIMOD:21]']), ('C', ['[UNIMOD:4]']), ('c_term', None)]
from_iter(it, config: Config) staticmethod

Creates a peptide from an iterator of (aa, mod) tuples.

Examples:

>>> foo = Peptide.from_iter(
...     [
...         ("n_term", None),
...         ("A", None),
...         ("M", None),
...         ("C", ["[UNIMOD:4]"]),
...         ("c_term", None),
...     ],
...     config=Config(),
... )
>>> foo.to_proforma()
'<[UNIMOD:4]@C>AMC[UNIMOD:4]'
>>> foo = Peptide._sample()
>>> foo.to_proforma()
'[UNIMOD:1]-PEPT[UNIMOD:21]IDEPINK'
>>> elems = [x for x in foo]
>>> foo = Peptide.from_iter(elems, config=Config())
>>> foo.to_proforma()
'[UNIMOD:1]-PEPT[UNIMOD:21]IDEPINK'
__iter_base() -> list[tuple[str, list[str] | None]]
get_mod_isoforms() -> list[Peptide]

Returns a list of possible modifications isoforms of a peptide.

Examples:

>>> foo = Peptide.from_sequence("AM[UNIMOD:35]AMK")
>>> out = foo.get_mod_isoforms()
>>> sorted([x.to_proforma() for x in out])
['AMAM[UNIMOD:35]K', 'AM[UNIMOD:35]AMK']
get_variable_possible_mods()

Returns a list of possible modifications for each aminoacid.

Examples:

>>> foo = Peptide.from_sequence("AMAMK")
>>> out = foo.get_variable_possible_mods()
>>> sorted([x.to_proforma() for x in out])
['AMAMK', 'AMAM[UNIMOD:35]K', 'AM[UNIMOD:35]AMK',
 'AM[UNIMOD:35]AM[UNIMOD:35]K']

Functions