Peptide objects

`ms2ml.peptide`

Attributes

Classes

`ms2ml.peptide.Peptide(sequence, properties, config, extras) -> None`

Bases: ProForma

Represents a peptide sequence with modifications.

Examples:

>>> p = Peptide.from_sequence("MYPEPTIDE")
>>> p.mass
1093.46377747225
>>> p = Peptide.from_sequence("MYPEPTIDE/2")
>>> p.charge
2
>>> p = Peptide.from_sequence("J")
>>> p.mass
131.09462866083
>>> p = Peptide.from_sequence("X")
>>> p.mass
18.010564683699997
>>> p = Peptide.from_sequence("Z")

Note that it does not throw an error ... it should ...

>>> p.mass
18.010564683699997

Attributes

`config = config` `instance-attribute`

`extras = extras` `instance-attribute`

`stripped_sequence` `property`

Returns the stripped sequence of the peptide.

Examples:

>>> p = Peptide.from_sequence("PEPTIDE")
>>> p.stripped_sequence
'PEPTIDE'
>>> p = Peptide.from_sequence("PEPTIDE/2")
>>> p.stripped_sequence
'PEPTIDE'
>>> p = Peptide.from_sequence("PEPM[Oxidation]ASDA")
>>> p.stripped_sequence
'PEPMASDA'

`ProForma` `property`

`mz` `property`

Returns the mz of the peptide.

`mass_pyteomics: float` `property`

Returns the mass of the peptide.

`charge: int` `property`

`fragment_masses: list` `property`

Functions

`pre_parse_mods(seq, config) -> str` `staticmethod`

Parse the modifications in the sequence.

`from_proforma_seq(seq, config: Config | None = None, extras = None) -> Peptide` `classmethod`

Generates a peptide from a proforma sequence.

Examples:

>>> p = Peptide.from_proforma_seq("PEPTIDE")
>>> p.mass
799.3599640267099
>>> p = Peptide.from_proforma_seq("PEPTIDE", extras={"test": 1})
>>> p.extras
{'test': 1}

`from_sequence(*args, **kwargs)` `classmethod`

Alias for from_proforma_seq.

`from_ProForma(proforma: ProForma, config, extras = None) -> Peptide` `classmethod`

Creates a peptide from a pyteomics.proforma.ProForma object.

Examples:

>>> from pyteomics.proforma import ProForma, parse
>>> config = Config()
>>> seq, props = parse("PEPTIDE")
>>> p = ProForma(seq, props)
>>> p = Peptide.from_ProForma(p, config)
>>> p.mass
799.3599

`to_proforma() -> str`

Converts the peptide to a string following the proforma specifications.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.to_proforma()
'<[UNIMOD:4]@C>AMC'

`to_massdiff_seq() -> str`

Converts the peptide to a string following the massdiff specifications.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.to_massdiff_seq()
'AMC[+57.021464]'
>>> p = Peptide.from_sequence("[UNIMOD:1]-AMC")
>>> p.to_massdiff_seq()
'A[+42.010565]MC[+57.021464]'

`validate() -> bool`

Validates the built peptide.

Not yet implemented.

`mass() -> float`

Calculates the mass of a peptide

Examples:

>>> p = Peptide.from_sequence("MYPEPTIDE")
>>> p.mass
1093.46377747225

`str() -> str`

`getitem(i)`

Slices a peptide

Examples:

>>> pep = Peptide._sample()
>>> pep.stripped_sequence
'PEPTIDEPINK'
>>> foo = pep[:2]
>>> foo.stripped_sequence
'PE'

`len() -> int`

Returns the length of the peptide sequence.

`ion_series(ion_type: str, charge: int) -> NDArray[np.float32]`

Calculates all the masses of an ion type.

Calculates the masses of all fragments of the peptide for a given ion type. and charge.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.ion_series("a", 1)
array([ 44.05003, 175.0905 ], dtype=float32)

`annotated_ion_series(ion_type: str, charge: int) -> list[AnnotatedIon]`

Returns a list of annotated ions.

Examples:

>>> p = Peptide.from_sequence("AMC")
>>> p.annotated_ion_series("b", 1)
[AnnotatedIon(mass=72.044945, charge=1,
position=1, ion_series='b', intensity=0, neutral_loss=None),
AnnotatedIon(mass=203.08542,
charge=1, position=2, ion_series='b', intensity=0, neutral_loss=None)]

`ion_dict() -> dict[str, AnnotatedIon]`

Returns a dictionary of all ion series for the peptide.

RAISES	DESCRIPTION
`ValueError`	If peptide does not have a charge state.

Examples:

>>> p = Peptide.from_sequence("PEPPINK/2")
>>> p.ion_dict
{'y1^1': AnnotatedIon(mass=147.11334, ...
charge=2, position=6, ion_series='b', intensity=0, neutral_loss=None)}
>>> p.ion_dict["y5^1"].mass
568.34537

`theoretical_ion_labels() -> np.ndarray`

`theoretical_ion_masses() -> np.ndarray`

`aa_to_onehot()`

Converts the peptide sequence to a one-hot encoding.

Returns a binary array of shape

(nterm + peptide_length + cterm, len(self.config.encoding_aa_order))

The positions along the second axis are the one-hot encoding of the aminoacid, matching the order of the encoding_aa_order argument in the config.

For instance, if the peptide was "ABA" and the encoding_aa_order was ["n_term", "A", "B", "c_term"], the vector would be:

[
    [1, 0, 0, 0 ,0],
    [0, 1, 0, 0 ,0],
    [0, 0, 1, 0 ,0],
    [0, 1, 0, 0 ,0],
    [0, 0, 0, 0 ,1]
]

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.aa_to_onehot()
array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 1, 0]], dtype=int32)

`mod_to_onehot()`

Converts the peptide sequence to a one-hot encoding.

Returns a binary array of shape

(nterm + peptide_length + cterm, len(self.config.encoding_mod_order))

The positions along the second axis are the one-hot encoding of the aminoacid, matching the order of the encoding_mod_order argument in the config.

For instance, if the peptide was "AC" and the encoding_mod_order was [None, "[UNIMOD:4]"], being [UNIMOD:4] carbamidomethyl, the vector would be:

[
    [1, 0],
    [1, 0],
    [0, 1],
    [1, 0],
]

Note that the 3rd position shows up as modified due to the implicit carbamidomethylation of C.

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.mod_to_onehot()
array([[1, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0]], dtype=int32)

`decode_onehot(config: Config, seq_onehot: np.ndarray, mod_onehot: np.ndarray | None = None) -> Peptide` `staticmethod`

Decodes a one-hot encoded vector into a peptide sequence.

Examples:

>>> config = Config()
>>> foo = Peptide.from_sequence("AMC", config=config)
>>> onehot = foo.aa_to_onehot()
>>> mod_onehot = foo.mod_to_onehot()
>>> Peptide.decode_onehot(config, onehot, mod_onehot)
Peptide([('A', None), ('M', None),
 ('C', [UnimodModification('4', None, None)])],
 {'n_term': None, 'c_term': None, 'unlocalized_modifications': [],
  'labile_modifications': [],
  'fixed_modifications':
      [ModificationRule(UnimodModification('4', None, None), ['C'])],
  'intervals': [], 'isotopes': [], 'group_ids': [], 'charge_state': None})

`decode_vector(config: Config, seq: np.ndarray, mod: np.ndarray | None, charge: int | None = None) -> Peptide` `staticmethod`

Decodes a one-hot encoded vector into a peptide sequence.

Examples:

>>> config = Config()
>>> foo = Peptide.from_sequence("AMC", config)
>>> foo.aa_to_vector()
array([ 0,  1, 13,  3, 27])
>>> foo.mod_to_vector()  # Default config has carbamido
array([0, 0, 0, 1, 0])
>>> Peptide.decode_vector(
...     foo.config, foo.aa_to_vector(), foo.mod_to_vector()
... )
Peptide([('A', None), ('M', None),
 ('C', [UnimodModification('4', None, None)])],
 {'n_term': None, 'c_term': None, 'unlocalized_modifications': [],
  'labile_modifications': [],
  'fixed_modifications':
      [ModificationRule(UnimodModification('4', None, None), ['C'])],
  'intervals': [], 'isotopes': [], 'group_ids': [], 'charge_state': None})

`aa_to_count()`

Converts the peptide sequence to a one-hot encoding.

Returns a binary array of shape

(nterm + peptide_length + cterm, len(self.config.encoding_aa_order))

The positions along the second axis are the one-hot encoding of the aminoacid, matching the order of the encoding_aa_order argument in the config.

For instance, if the peptide was "ABA" and the encoding_aa_order was ["n_term", "A", "B", "C", "c_term"], the vector would be:

[1, 2, 1, 0 ,1],

Examples:

>>> foo = Peptide.from_sequence("AAMC")
>>> foo.aa_to_count()
array([1, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0])

`mod_to_count()`

`aa_to_vector()`

Converts the peptide sequence to a vector encoding.

Returns a binary array of length

(nterm + peptide_length + cterm)

The number in every positions corresponds to the matching the order of the encoding_aa_order argument in the config.

For instance, if the peptide was "ABA" and the encoding_aa_order was ["n_term", "A", "B", "c_term"], the vector would be:

[0, 1, 2, 1, 3]

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.aa_to_vector()
array([ 0,  1,  13,  3, 27])

`mod_seq()`

Returns the sequence of modifications mathhing the aminoacid positions

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> foo.mod_seq
[None, None, None, '[UNIMOD:4]', None]

`mod_to_vector()`

Converts modifications to vectors

Converts the modifications peptide sequence to a vector encoding.

Examples:

>>> foo = Peptide.from_sequence("AMC")  # Implicit Carbamido.
>>> foo.mod_to_vector()
array([0, 0, 0, 1, 0])

`from_vector(aa_vector: list[int], mod_vector, config: Config)` `classmethod`

Converts vectors back to peptides.

Examples:

>>> foo = Peptide.from_vector([0, 1, 13, 3, 27], [0, 0, 0, 1, 0], Config())
>>> foo.to_proforma()
'<[UNIMOD:4]@C>AMC'

`iter() -> Iterator[tuple[str, list[str] | None]]`

Iterates over the peptide sequence.

YIELDS	DESCRIPTION
`Iterator[tuple[str, list[str] \| None]]`	(aa, mod) tuples

Examples:

>>> foo = Peptide.from_sequence("AMC")
>>> [x for x in foo]
[('n_term', None), ('A', None), ('M', None),
 ('C', ['[UNIMOD:4]']), ('c_term', None)]
>>> foo = Peptide.from_sequence("AMS[Phospho]C")
>>> [x for x in foo]
[('n_term', None), ('A', None), ('M', None),
('S', ['[UNIMOD:21]']), ('C', ['[UNIMOD:4]']), ('c_term', None)]

`from_iter(it, config: Config)` `staticmethod`

Creates a peptide from an iterator of (aa, mod) tuples.

Examples:

>>> foo = Peptide.from_iter(
...     [
...         ("n_term", None),
...         ("A", None),
...         ("M", None),
...         ("C", ["[UNIMOD:4]"]),
...         ("c_term", None),
...     ],
...     config=Config(),
... )
>>> foo.to_proforma()
'<[UNIMOD:4]@C>AMC[UNIMOD:4]'
>>> foo = Peptide._sample()
>>> foo.to_proforma()
'[UNIMOD:1]-PEPT[UNIMOD:21]IDEPINK'
>>> elems = [x for x in foo]
>>> foo = Peptide.from_iter(elems, config=Config())
>>> foo.to_proforma()
'[UNIMOD:1]-PEPT[UNIMOD:21]IDEPINK'

`__iter_base() -> list[tuple[str, list[str] | None]]`

`get_mod_isoforms() -> list[Peptide]`

Returns a list of possible modifications isoforms of a peptide.

Examples:

>>> foo = Peptide.from_sequence("AM[UNIMOD:35]AMK")
>>> out = foo.get_mod_isoforms()
>>> sorted([x.to_proforma() for x in out])
['AMAM[UNIMOD:35]K', 'AM[UNIMOD:35]AMK']

`get_variable_possible_mods()`

Returns a list of possible modifications for each aminoacid.

Examples:

>>> foo = Peptide.from_sequence("AMAMK")
>>> out = foo.get_variable_possible_mods()
>>> sorted([x.to_proforma() for x in out])
['AMAMK', 'AMAM[UNIMOD:35]K', 'AM[UNIMOD:35]AMK',
 'AM[UNIMOD:35]AM[UNIMOD:35]K']

Peptide objects

ms2ml.peptide

Attributes

Classes

ms2ml.peptide.Peptide(sequence, properties, config, extras) -> None

Attributes

config = config instance-attribute

extras = extras instance-attribute

stripped_sequence property

ProForma property

mz property

mass_pyteomics: float property

charge: int property

fragment_masses: list property

Functions

pre_parse_mods(seq, config) -> str staticmethod

from_proforma_seq(seq, config: Config | None = None, extras = None) -> Peptide classmethod

from_sequence(*args, **kwargs) classmethod

from_ProForma(proforma: ProForma, config, extras = None) -> Peptide classmethod

to_proforma() -> str

to_massdiff_seq() -> str

validate() -> bool

mass() -> float

__str__() -> str

__getitem__(i)

__len__() -> int

ion_series(ion_type: str, charge: int) -> NDArray[np.float32]

annotated_ion_series(ion_type: str, charge: int) -> list[AnnotatedIon]

ion_dict() -> dict[str, AnnotatedIon]

theoretical_ion_labels() -> np.ndarray

theoretical_ion_masses() -> np.ndarray

aa_to_onehot()

mod_to_onehot()

decode_onehot(config: Config, seq_onehot: np.ndarray, mod_onehot: np.ndarray | None = None) -> Peptide staticmethod

decode_vector(config: Config, seq: np.ndarray, mod: np.ndarray | None, charge: int | None = None) -> Peptide staticmethod

aa_to_count()

mod_to_count()

aa_to_vector()

mod_seq()

mod_to_vector()

from_vector(aa_vector: list[int], mod_vector, config: Config) classmethod

__iter__() -> Iterator[tuple[str, list[str] | None]]

from_iter(it, config: Config) staticmethod

__iter_base() -> list[tuple[str, list[str] | None]]

get_mod_isoforms() -> list[Peptide]

get_variable_possible_mods()

Functions

`ms2ml.peptide`

`ms2ml.peptide.Peptide(sequence, properties, config, extras) -> None`

`config = config` `instance-attribute`

`extras = extras` `instance-attribute`

`stripped_sequence` `property`

`ProForma` `property`

`mz` `property`

`mass_pyteomics: float` `property`

`charge: int` `property`

`fragment_masses: list` `property`

`pre_parse_mods(seq, config) -> str` `staticmethod`

`from_proforma_seq(seq, config: Config | None = None, extras = None) -> Peptide` `classmethod`

`from_sequence(*args, **kwargs)` `classmethod`

`from_ProForma(proforma: ProForma, config, extras = None) -> Peptide` `classmethod`

`to_proforma() -> str`

`to_massdiff_seq() -> str`

`validate() -> bool`

`mass() -> float`

`str() -> str`

`getitem(i)`

`len() -> int`

`ion_series(ion_type: str, charge: int) -> NDArray[np.float32]`

`annotated_ion_series(ion_type: str, charge: int) -> list[AnnotatedIon]`

`ion_dict() -> dict[str, AnnotatedIon]`

`theoretical_ion_labels() -> np.ndarray`

`theoretical_ion_masses() -> np.ndarray`

`aa_to_onehot()`

`mod_to_onehot()`

`decode_onehot(config: Config, seq_onehot: np.ndarray, mod_onehot: np.ndarray | None = None) -> Peptide` `staticmethod`

`decode_vector(config: Config, seq: np.ndarray, mod: np.ndarray | None, charge: int | None = None) -> Peptide` `staticmethod`

`aa_to_count()`

`mod_to_count()`

`aa_to_vector()`

`mod_seq()`

`mod_to_vector()`

`from_vector(aa_vector: list[int], mod_vector, config: Config)` `classmethod`

`iter() -> Iterator[tuple[str, list[str] | None]]`

`from_iter(it, config: Config)` `staticmethod`

`__iter_base() -> list[tuple[str, list[str] | None]]`

`get_mod_isoforms() -> list[Peptide]`

`get_variable_possible_mods()`