pyproteome package¶

Module contents¶

pyproteome.import_all(line=None)[source]¶

Inialize and import many packages using IPython Notebooks magic.

Imports numpy pandas, seaborn sklearn, and pyproteome packages. Sets visual display options for matplotlib and adds a logging handlers. Also applies auto-reload to pyproteome for developers.

Examples

>>> from pyproteome import *
>>> %import_all

Subpackages¶

Submodules¶

pyproteome.levels module¶

This module provides functionality for normalizing protein data.

Levels can be extracted from supernatant or phosphotyrosine runs using median or mean peptide levels across multiple channels.

pyproteome.levels.get_channel_levels(data, norm_channels=None, method='median', cols=2)[source]¶

Calculate channel normalization levels. This value is calculated by selecting the peak of Gaussian KDE distribution fitted to channel ratio values.

Parameters:	data : `pyproteome.data_sets.DataSet` norm_channels : list of str, optional Sample names of channels to use for normalization. method : str, optional Normalize to the ‘mean’ or ‘median’ of each row. cols : int, optional Number of columns used when displaying KDE distributions.
Returns:	fig : `matplotlib.figure.Figure` channel_levels : dict of str, float

pyproteome.levels.kde_max(points)[source]¶

Estimate the center of a quantification channel by fitting a gaussian KDE function and finding its maximum.

Parameters:	points : list of float
Returns:	float

pyproteome.loading module¶

This module provides functionality for loading data sets.

Functionality includes loading CAMV and Proteome Discoverer data sets.

pyproteome.loading.load_psms(basename, pick_best_psm=True)[source]¶

Load a list of peptide-spectrum matches (PSMs) from a .msf file produced by Proteome Discoverer.

Parameters:	basename : str Base name of the data set (i.e. ‘CK-H1-pY’ for ‘CK-H1-pY.msf’). pick_best_psm : bool, optional Select the best scoring PSM for a given scan, otherwise load all PSMs.
Returns:	psms : `pandas.DataFrame`

pyproteome.paths module¶

This module tracks the path to user data files. Developers can override paths here when using a custom data hierarchy.

pyproteome.paths.BASE_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/pyproteome/checkouts/latest/docs'¶: Location of the base directory containing proteomics data. By default this is set to the current or parent directory, whichever contains any folders matching the expected directory structure.

pyproteome.paths.CAMV_NAME = 'CAMV Output'¶: Name of the directory containing validated CAMV data.

pyproteome.paths.CAMV_OUT_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/pyproteome/checkouts/latest/docs/CAMV Output'¶: Location of the directory containing validated CAMV data. By default it is set to FIGURES_NAME in the current or parent directory.

pyproteome.paths.FIGURES_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/pyproteome/checkouts/latest/docs/Figures'¶: Location of the directory for saving output figures. By default it is set to FIGURES_NAME in the current or parent directory.

pyproteome.paths.FIGURES_NAME = 'Figures'¶: Name of the directory for saving output figures.

pyproteome.paths.MS_RAW_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/pyproteome/checkouts/latest/docs/MS RAW'¶: Location of the directory containing raw mass spectrometry files. By default it is set to FIGURES_NAME in the current or parent directory.

pyproteome.paths.MS_RAW_NAME = 'MS RAW'¶: Name of the directory containing raw mass spectrometry files.

pyproteome.paths.MS_SEARCHED_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/pyproteome/checkouts/latest/docs/Searched'¶: Location of the directory containing Proteome Discoverer .msf search files. By default it is set to FIGURES_NAME in the current or parent directory.

pyproteome.paths.MS_SEARCHED_NAME = 'Searched'¶: Name of the directory containing Proteome Discoverer .msf search files.

pyproteome.paths.find_base_dir()[source]¶

Finds the base directory containing the search / raw / scripts / figures folders. May be the current working directory or a parent of it.

Returns:	path : str

pyproteome.paths.set_base_dir(path)[source]¶

Set the base directory containing the search / raw / figures folders.

Parameters:	path : str

pyproteome.species module¶

This module includes functions for mapping spcies names.

pyproteome.species.INV_ORGANISM_MAPPING = {'cow': 'Bos taurus', 'dog': 'Canis familiaris', 'ferret': 'Mustela putorius', 'fruit fly': 'Drosophila melanogaster', 'horse': 'Equus caballus', 'human': 'Homo sapiens', 'mouse': 'Mus musculus', 'rat': 'Rattus norvegicus'}¶: Mapping between species’ colloquial name and its specific name.

pyproteome.species.ORGANISM_MAPPING = {'Bos taurus': 'cow', 'Canis familiaris': 'dog', 'Drosophila melanogaster': 'fruit fly', 'Equus caballus': 'horse', 'Homo sapiens': 'human', 'Mus musculus': 'mouse', 'Mustela putorius': 'ferret', 'Rattus norvegicus': 'rat'}¶: Mapping between species’ specific name and its colloquial name. (i.e. ‘Homo sapiens’ > ‘human’)

pyproteome.utils module¶

Utility functions used in other modules.

pyproteome.utils.DEFAULT_DPI = 300¶: The DPI to use when generating all image figures.

class pyproteome.utils.DefaultOrderedDict(default_factory=None, *a, **kw)[source]¶

Bases: collections.OrderedDict

copy() → a shallow copy of od[source]¶

pyproteome.utils.PICKLE_DIR = '.pyproteome'¶: Default directory to use for saving / loading pickle files.

pyproteome.utils.adjust_text(*args, **kwargs)[source]¶: Wraps importing and calling adjustText.adjust_text().

pyproteome.utils.flatten_list(lst)[source]¶

Flattens an Iterable with arbitrary nesting into a single list.

Parameters:	lst : Iterable
Returns:	flattened : list

Examples

>>> utils.flatten_list([0, [1, 2], [[3]], 'string'])
[0, 1, 2, 3, 'string']

pyproteome.utils.flatten_set(lst)[source]¶

Flattens an Iterable with arbitrary nesting into a single set.

Parameters:	lst : Iterable
Returns:	flattened : set

Examples

>>> utils.flatten_set([0, [1, 2], [[3]], 'string'])
set([0, 1, 2, 3, 'string'])

pyproteome.utils.fuzzy_find(needle, haystack)[source]¶

Find the longest matching subsequence of needle within haystack.

Returns the corresponding index from the beginning of needle.

Parameters:	needle : str haystack : str
Returns:	index : int

pyproteome.utils.get_name(proteins)[source]¶

Generates a shortened version of a protein name. For peptides that map to multiple proteins, this function finds the longest common prefix (excluding digits) that matches all proteins.

Parameters:	proteins : `data_sets.protein.Proteins`
Returns:	str

Examples

>>> pyp.utils.get_name(
...     protein.Proteins([
...         protein.Protein(gene='Dpysl2'),
...         protein.Protein(gene='Dpysl3'),
...     ])
... )
'Dpysl2/3'
>>> pyp.utils.get_name(
...     protein.Proteins([
...         protein.Protein(gene='Src'),
...         protein.Protein(gene='Fgr'),
...         protein.Protein(gene='Fyn'),
...     ])
... )
'Src / Fgr / Fyn'
>>> pyp.utils.get_name(
...     protein.Proteins([
...         protein.Protein(gene='Tuba1a'),
...         protein.Protein(gene='Tuba1b'),
...         protein.Protein(gene='Tuba1c'),
...         protein.Protein(gene='Tuba4a'),
...         protein.Protein(gene='Tuba8'),
...     ])
... )
'Tuba1a/1b/1c/3a/4a/8'

pyproteome.utils.load(name, default=None)[source]¶

Load a variable using the pickle module.

Parameters:	name : str The name to use for data storage. default : object, optional
Returns:	val : object

pyproteome.utils.make_folder(data=None, folder_name=None, sub='Output')[source]¶

pyproteome.utils.makedirs(folder_name=None)[source]¶

Creates a folder if it does not exist.

Parameters:	folder_name : str, optional
Returns:	folder_name : str

pyproteome.utils.memoize(func)[source]¶

Memoize a function, saving its returned value for a given set of parameters in an in-memory cache.

Parameters:	func : func
Returns:	memorized : func

Examples

>>> from pyproteome import utils
>>> @utils.memoize
... def download_data(species):
...    ...  # Fetch / calculate the return value once

pyproteome.utils.norm(channels)[source]¶

Converts a list of channels to their normalized names.

Parameters:	channels : list of str or dict of (str, str) or None
Returns:	new_channels : list of str or dict of str, str

pyproteome.utils.save(name, val=None)[source]¶

Save a variable using the pickle module.

Parameters:	name : str The name to use for data storage. val : object, optional
Returns:	val : object

pyproteome.utils.stars(p, ns='ns')[source]¶

Calculate the stars to indicate significant changes.

**** : p < 1e-4

*** : p < 1e-3

** : p < 1e-2

* : p < 5e-2

ns : not significant

Parameters:	p : float ns : str, optional
Returns:	str

pyproteome.utils.which(program)[source]¶

Checks if a program exists in PATH’s list of directories.

Parameters:	program : str
Returns:	path : str or None

pyproteome.version module¶

pyproteome.version.version = '0.12.0'¶: The version of pyproteome that is installed.

pyproteome package¶

Module contents¶

Subpackages¶

Submodules¶

pyproteome.levels module¶

pyproteome.loading module¶

pyproteome.paths module¶

pyproteome.species module¶

pyproteome.utils module¶

pyproteome.version module¶

Table of Contents

Previous topic

Next topic

This Page