pyproteome.analysis package

This module provides functionality for data set analysis.

Functions include volcano plots, sorted tables, and plotting sequence levels.

pyproteome.analysis.correlation module

This module provides functionality for data set analysis.

Functions include volcano plots, sorted tables, and plotting sequence levels.

pyproteome.analysis.correlation.correlate_data_sets(data1, data2, adjust=True, label_cutoff=1.5, show_labels=False, show_title=True, ax=None)[source]

Plot the correlation between peptides levels in two different data sets.

Parameters:
data1 : pyproteome.data_sets.DataSet
data2 : pyproteome.data_sets.DataSet
filename : str, optional
pyproteome.analysis.correlation.correlate_signal(data, signal, corr_cutoff=0.8, scatter_cols=4, options=None, title=None, show_duplicates=False, scatter_colors=None, scatter_symbols=None, show_scatter=True, ax=None, xlabel='')[source]

Calculate the correlation between levels of each peptide in a data set and a given signal variable.

Parameters:
data : pyproteome.data_sets.DataSet
signal : pandas.Series
Returns:
f_corr : matplotlib.figure.Figure
f_scatter : matplotlib.figure.Figure

pyproteome.analysis.heatmap module

pyproteome.analysis.heatmap.hierarchical_heatmap(data, cmp_groups=None, minmax=0, zscore=False, show_y=False, title=None, **kwargs)[source]

Plot a hierarhically-clustered heatmap of a data set.

Parameters:
data : pyproteome.data_sets.DataSet
cmp_groups : list of list of str
minmax : float, optional
zscore : bool, optional
show_y : bool, optional
title : str, optional
kwargs : dict

Kwargs passed directly to seaborn.clustermap().

Returns:
map : seaborn.ClusterGrid

pyproteome.analysis.plot module

Plot calculated levels of a given sequence across channels or groups.

pyproteome.analysis.plot.plot(data, title=None, ax=None, log_2=True, box=True)[source]

Plot the levels of a sequence across multiple channels.

Parameters:
data : pyproteome.data_sets.DataSet
title : str, optional
figsize : tuple of (int, int), optional
Returns:
figs : list of matplotlib.figure.Figure
pyproteome.analysis.plot.plot_all(data, cmp_groups=None)[source]

Runs plot() and plot_group() for all peptides in a data set.

Parameters:
data : pyproteome.data_sets.DataSet
cmp_groups : list of tuple, optional
Returns:
figs : list of matplotlib.figure.Figure
pyproteome.analysis.plot.plot_group(data, cmp_groups=None, cmp_groups_star=None, title=None, ax=None, box=True, show_p=True, show_ns=False, log_2=True, offset_frac=20, title_mods=None, size=4, y_max=None, p_ha='center', cmap='cool', linecolor='#000000', swarmcolor='#000000')[source]

Plot the levels of a sequence across each group.

Parameters:
data : pyproteome.data_sets.DataSet
cmp_groups : list of tuple, optional
cmp_groups_star : list of tuple, optional
title : str, optional
ax : matplotlib.axes.Axes, optional
box : bool, optional
show_p : bool, optional
show_ns : bool, optional
log_2 : bool, optional
offset_frac : float, optional
title_mods : list of str, optional
size : float, optional
y_max : float, optional
p_ha : str, optional
cmap : str, optional
Returns:
figs : list of matplotlib.figure.Figure
pyproteome.analysis.plot.plot_together(data, cmp_groups=None, title=None, ax=None, show_p=True, log_2=True, cmap='cool')[source]

Plot the levels of a sequence across each group in one shared plot.

Parameters:
data : pyproteome.data_sets.DataSet
cmp_groups : list of tuple, optional
title : str, optional
ax : matplotlib.axes.Axes, optional
show_p : bool, optional
log_2 : bool, optional
cmap : str, optional
Returns:
figs : list of matplotlib.figure.Figure

pyproteome.analysis.protein module

pyproteome.analysis.protein.draw_protein_seq(ds, genes, max_col=50, p_cutoff=0.01, upper_fc_cutoff=1.05, lower_fc_cutoff=0.95, missed_cleavage=1)[source]

Generate a figure showing all peptides in a data set mapping to the full sequence of their respective proteins.

Peptide differential regulation is indicated by bars for full peptide sequences and circles indicating phosphorylated residues.

Bars and circles are colored red for upregulation and blue for downregulation. Dark grey bars indicate an unmodified peptide with no change. Light grey bars indicate that only the phosphorylated version of that peptide was identified.

Parameters:
ds : pyproteome.data_sets.data_set.DataSet
genes : list of str
max_col : int, optional
p_cutoff : float, optional
upper_fc_cutoff : float, optional
lower_fc_cutoff : float, optional
missed_cleavage : int, optional
Returns:
figs : list of matplotlib.figure.Figure

Examples

>>> figs = analysis.protein.draw_protein_seq(
...     ds, ['Mapt']
... )    

pyproteome.analysis.tables module

pyproteome.analysis.tables.changes_table(data, sort='p-value')[source]

Show a table of fold changes and p-values for each unique peptide in a data set.

Parameters:
data : pyproteome.data_sets.DataSet
sort : str, optional
Returns:
df : pandas.DataFrame
pyproteome.analysis.tables.motif_table(data, f, p=0.05, sort='p-value', **kwargs)[source]

Run a motif enrichment algorithm on a data set and display the significantly enriched motifs.

Parameters:
data : pyproteome.data_sets.DataSet
f : dict or list of dict
p : float, optional
sort : str, optional
Returns:
df : pandas.DataFrame
pyproteome.analysis.tables.ptmsigdb_changes_table(data, sort='p-value', folder_name=None, csv_name=None)[source]

Show a table of fold changes and p-values for PTMSigDB.

Parameters:
data : pyproteome.data_sets.DataSet
sort : str, optional
folder_name : str, optional
csv_name : str, optional
Returns:
df : pandas.DataFrame
pyproteome.analysis.tables.write_csv(data, folder_name=None, out_name='DataSet.csv')[source]

Write information for a single data set to a .csv file.

Sheets are populated with protein, peptide, scan, and quantification values for all peptide-spectrum matches contained within a data set.

Parameters:
data : pyproteome.data_sets.DataSet
folder_name : str, optional
out_name : str, optional
Returns:
path : str

Path to .xlsx file.

pyproteome.analysis.tables.write_full_tables(datas, save_cols=None, sample_values=True, folder_name=None, out_name='Full Data.xlsx')[source]

Write information for a list of data sets to sheets of a .xlsx file.

Sheets are populated with protein, peptide, scan, and quantification values for all peptide-spectrum matches contained within a data set.

Parameters:
datas : list of pyproteome.data_sets.DataSet
save_cols : list of str, optional

Extra column names to save from in each dataset.

sample_values : bool, optional

Save normalized TMT values for each sample to the output.

folder_name : str, optional
out_name : str, optional
Returns:
path : str

Path to .xlsx file.

pyproteome.analysis.volcano module

pyproteome.analysis.volcano.plot_volcano(data, group_a=None, group_b=None, p=0.05, fold=1.25, xminmax=None, yminmax=None, title=None, ax=None, show_xlabel=True, show_ylabel=True, log2_fold=True, log10_p=True, bonferoni=False, **kwargs)[source]

Display a volcano plot of data.

This plot inclues the fold-changes and p-values associated with said changes.

Parameters:
data : pyproteome.data_sets.DataSet
group_a : str or list of str, optional
group_b : str or list of str, optional
p : float, optional
fold : float, optional
xminmax : tuple of (float, float), optional
yminmax : tuple of (float, float), optional
title : str, optional
ax : matplotlib.axes.Axes
show_xlabel : bool, optional
show_ylabel : bool, optional
log2_fold : bool, optional
log10_p : bool, optional
bonferoni : bool, optional
kwargs : dict

Arguments passed to plot_volcano_labels()

Returns:
f : matplotlib.figure.Figure
ax : matplotlib.axes.Axes
pyproteome.analysis.volcano.plot_volcano_filtered(data, f, **kwargs)[source]

Display a volcano plot, showing only peptides that are included by a given filter.

Parameters:
data : pyproteome.data_sets.DataSet
f : dict or list of dict

Filters passed to pyproteome.data_sets.DataSet.filter().

kwargs : dict

Extra arguments that are passed directly to plot_volcano().

Returns:
f : matplotlib.figure.Figure
ax : matplotlib.axes.Axes
pyproteome.analysis.volcano.plot_volcano_labels(data, ax, upper_fold=None, lower_fold=None, p=None, fold_and_p=True, sequence_labels=False, options=None, show_duplicates=False, compress_sym=True, adjust=True, mods=None)[source]

Plot labels on a volcano plot.

Parameters:
data : pyproteome.data_sets.DataSet
ax : matplotlib.axes.Axes
upper_fold : float, optional
lower_fold : float, optional
p : float, optional
fold_and_p : bool, optional
sequence_labels : bool, optional
options : dict, optional
show_duplicates : bool, optional
compress_sym : bool, optional
adjust : bool, optional
mods : str or list of str, optional
Returns:
labels : pandas.DataFrame