pyproteome.analysis package¶
This module provides functionality for data set analysis.
Functions include volcano plots, sorted tables, and plotting sequence levels.
pyproteome.analysis.correlation module¶
This module provides functionality for data set analysis.
Functions include volcano plots, sorted tables, and plotting sequence levels.
-
pyproteome.analysis.correlation.
correlate_data_sets
(data1, data2, adjust=True, label_cutoff=1.5, show_labels=False, show_title=True, ax=None)[source]¶ Plot the correlation between peptides levels in two different data sets.
Parameters: - data1 :
pyproteome.data_sets.DataSet
- data2 :
pyproteome.data_sets.DataSet
- filename : str, optional
- data1 :
-
pyproteome.analysis.correlation.
correlate_signal
(data, signal, corr_cutoff=0.8, scatter_cols=4, options=None, title=None, show_duplicates=False, scatter_colors=None, scatter_symbols=None, show_scatter=True, ax=None, xlabel='')[source]¶ Calculate the correlation between levels of each peptide in a data set and a given signal variable.
Parameters: - data :
pyproteome.data_sets.DataSet
- signal :
pandas.Series
Returns: - f_corr :
matplotlib.figure.Figure
- f_scatter :
matplotlib.figure.Figure
- data :
pyproteome.analysis.heatmap module¶
-
pyproteome.analysis.heatmap.
hierarchical_heatmap
(data, cmp_groups=None, minmax=0, zscore=False, show_y=False, title=None, **kwargs)[source]¶ Plot a hierarhically-clustered heatmap of a data set.
Parameters: - data :
pyproteome.data_sets.DataSet
- cmp_groups : list of list of str
- minmax : float, optional
- zscore : bool, optional
- show_y : bool, optional
- title : str, optional
- kwargs : dict
Kwargs passed directly to
seaborn.clustermap()
.
Returns: - map :
seaborn.ClusterGrid
- data :
pyproteome.analysis.plot module¶
Plot calculated levels of a given sequence across channels or groups.
-
pyproteome.analysis.plot.
plot
(data, title=None, ax=None, log_2=True, box=True)[source]¶ Plot the levels of a sequence across multiple channels.
Parameters: - data :
pyproteome.data_sets.DataSet
- title : str, optional
- figsize : tuple of (int, int), optional
Returns: - figs : list of
matplotlib.figure.Figure
- data :
-
pyproteome.analysis.plot.
plot_all
(data, cmp_groups=None)[source]¶ Runs
plot()
andplot_group()
for all peptides in a data set.Parameters: - data :
pyproteome.data_sets.DataSet
- cmp_groups : list of tuple, optional
Returns: - figs : list of
matplotlib.figure.Figure
- data :
-
pyproteome.analysis.plot.
plot_group
(data, cmp_groups=None, cmp_groups_star=None, title=None, ax=None, box=True, show_p=True, show_ns=False, log_2=True, offset_frac=20, title_mods=None, size=4, y_max=None, p_ha='center', cmap='cool', linecolor='#000000', swarmcolor='#000000')[source]¶ Plot the levels of a sequence across each group.
Parameters: - data :
pyproteome.data_sets.DataSet
- cmp_groups : list of tuple, optional
- cmp_groups_star : list of tuple, optional
- title : str, optional
- ax :
matplotlib.axes.Axes
, optional - box : bool, optional
- show_p : bool, optional
- show_ns : bool, optional
- log_2 : bool, optional
- offset_frac : float, optional
- title_mods : list of str, optional
- size : float, optional
- y_max : float, optional
- p_ha : str, optional
- cmap : str, optional
Returns: - figs : list of
matplotlib.figure.Figure
- data :
-
pyproteome.analysis.plot.
plot_together
(data, cmp_groups=None, title=None, ax=None, show_p=True, log_2=True, cmap='cool')[source]¶ Plot the levels of a sequence across each group in one shared plot.
Parameters: - data :
pyproteome.data_sets.DataSet
- cmp_groups : list of tuple, optional
- title : str, optional
- ax :
matplotlib.axes.Axes
, optional - show_p : bool, optional
- log_2 : bool, optional
- cmap : str, optional
Returns: - figs : list of
matplotlib.figure.Figure
- data :
pyproteome.analysis.protein module¶
-
pyproteome.analysis.protein.
draw_protein_seq
(ds, genes, max_col=50, p_cutoff=0.01, upper_fc_cutoff=1.05, lower_fc_cutoff=0.95, missed_cleavage=1)[source]¶ Generate a figure showing all peptides in a data set mapping to the full sequence of their respective proteins.
Peptide differential regulation is indicated by bars for full peptide sequences and circles indicating phosphorylated residues.
Bars and circles are colored red for upregulation and blue for downregulation. Dark grey bars indicate an unmodified peptide with no change. Light grey bars indicate that only the phosphorylated version of that peptide was identified.
Parameters: - ds :
pyproteome.data_sets.data_set.DataSet
- genes : list of str
- max_col : int, optional
- p_cutoff : float, optional
- upper_fc_cutoff : float, optional
- lower_fc_cutoff : float, optional
- missed_cleavage : int, optional
Returns: - figs : list of
matplotlib.figure.Figure
Examples
>>> figs = analysis.protein.draw_protein_seq( ... ds, ['Mapt'] ... )
- ds :
pyproteome.analysis.tables module¶
-
pyproteome.analysis.tables.
changes_table
(data, sort='p-value')[source]¶ Show a table of fold changes and p-values for each unique peptide in a data set.
Parameters: - data :
pyproteome.data_sets.DataSet
- sort : str, optional
Returns: - df :
pandas.DataFrame
- data :
-
pyproteome.analysis.tables.
motif_table
(data, f, p=0.05, sort='p-value', **kwargs)[source]¶ Run a motif enrichment algorithm on a data set and display the significantly enriched motifs.
Parameters: - data :
pyproteome.data_sets.DataSet
- f : dict or list of dict
- p : float, optional
- sort : str, optional
Returns: - df :
pandas.DataFrame
- data :
-
pyproteome.analysis.tables.
ptmsigdb_changes_table
(data, sort='p-value', folder_name=None, csv_name=None)[source]¶ Show a table of fold changes and p-values for PTMSigDB.
Parameters: - data :
pyproteome.data_sets.DataSet
- sort : str, optional
- folder_name : str, optional
- csv_name : str, optional
Returns: - df :
pandas.DataFrame
- data :
-
pyproteome.analysis.tables.
write_csv
(data, folder_name=None, out_name='DataSet.csv')[source]¶ Write information for a single data set to a .csv file.
Sheets are populated with protein, peptide, scan, and quantification values for all peptide-spectrum matches contained within a data set.
Parameters: - data :
pyproteome.data_sets.DataSet
- folder_name : str, optional
- out_name : str, optional
Returns: - path : str
Path to .xlsx file.
- data :
-
pyproteome.analysis.tables.
write_full_tables
(datas, save_cols=None, sample_values=True, folder_name=None, out_name='Full Data.xlsx')[source]¶ Write information for a list of data sets to sheets of a .xlsx file.
Sheets are populated with protein, peptide, scan, and quantification values for all peptide-spectrum matches contained within a data set.
Parameters: - datas : list of
pyproteome.data_sets.DataSet
- save_cols : list of str, optional
Extra column names to save from in each dataset.
- sample_values : bool, optional
Save normalized TMT values for each sample to the output.
- folder_name : str, optional
- out_name : str, optional
Returns: - path : str
Path to .xlsx file.
- datas : list of
pyproteome.analysis.volcano module¶
-
pyproteome.analysis.volcano.
plot_volcano
(data, group_a=None, group_b=None, p=0.05, fold=1.25, xminmax=None, yminmax=None, title=None, ax=None, show_xlabel=True, show_ylabel=True, log2_fold=True, log10_p=True, bonferoni=False, **kwargs)[source]¶ Display a volcano plot of data.
This plot inclues the fold-changes and p-values associated with said changes.
Parameters: - data :
pyproteome.data_sets.DataSet
- group_a : str or list of str, optional
- group_b : str or list of str, optional
- p : float, optional
- fold : float, optional
- xminmax : tuple of (float, float), optional
- yminmax : tuple of (float, float), optional
- title : str, optional
- ax :
matplotlib.axes.Axes
- show_xlabel : bool, optional
- show_ylabel : bool, optional
- log2_fold : bool, optional
- log10_p : bool, optional
- bonferoni : bool, optional
- kwargs : dict
Arguments passed to
plot_volcano_labels()
Returns: - data :
-
pyproteome.analysis.volcano.
plot_volcano_filtered
(data, f, **kwargs)[source]¶ Display a volcano plot, showing only peptides that are included by a given filter.
Parameters: - data :
pyproteome.data_sets.DataSet
- f : dict or list of dict
Filters passed to
pyproteome.data_sets.DataSet.filter()
.- kwargs : dict
Extra arguments that are passed directly to
plot_volcano()
.
Returns: - data :
-
pyproteome.analysis.volcano.
plot_volcano_labels
(data, ax, upper_fold=None, lower_fold=None, p=None, fold_and_p=True, sequence_labels=False, options=None, show_duplicates=False, compress_sym=True, adjust=True, mods=None)[source]¶ Plot labels on a volcano plot.
Parameters: - data :
pyproteome.data_sets.DataSet
- ax :
matplotlib.axes.Axes
- upper_fold : float, optional
- lower_fold : float, optional
- p : float, optional
- fold_and_p : bool, optional
- sequence_labels : bool, optional
- options : dict, optional
- show_duplicates : bool, optional
- compress_sym : bool, optional
- adjust : bool, optional
- mods : str or list of str, optional
Returns: - labels :
pandas.DataFrame
- data :