pyproteome.cluster package

pyproteome.cluster.auto module

pyproteome.cluster.auto.auto_clusterer(data, get_data_kwargs=None, cluster_kwargs=None, plot_clusters_kwargs=None, volcano_kwargs=None, plots=False, close=False)[source]

Cluster and generate plots for a data set.

Parameters:
data : pyproteome.data_sets.DataSet
get_data_kwargs : dict

Arguments passed to clusterer.get_data().

cluster_kwargs : dict

Arguments passed to clusterer.cluster().

plot_clusters_kwargs : dict

Arguments passed to plot.plot_all_clusters().

plots : bool, optional

Generate plots for each cluster.

close : bool, optional

Automatically close all figures.

Returns:
data : dict

Dictionary containing the data set and exact matrix used for clustering, as well as accessory objects.

y_pred : numpy.array

List of cluster IDs for each peptide.

clr

scikit-learn’s cluster object.

Examples

>>> data, y_pred, clr = cluster.auto.auto_clusterer(
...     ds,
...     get_data_kwargs={
...         'dropna': True,
...         'corrcoef': False,
...     },
...     cluster_kwargs={
...         'clr': sklearn.cluster.MiniBatchKMeans(
...             n_clusters=n,
...             random_state=0,
...         ),
...     },
...     plots=False,
... )

pyproteome.cluster.clusterer module

pyproteome.cluster.clusterer.cluster(data, z=False, log2=True, clr=None, n_clusters=20)[source]

Cluster a data set using scikit-learn.

Parameters:
data : dict

Object returned from get_data().

z : float, optional
log2 : float, optional
clr : object, optional

Clusterer object, by default use sklearn.cluster.MiniBatchKMeans.

n_clusters : int, optional
Returns:
clr : sklearn.base.ClusterMixin
y_pred : pandas.Series of int
pyproteome.cluster.clusterer.get_data(ds, dropna=True, corrcoef=True, groups=None)[source]

Extract the exact data matrix that will be used for clustering

Parameters:
ds : pyproteome.data_sets.DataSet
dropna : bool, optional
corrcoef : bool, optional
groups : list of str, optional
Returns:
dict

pyproteome.cluster.plot module

pyproteome.cluster.plot.cluster_corrmap(data, y_pred, colorbar=True, f=None, ax=None, div_scale=None, show_names=None)[source]
pyproteome.cluster.plot.cluster_range(data, min_clusters=2, max_clusters=20, cols=3)[source]
pyproteome.cluster.plot.hierarchical_clusters(data, y_pred)[source]
pyproteome.cluster.plot.pca(data)[source]
pyproteome.cluster.plot.plot_all_clusters(data, y_pred, cols=4)[source]
pyproteome.cluster.plot.plot_cluster(data, y_pred, cluster_n, f=None, ax=None, div_scale=None, ylabel=True, title=None, color=None)[source]
pyproteome.cluster.plot.show_cluster(data, y_pred, seq=None, protein=None, ylabel=True, f=None, ax=None, color=None, div_scale=None)[source]
pyproteome.cluster.plot.show_peptide_clusters(data, y_pred, filters, new_colors=False, div_scale=None, cols=4)[source]