Principal components analysis¶

Utility functions for running principal components analysis and plotting the results.

See also

sklearn.decomposition.PCA, anhima.ld.ld_prune_pairwise

Notes

The anhima.ld.ld_prune_pairwise() can be used to obtain a set of variants in approximate linkage equilibrium prior to running PCA.

anhima.pca.plot_coords(model, coords, pcx=1, pcy=2, ax=None, colors=u'b', sizes=20, labels=None, scatter_kwargs=None, annotate_kwargs=None)[source]¶

Scatter plot of transformed coordinates from principal components analysis.

Parameters:

model : sklearn.decomposition.PCA

The fitted model.

coords : ndarray, shape (n_samples, n_components)

The transformed coordinates.

pcx : int, optional

The principal component to plot on the X axis. N.B., this is one-based, so 1 is the first principal component, 2 is the second component, etc.

pcy : int, optional

The principal component to plot on the Y axis. N.B., this is one-based, so 1 is the first principal component, 2 is the second component, etc.

ax : axes, optional

The axes on which to draw. If not provided, a new figure will be created.

colors : color or sequence of color, optional

Can be a single color format string, or a sequence of color specifications of length n_samples.

sizes : scalar or array_like, shape (n_samples), optional

Size in points^2.

labels : sequence of strings

If provided, will be used to label points in the plot.

scatter_kwargs : dict-like

Additional keyword arguments passed through to plt.scatter.

annotate_kwargs : dict-like

Additional keyword arguments passed through to plt.annotate when labelling points.

Returns:

ax : axes

The axes on which the plot was drawn.

anhima.pca.plot_variance_explained(model, bar_kwargs=None, ax=None)[source]¶

Parameters:

model : sklearn.decomposition.PCA

The fitted model.

bar_kwargs : dict-like, optional

Additional keyword arguments passed through to ax.bar().

ax : axes, optional

The axes on which to draw. If not provided, a new figure will be created.

Returns:

ax : axes

The axes on which the plot was drawn.

anhima.pca.plot_loadings(model, pc=1, pos=None, plot_kwargs=None, ax=None)[source]¶

Plot loadings for the given principal component.

Parameters:

model : sklearn.decomposition.PCA

The fitted model.

pc : int, optional

The principal component to plot loadings for. N.B., this is one-based, so 1 is the first principal component, 2 is the second component, etc.

pos : array_like, int, optional

An array of variant positions to use for the X axis, If not given, variant index will be used for the X axis.

plot_kwargs : dict-like, optional

Additional keyword arguments passed through to ax.plot().

ax : axes, optional

The axes on which to draw. If not provided, a new figure will be created.

Returns:

ax : axes

The axes on which the plot was drawn.