This page may be out of date. Submit any pending changes before refreshing this page.
Hide this message.
2 Answers
Luis Argerich

Both PCA and ICA try to find a set of vectors, a basis, for the data. So you can write any point (vector) in your data as a linear combination of the basis.

In PCA the basis you want to find is the one that best explains the variability of your data. The first vector of the PCA basis is the one that best explains the variability of your data (the principal direction) the second vector is the 2nd best explanation and must be orthogonal to the first one, etc.

In ICA the basis you want to find is the one in which each vector is an independent component of your data, you can think of your data as a mix of signals and then the ICA basis will have a vector for each independent signal.

As an example of ICA consider these two images:

I mixed them in different proportions producing these two mixes:

If we now apply ICA to these images we get this result:

While not 100% perfect it is an excellent separation of the two mixed images.

In a more practical way we can say that PCA helps when you want to find a reduced-rank representation of your data and ICA helps when you want to find a representation of your data as independent sub-elements. In layman terms PCA helps to compress data and ICA helps to separate data.

Note: PCA and the SVD are the same thing and it's usually better to just use the SVD of the centered data matrix because SVD algorithms are faster and numerically more stable than PCA.

Note2: In some cases NMF (non negative matrix factorization) can work as ICA. In NMF the basis you want to find is the one that helps you reconstruct the data as a positive summation over the basis vectors. This means the basis will have vectors that represents parts of your original data, if your data contains images then the NMF basis contains parts of images that will help you reconstruct any of your images in the dataset.

Hope it helps,

Luis

Both techniques try to obtain new sources by linearly combining the original sources. PCA attempts to find uncorrelated sources, where as ICA attempts to find independent sources.

"Uncorrelatedness" has a strong definition that can be readily used in an optimization scheme, it boils down to least squares minimization. There are several ways of approaching "independence", however. One way, which is motivated by the central limit theorem, is to find the source space that maximizes the "non-gaussianity" of all sources, which can be measured in a few different ways. ICA is more of a class of blind source separation techniques.

PCA can also rank each source. ICA does not have this property, which makes it a poor tool for dimensionality reduction.