Is is fine if I say that the reason why PCA needs the data to be standardized because PCA merge similar features (with small variance) into one principal component and if the data is not standardized then features which behaves similarly in a particular case may be seen as different since their different range of values causes their variance to be big?
#Question about PCA and standardization
4 messages · Page 1 of 1 (latest)
Its more accurate to say PCA requires data to be standardized because it relies on variance to identify principal components. Without standardization, features with larger ranges can dominate the variance, causing PCA to prioritize them even if they behave similarly to features with smaller ranges. This can distort the structure PCA captures, and can screw you in the long run.
I see
thx