My understanding is that PCA does not identify which variables depend on each other and does not specifically identify each component so the how do analysts interpret the factors/components in PCA?
A good example are PCA factors for different maturity US Treasuries. We are told PC1 is parallel shifts, PC2 is curve shifts (steepeners and flatteners) and PC3 are convexity (curvature) changes - how do we know these descriptors are the factors/components and why in that order? What could be PC>3?
Actually its the opposite. PCA uses a covariance matrix of standardized values to determine how much UNIQUE information is contained. Because it uses a covariance matrix, it takes into account co-movement.
Procedure: 1. convert your data to z-scores -> 2. create a covariance matrix of the z-scores -> 3. compute the eigen value matrix that corresponds with the eigen vectors
After performing this procedure, the largest eigen values contain the most unique information. If you would like, you can compute contribution % to the model.
Forgetting to standardize your data into z-scores typically results in incomparable measures as covariance is a unitless measure.
Undergraduate: accounting, finance, information systems; Graduate: MBA/finance; Graduate certificates: data science, applied statistics, advanced valuation; PhD candidate - data science