Causal Clustering for 1-Factor Measurement Models
Erich Kummerfeld*, University of Pittsburgh; Joseph Ramsey, Carnegie Mellon University
Many scientiﬁc research programs aim to learn the causal structure of real world phenomena. This learning problem is made more diﬃcult when the target of study cannot be directly observed. One strategy commonly used by social scientists is to create measurable “indicator” variables that covary with the latent variables of interest. Before leveraging the indicator variables to learn about the latent variables, however, one needs a measurement model of the causal relations between the indicators and their corresponding latents. These measurement models are a special class of Bayesian networks. This paper addresses the problem of reliably infer-ring measurement models from measured indicators, with-out prior knowledge of the causal relations or the number of latent variables. We present a provably correct novel algorithm, FindOneFactorClusters (FOFC), for solving this inference problem. Compared to other state of the art algorithms, FOFC is faster, scales to larger sets of indicators, and is more reliable at small sample sizes. We also present the ﬁrst correctness proofs for this problem that do not assume linearity or acyclicity among the latent variables.