Tree Ensemble Algorithms for Causal Machine Learning

190865-Thumbnail Image.png
Description
This dissertation centers on treatment effect estimation in the field of causal inference, and aims to expand the toolkit for effect estimation when the treatment variable is binary. Two new stochastic tree-ensemble methods for treatment effect estimation in the continuous

This dissertation centers on treatment effect estimation in the field of causal inference, and aims to expand the toolkit for effect estimation when the treatment variable is binary. Two new stochastic tree-ensemble methods for treatment effect estimation in the continuous outcome setting are presented. The Accelerated Bayesian Causal Forrest (XBCF) model handles variance via a group-specific parameter, and the Heteroskedastic version of XBCF (H-XBCF) uses a separate tree ensemble to learn covariate-dependent variance. This work also contributes to the field of survival analysis by proposing a new framework for estimating survival probabilities via density regression. Within this framework, the Heteroskedastic Accelerated Bayesian Additive Regression Trees (H-XBART) model, which is also developed as part of this work, is utilized in treatment effect estimation for right-censored survival outcomes. All models have been implemented as part of the XBART R package, and their performance is evaluated via extensive simulation studies with appropriate sets of comparators. The contributed methods achieve similar levels of performance, while being orders of magnitude (sometimes as much as 100x) faster than comparator state-of-the-art methods, thus offering an exciting opportunity for treatment effect estimation in the large data setting.
Date Created
2023
Agent

Machine Learning and Causal Inference: Theory, Examples, and Computational Results

187808-Thumbnail Image.png
Description
This dissertation covers several topics in machine learning and causal inference. First, the question of “feature selection,” a common byproduct of regularized machine learning methods, is investigated theoretically in the context of treatment effect estimation. This involves a detailed review

This dissertation covers several topics in machine learning and causal inference. First, the question of “feature selection,” a common byproduct of regularized machine learning methods, is investigated theoretically in the context of treatment effect estimation. This involves a detailed review and extension of frameworks for estimating causal effects and in-depth theoretical study. Next, various computational approaches to estimating causal effects with machine learning methods are compared with these theoretical desiderata in mind. Several improvements to current methods for causal machine learning are identified and compelling angles for further study are pinpointed. Finally, a common method used for “explaining” predictions of machine learning algorithms, SHAP, is evaluated critically through a statistical lens.
Date Created
2023
Agent