Separation in Optimal Designs for the Logistic Regression Model

157561-Thumbnail Image.png
Description
Optimal design theory provides a general framework for the construction of experimental designs for categorical responses. For a binary response, where the possible result is one of two outcomes, the logistic regression model is widely used to relate a

Optimal design theory provides a general framework for the construction of experimental designs for categorical responses. For a binary response, where the possible result is one of two outcomes, the logistic regression model is widely used to relate a set of experimental factors with the probability of a positive (or negative) outcome. This research investigates and proposes alternative designs to alleviate the problem of separation in small-sample D-optimal designs for the logistic regression model. Separation causes the non-existence of maximum likelihood parameter estimates and presents a serious problem for model fitting purposes.

First, it is shown that exact, multi-factor D-optimal designs for the logistic regression model can be susceptible to separation. Several logistic regression models are specified, and exact D-optimal designs of fixed sizes are constructed for each model. Sets of simulated response data are generated to estimate the probability of separation in each design. This study proves through simulation that small-sample D-optimal designs are prone to separation and that separation risk is dependent on the specified model. Additionally, it is demonstrated that exact designs of equal size constructed for the same models may have significantly different chances of encountering separation.

The second portion of this research establishes an effective strategy for augmentation, where additional design runs are judiciously added to eliminate separation that has occurred in an initial design. A simulation study is used to demonstrate that augmenting runs in regions of maximum prediction variance (MPV), where the predicted probability of either response category is 50%, most reliably eliminates separation. However, it is also shown that MPV augmentation tends to yield augmented designs with lower D-efficiencies.

The final portion of this research proposes a novel compound optimality criterion, DMP, that is used to construct locally optimal and robust compromise designs. A two-phase coordinate exchange algorithm is implemented to construct exact locally DMP-optimal designs. To address design dependence issues, a maximin strategy is proposed for designating a robust DMP-optimal design. A case study demonstrates that the maximin DMP-optimal design maintains comparable D-efficiencies to a corresponding Bayesian D-optimal design while offering significantly improved separation performance.
Date Created
2019
Agent

Categorical responses in mixture experiments

154390-Thumbnail Image.png
Description
Mixture experiments are useful when the interest is in determining how changes in the proportion of an experimental component affects the response. This research focuses on the modeling and design of mixture experiments when the response is categorical namely, binary

Mixture experiments are useful when the interest is in determining how changes in the proportion of an experimental component affects the response. This research focuses on the modeling and design of mixture experiments when the response is categorical namely, binary and ordinal. Data from mixture experiments is characterized by the perfect collinearity of the experimental components, resulting in model matrices that are singular and inestimable under likelihood estimation procedures. To alleviate problems with estimation, this research proposes the reparameterization of two nonlinear models for ordinal data -- the proportional-odds model with a logistic link and the stereotype model. A study involving subjective ordinal responses from a mixture experiment demonstrates that the stereotype model reveals useful information about the relationship between mixture components and the ordinality of the response, which the proportional-odds fails to detect.

The second half of this research deals with the construction of exact D-optimal designs for binary and ordinal responses. For both types, the base models fall under the class of Generalized Linear Models (GLMs) with a logistic link. First, the properties of the exact D-optimal mixture designs for binary responses are investigated. It will be shown that standard mixture designs and designs proposed for normal-theory responses are poor surrogates for the true D-optimal designs. In contrast with the D-optimal designs for normal-theory responses which locate support points at the boundaries of the mixture region, exact D-optimal designs for GLMs tend to locate support points at regions of uncertainties. Alternate D-optimal designs for binary responses with high D-efficiencies are proposed by utilizing information about these regions.

The Mixture Exchange Algorithm (MEA), a search heuristic tailored to the construction of efficient mixture designs with GLM-type responses, is proposed. MEA introduces a new and efficient updating formula that lessens the computational expense of calculating the D-criterion for multi-categorical response systems, such as ordinal response models. MEA computationally outperforms comparable search heuristics by several orders of magnitude. Further, its computational expense increases at a slower rate of growth with increasing problem size. Finally, local and robust D-optimal designs for ordinal-response mixture systems are constructed using MEA, investigated, and shown to have high D-efficiency performance.
Date Created
2016
Agent