You are here

Analysis of Multivariate Data with Random Cluster Size

Title: Analysis of Multivariate Data with Random Cluster Size.
Name(s): Li, Xiaoyun, author
Sinha, Debajyoti, professor directing dissertation
Zhou, Yi, university representative
McGee, Dan, committee member
Lipsitz, Stuart, committee member
Department of Statistics, degree granting department
Florida State University, degree granting institution
Type of Resource: text
Genre: Text
Issuance: monographic
Date Issued: 2011
Publisher: Florida State University
Florida State University
Place of Publication: Tallahassee, Florida
Physical Form: computer
online resource
Extent: 1 online resource
Language(s): English
Abstract/Description: In this dissertation, we examine binary correlated data with present/absent component or missing data that are related to binary responses of interest. Depending on the data structure, correlated binary data can be referred as emph{clustered data} if sampling unit is a cluster of subjects, or it can be referred as emph{longitudinal data} when it involves repeated measurement of same subject over time. We propose our novel models in these two data structures and illustrate the model with real data applications. In biomedical studies involving clustered binary responses, the cluster size can vary because some components of the cluster can be absent. When both the presence of a cluster component as well as the binary disease status of a present component are treated as responses of interest, we propose a novel two-stage random effects logistic regression framework. For the ease of interpretation of regression effects, both the marginal probability of presence/absence of a component as well as the conditional probability of disease status of a present component, preserve the approximate logistic regression forms. We present a maximum likelihood method of estimation implementable using standard statistical software. We compare our models and the physical interpretation of regression effects with competing methods from literature. We also present a simulation study to assess the robustness of our procedure to wrong specification of the random effects distribution and to compare finite sample performances of estimates with existing methods. The methodology is illustrated via analyzing a study of the periodontal health status in a diabetic Gullah population. We extend this model in longitudinal studies with binary longitudinal response and informative missing data. In longitudinal studies, when treating each subject as a cluster, cluster size is the total number of observations for each subject. When data is informatively missing, cluster size of each subject can vary and is related to the binary response of interest and we are also interested in the missing mechanism. This is a modified situation of the cluster binary data with present components. We modify and adopt our proposed two-stage random effects logistic regression model so that both the marginal probability of binary response and missing indicator as well as the conditional probability of binary response and missing indicator preserve logistic regression forms. We present a Bayesian framework of this model and illustrate our proposed model on an AIDS data example.
Identifier: FSU_migr_etd-1425 (IID)
Submitted Note: A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Degree Awarded: Spring Semester, 2011.
Date of Defense: December 2, 2010.
Keywords: Clustered data, Longitudinal data analysis, Informative missing, Categorical data anlaysis, Logistic regression, Bridge distribution
Bibliography Note: Includes bibliographical references.
Advisory Committee: Debajyoti Sinha, Professor Directing Dissertation; Yi Zhou, University Representative; Dan McGee, Committee Member; Stuart Lipsitz, Committee Member.
Subject(s): Statistics
Persistent Link to This Record:
Use and Reproduction: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). The copyright in theses and dissertations completed at Florida State University is held by the students who author them.
Host Institution: FSU

Choose the citation style.
Li, X. (2011). Analysis of Multivariate Data with Random Cluster Size. Retrieved from