Masaoud, Elmabrok A. M. Statistical Models for Binary Repeated Measures and Hierarchical Data in Veterinary Science. 2009. University of Prince Edward Island, Dissertation/Thesis, https://scholar2.islandarchives.ca/islandora/object/ir%3A21637.

Genre

  • Dissertation/Thesis
Contributors
Author: Masaoud, Elmabrok A. M.
Thesis advisor: Stryhn, Henrik
Date Issued
2009
Publisher
University of Prince Edward Island
Place Published
Charlottetown, PE
Extent
279
Abstract

The objective of the thesis was to assess the performance of statistical procedures for the analysis of binary longitudinal data in veterinary science, specifically, to describe and quantify their performance in terms of statistical properties such as unbiasedness, confidence interval coverage and efficiency. The focus was on marginal and random effects procedures including: ordinary logistic regression (OLR), alternating logistic regression (ALR), generalized estimating equations (GEE), marginal quasi likelihood (MQL), penalized quasi likelihood (PQL), pseudo likelihood (REPL), maximum likelihood (ML) and Bayesian Markov chain Monte Carlo (MCMC). The marginal and random effects procedures handle the within-subject dependence differently, and they offer different interpretations of regression estimates for binary longitudinal data. Several simulation studies covered a wide range of data structures and designs including a two-level balanced longitudinal design, a three-level balanced setting of binary repeated measures data, and repeated measures data with missing values. A statistical simulation approach was used as the tool for the assessment.

The first study involved a two-level setting of binary repeated measures data. Results for the marginal model data showed the autoregressive GEE showed to be highly efficient when treatment was within subjects, even with strongly correlated responses. For treatment between subjects, random effects methods also performed well in some situations; however, a small number of subjects with short time series proved a challenge for both marginal and random effects methods. Results for the random effects model data showed bias in estimates from random effects methods while the marginal model produced estimates close to the marginal parameters.

The second study involved binary repeated measures data with an additional hierarchical structure. Results indicate that in data generated by random intercept models, the ML and MCMC procedures performed well and had fairly similar estimation errors. The PQL regression estimates were attenuated while the variance estimates were less accurate than ML and MCMC, but the direction of the bias depended on whether binomial or extra-binomial dispersion was assumed. In datasets with autocorrelation, random effects estimates procedures gave downward biased estimates, while marginal estimates were little affected by the presence of autocorrelation. The results also indicate that in addition to ALR, a GEE procedure that accounts for clustering at the highest hierarchical level is sufficient. The REPL procedure performed poorly and produced unsatisfactory estimates regardless of autocorrelation values.

The third study involved binary repeated measures data with an additional hierarchical structure and missing values, where five different scenarios of simulated incomplete datasets were considered. The first scenario corresponded to a combination of three types of missingness patterns present in a real (scc40) dataset: delayed entry and drop-outs as well as intermittent missing values. The remaining scenarios involved only drop-outs, and corresponded to either moderate or high percentages of values either missing at random (MAR) or not missing at random (NMAR), respectively.

In the first scenario, all estimation procedures except OLR performed well and produced estimates with small relative bias (generally less than 5%) for levels of missingness that roughly corresponded to the scc40 data. In MAR missingness scenarios, some biases were found for ALR, GEE and PQL procedures, whereas the likelihood-based procedures were largely unaffected by the missing values. In NMAR scenarios, all procedures experienced similar and strong biases in the time coefficient; however, fixed effects estimates at the subject and cluster level were relatively unaffected. The presence of autocorrelation in the data did not substantially alter the impact of missing values although the shrinkage of random effects estimates was marginally less pronounced than in the full datasets.

Additionally, a hierarchical data structure arising in an aquaculture vaccine trial on Infectious Salmon Anaemia Virus (ISAV), where multiple treatment groups of fish in the same tanks were observed over time, was studied. The focus was to assess and account for neighbour treatment effects. By neighbour treatment effects in an incomplete block design setting, we mean that treatments present in the same block (tank) may affect each other in their performance. Two statistical models were proposed to assess and account for neighbour treatment effects. The first approach was based on a non-linear model, and the second involved cross-classified and multiple membership models. The performance of the models was evaluated by simulation.

Results demonstrated that both proposed models show promise in capturing neighbour treatment effects of the type assumed, whenever such neighbour effects are of at least moderate magnitude. Analyses of the ISAV trial data by both models did not provide any evidence of substantial neighbour effects.

Note

Source: Dissertation Abstracts International, Volume: 70-08, Section: B, page: 4570.

Language

  • English

ETD Degree Name

  • Doctor of Philosophy

ETD Degree Level

  • Doctoral

ETD Degree Discipline

  • Faculty of Veterinary Medicine. Department of Health Management.
Degree Grantor
University of Prince Edward Island

Subjects

  • Biology, Biostatistics
ISBN
9780494498606
LAC Identifier
TC-PCU-21637

Department