top of page

Research Interests

  • Spatial Statistics

  • Small Area Estimation

  • Official Statistics

  • Survey Methodology

  • Spatio-temporal Statistics

  • Bayesian Statistics

  • Network Analysis

  • Statistics for Public Policy

Research Projects

Following is a list of recently concluded or ongoing research projects that I am involved in. 

Covariate measurement error in multi-type spatial data

To varying degrees, almost all statistical data contain measurement errors. The error may be a result of self-reporting, survey design, incorporating variables from other data sources, etc. The smaller the geography (domain) we consider,  the higher the impact of measurement error in the reported estimates.  Failure to account for this error in the modeling framework can lead to biased, inconsistent estimates of regression parameters, severely impede inference, and often result in unreliable predictions. We propose a multivariate multi-type Bayesian hierarchical mixed effect model for area-level data which can be distributed from multiple classes of distributions both Gaussian or non-Gaussian, using auxiliary data which are measured with error and are spatially dependent, under Hierarchical Generalized Transformation model specification.

​

Co-authors: Dr. Scott H. Holan, Dr. Jonathan R. Bradley, Dr. Christopher K. Wikle.

Paper: Preprint: https://arxiv.org/abs/2211.09797. 

          Publication: (under invited revision) The Annals of Applied Statistics.

Socio-Spatial Neighborhoods

Many existing models of spatial and spatio-temporal data assume that near things are more related than distant things, in line with the first law of geography.  while geography may be important, it may not be all-important, for at least two reasons. First, technology helps bridge distance, so that events in one region can be affected by events in distant regions. As a consequence, regions separated by large distances may be more similar than would be expected based on geographical distance. Second, geographical, political, and social divisions can make neighboring regions dissimilar. We introduce a novel statistical approach to learning from spatial data which spatial units are close in an unobserved socio-demographic space and hence which units are similar.  As a by-product, the proposed approach helps quantify the relative importance of socio-demographic space relative to geographical space,  along with the uncertainty associated with it. 

​

Co-authors: Dr. Scott H. Holan, Dr. Michael Schweinberger 

Paper: Preprint:  https://arxiv.org/abs/2304.03331

          Publication: (under review ) Journal of the American Statistical Association,                                                Applications & Case Studies.

​

Probability of Regional Seismic Phase Observation

High-frequency seismic wave blockage is often the result of strong attenuation and the regional phase Sn is particularly prone to blockage in comparison with any of the other regional phases including Lg. The regional phase Sn is a high-frequency (0.5 - 5 Hz) shear wave propagating within the lithospheric mantle lid with a velocity between 4.3 - 4.7 km/s. Sn is likely blocked by intrinsic attenuation within the uppermost mantle since we see a good correlation between the blockage and young volcanism. As widespread blockage can lead to difficulty in the estimation of source parameters or path attenuation, accurate characterization of efficient regional wave propagation is necessary. We propose a newly developed Bayesian logistic regression model that is able to predict the likelihood (probability) of phase blockage. As a byproduct of our Bayesian approach, we obtain measures of uncertainty for the probability of blockage.

​

Co-authors: Hongjun HuiDr. Scott H. Holan, Jingjing Pan, Duyi Li, Dr. Eric A. Sandvol.

Paper: https://doi.org/10.1785/0120220032

Gaussian Pseudo-likelihood Data Sketching

Many modern survey datasets are of substantial size, with potential for millions of individuals to be included in the sample. As model complexity becomes increasingly necessary, in order to extract the most information from the data, large sample sizes can hinder computational feasibility. The Bayesian pseudo-likelihood is an important tool in modeling data from complex surveys, as it ac- counts for potentially informative sample designs, while allowing for the use of Bayesian hierarchical modeling to capture complex dependence structures.To handle this situation, we propose the use of data sketching within a pseudo-likelihood for Gaussian survey data.

​

Co-authors: Dr. Scott H. Holan, Dr. Paul Parker.

bottom of page