November 4, 2014:  Dr. Todd MacKenzie, PhD, Associate Professor, Dartmouth College

“Causal Hazard Ratio Estimation Using Instrumental Variables or Principal Strata”

 Estimation of treatment effects is a primary goal of statistics in medicine. Estimates from observational studies are subject to selection bias, while estimates from non-observational (i.e. randomized) studies are subject to bias due to non-compliance. In observational studies confounding by unmeasured confounders cannot be overcome by regression adjustment, conditioning on propensity scores or inverse weighted propensities. The method of instrumental variables (IVs) can overcome bias due to unmeasured confounding. In the first part of this talk a method for using IVs to estimate hazard ratios is proposed and evaluated. In the second part of this talk the approach of principal strata for deriving treatment effects for randomized studies subject to all-or-nothing compliance is reviewed and an estimate of the complier hazard ratio is proposed and evaluated.

 

October 21, 2014:  Dr. Wei Ding, PhD, Associate Professor, UMass Boston

“Data Mining with Big Data”

Big Data concerns large-volume, complex, growing data sets with multiple, autonomous sources.  In this talk, I will give an overview of our recent machine learning and data mining results in feature selection, distance metric learning, and least squares-based optimization with applications to NASA mission data analysis, extreme weather prediction, and physical activity analysis for children obesity.

October 7, 2014:  Dr. Tam Nguyen, PhD, Assistant Professor, Boston College, Connell School of Nursing

 Application of Item Response Theory in the Development of Patient Reported Outcome Measures: An Overview”

 The growing emphasis on patient-centered care has accelerated the demand for high quality data from patient reported outcome measures (i.e. quality of life, depression, physical functioning).  Traditionally, the development and validation of these measures has been guided by Classical Test Theory.  However, Item Response Theory, an alternate measurement framework, offers promise for addressing practical measurement problems found in health-related research that have been difficult to solve through Classical methods.  This talk will introduce foundational concepts in Item Response Theory, as well as commonly used models and their assumptions.  Example will be provided that exemplify typical applications of Item Response Theory.  These examples will illustrate how Item Response Theory can be used to improve the development, refinement, and evaluation of patient reported outcome measures.  Greater use of methods based on this framework can increase the accuracy and efficiency with which patient reported outcomes are measured.

September 16, 2014:  Dr. Amresh Hanchate, Assistant Professor, Health Care Disparities Research Program, Boston University School of Medicine

Did MA reform increase or decrease use of ED services?  An Application of Difference-in-Differences Analysis

This presentation will focus on difference-in-differences regression models as an approach to estimate causal relationships. Commonly applied in the context of “natural experiments”, I will examine its application to evaluate the impact of Massachusetts health reform on the use of emergency department services. Two previous studies applying this approach (Miller 2012 & Smulowitz 2014) obtained contrasting results, one finding an increase in ED use and the other a decrease, following the Massachusetts insurance expansion (2006-2007).  I will report the findings of a comparative assessment of these contrasting results, based on side-by-side replication of the original analysis using similar data.

 

September 2, 2014:  Dr. Stavroula Chrysanthopoulou, PhD, Department of Biostatistics, Brown University School of Public Health

 “Statistical Methods in Microsimulation Modeling:  Calibration and Predictive Accuracy”

This presentation is concerned with the statistical properties of MicroSimulation Models (MSMs) used in Medical Decision Making. The MIcrosimulation Lung Cancer (MILC) model, a new, streamlined MSM describing the natural history of lung cancer, has been used as a tool for the implementation and comparison of complex statistical techniques for calibrating and assessing the predictive accuracy of continuous time, dynamic MSMs. We present the main features of the MILC model along with the major findings and conclusions, as well as the challenges imposed from the implementation of the suggested statistical methods.

May 20, 2014:  Dr. Craig Wells, Ph.D., Department of Educational Policy, Research and Administration, UMASS Amherst

"Applications of Item Response Theory"

Item response theory (IRT) is a powerful, model-based technique for developing scales and assessments. Due to the attractive features of IRT models, it is the statistical engine that is used to develop many types of assessments. The purpose of the presentation will be to describe the fundamental concepts of IRT as well as its applications in a variety of contexts. The presentation will address the advantages of IRT over classical methods, describe popular IRT models and applications”

 

May 6, 2014:  Jeffrey Brown, Ph.D., Department of Population Medicine, Harvard Medical School

FDA's Mini-Sentinel Program to Evaluate the Safety of Marketed Medical Products

“The Sentinel Initiative began in 2008 as a multi-year effort to create a national electronic system for monitoring the safety of FDA-regulated medical products (e.g., drug, biologics, vaccines, and devices). The Initiative is the FDA’s response to the Food and Drug Administration Amendments Act requirement that the FDA work develop a system to obtain information from existing electronic health care data from multiple sources to assess the safety of approved medical products. The Mini-Sentinel pilot is part of the Sentinel Initiative. Mini-Sentinel uses a distributed data approach in which data partners retain control over data in their possession obtained as part of normal care and reimbursement activities. Using this approach allows Mini-Sentinel queries to be executed behind the firewalls of data partners, with only summary level or minimum necessary information returned for analysis. The Mini-Sentinel network allows FDA to initiate hundreds of queries a year using across a network of 18 data partners and over 350 million person-years of electronic health data. These queries use privacy-preserving approaches that have greatly minimized the need to share protected health data. Mini-Sentinel analyses have been used to support several regulatory decisions, and Mini-Sentinel.”

 

April 15, 2014:  Jessica Meyers Franklin, Ph.D., Department of Medicine, Division of Pharmacoepidemiology & Pharmacoeconomics, Harvard Medical School

 "High-dimensional simulation for evaluating high-dimensional methods: Comparing high-dimensional propensity score versus lasso variable selection for confounding adjustment in a novel simulation framework"

“The high-dimensional propensity score (hdPS) algorithm has been shown to reduce bias in nonrandomized studies of treatments in administrative claims databases through empirical selection of confounders. Lasso regression provides an alternative confounder selection method and allows for direct modeling of the outcome in a high-dimensional covariate space through shrinkage of coefficient estimates. However, these methods have not been able to be compared, due to limitations in ordinary simulation techniques. In this talk, I will discuss a novel "plasmode" simulation framework that is better suited to evaluating methods in the context of a high-dimensional covariate space, and I will present a study in progress that uses this framework to compare the performance of hdPS to that of a lasso outcome regression model for reduction of confounding bias.” The Department of Quantitative Health Sciences and the Quantitative Methods Core will conduct monthly seminars to explore statistical issues of general interest.

 

Tuesday, April 1, 2014: Presented by: Michael Ash, Ph.D., Chair, Department of Economics, Professor of Economics and Public Policy, University of Massachusetts Amherst

"Critical Replication for Learning and Research"

“Critical replication asks students to replicate a published quantitative empirical paper and to extend the original study either by applying the same model and methods to new data or by applying new models or methods to the same data. Replication helps students come rapidly up to speed as practitioners. It also benefits the discipline by checking published work for accuracy and robustness. Extension gives a practical introduction to internal or external validity and can yield publishable results for students. I will discuss critical replication of three published papers: Growth in a Time of Debt (Reinhart and Rogoff 2010); Mortality, inequality and race in American cities and states (Deaton and Lubotsky 2003); and Stock markets, banks, and growth (Levine and Zervos 1998).”

Social Science & Medicine
Cambridge Journal
International Review of Applied Ergonomics

Tuesday, March 18, 2014: Presented by: Balgobin Nandram, Ph.D., Professor, Mathematical Sciences, Worcester Polytechnic Institute

"A Bayesian Test of Independence for Sparse Contingency Tables of BMD and BMI"

“Interest is focused on a test of independence in contingency tables of body mass index (BMI) and bone mineral density (BMD) for small places. Techniques of small area estimation are implemented to borrow strength across U.S. counties using a hierarchical Bayesian model. For each county a pooled Bayesian test of independence of BMD and BMI is obtained. We use the Bayes factor to perform the test, and computation is performed using Monte Carlo integration via random samples rather than Gibbs samples. We show that our pooled Bayesian test is preferred over many competitors.”

Key Words: Bayes factor, Contingency tables, Cressie-Read test, Gibbs sampler, Monte Carlo integration, NHANES III, Power, Sensitivity analysis, Small area estimation.

Tuesday, March 4, 2014: Presented by: Krista Gile, Ph.D., Assistant Professor, Department of Mathematics and Statistics, University of Massachusetts

"Inference and Diagnostics for Respondent-Driven Sampling Data"

“Respondent-Driven Sampling is type of link-tracing network sampling used to study hard-to-reach populations. Beginning with a convenience sample, each person sampled is given 2-3 uniquely identified coupons to distribute to other members of the target population, making them eligible for enrollment in the study. This is effective at collecting large diverse samples from many populations.

Unfortunately, sampling is affected by many features of the network and sampling process. In this talk, we present advances in sample diagnostics for these features, as well as advances in inference adjusting for such features.

This talk includes joint work with Mark S. Handcock, Lisa G. Johnston and Matthew J. Salganik.”

Tuesday, February 18, 2014: Presented by: John Griffith, Ph.D., Associate Dean for Research, Bouve College of Health Sciences, Northeastern University

 

"Translating Science to Health Care: the Use of Predictive Models in Decision Making"

“Clinical predictive models take information about a patient or subject and synthesize it into a composite score that can then assist with decision making concerning treatment for the individual patient. To be useful, these tools need to accurately categorize the risk of events for patients and their use needs to positively impact treatment decisions and patient outcomes. Statistical approaches can be used for internal validation of these models. However, clinical trials are often needed to show treatment effectiveness. The issues that arise with the development, testing, and implementation of such models will be discussed.”

John Griffith Presentation Slides

Tuesday, February 4, 2014: Presented by: Christopher Schmid, Ph.D., Professor of Biostatistics, Center for Evidence Based Medicine, Brown University School of Public Health

 

"N-of-1 Trials"

“N-of-1 trials are a promising tool to enhance clinical decision-making and patient outcomes. These trials are single-patient multiple-crossover studies for determining the relative comparative effectiveness of two or more treatments within each individual patient. Patient and clinician select treatments and outcomes of interest to them, carry out the trial, and then make a final treatment decision together based on results of the trial. This talk will discuss the advantages and challenges in conducting N-of-1 trials, along with some of the design and analytic considerations. A study to test the effectiveness of the N-of-1 trial as a clinical decision tool comparing patients randomized to N-of-1 vs. usual care is ongoing. The challenges of implementing the decision strategy in such a context will be discussed.”

Christopher Schmid slides

Tuesday, January 21, 2014: Presented by:David MacKinnon, Ph.D., Professor, Arizona State University, Author of "Introduction to Statistical Mediation Analysis"

"Mediation Analysis"

Learning Objective: Understanding and Running Mediation Analyses

Bring in your laptops and run step-wise mediation analyses with the speaker using SAS and free Mplus demo program (http://www.statmodel.com/demo.shtml).

Dr. David MacKinnon's Seminar Materials

McKinnon PM2002

McKinnon MBR2004

McKinnon AR2007

McKinnon Slides

Tuesday, December 3, 2013: Presented by: Erin M. Conlon, Ph.D., Associate Professor, Department of Mathematics and Statistics Lederle Graduate Research, University of Massachusetts, Amherst

 

"Bayesian Meta-Analysis Models for Gene Expression Studies"

Biologists often conduct multiple independent gene expression studies that all target the same biological system or pathway. Pooling information across studies can help more accurately identify true target genes. Here, we introduce a Bayesian hierarchical model to combine gene expression data across studies to identify differentially expressed genes. Each study has several sources of variation, i.e. replicate slides within repeated experiments. Our model produces the gene-specific posterior probability of differential expression, which is the basis for inference. We further develop the models to identify up- and down-regulated genes separately, and by including gene dependence information. We evaluate the models using both simulation data and biological data for the model organisms Bacillus subtilis and Geobacter sulfurreducens.

Tuesday, November 19, 2013: Presented by: Jing Qian, Ph.D., Assistant Professor of Biostatistics, Division of Biostatistics and Epidemiology, School of Public Health and Health Sciences, University of Massachusetts, Amherst

"Statistical Methods for Analyzing Censored Medical Cost and Sojourn Time in Progressive Disease Process"

To conduct comprehensive evaluation in clinical studies for chronic diseases like cancer, features of the disease process, such as lifetime medical cost and sojourn time in progressive disease process, are often assessed in addition to the overall survival time. However, statistical analysis of these features is challenged by dependent censoring and identifiability issue, arising from the incomplete follow-up data in clinical studies. In this talk, I will first present a semiparametric regression model for analyzing censored lifetime medical cost, which can be used to address cost difference between different treatments in the motivating example of a lung cancer clinical trial. Next, I will discuss how to use the similar inference approach to estimate sojourn time in progressive disease process, motivated by a colon cancer study where patients progress through cancer-free and cancer-recurrence states. Inference procedures and simulation studies will be described. The methods will be illustrated through a lung cancer and a colon cancer clinical trials.

Thursday, November 7, 2013: Presented by: Bei-Hung Chang, Sc.D., Associate Professor, Boston University School of Public Health, VA Boston Healthcare System

 

"Mind and Body Medicine Research: Study Design and Statistical Method Demonstrations"

The nature of mind /body practices, such as meditation and acupuncture, poses a challenge for evaluating the intervention effect. Blinding, randomization, control group selection, and placebo effects are among the list of these challenges. This talk will present two studies that employed innovative study designs to overcome these challenges in investigating the health effect of acupuncture and the relaxation response/meditation. The use of statistical methods including a 2-slope regression model and mixed effects regression models in the studies will also be demonstrated.

Tuesday, October 15, 2013: Presented by: Laura Forsberg White, Ph.D., Associate Professor, Department of Biostatistics, Boston University School of Public Health

"Characterizing Infectious Disease Outbreaks: Traditional and Novel Approaches"

Infectious disease outbreaks continue to be a significant public health concern. Quantitative methods for characterizing an outbreak rapidly are of great interest in order to mount an appropriate and effective response. In this talk, I will review some traditional approaches to doing this and discuss more recent work. In particular, this talk will focus on methods for quantifying the spread of an illness through estimation of the reproductive number. We will also briefly discuss methods to determine the severity of an outbreak through estimation of the case fatality ratio and attack rate. Applications of this work to the 2009 Influenza A H1N1 outbreak will be discussed. We will also discuss methods to estimate heterogeneity in the reproductive number.

Laura Forsberg White, Ph.D. slides

Tuesday, October 1, 2013: Presented by Molin Wang, Ph.D., Assistant Professor, Department of Medicine, Harvard Medical School, Departments of Biostatistics and Epidemiology, Harvard School of Public Health

"Statistical Methods and SAS Macros for Disease Heterogeneity Analysis"

Epidemiologic research typically investigates the associations between exposures and the risk of a disease, in which the disease of interest is treated as a single outcome. However, many human diseases, including colon cancer, type II diabetes mellitus and myocardial infarction, are comprised of a range of heterogeneous molecular and pathologic processes, likely reflecting the influences of diverse exposures. The approach, which incorporates data on the molecular and pathologic features of a disease directly into epidemiologic studies, Molecular Pathological Epidemiology, has been proposed to better identify causal factors and better understand how potential etiologic factors influence disease development. In this talk, I will present statistical methods for evaluating whether the effect of a potential risk factor varies by subtypes of the disease, in cohort studies, case-control studies and case-case study designs. Efficiency of the tests will also be discussed. SAS macros will be presented to implement these methods. The macros test overall heterogeneity through the common effect test (i.e., the null hypothesis is that all of the effects of exposure on the different subtypes are the same) as well as pair-wise differences in exposure effects. In adjusting for confounding, the effects are allowed to vary for the different subtypes or they can be assumed to be the same across the different subtypes. To illustrate the methods, we evaluate the effect of alcohol intake on LINE-1 methylation subtypes of colon cancer in the Health Professionals Follow-up Study, where 51,529 men have been followed since 1986 during which time 268 cases of colon cancer have occurred. Results are presented for all 3 possible study designs for comparison purposes. This is a joint work with Aya Kuchiba and Donna Spiegelman.

Tuesday, September 17, 2013: Presented by Zheyang Wu, Ph.D., Assistant Professor, Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, MA.

"Genetic Effects and Statistical Power of Gene Hunting Using GWAS and Sequence Data"

Genome-wide association studies (GWAS) use high-density genotyping platforms to reveal single-nucleotide and copy number variants over whole genome for gene hunting. Although many significant genetic factors have been identified, genes discovered so far account for a relatively small proportion of genetic contribution to most complex traits, the so-called “missing heritability”. A key statistical research to champion the discovery of novel disease genes is to reveal the capacity of association-based detection strategies and design optimal methods. We study this problem from the view of statistical signal detection for high-dimensional data, while considering three major features of those unfound genetic factors: weak effects of association, sparse signals among all genotyped variants, and complex correlations and gene-gene interactions. In this talk, I will discuss two relevant results. First, we address how gene-gene interaction and linkage disequilibrium among variants influence the capacity of model selection strategies for searching and testing genes. In particular, we developed a novel power calculation framework for model selection strategies to pick up proper signals of disease genes. Second, the requirement for signal strength in gene detection could be reduced when we target on the detection of groups of signals, instead of on individual signals. Specifically, we established a theory of detection boundary, which clarifies the limit of statistical analysis: genetic effects below the boundary are simply too rare and weak to be reliably detected by any statistical methods. Meanwhile, we developed optimal tests that work for these minimally detectable signals. These results are also applicable in designing statistical association tests for detecting rare variants in exome or whole-genome sequence data analysis.

2009 ZhaoWuPlosGenetPowerModelSelectionGWAS

NIHMS451862

PaperWu0526

Wu Slides

Tuesday, September 3, 2013: Presented by Raji Balasubramanian, Sc.D., Assistant Professor of Biostatistics, Division of Biostatistics and Epidemiology, UMass Amherst

 

Variable importance in matched case control studies in settings of high dimensional data

In this talk, I’ll describe a method for assessing variable importance in matched case-control investigations and other highly stratified studies characterized by high dimensional data (p >> n). The proposed methods are motivated by a cardiovascular disease systems biology study involved matched cases and controls. In simulated and real datasets, we show that the proposed algorithm performs better than a conventional univariate method (conditional logistic regression) and a popular multivariable algorithm (Random Forests) that does not take the matching into account.

This is joint work with E. Andres Houseman (Oregon State University), Rebecca A. Betensky (Harvard School of Public Health) and Brent A. Coull (Harvard School of Public Health).

Powerpoint slides from presentation

  

Tuesday, May 21, 2013:  Presented by Alexander Turching, MD, MS, Director of informatics Research, Department of Endocrinology, Diabetes and Hypertension Harvard Medical School

Using Electronic Medical Records Data for Clinical Research:  Experience and Practical Implications

Electronic medical records (EMR) systems represent a rich source of clinical data that can be utilized for research, quality assurance, and pay-for-performance, among others. However, it is important to recognize that, like any other data source, EMR data has its own pitfalls that need to be approached in a rigorous fashion. In particular, a large fraction of data in EMR is “locked” in narrative documents and can therefore be especially challenging to extract. This presentation will discuss common flaws in EMR data with a special focus on a systematic approach to using data from narrative electronic documents. The discussion will be illustrated by specific examples of clinical research using EMR data, including narrative text.

Learning Objectives:

1. To understand limitations and caveats of EMR data

2. To learn how to approach development of NLP algorithms

3. To learn how to evaluate NLP algorithms

 

Tuesday, May 7, 2013:  Presented by Tingjian Ge, PhD, Assistant Professor, Department of Computer Science, UMass Lowell

How Recent Data Management and Mining Research can Benefit Biomedical Sciences

Data management (a.k.a. databases, traditionally) and data mining have been active research topics in Computer Science since the 1960s, both in academia and in the research and development groups of companies (for example IBM Research). In recent years we have seen a surge in this research due to the “big data” trend. On the other hand, various areas in the biomedical sciences are producing increasingly large amount of data due to the prevalence of automatic data-generating devices. It is natural to consider what some of the most recent results from data management and mining can do for the state-of-the-art biomedical research and practice.

In this talk, I will discuss the potential applications of my research in data management and mining to various biomedical studies. They include: (1) complex event detection over correlated and noisy time series data, such as ECG monitoring signals and real-time dietary logs; (2) ranking and pooled analysis of noisy and conflicting data, such as microarray results and emergency medical responses in disaster scenes (e.g., terrorist attacks or earthquakes); and (3) association rule mining on mixed categorical and numerical data, such as the dietary logs, for food recommendation and weight control.

 

Tuesday, April 16, 2013:  Presented by Jeffrey Bailey, MD, PhD,

Computational Approaches for Analyzing Copy Number Variation and Standing Segmental Duplication

 Segmental duplication represents the key route for the evolution of new genes within an organism.  An regions of duplication are often copy number variant providing increased functional diversity.  Detecting regions of duplication and copy number variation is still a challenge even with hihg-throughput sequencing.  The lecture will review the key methods for identifying duplicated sequence and copy number variant regions within genomic sequence and provide an overview of our laboratory's ongoing work to detect, type and correlate such regions with phenotype particularly vis-a-via malaria.

Tuesday, April 2, 2013:  Presented by Becky Briesacher, PhD

"Offsetting Effects of Medicare Part D on Health Outcomes and Hospitalization?"

This presentation will cover a Medicare Part D policy evaluation and the novel use of time-series and bootstrapping methods. My early results challenge the assumption of the US Congressional Budget Office that Medicare prescription drug costs are offset by medical service savings. I will also describe how we used Pre-Part D data to create simulated post-Part D outcomes. Confidence intervals were constructed using bootstrapping and the test for differences was based on the proportion of simulated values that exceeded/fell below the observed value.

BBriesacher 4/2/2013 Intermediate Level Policy Eval Paper

BBriesacher 4/2/2013 Advanced Methods paper

Tuesday, March 5, 2013:  Presented by David Hoaglin, PhD, Professor, Biostatistics and Health Services Research

 

"Regressions Gone Wrong: Why Many Reports of Regression Analyses Mislead"

Regression methods play an important role in many analyses: multiple regression, logistic regression, survival models, longitudinal analysis. Surprisingly, many articles and books describe certain results of such analyses in ways that lead readers astray. The talk will examine reasons for these problems and suggest remedies.

Speed 2012 DHoaglin 3/5/13

Making Sense DHoaglin 3/5/13

February 19, 2013:  Presented by Wenjun Li, PhD, Associate Professor, Preventative and Behavioral Medicine

Use of Small Area Health Statistics to Inform and Evaluate Community Health Promotion Programs

This presentation discusses the application of small area estimation methods to identify priority communities for public health intervention programs, to tailor community-specific intervention strategies, and to evaluate the effectiveness at the community level.

December 4, 2012:  Presented by Thomas Houston, MD, MPH Professor and Chief

Comparative Effectiveness Research (CER) Seminar Series -- Pragmatic Clinical Trials (PCT II) (following Bruce Barton's PCT 1 on Sept. 18)

 Dr. Houston will describe a series of cluster-randomized trials where they have used the Internet and informatics to support Interventions for providers and patients. He will also review the PRECIS tool, a way to characterize your pragmatic trials, and the stages of implementation complete (SIC measure) a time-and-milestone-based method to assess success in implementation.

 November 20, 2012:  Presented by Jennifer Tjia, MD, MSCE, Associate Professor of Medicine

Pharmacoepidemiologic Approaches to Evaluate Outcomes of Medication Discontinuation

The self-controlled case series method, or case series method for short, can be used to study the association between an acute event and a transient exposure using data only on cases; no separate controls are needed. The method uses exposure histories that are retro­spectively ascertained in cases to estimate the relative incidence. That is, the incidences of events within risk periods—windows of time during or after experiencing the exposure when people are hypothesized to be at greater risk—relative to the incidences of events within control periods, which includes all time before the case experienced the exposure and after the risk has returned to the baseline value. For many researchers, the main appeal of the self-controlled case series method is the implicit control of fixed confounders. We will discuss the application of this method in pharmacoepidemiologic outcomes studies, and explore the idea of whether this approach offers advantages over more conventional cohort studies when evaluating adverse drug withdrawal events following medication discontinuation. We will use examples from a linked Medicare Part D and Minimum Data Set database to facilitate discussion.

November 6, 2012:  Presented by Molin Wang, PhD, Harvard University

Latency Analysis under the Cox Model when the effect may change over time

We consider estimation and inference for latency in the Cox proportional hazard model framework, where time to event is the outcome. In many public health settings, it is of interest to assess whether exposure effects are subject to a latency period, where the risk of developing disease depending on the exposure level varies over time, perhaps affecting risk only during times near the occurrence of the outcome, or perhaps affecting risk only during times preceding a lag of some duration. Identification of the latency period, if any, is an important aspect of assessing risks of environmental and occupational exposures. For example, in air pollution epidemiology, of interest is often not only the effect of the m-year moving cumulative average air pollution level on risk of all cause mortality, but also point and interval estimation of m itself. In this talk, we will focus on methods for point and interval estimation of the latency period under several models for the timing of exposure which have previously appeared in the epidemiologic literature. Computational methods will be discussed. The method will be illustrated in the study of the timing of the effects of constituents of air pollution on mortality in the Nurses’ Health Study.

October 16, 2012:  Presented by Dr. Sherry Pagoto and Deepk Ganesan, PhD

mHealth-based Behavioral Sensing and Interventions

This presentation will review mHealth and sensing research and methodologies at the UMass Amherst and UMass medical School campuses. We will discuss ongoing research in mobile and on-body sensing to obtain pysiological data in the field, and to design a toolkit for processing such data to derive high quality features, deal with data quality issues (e.g. loose sensors, missing data), and leverage diverse sensor modalities to improve inference quality. To demonstrate the methodologies, we will discuss a recently funded pilot project in which mobile and sensing technology will be used to assess and predict physiological and environmental factors that impact eating behavior. Once eating behavior is predictable with accuracy, interventions will be delivered via technology at the precise moments when individuals are the most likely to overeat. The purpose of this research is to improve the impact of behavioral weight loss interventions.

October 2, 2012:  Presented by Amy Rosen, PhD

Assessing the Validity of the Agency of Healthcare Research and Quality (AHRQ) Patient Safety Indicators (PSIs) in the VA

This presentation will review general patient safety concepts and ways in which patient safety events are identified.  Background on the PSIs will be provided, and a recent multi-faceted validation study that was conducted in the VA to examine both the criterion and attributional validity of the indicators will be presented.  Two questions will be specifically addressed:  1) Do the PSIs Accurately Identify True Safety Events? 2) Are PSI rates associated with structures/processes of care?

PSI 1  PSI 2

September 18, 2012:  Presented by Bruce Barton, PhD

Pragmatic Clinical Trials:  Different Strokes for Different Folks

Pragmatic clinical trials (PCTs) are relatively new on the clinical research scene and are being proposed routinely for NIH funding.  In a sense, PCTs are comparative effectiveness studies on steroids!  This presentation will discuss the concepts behind this new breed of clinical trial, how PCTs differ from the usual randomized clinical trial, and what to be careful of when developing one.  We will review two PCTs as case studies to look at different approaches to the study design.  The references are two of the more recent papers on PCT methodology and approaches.

PCT1 Reference    PCT2 Reference   PCT Slides

July 17, 2012:  Presented by Dianne Finkelstein, PhD; Mass General Hospital and Harvard School of Public Health, Boston, MA

Developing Biostatistics Resources at an Academic Health Center.

Although biostatistics plays an important role in health-related research, biostatistics resources are often fragmented, or ad hoc, or oversubscribed within Academic Health Centers (AHCs).  Given the increasing complexity and quantity of health-related data, the emphasis on accelerating clinical and translational science, and the importance of reproducible research, there is need for the thoughtful development of biostatistics resources with AHCs.  I will be reporting on a recent collaboration of CTSA biostatisticians who identified strategies for developing biostatistics resources in three areas:  (1) recruiting and retaining biostatisticians; (2) using biostatistics resources efficiently; and (3) improving science through biostatistics collaborations.  Ultimately, it was recommended that AHCs centralize biostatistics resources in a unit rather than disperse them across clinical departments, as the former offers distinct advantages to investigator collaborators, biostatisticians, and ultimately to the success of the research and education missions of AHCs.

May 15, 2012: Presented by George Reed, PhD

Modeling disease states using Markov models with covariate dependence and time varying intervals.

An example of modeling transitions among multiple disease states where measurements are not made at fixed and equal time intervals and the primary interest is in factors associated with the transition probabilities. Both first order and higher order Markov models are considered.

May 1, 2012: Presented by Becky Briesacher, PhD

"Medicare Prescription Drug program and Using Part D Data for Research"

In 2006, the Medicare program began offering coverage for prescription drugs, and as of June 2008, Part D data have been available to researchers. This presentation will briefly introduce the audience to the Medicare Part D program and Part D data for research purposes. The presentation will include personal reflections on becoming a drug policy researcher and excerpts from my own program evaluation research.

April 17, 2012: Presented by David Hoaglin, PhD

"Indirect Treatment Comparisons and Network Meta-Analysis: Relative Efficacy and a Basis for Comparative Effectiveness"

Evidence on the relative efficacy of two treatments may come from sets of trials that compared them directly (head to head); but often one must rely on indirect evidence, from trials that studied them separately with a common comparator (e.g., placebo) or from a connected network of treatments. The talk will review basic meta-analysis, discuss steps and assumptions in network meta-analysis, and comment on applications to comparative effectiveness

  • ISPOR States Its Position on Network Meta-Analysis
  • Conducting Indirect-Treatment-Comparison and Network-Meta-Analysis Hoaglin 2011 ViH
  • Appendix: Examples of Bayesian Network Hoaglin Appen 2011
  • Jansen 2011 ViH
  • Luce 2010 Millbank

March 20, 2012: Presented by Thomas English,PhD

"Using Allscripts Data at UMass for Clinical Research"

I will discuss work that I have done that has been enabled by EHRs. This should give an idea of how the current EHR at UMass could help your research.

February 28, 2012: Presented by Nancy Baxter, MD, PhD, FRCSC, FACRS

"Room for Improvement in Quality Improvement"

In most circumstances in clinical medicine randomized clinical proving efficacy are required before widespread adoption of interventions. However in the area of quality improvement many strategies have been implemented with little supporting evidence. Why is this, and why worry? These are topics that will be explored in my presentation.

February 21, 2012: Presented by Stephen Baker, MScPH

"Sequentially Rejective Procedures for Multiple Comparisons in Genome Wide Association Studies (GWAS)"

The problem of additive type I error due to multiple comparisons has been well known for many years, however with the introduction of microarrays and other technologies it has become one of the central problems in data analysis in molecular biology. Sequential testing procedures have been popular but have limitations with these new technologies. I will discuss some popular methods, some new ones and illustrate them with microarray data for associating gene expression with disease status.

February 7, 2012: Presented by Arlene Ash, PhD

Risk Adjustment Matters

What variables should be included, and how, in models designed to either detect differences in quality among providers with very different “case-mix” or to isolate the effect of some patient characteristic on outcome? What role does the purpose of the modeling effort play? What are the consequences of different modeling choices? What does “do no harm” mean for a statistical analyst?

January 17, 2012: Presented by Zhiping Weng

Computational Identification of Transposon Movement With Whole Genome Sequencing

Transposons evolve rapidly and can mobilize and trigger genetic instability. In Drosophila melanogaster, paternally inherited transposons can escape silencing and trigger a hybrid sterility syndrome termed hybrid dysgenesis. We developed computational methods to identify transposon movement in the host genome and uncover heritable changes in genome structure that appear to enhance transposon silencing during the recovery to hybrid dysgenesis.

December 20, 2011: Presented by Jacob Gagnon, PhD

Gene Set Analysis Applied to a Leukemia Data Set

Gene set analysis allows us to determine which groups of genes are differentially expressed when comparing two subtypes of a given disease. We propose a logistic kernel machine approach to determine the gene set differences between B-cell and T-cell Acute Lymphocytic Leukemia (ALL). Compared to previous work, our method has some key advantages: 1) our hypothesis testing is self-contained rather than being competitive, 2) we can model gene-gene interactions and complex pathway effects, and 3) we test for differential expression adjusting for clinical covariates. Results from simulation studies and from an application of our methods to an ALL dataset will be discussed.

December 14, 2011: Presented by Yunsheng Ma, MD, PhD

Determinants of Racial/Ethnic Disparities in Incidence of Clinical Diabetes in Postmenopausal Women in the United States: The Women’s Health Initiative 1993- 2009

Although racial/ethnic disparities in diabetes risk have been identified, determinants of these differences have not been well-studied. Previous studies have considered dietary and lifestyle factors individually, but few studies have considered these factors in aggregate in order to estimate the proportion of diabetes that might be avoided by adopting a pattern of low-risk behaviors. Using data from the Women’s Health Initiative, we examined determinants of racial/ethnic differences in diabetes incidence.

  • This paper, “Diet, lifestyle, and the risk of type 2 diabetes mellitus in women", by Hu et al., presented ways to analyze diabetes risk factors in aggregate in order to estimate the proportion of diabetes that might be avoided by adopting a pattern of low-risk behaviors.

    Technical Level

    Intermediate

    Focus Application
    Data Nurses’ Health Study
    Methods Cox proportional hazards models

November 9, 2011: Presented by: Nanyin Zhang, Ph.D.

In the presentation I will Introduce the fundamental mechanisms of fMRI. I will also talk about potential applications of fMRI in understanding different mental disorders.

  • Article #1 (Functional Connectivity and Brain Networks in Schizophrenia), by Mary-Ellen Lynall et. al., tested the hypothesis that Schizophrenia is a disorder of connectivity between components of large-scale brain networks by measuring aspects of both functional connectivity and functional network topology derived from resting-state fMRI time series acquired at 72 cerebral regions over 17 min from 15 healthy volunteers (14 male, 1 female) and 12 people diagnosed with schizophrenia (10 male, 2 female).

    Technical Level

    Intermediate

    Focus Application
    Data Real
    Methods Proof
       
  • Article #2 (Hyperactivity and hyperconnectivity of the default network in schizophrenia and in first-degree relatives of persons with schizophrenia), by Susan Whitfield-Gabrieli, examined the status of the neural network mediating the default mode of brain function in patients in the early phase of schizophrenia and in young first-degree relatives of persons with schizophrenia.

    Technical Level

    Intermediate

    Focus Application
    Data Real
    Methods Proof

October 18, 2011: Presented by: Bruce A. Barton, Ph.D.

The Continuing Evolution of Randomized Clinical Trials – the Next Steps: Continuing the discussion initiated by Wenjun Li, Ph.D., at the April QHS/QMC Methods Workshop (“Role of Probability Sampling in Clinical and Population Health Research”), this workshop will discuss some proposed designs for randomized clinical trials (RCTs) which provide partial answers to some of the problems with the current design of RCTs – as well as possible next evolutionary steps in RCT design to better address the primary issues of patient heterogeneity and of generalizability of results.

September 20, 2011: Presented by Zi Zhang, MD, MPH

Using Address-Based Sampling (ABS) to Conduct Survey Research -

The Traditional random-digital-dial (RDD) approach for telephone surveys has become more problematic due to landline erosion and coverage bias. Dual-sample frame method employing both landlines and cell phones is costly and complicated. We will discuss the use of the U.S. Postal Service Deliver Sequence File as an alternative sampling source in survey research. We will focus on sample coverage and response rate in reviewing this emerging approach.

July 19, 2011: Presented by: Jennifer Tjia, MD, MSCE:

Addressing the issue of channeling bias in observational drug studies

Channeling occurs when drug therapies with similar indications are preferentially prescribed to groups of patients with varying baseline prognoses. In this session, we wil discuss the phenomenon of channeling using a specific example from the Worcester Heart Attack Study.

June 21, 2011: Presented by Mark Glickman, PhD:

Multiple Testing: Is Slicing Significance Levels Producing Statistical Bologna?

Procedures for adjusting significance levels when performing many hypothesis tests are commonplace in health/medical studies. Such procedures, most notably the Bonferroni adjustment, control for study-wide false positive rates, and recognize that the probability of a single false positive result increases with the number of tests. In this talk we establish, in contrast to common wisdom, that significance level adjustments based on the number of tests performed are, in fact, unreasonable procedures, and lead to absurd conclusions if applied consistently. We argue that confusion may exist between an increased number of tests being performed with a low (prior) probability of each null hypothesis being true. This confusion may lead to the unwarranted multiplicity adjustment. We finally demonstrate how false discovery rate adjustments are a more principled approach to significance level adjustments in health and medical studies.

April 19, 2011: Presented by: Wenjun Li, PhD:

Role of Probability Sampling in Clinical and Population Health Research

This workshop uses practical examples to illustrate the use of probability sampling of RCT and population health studies. The approach is used to optimize the generalizability of, increase statistical power and add values to the collected data by preserving the possibility of sub-group analysis.

March 15, 2011:

The Peters-Belson Approach to study Health Disparities: Application to the National Health Interview Survey

This workshop will discuss cancer screening rates varyingly substantially by race/ethnicity, and identifying factors that contribute to this disparity between the minority groups and the white majority should aid in designing successful programs. The traditional approach for examining the role of race/ethnicity is to include a categorical variable, indicating minority status, in a regression-type model, whose coefficient estimates this effect. We applied the Peters- Belson(PB) approach, used in wage discrimination studies, to analyze disparities in cancer screening rates between different race/ethnic groups from the 1998 National Health Interview Survey (NHIS), and to decompose the difference into a component due to differences in the covariate values in the two groups and a residual difference. Regression model was estimated accounting for the complex sample design. Variances were estimated by the jackknife method where a single primary sampling unit was considered as the deleted group and compared to analytic variances derived from Taylor linearization. We found that among both men and women, most of the disparity in colorectal cancer screening and digital rectal exam rates between whites and blacks was explained by the covariates but the same was not true for the disparity between whites and Hispanics.

Dr Rao also would like to suggest a book for anyone who wants to analyze national surveys.  This is "Analysis of Health Surveys" by Korn EL and Graubard BI.  It was published by Wiley, New York, NY in 1999.

February 15, 2011:

Multivariable Modeling Strategies: Uses and Abuses

This workshop will be hosted by George Reed, PhD and will discuss regression modeling strategies including predictor complexity and variable selection.  The workshop will examine the flaws and uses of methods like stepwise procedures, and discuss how modeling strategies should be tailored to particular problems.

Dr Reed would also like to recommend Chapter 4 from the book, "Frank Harrell's regression modeling strategies."

“REGRESSION MODELING STRATEGIES: Chapter 4: ‘Multivariable Modeling Strategies’ by Frank E. Harrell, Jr. Copyright 2001 by Springer. Reprinted by permission of Springer via the Copyright Clearance Center’s Annual Academic Copyright License.”

November 16, 2010:

Bootstrapping: A Nonparametric Approach to Statistical Inference

This workshop will discuss analytic approaches to situations where the sampling distribution of a variable is not known and cannot be assumed to be normal.  Bootstrap resampling is a feasible alternative to conventional nonparametric statistics and can also be used to estimate the power of a comparison.

October 19, 2010:

Propensity Score Analyses, Part II

Last month's workshop spent a lot of time on the propensity score (PS) "basics" and ended with a rather hurried discussion of what variables do and don't belong in a PS model.  This month we will address a range of more advanced issues, including the previously promised discussion of why and when it may not be a good idea to include "all available" variables in a PS analysis, and the pros and cons of PS matching vs. weighting vs. covariate adjustment.

September 21, 2010:

Propensity Score Analyses, Part I

This meeting will discuss the separate roles of propensity scores and instrumental variables. Time permitting, we will explore implementation issues in constructing propensity score models.

  • Analyzing Observational Data:  Focus on Propensity Scores (Powerpoint presentation by Arlene Ash, PhD)
  • This draft article, Observational Studies in Cardiology , by Marcus et al provides a fairly straightforward, non-technical "review of three statistical approaches for addressing selection bias: propensity score matching, instrumental variables, and sensitivity analyses. There are many other places where such issues are discussed.

    Technical Level

    Introductory

    Focus Application
    Data Real
    Methods Case Study
  • This paper, "Variable Selection for Propensity Score Models", by Brookhart et al., presented "the results of two simulation studies designed to help epidemiologists gain insight into the variable selection Problem" in a propensity score analysis.

    Technical Level

    Intermediate

    Focus Theory
    Data Simulated
    Methods Simulation