Tuesday, May 7, 2013: Presented by Tingjian Ge, PhD, Assistant Professor, Department of Computer Science, UMass Lowell
How Recent Data Management and Mining Research can Benefit Biomedical Sciences
Data management (a.k.a. databases, traditionally) and data mining have been active research topics in Computer Science since the 1960s, both in academia and in the research and development groups of companies (for example IBM Research). In recent years we have seen a surge in this research due to the “big data” trend. On the other hand, various areas in the biomedical sciences are producing increasingly large amount of data due to the prevalence of automatic data-generating devices. It is natural to consider what some of the most recent results from data management and mining can do for the state-of-the-art biomedical research and practice.
In this talk, I will discuss the potential applications of my research in data management and mining to various biomedical studies. They include: (1) complex event detection over correlated and noisy time series data, such as ECG monitoring signals and real-time dietary logs; (2) ranking and pooled analysis of noisy and conflicting data, such as microarray results and emergency medical responses in disaster scenes (e.g., terrorist attacks or earthquakes); and (3) association rule mining on mixed categorical and numerical data, such as the dietary logs, for food recommendation and weight control.
Tuesday, April 16, 2013: Presented by Jeffrey Bailey, MD, PhD,
Computational Approaches for Analyzing Copy Number Variation and Standing Segmental Duplication
Segmental duplication represents the key route for the evolution of new genes within an organism. An regions of duplication are often copy number variant providing increased functional diversity. Detecting regions of duplication and copy number variation is still a challenge even with hihg-throughput sequencing. The lecture will review the key methods for identifying duplicated sequence and copy number variant regions within genomic sequence and provide an overview of our laboratory's ongoing work to detect, type and correlate such regions with phenotype particularly vis-a-via malaria.
Tuesday, April 2, 2013: Presented by Becky Briesacher, PhD
"Offsetting Effects of Medicare Part D on Health Outcomes and Hospitalization?"
This presentation will cover a Medicare Part D policy evaluation and the novel use of time-series and bootstrapping methods. My early results challenge the assumption of the US Congressional Budget Office that Medicare prescription drug costs are offset by medical service savings. I will also describe how we used Pre-Part D data to create simulated post-Part D outcomes. Confidence intervals were constructed using bootstrapping and the test for differences was based on the proportion of simulated values that exceeded/fell below the observed value.
Tuesday, March 5, 2013: Presented by David Hoaglin, PhD, Professor, Biostatistics and Health Services Research
"Regressions Gone Wrong: Why Many Reports of Regression Analyses Mislead"
Regression methods play an important role in many analyses: multiple regression, logistic regression, survival models, longitudinal analysis. Surprisingly, many articles and books describe certain results of such analyses in ways that lead readers astray. The talk will examine reasons for these problems and suggest remedies.
Speed 2012 DHoaglin 3/5/13
Making Sense DHoaglin 3/5/13
February 19, 2013: Presented by Wenjun Li, PhD, Associate Professor, Preventative and Behavioral Medicine
Use of Small Area Health Statistics to Inform and Evaluate Community Health Promotion Programs
This presentation discusses the application of small area estimation methods to identify priority communities for public health intervention programs, to tailor community-specific intervention strategies, and to evaluate the effectiveness at the community level.
December 4, 2012: Presented by Thomas Houston, MD, MPH Professor and Chief
Comparative Effectiveness Research (CER) Seminar Series -- Pragmatic Clinical Trials (PCT II) (following Bruce Barton's PCT 1 on Sept. 18)
Dr. Houston will describe a series of cluster-randomized trials where they have used the Internet and informatics to support Interventions for providers and patients. He will also review the PRECIS tool, a way to characterize your pragmatic trials, and the stages of implementation complete (SIC measure) a time-and-milestone-based method to assess success in implementation.
November 20, 2012: Presented by Jennifer Tjia, MD, MSCE, Associate Professor of Medicine
Pharmacoepidemiologic Approaches to Evaluate Outcomes of Medication Discontinuation
The self-controlled case series method, or case series method for short, can be used to study the association between an acute event and a transient exposure using data only on cases; no separate controls are needed. The method uses exposure histories that are retrospectively ascertained in cases to estimate the relative incidence. That is, the incidences of events within risk periods—windows of time during or after experiencing the exposure when people are hypothesized to be at greater risk—relative to the incidences of events within control periods, which includes all time before the case experienced the exposure and after the risk has returned to the baseline value. For many researchers, the main appeal of the self-controlled case series method is the implicit control of fixed confounders. We will discuss the application of this method in pharmacoepidemiologic outcomes studies, and explore the idea of whether this approach offers advantages over more conventional cohort studies when evaluating adverse drug withdrawal events following medication discontinuation. We will use examples from a linked Medicare Part D and Minimum Data Set database to facilitate discussion.
November 6, 2012: Presented by Molin Wang, PhD, Harvard University
Latency Analysis under the Cox Model when the effect may change over time
We consider estimation and inference for latency in the Cox proportional hazard model framework, where time to event is the outcome. In many public health settings, it is of interest to assess whether exposure effects are subject to a latency period, where the risk of developing disease depending on the exposure level varies over time, perhaps affecting risk only during times near the occurrence of the outcome, or perhaps affecting risk only during times preceding a lag of some duration. Identification of the latency period, if any, is an important aspect of assessing risks of environmental and occupational exposures. For example, in air pollution epidemiology, of interest is often not only the effect of the m-year moving cumulative average air pollution level on risk of all cause mortality, but also point and interval estimation of m itself. In this talk, we will focus on methods for point and interval estimation of the latency period under several models for the timing of exposure which have previously appeared in the epidemiologic literature. Computational methods will be discussed. The method will be illustrated in the study of the timing of the effects of constituents of air pollution on mortality in the Nurses’ Health Study.
October 16, 2012: Presented by Dr. Sherry Pagoto and Deepk Ganesan, PhD
mHealth-based Behavioral Sensing and Interventions
This presentation will review mHealth and sensing research and methodologies at the UMass Amherst and UMass medical School campuses. We will discuss ongoing research in mobile and on-body sensing to obtain pysiological data in the field, and to design a toolkit for processing such data to derive high quality features, deal with data quality issues (e.g. loose sensors, missing data), and leverage diverse sensor modalities to improve inference quality. To demonstrate the methodologies, we will discuss a recently funded pilot project in which mobile and sensing technology will be used to assess and predict physiological and environmental factors that impact eating behavior. Once eating behavior is predictable with accuracy, interventions will be delivered via technology at the precise moments when individuals are the most likely to overeat. The purpose of this research is to improve the impact of behavioral weight loss interventions.
October 2, 2012: Presented by Amy Rosen, PhD
Assessing the Validity of the Agency of Healthcare Research and Quality (AHRQ) Patient Safety Indicators (PSIs) in the VA
This presentation will review general patient safety concepts and ways in which patient safety events are identified. Background on the PSIs will be provided, and a recent multi-faceted validation study that was conducted in the VA to examine both the criterion and attributional validity of the indicators will be presented. Two questions will be specifically addressed: 1) Do the PSIs Accurately Identify True Safety Events? 2) Are PSI rates associated with structures/processes of care?
PSI 1 PSI 2
September 18, 2012: Presented by Bruce Barton, PhD
Pragmatic Clinical Trials: Different Strokes for Different Folks
Pragmatic clinical trials (PCTs) are relatively new on the clinical research scene and are being proposed routinely for NIH funding. In a sense, PCTs are comparative effectiveness studies on steroids! This presentation will discuss the concepts behind this new breed of clinical trial, how PCTs differ from the usual randomized clinical trial, and what to be careful of when developing one. We will review two PCTs as case studies to look at different approaches to the study design. The references are two of the more recent papers on PCT methodology and approaches.
PCT1 Reference PCT2 Reference PCT Slides
July 17, 2012: Presented by Dianne Finkelstein, PhD; Mass General Hospital and Harvard School of Public Health, Boston, MA
Developing Biostatistics Resources at an Academic Health Center.
Although biostatistics plays an important role in health-related research, biostatistics resources are often fragmented, or ad hoc, or oversubscribed within Academic Health Centers (AHCs). Given the increasing complexity and quantity of health-related data, the emphasis on accelerating clinical and translational science, and the importance of reproducible research, there is need for the thoughtful development of biostatistics resources with AHCs. I will be reporting on a recent collaboration of CTSA biostatisticians who identified strategies for developing biostatistics resources in three areas: (1) recruiting and retaining biostatisticians; (2) using biostatistics resources efficiently; and (3) improving science through biostatistics collaborations. Ultimately, it was recommended that AHCs centralize biostatistics resources in a unit rather than disperse them across clinical departments, as the former offers distinct advantages to investigator collaborators, biostatisticians, and ultimately to the success of the research and education missions of AHCs.
May 15, 2012: Presented by George Reed, PhD
Modeling disease states using Markov models with covariate dependence and time varying intervals.
An example of modeling transitions among multiple disease states where measurements are not made at fixed and equal time intervals and the primary interest is in factors associated with the transition probabilities. Both first order and higher order Markov models are considered.
May 1, 2012: Presented by Becky Briesacher, PhD
"Medicare Prescription Drug program and Using Part D Data for Research"
In 2006, the Medicare program began offering coverage for prescription drugs, and as of June 2008, Part D data have been available to researchers. This presentation will briefly introduce the audience to the Medicare Part D program and Part D data for research purposes. The presentation will include personal reflections on becoming a drug policy researcher and excerpts from my own program evaluation research.
April 17, 2012: Presented by David Hoaglin, PhD
"Indirect Treatment Comparisons and Network Meta-Analysis: Relative Efficacy and a Basis for Comparative Effectiveness"
Evidence on the relative efficacy of two treatments may come from sets of trials that compared them directly (head to head); but often one must rely on indirect evidence, from trials that studied them separately with a common comparator (e.g., placebo) or from a connected network of treatments. The talk will review basic meta-analysis, discuss steps and assumptions in network meta-analysis, and comment on applications to comparative effectiveness
- ISPOR States Its Position on Network Meta-Analysis
- Conducting Indirect-Treatment-Comparison and Network-Meta-Analysis Hoaglin 2011 ViH
- Appendix: Examples of Bayesian Network Hoaglin Appen 2011
- Jansen 2011 ViH
- Luce 2010 Millbank
March 20, 2012: Presented by Thomas English,PhD
"Using Allscripts Data at UMass for Clinical Research"
I will discuss work that I have done that has been enabled by EHRs. This should give an idea of how the current EHR at UMass could help your research.
February 28, 2012: Presented by Nancy Baxter, MD, PhD, FRCSC, FACRS
"Room for Improvement in Quality Improvement"
In most circumstances in clinical medicine randomized clinical proving efficacy are required before widespread adoption of interventions. However in the area of quality improvement many strategies have been implemented with little supporting evidence. Why is this, and why worry? These are topics that will be explored in my presentation.
February 21, 2012: Presented by Stephen Baker, MScPH
"Sequentially Rejective Procedures for Multiple Comparisons in Genome Wide Association Studies (GWAS)"
The problem of additive type I error due to multiple comparisons has been well known for many years, however with the introduction of microarrays and other technologies it has become one of the central problems in data analysis in molecular biology. Sequential testing procedures have been popular but have limitations with these new technologies. I will discuss some popular methods, some new ones and illustrate them with microarray data for associating gene expression with disease status.
February 7, 2012: Presented by Arlene Ash, PhD
Risk Adjustment Matters
What variables should be included, and how, in models designed to either detect differences in quality among providers with very different “case-mix” or to isolate the effect of some patient characteristic on outcome? What role does the purpose of the modeling effort play? What are the consequences of different modeling choices? What does “do no harm” mean for a statistical analyst?
January 17, 2012: Presented by Zhiping Weng
Computational Identification of Transposon Movement With Whole Genome Sequencing
Transposons evolve rapidly and can mobilize and trigger genetic instability. In Drosophila melanogaster, paternally inherited transposons can escape silencing and trigger a hybrid sterility syndrome termed hybrid dysgenesis. We developed computational methods to identify transposon movement in the host genome and uncover heritable changes in genome structure that appear to enhance transposon silencing during the recovery to hybrid dysgenesis.
December 20, 2011: Presented by Jacob Gagnon, PhD
Gene Set Analysis Applied to a Leukemia Data Set
Gene set analysis allows us to determine which groups of genes are differentially expressed when comparing two subtypes of a given disease. We propose a logistic kernel machine approach to determine the gene set differences between B-cell and T-cell Acute Lymphocytic Leukemia (ALL). Compared to previous work, our method has some key advantages: 1) our hypothesis testing is self-contained rather than being competitive, 2) we can model gene-gene interactions and complex pathway effects, and 3) we test for differential expression adjusting for clinical covariates. Results from simulation studies and from an application of our methods to an ALL dataset will be discussed.
December 14, 2011: Presented by Yunsheng Ma, MD, PhD
Determinants of Racial/Ethnic Disparities in Incidence of Clinical Diabetes in Postmenopausal Women in the United States: The Women’s Health Initiative 1993- 2009
Although racial/ethnic disparities in diabetes risk have been identified, determinants of these differences have not been well-studied. Previous studies have considered dietary and lifestyle factors individually, but few studies have considered these factors in aggregate in order to estimate the proportion of diabetes that might be avoided by adopting a pattern of low-risk behaviors. Using data from the Women’s Health Initiative, we examined determinants of racial/ethnic differences in diabetes incidence.
- This paper, “Diet, lifestyle, and the risk of type 2 diabetes mellitus in women", by Hu et al., presented ways to analyze diabetes risk factors in aggregate in order to estimate the proportion of diabetes that might be avoided by adopting a pattern of low-risk behaviors.
Technical Level | Intermediate |
|---|
| Focus | Application |
| Data | Nurses’ Health Study |
| Methods | Cox proportional hazards models |
November 9, 2011: Presented by: Nanyin Zhang, Ph.D.
In the presentation I will Introduce the fundamental mechanisms of fMRI. I will also talk about potential applications of fMRI in understanding different mental disorders.
- Article #1 (Functional Connectivity and Brain Networks in Schizophrenia), by Mary-Ellen Lynall et. al., tested the hypothesis that Schizophrenia is a disorder of connectivity between components of large-scale brain networks by measuring aspects of both functional connectivity and functional network topology derived from resting-state fMRI time series acquired at 72 cerebral regions over 17 min from 15 healthy volunteers (14 male, 1 female) and 12 people diagnosed with schizophrenia (10 male, 2 female).
Technical Level | Intermediate |
|---|
| Focus | Application |
| Data | Real |
| Methods | Proof |
| | |
- Article #2 (Hyperactivity and hyperconnectivity of the default network in schizophrenia and in first-degree relatives of persons with schizophrenia), by Susan Whitfield-Gabrieli, examined the status of the neural network mediating the default mode of brain function in patients in the early phase of schizophrenia and in young first-degree relatives of persons with schizophrenia.
Technical Level | Intermediate |
|---|
| Focus | Application |
| Data | Real |
| Methods | Proof |
October 18, 2011: Presented by: Bruce A. Barton, Ph.D.
The Continuing Evolution of Randomized Clinical Trials – the Next Steps: Continuing the discussion initiated by Wenjun Li, Ph.D., at the April QHS/QMC Methods Workshop (“Role of Probability Sampling in Clinical and Population Health Research”), this workshop will discuss some proposed designs for randomized clinical trials (RCTs) which provide partial answers to some of the problems with the current design of RCTs – as well as possible next evolutionary steps in RCT design to better address the primary issues of patient heterogeneity and of generalizability of results.
September 20, 2011: Presented by Zi Zhang, MD, MPH
Using Address-Based Sampling (ABS) to Conduct Survey Research -
The Traditional random-digital-dial (RDD) approach for telephone surveys has become more problematic due to landline erosion and coverage bias. Dual-sample frame method employing both landlines and cell phones is costly and complicated. We will discuss the use of the U.S. Postal Service Deliver Sequence File as an alternative sampling source in survey research. We will focus on sample coverage and response rate in reviewing this emerging approach.
July 19, 2011: Presented by: Jennifer Tjia, MD, MSCE:
Addressing the issue of channeling bias in observational drug studies
Channeling occurs when drug therapies with similar indications are preferentially prescribed to groups of patients with varying baseline prognoses. In this session, we wil discuss the phenomenon of channeling using a specific example from the Worcester Heart Attack Study.
June 21, 2011: Presented by Mark Glickman, PhD:
Multiple Testing: Is Slicing Significance Levels Producing Statistical Bologna?
Procedures for adjusting significance levels when performing many hypothesis tests are commonplace in health/medical studies. Such procedures, most notably the Bonferroni adjustment, control for study-wide false positive rates, and recognize that the probability of a single false positive result increases with the number of tests. In this talk we establish, in contrast to common wisdom, that significance level adjustments based on the number of tests performed are, in fact, unreasonable procedures, and lead to absurd conclusions if applied consistently. We argue that confusion may exist between an increased number of tests being performed with a low (prior) probability of each null hypothesis being true. This confusion may lead to the unwarranted multiplicity adjustment. We finally demonstrate how false discovery rate adjustments are a more principled approach to significance level adjustments in health and medical studies.
April 19, 2011: Presented by: Wenjun Li, PhD:
Role of Probability Sampling in Clinical and Population Health Research
This workshop uses practical examples to illustrate the use of probability sampling of RCT and population health studies. The approach is used to optimize the generalizability of, increase statistical power and add values to the collected data by preserving the possibility of sub-group analysis.
March 15, 2011:
The Peters-Belson Approach to study Health Disparities: Application to the National Health Interview Survey
This workshop will discuss cancer screening rates varyingly substantially by race/ethnicity, and identifying factors that contribute to this disparity between the minority groups and the white majority should aid in designing successful programs. The traditional approach for examining the role of race/ethnicity is to include a categorical variable, indicating minority status, in a regression-type model, whose coefficient estimates this effect. We applied the Peters- Belson(PB) approach, used in wage discrimination studies, to analyze disparities in cancer screening rates between different race/ethnic groups from the 1998 National Health Interview Survey (NHIS), and to decompose the difference into a component due to differences in the covariate values in the two groups and a residual difference. Regression model was estimated accounting for the complex sample design. Variances were estimated by the jackknife method where a single primary sampling unit was considered as the deleted group and compared to analytic variances derived from Taylor linearization. We found that among both men and women, most of the disparity in colorectal cancer screening and digital rectal exam rates between whites and blacks was explained by the covariates but the same was not true for the disparity between whites and Hispanics.
- This article, "Understanding the Factors Underlying Disparities in Cancer Screening Rates Using the Peters-Belson Approach", is authored by Rao,Sowmya, Graubard BI, Breen, Nancy and Gastwirth, Joseph.
Technical Level | Introductory |
|---|
| Focus | Application |
| Data | Real (NHIS, 1998) |
| Methods | Case Study |
- This article, "Using the Peters-Belson method to measure health care disparities from complex survey data", is by Graubard BI, Rao,Sowmya, and Gastwirth, Joseph.
Technical Level | Intermediate |
|---|
| Focus | Theory and Application |
| Data | Real (NHIS, 1998) |
| Methods | Proof |
Dr Rao also would like to suggest a book for anyone who wants to analyze national surveys. This is "Analysis of Health Surveys" by Korn EL and Graubard BI. It was published by Wiley, New York, NY in 1999.
February 15, 2011:
Multivariable Modeling Strategies: Uses and Abuses
This workshop will be hosted by George Reed, PhD and will discuss regression modeling strategies including predictor complexity and variable selection. The workshop will examine the flaws and uses of methods like stepwise procedures, and discuss how modeling strategies should be tailored to particular problems.
- This article, " Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets ", is authored by Steyerbert, Ewout, Eijkemans, Marinus, Harrell Jr, Frank, and Habbema, J. Dik.
Technical Level | Intermediate |
|---|
| Focus | Application |
| Data | Real |
| Methods | Case Study |
- This article, "Selection of important variables and determination of functional form for continuous predictors in multivariable model building", is by Sauerbrei, Willi, Royston, Patrick and Binder, Harald.
Technical Level | Intermediate |
|---|
| Focus | Application |
| Data | Real |
| Methods | Case Study |
Dr Reed would also like to recommend Chapter 4 from the book, "Frank Harrell's regression modeling strategies."
“REGRESSION MODELING STRATEGIES: Chapter 4: ‘Multivariable Modeling Strategies’ by Frank E. Harrell, Jr. Copyright 2001 by Springer. Reprinted by permission of Springer via the Copyright Clearance Center’s Annual Academic Copyright License.”
November 16, 2010:
Bootstrapping: A Nonparametric Approach to Statistical Inference
This workshop will discuss analytic approaches to situations where the sampling distribution of a variable is not known and cannot be assumed to be normal. Bootstrap resampling is a feasible alternative to conventional nonparametric statistics and can also be used to estimate the power of a comparison.
October 19, 2010:
Propensity Score Analyses, Part II
Last month's workshop spent a lot of time on the propensity score (PS) "basics" and ended with a rather hurried discussion of what variables do and don't belong in a PS model. This month we will address a range of more advanced issues, including the previously promised discussion of why and when it may not be a good idea to include "all available" variables in a PS analysis, and the pros and cons of PS matching vs. weighting vs. covariate adjustment.
September 21, 2010:
Propensity Score Analyses, Part I
This meeting will discuss the separate roles of propensity scores and instrumental variables. Time permitting, we will explore implementation issues in constructing propensity score models.
- Analyzing Observational Data: Focus on Propensity Scores (Powerpoint presentation by Arlene Ash, PhD)
- This draft article, Observational Studies in Cardiology , by Marcus et al provides a fairly straightforward, non-technical "review of three statistical approaches for addressing selection bias: propensity score matching, instrumental variables, and sensitivity analyses. There are many other places where such issues are discussed.
Technical Level | Introductory |
|---|
| Focus | Application |
| Data | Real |
| Methods | Case Study |
- This paper, "Variable Selection for Propensity Score Models", by Brookhart et al., presented "the results of two simulation studies designed to help epidemiologists gain insight into the variable selection Problem" in a propensity score analysis.
Technical Level | Intermediate |
|---|
| Focus | Theory |
| Data | Simulated |
| Methods | Simulation |