Tuesday, May 27, 2014

Data collection for health outcomes research: twenty-first century options

Here is an interesting article outlining an effort supported by the government of India to compile vital statistics for determining the leading causes of death in its country between 2001 and 2014 (New York Times, Door by Door, India Strives to Know More About Death, May 22, 2014).  While the sentiment is admirable, the approach seems byzantine and early 20th century.  Government registry officials visit the homes of deceased citizens using autopsy forms as a guide to verbal information gathering about circumstances surrounding cause of death for those individuals who died at home without medical supervision.  Good old fashioned epidemiology.  Of course collecting data this way provides huge opportunity for bias due to errors affecting its reliability, including those introduced by language barriers, human recall, and lack of respondents' willingness to disclose intimate family details to a stranger representing the national government.  Not to mention time-consuming:  results will not be final for at least five years.  With great insight, the article's author notes a more modernistic objection to building data sets of this type, that might not have been considered even 20 years ago:  "A great reservoir of information will be valuable to public health specialists, but will probably bring little to the families who were its subjects."

Contrast conventional vital statistics data collection methods with crowdsourcing approaches.  The crowdsourcing concept is characterized by collecting volumes of data, in a systematic manner, at no or very low cost to any single entity.  Millions of ordinary people, fueled by passion for the cause, each contributes information about an observation made during the course of their normal activities, at but a micro-cost to themselves.  Individual contributions are turned into digital data and amassed into a single, gigantic data set so the statisticians can do their magic.  In terms an economist would understand, using crowdsourcing to create a useful data set is like harnessing the antithesis of forces that lead to the tragedy of the commons.

An interesting article also published by the New York Times (Crowdsourcing, for the Birds, August 19, 2013) provides an illustration of two variations of crowdsourcing used by wildlife biologists to monitor bird populations.  It provides an articulate description of both efforts, each of which harness the power of crowdsourcing techniques to produce data for bird epidemiology research:  (1) the Breeding Bird Survey coordinated by the United States Geological Survey (USGS) and (2) the eBird project, a non-profit, global ornithological network of high tech data collectors.  The app-driven eBird project seems more productive, efficient, and appealing, despite criticisms lobbed against it by more conventional Bird Survey proponents, including the usual generic validity and reliability snarks.  But the Bird Survey is adept at organizing and training volunteer bird-watchers to count birds in a systematic, quasi-controlled manner, and transfer observations by hand to a paper survey form.  It requires conventional data processing methods to key in and compile data points to produce a final data set.  Still, final data set products from both projects are freely available to the public so that anyone can use the data for their own analysis.  Open data!  A highlight of the data gathering process for the eBird project is that it provides the ability to produce a real-time view of bird populations around the world using heat maps, a visually appealing, intuitively easy to understand depiction of bird species population density, location, and migration over time.  Imagine being able to track human health epidemics, diseases, conditions, and health outcomes in response to various interventions using similar methods.

My favorite quote from the article is, "Birds are notoriously hard to count."  This observation, as if birds are harder to count than any other species?  Scientists who count birds must not have any cross-training in the epidemiology of human health.  How is it that birds are any harder to count than babies?  Adults?  People with heart disease?  People living in poverty or substandard living conditions?  My mind almost short-circuited with thoughts about how crowdsourcing might be applied to compiling data sets for the study of human health and the effects of public policy and other interventions on health outcomes.  Crowdsourcing:  an untapped resource for producing data that could answer crucial questions in medical care research, particularly useful for very large, heterogeneous populations that were heretofore impossible to study because of the impracticalities of producing requisite data for a nominal cost.

Twenty-five hundred grams

Twenty-five hundred grams (2,500 g) is the upper cut-off point for classifying infant birth weight as undesirably low.  A customary public health metric, low birth weight is defined as an infant weight of less than 2,500 g (equivalent to 5 pounds, 8 ounces), measured within one hour of birth.  As it relates to infant health at birth, as infant size declines, risk for early death and acute and chronic morbidity increases.  Low birth weight is a noteworthy public health concern because:  (1) it is a major risk factor for infant mortality; (2) it consumes a disproportionately high level of health care resources; and (3) its prevalence continues to rise despite investment of public resources aimed toward improving birth outcomes.      

Infant birth weight is one of the primary measures of infant health at birth, making it a central target of much public health policy in both the U.S. and globally.  It is a well documented predictor of neonatal health, infant survival, and future health and productivity (Currie, 2011, p.12; Martin et al., 2011; Almond, Chay, & Lee, 2005).  Public health entities and policy makers traditionally view infant birth weight as an indicator of overall health at the population level because poor birth weight outcomes reflect maternal undernutrition, chronic ill health, excessive physical exertion or stress, and poor health care during pregnancy (The World Health Organization [WHO], n.d.; Stevens-Simon & Orleans, M., 1999).  The WHO and others also note its importance as an indicator of a country or region's health care system's effectiveness in delivering life-saving, life enhancing interventions to its citizens (Almond et al., 2005; WHO, 2000).

Federal and state governments are heavily invested in financing medical care for childbirth associated maternal and infant health outcomes.  In the U.S., public funding sources finance the cost of health insurance coverage for a substantial number of infant deliveries and subsequent medical care.  The Federal-State Medical Assistance Program, better known as Medicaid, is the primary source of insurance for publicly financed infant deliveries.  Federal and state governments share the cost of providing pubic insurance for low-income residents who meet certain eligibility criteria, and states administer their respective programs.

In 2010, the U.S. recorded 3,854,224 million births.  Of those, Medicaid provided insurance coverage for 45% (1.75 million) of associated deliveries and just under half (48% or 1.86 million) were billed to private payers; those that made up the remaining portion (3% or 131,205) were either uninsured or covered by other miscellaneous government programs (Source:  Agency for Healthcare Research and Quality [AHRQ], Nationwide Inpatient Sample [NIS], n.d.).  AHRQ indicator data that tracks sources of insurance coverage for all U.S. hospital deliveries illustrate a steadily rising trend in public insurance financing for infant deliveries under current eligibility criteria.  And as federal health care reform progresses toward full implementation, eligibility criteria are likely to expand, at least in many states.


Sunday, May 4, 2014

Sex & Econometrics

You think it's impossible that these two topics are related?  Not so.  Imagine my delight when I came upon the following quote, embedded in a dry old journal article published in 1983, by The Journal of Economic Review.

"Methodology, like sex, is better demonstrated than discussed, though often better anticipated than experienced" (Leamer, 1983, p. 41).

I think it's poetic. 

Although Lerner wrote this paper in 1983 (based on a public lecture presented at the University of Toronto in 1982), it is crammed full of wisdom relative to current events in applied health research.  Take, for one example, the presumed belief in the relative power of experimental over nonexperimental data to produce credible inferences.  Customary assumptions rely on the belief that the randomized controlled trial (RCT) is the gold standard in medical care research.  Maintaining control over randomization automatically equates to "a rigorous study."  However, Lerner addresses this question directly when he asks, "Is Randomization Essential" (p. 31)?  The answer is:  "the randomized experiment and the nonrandomized experiment are exactly the same" (p. 32), at least in terms of drawing credible inferences.

Randomization is customarily the linchpin for determining the value of research today, but really it might better be viewed as one consideration among many others within the context of overall study design.  Here is a perspective that levels the playing field between the value of research produced by RCTs and econometric methods as far as their potential to generate valid inferences.  Regardless, any applied research is dependent on measurement whether using experimental or nonexperimental data.

My third favorite point:  "The job of a researcher is then to report economically and informatively the mapping from assumptions into inferences" (p. 38).  Sensitivity analyses ameliorate doubt about the reality of some assumptions, but others are not objective nor are they value free.  When this is the case, changes in assumptions can change inferences.

In conclusion, most facts are really opinions, and many opinions are based on conventions rather than truth.  So it seems that, just as there are similarities between sex and econometrics, questioning assumptions about econometric modeling conventions might be a process that is similar to how we determine the truths by which we live our lives.