The Future

One million

The number of people who will fuel the government’s official body of medical data.
The Future

The government wants your medical data

The U.S. is constructing its own data set from one million people to be used for medical research; whether it will help people or reinforce structural issues remains a question.

On May 6, America took a big step towards a new medical future. Across the country, universities, hospitals and research institutes began the process of recruiting the biggest research cohort ever assembled — a group that will be at least one million people strong when the project is complete. Over a period of years, these million research participants will contribute genetic samples, information about their lifestyles and living environment, and access to electronic health records and healthcare claims. The aim is to compile a vast dataset on health that will then be made publicly available, with anonymized information presented through a research portal, and more sensitive data available for researchers who have completed bioethics training. But while the benefits or such rich data are large, some researchers have raised concerns that it could also reproduce biases against groups which already suffer from poor health outcomes.

The data collection project, titled the All Of Us Research Program, forms part of a presidential Precision Medicine Initiative announced in 2015 by Barack Obama, one aspect of the 44th president’s mission to transform and improve the US healthcare system. “Just like analyzing our DNA teaches us more about who we are than ever before” Obama said in a speech to the press, “analyzing data from one of the largest research populations ever assembled will teach us more about the connections between us than ever before. And this new information will help doctors discover the causes, and one day the cures, of some of the most deadly diseases that we face.” But where the blame will land for these medical issues unclear; data like this can pinpoint only personal responsibility, but may allow everyone involved to ignore the real structural issues that are behind poor health.

The All Of Us website defines precision medicine as “a revolutionary approach for disease prevention and treatment that takes into account individual differences in lifestyle, environment, and biology.” Put in more human terms, it’s to say that I as a 32-year-old male of South Asian descent, living in a cold northern climate, could be at greater risk of certain diseases than an older Jewish man who lives in Florida, or might reap greater benefits from certain healthy behaviors than would a woman of the same age. Perhaps the fact that I’m vegan and drink alcohol makes me a candidate for ailments that a sober meat-eater might avoid, or maybe my light build gives me less chance of back problems than someone with a similar lifestyle and a larger frame.

Teasing out correlations and predictions from a sea of data is where computer systems excel over the limited brain power of human analysts. We’ve already seen machine learning used to improve cancer diagnosis or refine depression treatments, and the possible uses multiply by the day, to understandable acclaim. But while there is widespread excitement about the potential of precision medicine, there’s also a danger in thinking that more data and computational power will always bring us closer to solving problems with complex social causes.

In theory of course, more data on health problems does gives us a better understanding. But under a free-market healthcare system, the type of data available is often that which has been collected through systems designed by, and for, the needs of insurance providers first, and patients second. During a recent appearance on Stephen Dubner’s Freakonomics podcast the surgeon and author Atul Gawande told the host: “Our systems are incredibly optimized for sending bills. I can send the bill in three keystrokes. But recording an allergy can be four different screens. So it’s not built to set a goal for care and then accomplish it.” In fact, beyond just creating an administrative burden on medical staff, an ethnographic study conducted in an obstetrical unit found that nurses frequently created a record of care that fit the logic of the Electronic Medical Record system, but was at odds with the actual sequence of events: what the study authors described as a “perfect” but inaccurate account.

There is a tendency see conclusions drawn from data as objective. But as artificial intelligence experts have pointed out, input data skewed by human bias reflects that bias in the output. In a report titled Fairness in Precision Medicine, researchers Kadija Ferryman and Mikaela Pitcan of the Data & Society Institute explore this issue through a series of interviews with academics and medical professionals. When cataloguing the possibilities for bias in datasets, Ferryman and Pitcan identify risk factors such as a lack of diversity in genetic research data (which globally skews heavily white), the aforementioned problems with electronic health records, and the design of analytical systems by software engineers who have limited exposure to the practice of healthcare.

Besides bias in the datasets, another aspect is a bias in outcomes: The concern that a precision medicine approach could unintentionally weaken the case for large-scale public health interventions, like taxing sugary drinks or restricting tobacco advertising, by attributing health issues to factors intrinsic to people when they may be caused by environmental, social, or behavioral factors instead.

“One future orientation of precision medicine could really put the responsibility for health outcomes onto the individual without taking into account any of the structural influences on health,” Ferryman told The Outline in a Skype call. “So if you imagine that in the future individuals will be getting very personalized risk reports, or medical interventions, the scope and scale of intervention may then be really narrowed down to the individual.”

There is also a question of whether presenting information is enough to bring about improvements, in a context where the findings are complex.

“One of the visions of the field of precision medicine is that, with increased information about health, you’ll have more autonomy and agency over controlling your own health outcomes,” says Pitcan. “But that relies on the individual receiving the information having a certain level of literacy, and a certain level of access to interventions.” In other words, it’s unsurprising that populations with the worst health outcomes tend to have limited access to healthcare or healthy living choices in the first place, and there’s little to be gained from an algorithm that is able to predict this unless other more substantive changes occur. For their part, coordinators of the Precision Medicine Initiative frame their work as being one piece of the larger landscape of public health. In an email, Dr Joni Rutter, Director of Scientific Programs for All Of Us, said that the research “requires a shared burden on the part of the researchers, the participants, and the providers,” and that “a deeper understanding of how biology, environment, and lifestyle play into overall health provides all of these stakeholders with more weapons to combat disease.” Effective precision medicine should be seen as a way to provide more paths to better health outcomes, she said, rather than being the only path.

In our call, Pitcan and Ferryman recognized that All of Us has worked hard to engage a diverse research cohort, and Dr Rutter confirmed that the program has a target of recruiting 70-75 percent of participants from communities that have been underrepresented in biomedical research, whether due to ethnicity, gender, sexuality or other factors. The project holds the promise of a more accurate dataset than can be collected through indirect means like medical records alone, but it’s important to clarify that this is no guarantee of an equitable distribution of the results of precision medicine research as a whole.

For the healthcare system in America, lack of access to affordable care has always been a bigger problem than a lack of data. The most important factor in making precision medicine work for marginalized groups may hinge on the continuation of Obama’s other great project: the struggle for universal coverage.