bmichelsonz

Data entrepreneurs, I'll choose my own movie. Go study cancer.

by bmichelsonz on ‎25-04-2012 01:23 PM

Why is it we can predict a consumer's propensity to read Hunger Games, upgrade their iPad or download music featured on the Voice, yet we fail miserably at predicting life-threatening events, such as a women's propensity to develop breast cancer?  

That, in essence, is one of the big questions posed in a new study by the Kauffman Foundation on tackling problems in US health care.

Retail, awash in data:

"Digital merchants, social networks, and data mining entrepreneurs are assembling countless bits and bytes of information about consumer preferences, transactions, and outcomes, agglomerating them, and creating algorithms that can predict what people need, help them find it, and deliver it efficiently."

Breast cancer research, disconnected data islands:

"about 70 percent to 80 percent of women who develop breast cancer do not have a first-degree-relative family history of the disease, so clearly non-genetic factors must be at work. As one clinician puts it, “We’re missing something big.” Or, perhaps more accurately, we’re missing much that is small. The country faces more than 200,000 new breast cancer cases every year, each treated as individual cases, or a few occasionally bundled together for research purposes. Studies can compare selected patient populations in detail on a small scale, and they can make gross comparisons on a large scale. But the factor or factors that cause those 70 percent to 80 percent of unpredictable breast cancer cases are, as of today, falling through the cracks, invisible to the crude optics of the health care sector’s data systems."

Now imagine -- as the study recommends -- what could happen if we applied similar data collection, cleansing and analysis efforts to health care as we do in retail: 

"Instead, imagine a world in which breast cancer cases, their courses of treatment, and their outcomes were routinely uploaded to a database. Another river of data would flow in from women who have not had breast cancer. Pattern analysis could search and compare many thousands of cases across hundreds of variables for clues as to which factors increase or decrease risk of disease, which methods most effectively and safely extend life, which do so at the lowest cost relative to years gained, which treatments produce highest patient satisfaction, and, no less important, which therapies are not cost-effective.

In principle, with proper privacy safeguards, medical data could be cross-referenced with DNA data to uncover new targets for drug research, to design individualized therapies, and to tailor best-practice guidelines not just for whole diseases but for particular patients. For cancer, for example, doctors prescribe first-line therapies that they know will not work in three-fourths of patients with metastatic breast and colon cancer—they just don’t know which three-fourths. Combining larger datasets on drug response with genomic data on patients could steer therapies to the people they are most likely to help. The result would be to reduce substantially the need for trial-and-error medicine, with all its discomforts, high costs, and sometimes tragically wrong guesses."

So, what's the holdup? According to the study, there are four primary obstacles:

  1. Legal barriers and privacy concerns
  2. Technical and semantic issues
  3. Constraints on talent and expertise
  4. Cultural and policy resistance

The common answer to breaking down these barriers is policy reform with open data provisions. This is exactly the big and slow type of answer the Kauffman Task Force sought to avoid:

"We canvassed what we call the adjacent possible—that is, incremental, but important, workable reforms that should improve the productivity of health care and its value independent of whether and how the recently enacted Affordable Care Act of 2010 is ultimately implemented. We did not seek giant, dramatic steps; we avoided sweeping claims and rejected purported magic bullets. We believe that a quest for sweeping, comprehensive, one-shot reform is problematic because it misconceives the health care system as an engineered “system” rather than a natural ecosystem, perhaps as intricate and complex as anything to be found in nature.

Instead, we focus primarily on incremental changes which, taken together, can cumulate to significantly advance both productivity of health care and its outcomes."

On the data problem, the task force focused on "strategies that change incentives at the health sector's data choke points, or which bypass those bottlenecks altogether".

One of these strategies is cultivating data entrepreneurs for the health care sector:

"...The role of the data entrepreneur, then, is to invest time and expertise prospecting for patterns. Doing that, in turn, requires that the data supply be reasonably large and the cost of accessing and analyzing it be reasonably low—conditions that do not exist in American health care today. Indeed, the cost of data entrepreneurship is probably higher in health care than in almost any other sector of the economy."

To fill the data entrepreneurs' pipelines, the study recommends four data related policies:

  1. Portable consent: Allow patients and research subjects in studies to give their consent for their health data to be included in large research databases
  2. Data from outside the health care system: Circumvent the health care system, which is not designed for the collection of data, and legal privacy concerns by collecting data outside the medical system
  3. Sharing publicly funded data: Similar to NIH research grants, data developed from federal grants should be publicly available
  4. Curating data: As more data becomes available, the need for interoperability and ease of using the data becomes even more important

Having clients in the health care space, I'm not naive to the complexities and politics of the above. But, as a pragmatistic, and as one of millions who have suffered loss to cancer, I have to believe there are adjacent possibilities. 

[Read the Kauffman Foundation Health care report (PDF).]

Comments
by Tony Barnes(anon) on ‎25-07-2012 05:26 PM

I read your summary of the Kauffman report, and wanted to respond as a person who has been applying big data math to biological data for a while. The world has tried everything from expensive MRI to dogs with trained noses to find a really early signal for cancer (ie later than the BRCA genes and earlier and with more selectivity than MRI). Each of these studies done in isolation has failed to find something of significance. It is my belief that there are very small amounts of predictive information in many observations that are quite subtle. Unfortunately, gathering large databases that measure these subtle variables and also follow women from asymptomatic disease to biopsy-confirmed breast cancer across 4-30 years simply do not exist. Importantly, the cost of collecting good data about prospective variables is way more expensive than simply monitoring keystrokes and behaviors of people visiting the web to consider buying a car.

What I do think is that in the short run there will be a combination of chemical measurements with imaging measurements and genetic/gene expression signals that will put some asymptomatic women into a higher risk class than any one kind of information alone. Unfortunately, the companies that could combine their work are incentivized to compete rather than combine, and this hurts all of us.

 

Tony

Post a Comment
Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.

The HP Input Output site is sponsored by HP and features articles and content from HP and third-party contributors. Third-party articles and content, while paid for by HP, do not necessarily represent the views and opinions of HP. HP does not endorse this content and is not responsible for its accuracy, availability and quality.

Follow Us
Spotlight
"It's Not My Job" - Handling the Vendor Finger-Pointing Trap Is Teamwork Dead? A Post-Agile Prognosis Improving Your Personal Brand with Social Networking 5 Types of Meetings Every Business Must Explore
┼ Based on energy, paper and toner savings from regular printer usage. Results may vary.