Your Origin Story for Data Science

There’s an origin story for every superhero; even those without superpowers (like Batman – that’s right) got started somewhere. What we sometimes forget is that there is also an origin story for every regular person, every profession, every hobby.

Source: Wikimedia Commons


If you’re a radiologist looking to learn a few things in radiology data science, a simple web search will reveal a seemingly overwhelming amount of material you might have to know.

Fortunately, only a very small subset is necessary to start being productive.  Here are a few resources I used to get started.

Continue reading

The Agitator, Innovator, and Orchestrator Model

A well-written framework on Stanford Social Innovation Review describes three distinct forces of transforming a practice.

An agitator brings the grievances of specific individuals or groups to the forefront of public awareness. An innovator creates an actionable solution to address these grievances. And an orchestrator coordinates action across groups, organizations, and sectors to scale the proposed solution.

The key observation is that transformation requires all three in harmony.  In medicine, the voices of agitators frequently meet top-down repression or with the silence of the leadership. “This is just the way we’ve always done it,” they might say.

The Stanford article focuses on building a team consisting of people in all three domains in order to bring about social innovation.  In medicine, practices tend to be resistant to change partly due to the higher stakes but also due to the highly regulated climate of modern health care.  (This is not necessarily good or bad – it just is.)

Although medicine often places more weight on orchestration – coordination of interdisciplinary care to benefit patient health – it stands to reason that a healthy dose of the other two is also necessary. If you see yourself as an agitator, know that a thorough understanding of stakeholder analysis can help you better differentiate between a simple inconvenience and an opportunity to create value. If you are an innovator, your strength may lie in an intuitive visualization of connections between disparate organizational units. Know that what seems obvious to you is probably opaque to others. In the end:

Agitation without innovation means complaints without ways forward, and innovation without orchestration means ideas without impact.

The most dangerous AI – Mike’s blog

Artificial intelligence is the hottest topic in medical informatics.  The promises of an intelligent automation in medicine’s future are equal parts optimism, hype, and fear.

In this post, Mike Hearn struggles to reconcile the paradox surrounding the supposedly objective, data-driven approaches to AI and the incredibly opinion-charged, ultra-political world from which AI draws its data source.

The post focuses on broader applications, but in medicine, a similar problem exists. If AI is expected to extract insight from the text of original research articles, statistical analyses, and systematic reviews, its “insights” are marred by human biases.

The difference, of course, is that AI may bury such biases into a machine learning black box.  We have an increasing body of research on latent human biases, but machine biases are much harder to discover, particularly when it reflects the inherent biases in the data from which it draws its conclusions. Our own biases.

AI acts as a mirror. Sometimes we don’t like the face staring back at us.

Source: The most dangerous AI – Mike’s blog

Are You Solving the Right Problems?

This month’s Harvard Business Review has an article highlighting one of the most fascinating emerging trends in quality improvement: that a “root cause” exists may be a myth.  As healthcare QI/QA moves towards eliminating errors and improving metric-based performance, the increasing obsession towards solving a quality problem is laudable but sometimes misguided.

This excellent HBR article focuses on reframing.  In short, what you say after discovering a complex problem is important.  Before saying “Let’s start making a pareto chart and collect some data!” try inserting a 30-second pause with, “Is that the right problem we should be solving?”

Without spoiling the fun of reading the article, try thinking through this issue before reading – You have received multiple complaints about the speed of your building’s elevators.  How would you address this problem?


In fact, the very idea that a single root problem exists may be misleading; problems are typically multicausal and can be addressed in many ways.

Source: Are You Solving the Right Problems?

Propel healthcare data science by solving the boring problems

Data cleaning is boring but critical.

If you have been paying attention to data science in healthcare you will have noticed the gradual shift from 2016’s Big Data to 2017’s Machine Learning.  Specifically, deep learning techniques attract much of the attention. The FDA recently approved the use of deep learning techniques in cardiac diagnoses.  Enlitic promises to automate the process of radiologic diagnosis for medical imaging.  And with the advent of wearables, there is an ever-increasing volume of health data that requires “smart” algorithms to parse out the signal from the noise. Continue reading

William Chen’s answer to What are the top 5 skills needed to become a data scientist? – Quora

A Quora answer/article about data science.

Incidentally, the same 5 skills are also highly relevant to be a physician-informatician, particularly in radiology.  Give it a read.

Source: William Chen’s answer to What are the top 5 skills needed to become a data scientist? – Quora

DICOM Processing and Segmentation in Python – Radiology Data Quest

There is something strangely satisfying about being able to take things apart and putting it back together.  Inspired by the popularity of Lego sets in our childhoods, Minecraft brought this sense of wonder to video games.

For those of us who are life-long tinkerers who happen to be radiologists, I published in Radiology Data Quest a DIY on how one take DICOM apart and manipulate it.  All in Python, no less.


DICOM is a pain in the neck.  It also happens to be very helpful.  As clinical radiologists, we expect post-processing, even taking them for granted. However, the magic that occurs behind the scene…

Source: DICOM Processing and Segmentation in Python – Radiology Data Quest