How Will AI Acknowledge Our Differences in Healthcare?

A few weeks ago for a hackathon, my team was looking into heart attacks for our heart attack detection wearable. Halfway through our research, we learned that women and men can actually experience different heart attack symptoms, which often go unacknowledged. This led us in a new direction: a woman-centered heart attack wearable utilizing AI trained on female-specific data called Heartware.

It was this project that made me realize the need for awareness of bias in AI, especially in the context of healthcare.

Why is AI being used in the healthcare industry?

AI and machine learning today can be leveraged as tools to help medical care workers make more accurate, faster, and cheaper diagnosis such as:

  • Diagnosing skin cancer like a dermatologist
  • Analyze a CT scan for strokes like a radiologist
  • Detect cancer on a colonoscopy like a gastroenterologist

Put simply, AI works by training on past data to make predictions about future data it hasn’t seen before. Just like an experienced doctor studies and has seen many examples of skin cancer vs. not skin cancer, AI can help classify these on a more precise level.

Why do biases matter in healthcare?

In certain cases, such as the impact of gender on heart attacks, factors relating to ones’ identity may actually impact the way certain conditions need to be diagnosed or treated.

In healthcare, it is important to acknowledge these factors for the benefit of the patient when creating algorithms for these use cases. But on the flip-side, it is very important not to exacerbate the deeply ingrained systemic divide in our healthcare system already.

Most often, the harmful bias in AI emerges from a training dataset that does not accurately represent the population it will be deployed on — such as a skin cancer detection algorithm that was not trained on enough data from darker-skinned individuals.

Unwanted AI bias can creep into healthcare data through:

  • Samples: We must ensure that there is adequate diversity in training datasets.
  • Design: What was the original intention behind creating the algorithm? Was it to maximize profit?
  • Usage: Physicians must be aware of the details of the AI models they‘re using to ensure ethical deployment

Case Study: Healthcare Risk-Management Algorithm

Consider an algorithm that detects which patients are at the highest risk and may benefit most from specialized care programs, and classifies which individuals should qualify for this extra care.

This type of algorithm is used on over 200 million individuals in the US. However, a 2019 study showed that significant racial bias existed in this algorithm. Despite sharing similar risk scores, black individuals had a 26.3% higher chance of chronic illness than their white counterparts.

The ground truth this algorithm was built on was the assumption that those who spend more money on healthcare have higher needs. This was a convenient way to come to this conclusion because this data was easy to obtain and was an efficient quantitative indicator that made data cleaning simple.

However, when black and white patients spent the same amount, this did not directly translate to the same level of need. This is because healthcare costs and race are highly correlated. Patients of color are more likely to have reduced access to medical care due to time, location, and cost constraints.

This bias as a result of the flawed ground truth went unidentified until it was deployed on millions of Americans. This exemplifies the importance of considering context when building technology meant to truly benefit all.

How can we improve our algorithms?

Because AI has so much potential for good in the healthcare industry, it is unreasonable to rule it out completely. So what can be done to improve?

Obtaining Better Data

At the moment, it is very difficult to compile a large, diverse, high-quality medical dataset.

As the public becomes more distrusting of sharing their data even if it’s for a worthy cause, hospitals are less incentivized to share data due to the loss of patients to other hospitals. In addition, privacy laws protecting data, the sanctity of medical data, and the consequences from making errors in sharing data, make it more difficult to obtain good data.

On top of this, there is also a technical barrier due to the limited interoperability between medical record systems.

Many datasets will be inherently biased as a result of the sourcing method, like how military datasets leave out females since most service members are male. Therefore, datasets used to train medical AI on a large scale will probably need to be curated in a very intentional manner.

Diversity Among Developers

A diverse training dataset alone will not guarantee the elimination of unwanted bias.

A lack of diversity among developers and investors behind medical AI tools can implant bias due to the framing of problems from the perspective of majority groups, as was in the case study above. Implicit biases and assumptions about data can leave potentially major biases go ignored.

What to look out for:

Steps to spot bias before an algorithm is deployed.

  1. Audit algorithms for potential pre-identified biases.
  2. Dig deeper into where and how the data was obtained and look for flaws in the data that could lead to bias.
  3. Consider how this will be deployed across a diverse array of patient populations.
  4. Follow this process into deployment by continually monitoring bias in real-time for unanticipated outcomes.
  5. At every step, ensure communication and transparency with providers and patients.

How will we move forward?

There is still a lot we don’t know.

A potential solution to the data issue is a “more with less” approach to training models. This would hopefully allow us to create accurate models with more limited data, decreasing the need for huge datasets.

There are also many questions regarding how these algorithms will be used. Will health insurance companies use AI to rack up insurance costs for people of color due to their higher risk? Will different treatments be used depending on a patient’s insurance status or ability to pay? We must keep asking these questions as we phase more AI into our decision-making.

The 21st century physician will need to have at least a basic understanding of how the algorithms they use work and who they were built for.

Key Takeaways

  • It is especially critical in the healthcare field that we are aware of the harmful biases that exist (racial bias in risk-assessment) and the necessary biases that don’t exist (gender bias in heart attack detection).
  • Algorithms built upon initial assumptions about data that are missing important context (such as the correlation between race and healthcare costs) can be incredibly harmful in the long-run.
  • Albeit necessary, obtaining more diverse medical datasets is really challenging — which could be helped by a “more with less” approach to training models.