Machine learning systems today are not without bias, and this can affect their performance, especially in the field of medicine, and impact clinical decisions.
AI can be as good as data and based on the people who create data. AI models are trained on pre-set data input. If by chance this data has any bias, then yes, the AI models so trained will pick that up and hence predict based on that.
That said, it may not be possible to present a completely unbiased human mind and hence one cannot accept so from the AI system either, at least till the “noise” can be cleaned out.
Many AI prediction models are being tested in the field of medical diagnostics. These models predict which patient could develop a certain disease or not. The input data is based on demographic factors and other medical data, fed by humans. Nonetheless, bias can creep in.
However, researchers in the field of medicine across the world are trying to ensure the “fairness of the AI models’ underlying algorithms”. Models need to be developed that consider data that covers a diverse set of medical information across diverse populations.
Based on a recent study, medical researchers discovered a group recalibration approach. The researchers found it useful as a better risk calculator that guides prescription decisions for people of different races.
So, what is this approach about? It re-adjusts the risk model for each subgroup of patients to match the frequency of the present outcomes. The other approach is known as ‘equalized odds’. This approach ensures that error rates are displayed as similar for all groups. These are not found to match the guidelines’ recommendations.
Thus, when building algorithms, the context of the population under consideration needs to be accounted for to keep away the bias.
There is a need to work towards results whose evaluation of disease risk prediction models can be fair and thus increase the responsibility when diagnosing health predictions.
Cautiousness in Clinical Guidelines
Ensuring tools and apps, that are currently increasingly used in clinical practice guidelines, should be designed keeping in mind “algorithm fairness” when applicable across a diverse population.
In the case of atherosclerotic cardiovascular disease, a condition caused by fats, cholesterol, and plaques that narrow artery walls, the results can be a case for causing strokes and kidney failure.
The American College of Cardiology and the American Heart Association have both provided certain guidelines and recommendations as to when patients diagnosed with cardiovascular disease should start medications called statins. Statins are drugs that are known to reduce levels of certain cholesterol that cause the narrowing of arterial walls. The guidelines suggest the use of a calculator which can estimate the patient’s risk of developing the disease within 10 years after considering certain set parameters.
This could lead to a certain bias when recommending those patients to take statins who could be identified as being at intermediate or high risk as against those who are borderline.
Currently, especially during and post-pandemic, the proliferation of medical-decision support calculators — for instance on phones and other electronics used in clinical settings — means such apps are often right at hand. And herein certain risks and dangers could be inherent in the algorithms they are exposed to thus leading to bias in the clinical decision-making outcome.
Lessening the Risk of Perpetuating Bias
Although ML and AI in healthcare have made great strides, the evolution of complex data-driven prediction models requires careful interpretation of the given clinical guidelines before they can be applied and disseminated when assigning treatment.
It needs to be understood by those following clinical guidelines that although models may perform well based on individual performance characteristics, they may not always translate into the “correct and actionable” clinical decision-making required. It may happen that although an algorithm could predict a disease state correctly yet when other parameters or comorbidities are added, the algorithm could add very little to the clinical decision-making.
Nonetheless, being aware of the bias and some of the existing constraints, and working towards fairness when designing the algorithm, have made it easier to utilize healthcare information in clinical decision-making.
A framework for structured quality assessment across the entire AI-based prediction model (AIPM) is much needed to ensure the safe and responsible application of AIPMs in healthcare. This can help to prevent faulty decision-making based on overfitted models. For example, when computer-derived ECG interpretations are inaccurately read, they can lead to a faulty diagnosis. This happens when algorithms are derived in populations that are distinct from the algorithms to which they are applied.
Thus, clinical decision-makers and even the algorithm designers should account for the “correct” fairness metrics to use when evaluating the predictions and which metrics should be employed for model adjustment. They need to keep in mind that any erroneous predictions could lead to adverse clinical decisions.
In conclusion, the need for a diverse sample, that includes a diverse patient population, varied means of data collection, rigorous external validation research practices, and rigorous assessment of the model, is very much needed to keep bias away.