/ 29 May, 2025
/ Hann Yee Son

A recipe for good SOUP: Part 3 — Large Language Models as SOUP

In the modern software landscape, leveraging software of unknown provenance (SOUP) presents both benefits and risks. Nowhere is this trade-off more pronounced than with large language models (LLMs), which offer extraordinary potential for Software as a Medical Device (SaMD) development.

Capping off our blog series on SOUP, we’ll explore one of the most recent developments that has the potential to transform modern medical devices. Integrating Large Language Models (LLMs) such as ChatGPT, Claude, or Med–PaLM into SaMD offers incredible potential, particularly in clinical decision-making support. However, incorporating LLMs into medical software introduces regulatory challenges, especially in software risk management, verification, and change management.

In this blog post, we’ll explain why LLMs are usually treated as SOUP and discuss their unique challenges compared to more traditional SOUP components.

Are LLMs always SOUP?

In medical device software, LLMs typically share these characteristics:

  • Developed by third parties
  • Not developed exclusively for medical device use
  • Trained on vast, generally undocumented corpora
  • Opaque APIs, with limited access to internal logic

The first two characteristics alone would qualify a typical LLM as SOUP. However, even an LLM trained in-house by a SaMD manufacturer for the explicit purpose of being integrated into their own SaMD product, would have the following characteristics:

  • The API is still opaque. Manufacturers generally cannot explain internal logic or retroactively determine how an LLM achieves specific outputs.
  • The in-house LLM may still leverage third-party components and training data.
  • The data the model is trained on may not have clear provenance or licensing.

Based on the above, and referencing the requirements of IEC 62304, most instances of LLMs in SaMD would qualify as SOUP.

The risks of incorporating LLMs into SaMD

State-of-the-art LLMs pose a number of regulatory challenges.

  • No control or transparency on the training data
  • Models are subject to biases inherent within the training data
  • Model outputs are non-deterministic
  • No confidence score is associated with the model outputs
  • Unpredictable failure modes: e.g. hallucinations
  • Lack of transparency on the internal logic of the model
  • Fast-moving technology with regular undisclosed updates

It is essential that the manufacturer comprehensively documents and mitigates risk related to LLMs as part of their risk management activities. A failure to do so may have the following consequences when LLMs are incorporated into a SaMD:

  • Unsafe recommendations: the LLM may generate inaccurate or unsafe suggestions in a clinical support context
  • False confidence in output: clinicians may over-trust well-formatted but incorrect output
  • Inconsistent behaviour: variability across sessions can confuse users or lead to misinterpretation of outputs
  • Lack of explainability: difficulty justifying recommendations or outputs for clinical audit in retrospect

How to control the risks

To mitigate the LLM-related risks, manufacturers should introduce adequate risk controls, prioritising those that ensure inherently safe software design. These may include:

Summary

LLMs are a new, exciting technology that offers extraordinary potential in the field of SaMD. However, given the opacity of these models, and their generally non-deterministic behaviour, they carry unique challenges that must be accurately defined and mitigated through risk management, inherently safe software design, and rigorous change management.

Want Scarlet news in your inbox?

Sign up to receive updates from Scarlet, including our newsletter containing blog posts sent straight to you by email.

You can unsubscribe at any time by clicking the link in the footer of our emails. For more information, please visit our privacy policy.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices.