Clinician perspectives on explainability in AI-driven closed-loop neurotechnology

Clinician perspectives on explainability in AI-driven closed-loop neurotechnology


Qualitative results

Our qualitative analysis reveals a nuanced landscape of clinicians’ attitudes, informational needs and preferences concerning AI-driven closed-loop medical neurotechnology. To structure our findings, we organized clinician responses across three dimensions of the algorithmic system: its input (training data), its core architecture (algorithm), and its output (prediction and clinical decision). In addition, we also report their desiderates with view to user interfaces and XAI visualisations. We begin with clinicians’ views on the algorithm itself, given its centrality to current debates on explainability. The full overview of all explainability-related concepts can be seen in the concept map below (see Fig. 1).

Fig. 1figure 1

Explainability-related concept map. This explainability-related concept map illustrates the emerging themes based on our thematic analysis of the expert interviews.

Algorithmic specifications and perceived relevance

A consistent theme across interviews was clinicians’ limited interest in the technical specifications of AI models embedded in closed-loop medical neurotechnological systems. Several participants (9/20) explicitly reported seeing little value in understanding technical details such as algorithm type, number of layers, or parameter counts. These aspects were generally perceived as overly technical, falling outside the their clinical expertise and responsibilities, and offering minimal practical or clinical utility for decision-making in patient care.

One participant noted that many doctors lack the expertise to distinguish between different machine learning (ML) algorithms, and several clinicians questioned whether explainability—and by extension, XAI methods—is even necessary in the context of closed-loop medical neurotechnology. Six out of twenty acknowledged the inherent opacity of AI models but expressed openness to use them nonetheless. One participant remarked that detailed understanding of the underlying AI model is rarely essential for physicians, while another observed that patients typically show little concern about algorithmic transparency. This pragmatic stance was echoed in comparisons to non-AI-driven interventions such as conventional deep brain stimulation (DBS) for Parkinson’s disease, which often prove clinically effective despite incomplete mechanistic understanding. Still, one clinician repeatedly emphasized that algorithmic transparency remains crucial for AI developers and engineers, particularly for system validation, safety assurance, and error detection. From this perspective, even if clinicians do not require direct algorithmic insight, they depend on technical and regulatory actors to ensure such understanding is firmly embedded at the system level.

Input data information request

Our analysis reveals that clinicians’ interest in input data is shaped by underlying concerns about clinical representativeness, data accessibility, data quality, and relevance to real-world therapeutic decision-making. Several interviewees highlighted the critical role of knowledge about input data in shaping clinicians’ trust in, and expectations of, future AI-driven medical closed-loop neurotechnologies.

Representativeness of training data

Clinicians (4/20) voiced scepticism about whether the full spectrum of neurological and psychiatric symptoms can be adequately captured by training datasets. In disorders such as Parkinson’s disease, where symptoms such as tremor vary widely across individuals, participants stressed the need to understand whether the AI models had been trained on data representative of their own patient populations.

Access to input data

Some research-oriented clinicians (3/20) emphasized the need for transparent access to the raw data that underpin AI decisions, to better understand AI and determine its applicability. Especially in high-stakes contexts like critical care, they expressed a desire to inspect and interpret input signals independently. This demand for data transparency was closely linked to clinicians’ need to validate AI-driven interventions and retain clinical oversight, especially when the treatment mechanism is not fully explainable even without AI.

Multimodal input and new biomarker

Multiple interviewees agreed (4/20) that neural activity data alone is insufficient for building and implementing high-performing closed-loop systems. Clinicians strongly advocated for the integration of complementary inputs such as wearable sensor data, video-based movement assessments, and subjective patient-reported outcomes, including measures of quality of life. One clinician even remarked that therapeutic efficacy does not require full mechanistic understanding, so long as the patient’s well-being improves. Consequently, clinicians also expressed a desire to know whether multimodal and subjective patient data was used to train a specific AI system. At the same time, four clinicians noted that certain biomarkers for neurological and psychiatric conditions are already too complex to be fully understood, even without the use of AI, with two of them reporting that parameter settings in some neurostimulation devices are often determined through trial and error. This highlights a shift toward outcome-oriented validation and supports the integration of multimodal data pipelines in AI system design.

High data quality and generalizability

Three out of twenty clinicians raised concerns about the noise and artifact susceptibility of brain data, particularly when acquired under clinical conditions. They emphasized the need for robust preprocessing pipelines and artifact removal to ensure that algorithms learn from signal rather than noise, and demanded that appropriate steps are reported. Additionally, one participant questioned the adequacy of existing datasets, citing small sample sizes and variability in patient condition, device configuration, and electrode placement as potential limitations.

Key information about the output

Clinicians expressed their strongest concerns (11/20) about the output of AI-driven systems, particularly in terms of safety, patient benefit and clinical relevance, and connected this to their informational needs and preferences. Many emphasized that their trust in AI is shaped not by insight into the algorithms themselves, but by the real-world consequences of their decisions and actions.

Safety of the output and operational transparency

Another concern raised by one clinician was the safety of algorithmic outputs, especially in scenarios where systems can autonomously adjust neurostimulation parameters in real time. Four out of twenty clinicians emphasized the need to understand not only the system’s accuracy, but also how its output is operationalized and translated into clinical action. This concern was particularly acute in sensitive or unpredictable environments involving potential real-time interventions, such as seizure detection by responsive neurostimulation (RNS) systems. Seven out of twenty clinicians noted the importance of clearly defined safety boundaries to prevent outputs that could result in harmful or socially dangerous actions (e.g., during driving or unsupervised movement). Several participants stressed that AI models adjusting stimulation parameters autonomously must be constrained by hard safety limits, ensuring that output decisions cannot exceed clinically safe thresholds.

Alignment with clinical reasoning and evidence of patient benefit

Clinicians were more likely to trust AI models when the system’s recommendations were aligned with their clinical reasoning or “gut feeling”. In their view, trust was not gained through detailed algorithmic explanations, but rather through consistent congruence between AI recommendations and clinician intuition. Although most clinicians expressed optimism about AI’s future role in closed-loop systems, they voiced frustration at the current gap between technical potential and clinical application. Some (3/20) described the field as “overhyped,” arguing that claims of transformative impact lack sufficient empirical support. For example, one clinician cautioned that adding adaptive features to DBS treatment is unlikely to improve by more than 30%. While acknowledging the strengths of AI in data processing and predictive modelling, participants consistently emphasized that tangible patient benefits must be demonstrated through realistic, clinically grounded trials. Several noted that new implantable pulse generators (IPGs), which enable long-term neural data collection, present a valuable opportunity to generate the kind of real-world datasets needed to rigorously assess AI’s clinical utility. Ultimately, participants underscored that AI should be deployed only when it leads to clear, demonstrable improvements in patient outcomes, and that these benefits must be carefully evaluated through transparent and ethically sound clinical trials.

Meaningful clinician-AI interaction and clinical relevance

Eight participants emphasized that AI should support, not replace, clinical judgment. One clinician cautioned that AI models might identify statistically significant patterns that lack medical validity, reinforcing the need for clinicians’ oversight in defining use cases and validating model outputs. Several clinicians also emphasized that AI systems require clear hypotheses to address real clinical problems and cannot generate meaningful solutions in isolation.

AI user interface design requirements

Clinicians provided valuable insights on the design of user interfaces for AI-driven closed-loop neurotechnology when prompted, emphasizing the importance of intuitive, context-specific visualizations that support understanding without demanding technical expertise in ML.

Data transparency through visualization

There was broad agreement among participants (9/20) that descriptive statistics and visual summaries, such as charts, enhance understanding of the training data. These tools were considered essential for assessing the representativeness and relevance of the dataset to individual patient cases. One clinician suggested incorporating visualizations of symptom clusters to verify whether an individual patient matches specific subgroups in the training set.

Linking to scientific evidence

A clinician welcomed the idea of embedding hyperlinks to peer-reviewed publications that underpin model decisions. Such links could strengthen trust by showing that algorithmic outputs are grounded in established clinical knowledge and by allowing users to verify the underlying medical rationale.

Explainability tools

Only three out of twenty participants spontaneously expressed interest in formal XAI methods such as feature relevance, feature importance rankings, and counterfactual examples. Those who did valued the ability to identify top predictors or explore counterfactuals and what-if scenarios. One participant also called for access to source data within the interface, particularly for research or validation purposes. Interestingly, two clinicians preferred paper-based formats over digital dashboards when reviewing complex patient profiles, suggesting that interface flexibility remains crucial.

The limits of visualization

Despite general appreciation for transparency tools, several participants cautioned that data visualization alone does not fully resolve the challenges of algorithmic opacity. As a result, clinicians advocated not for complete transparency but for practical intelligibility: interfaces that convey enough information to safely and confidently integrate AI support into clinical workflows, without overwhelming users with unnecessary complexity.

Illustrative theme-specific quotes are presented in Table 1.

Table 1 Illustrative quotes for the respective themes.



Content Curated Originally From Here