
Artificial intelligence (AI) is frequently described as having the capacity to dramatically change and improve healthcare. One extensively studied application of AI in ophthalmology involves the diagnosis of diabetic retinopathy (DR) or diabetic maculopathy (DM) using retinal imaging.
An emerging area of research is the use of AI to predict DR progression using retinal imaging, which holds significant potential for application within the UK National Diabetic Eye Screening Programme (DESP). The UK DESP captures fundus photographs from over three million people with diabetes yearly, to identify retinopathy that needs referral to the hospital eye service. The high volume of participants lends itself to the application of AI, which could be scalable at a national level.
Prognostic AI in diabetic screening
Artificial intelligence models that can predict DR progression could have significant implications for DR screening. Previous diagnostic AIs for DR detection have targeted the debated replacement of human graders [1–3]. Prognostic AI can enhance clinical practice by integrating assessments from human graders for DR detection, while simultaneously using AI to forecast DR progression. This is pertinent since graders are not expected nor able to predict DR progression, allowing prognostic AI to create a new capability which can improve screening efficacy, lower costs and reduce the burden of screening for low-risk patients.

The first proof-of-concept study by Arcadu, et al. found prognostic AI could predict two-year DR progression risk with ‘area under the receiver operating characteristic’ curves (AUROCs) between 0.70–0.80 (values above 0.8 indicate good performance), demonstrating that AI could extract prognostic features from retinal images [4]. Several studies have advanced the concept, including Google Health, that trained prognostic AI systems to predict the two-year incidence of any DR in patients with no baseline DR, with an AUROC of 0.79 [5]. Rom, et al. also reported prognostic AI using multiple retinal images from the same patient and improved the model’s prediction of DR progression from an AUROC of 0.76 to 0.81 [6]. Lai, et al. recently developed a prognostic AI to predict the time to DR progressing to a sight-threatening stage over a five-year time horizon using retinal images, with AUROCs between 0.75–0.85 [7].
Issues for UK adoption
While the rapid development of prognostic AI has been encouraging, there are multiple implementation considerations for use in the UK population DESP:
- Prior prognostic AI systems have largely been trained on American populations and DR screening protocols that differ to UK National Screening Committee (NSC) DESP protocols. Differences in the populations and imaging devices used to train a model could lead to poor AI model generalisation when applied to UK DESP populations.
- Grading definitions for DR also differ between the UK and other countries. Prior prognostic AI have used international clinical diabetic retinopathy disease severity scale (ICDR) criteria, which do not overlap completely with UK NSC criteria, which could invalidate model predictions.
- Prognostic AI models could benefit from using clinical data rather than retinal images alone. It has been reported that using AI on electronic health record data alone can provide a high-performing predictive AI [8] and not integrating it into a model may risk suboptimal performance when utilised over a whole nation’s diabetic population. However, a balance needs to be struck as risk calculators may use invasive tests (e.g. HBA1c, eGFR) that may not be readily available at a standard UK screening visit [9].
A UK DESP multimodal algorithm
One newly developed prognostic AI system addresses these issues in the study design and model development [10]. Using data from the South London DESP, a deep learning system was developed that combined fundus images taken from routine DESP appointments with non-invasive risk factors that are readily available at screening visits (age, physician-reported sex, self-reported ethnicity, DM type, DM duration, visual acuity and deprivation) to create a multimodal prognostic AI model. Using data from 200,000 eyes collected over a six-year period, prognostic AI models were trained to predict of one-, two- and three-year incidence of referable DR or maculopathy (cases that need referral to hospital).
Combining clinicodemographic data and retinal images into a multimodal prognostic AI model produced a better performing algorithm than individually. The AUROC for predicting two-year incident referable DR or maculopathy were ~0.80 and were maintained on external testing in the Birmingham DESP [10]. To the best of our knowledge, this is the first prognostic AI that uses retinal images to predict DR progression, developed and tested using UK DESP data, for the specific purpose of individualising and optimising screening recall intervals.
The path to implementation
The pathway to implementing AI in healthcare requires prospective evaluations to support applications for regulatory approval. An option here includes a ‘silent trial’, which is an evaluation of the AI model in the background of the current standard of care. In this scenario, clinicians are not informed of the AI outputs, ensuring that patient care remains unchanged and mitigating concerns about influencing patient management prior to establishing the safety of the AI. At the end of the study period ‘real-world’ outcomes are contrasted with the outcomes had the patients had their care determined by AI. This allows a detailed comparison of safety, efficacy and cost-effectiveness, a critical step prior to the deployment of AI systems within clinical environments.
The first prospective ‘silent trial’ of a prognostic AI in UK DESP
A novel prospective silent trial of the prognostic AI will imminently begin and will recruit a consecutive cohort of 50,000+ patients from DESP sites across London. The primary objective is to prospectively evaluate the prognostic AI model we developed on a large, ethnically diverse population that is representative of the UK. Eligibility criteria will be open and unrestrictive to include all patients with non-referable DR at baseline. Eligible patients will not have their recall altered by prognostic AI but will instead attend screening intervals (one or two yearly) according to the local standard of care. However, at baseline, demographic data and retinal images will run through the AI model to determine the risk of developing referable DR or maculopathy over two years. The prognostic AI risk prediction will be used to categorise patients in a risk category consisting of a high-risk group with a one-year recall or low-risk group with a two-year recall. At the end of two years, we will compare the safety, efficacy and cost-effectiveness of the prognostic AI to current screening recall intervals. We will place particular attention to ‘missed’ or ‘delayed‘ diagnoses of proliferative DR, whereby erroneous stratification to a low-risk group with two-year recall would pose the greatest risk to sight loss.
Impact of the prognostic silent trial for UK DESP
The silent trial will provide invaluable data on the safety, efficacy and cost-effectiveness of prognostic AI within the UK DESP. Added technology costs would hopefully be outweighed by savings in appointment costs by correctly identifying more low-risk patients and extending their recall interval to every two years from yearly. Concerns about the current standard of care disadvantaging ethnic minorities would also hopefully be addressed by the inclusion of demographic data in model design [11]. This trial will be the first of its kind in the UK and should provide real-world data on how AI might reduce the strain on screening services but do so in a way that reduces inequalities and frees up more resources, to ensure those at the highest risk of vision loss are prioritised.
References
1. Abràmoff MD, Lavin PT, Birch M, et al. Pivotal Trial of an Autonomous AI-Based Diagnostic System for Diabetic Retinopathy in Primary Care Offices. NPJ Digit Med 2018;1:39.
2. Ting DSW, Cheung CY-L, Lim G, et al. Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes. JAMA 2017;318(22):2211–23.
3. Tufail A, Rudisill C, Egan C, et al. Automated Diabetic Retinopathy Image Assessment Software: Diagnostic Accuracy and Cost-Effectiveness Compared with Human Graders. Ophthalmology 2017;124(3):343-351. doi: 10.1016/j.ophtha.2016.11.014. Epub 2016 Dec 23. PMID: 28024825.
4. Arcadu F, Benmansour F, Maunz A, et al. Deep learning algorithm predicts diabetic retinopathy progression in individual patients. NPJ Digit Med 2019:2:92.
5. Bora A, Balasubramanian S, Babenko B, et al. Predicting the risk of developing diabetic retinopathy using deep learning. Lancet Digit Health 2021;3(1):e10–9.
6. Rom Y, Aviv R, Ianchulev T, Dvey-Aharon Z. Predicting the future development of diabetic retinopathy using a deep learning algorithm for the analysis of non-invasive retinal imaging. BMJ Open Ophthalmol 2022;7:e001140.
7. Dai L, Sheng B, Chen T, et al. A deep learning system for predicting time to progression of diabetic retinopathy. Nat Med 2024;30(2):584–94.
8. Romero-Aroca P, Verges R, Pascual-Fontanilles J, et al. Referable Diabetic Retinopathy Prediction Algorithm Applied to a Population of 120,389 Type 2 Diabetics over 11 Years Follow-Up. Diagnostics (Basel) 2024;14(8):833.
9. Eleuteri A, Fisher AC, Broadbent DM, et al. Individualised variable-interval risk-based screening for sight-threatening diabetic retinopathy: the Liverpool Risk Calculation Engine. Diabetologia 2017;60(11):2174–82.
10. Nderitu P, Lipman M, Anton H, et al. Predicting 1, 2 and 3 year emergent referable diabetic retinopathy and maculopathy using deep learning. Commun Med (Lond) 2024;4(1):167.
11. Olvera-Barrios A, Rudnicka AR, Anderson J, et al. Two-year recall for people with no diabetic retinopathy: a multi-ethnic population-based retrospective cohort study using real-world data to quantify the effect. Br J Ophthalmol 2023;107(12):1839–45.
Declaration of competing interests: None declared.
Acknowledgements: The authors thank Sight Research UK for funding and supporting this project through their Translational Research Award.


