1. National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States;
2. National Eye Institute (NEI), National Institutes of Health (NIH), Bethesda, Maryland, United States;
* These authors contributed equally to this work.
† zhiyong.lu@nih.gov; echew@nei.nih.gov
Age-related macular degeneration (AMD) is the leading cause of incurable blindness worldwide in people over the age of 65. The Age-Related Eye Disease Study (AREDS) Simplified Severity Scale uses two risk factors found in color fundus photographs (drusen and pigmentary abnormalities) to provide convenient risk categories for the development of late AMD. However, manual assignment can still be time consuming, expensive, and requires domain expertise.
Our model, DeepSeeNet, mimics the human grading process by first detecting risk factors for each eye (large drusen and pigmentary abnormalities) and subsequently calculates patient-based AMD severity scores. DeepSeeNet was trained and validated on 59,302 color fundus photographs from 4,549 participants.
This dataset, released by the NIH, contains retinal color fundus images from over 4,549 patients. Grades obtained from a central reading center were used to calculate AMD severity scores for ground truth labels. Performance of DeepSeeNet was compared to the performance of retinal specialists, who independently assessed 450 AREDS participants as part of a qualification survey used to determine initial AMD severity for each eye.
As seen to the right, DeepSeeNet's performance (accuracy=0.671; kappa=0.558) exceeds retinal specialists performance levels (accuracy=0.599; kappa=0.467) on identifying AREDS Simplified Severity Scale scores. Additionally, DeepSeeNet's performance was compared to two other deep learning models with different training strategies employed. Again, DeepSeeNet's performance is superior to both model performance levels.
While several automated deep learning systems have been developed for classifying color fundus photographs of individual eyes by AMD severity score, none to date have utilized a patient-based scoring system that employs images from both eyes to obtain one classification score for the individual.
DeepSeeNet, trained on one of the largest color fundus image datasets for AMD analysis, shows high classification accuracy in the AREDS dataset and can be used to assign individual patients to AMD risk categories based on the AREDS Simplified Severity Scale. DeepSeeNet performed better on patient-based, multi-class classification (accuracy=0.671; kappa=0.558) than retinal specialists (accuracy=0.599; kappa=0.467) with high AUCs in the detection of large drusen (0.94), pigmentary abnormalities (0.93) and late AMD (0.97), respectively. Its superior performance highlights the potential of deep learning systems to enhance clinical decision-making processes and allow for better understanding of retinal disease.
This work was supported by the Intramural Research Programs of the National Institutes of Health, National Library of Medicine and National Eye Institute.