DeepSeeNet: A deep learning framework for classifying patient-based age-related macular degeneration severity in retinal color fundus photographs

Yifan Peng1*, Shazia Dharssi1,2*, Qingyu Chen1, Tiarnan D. Keenan2, Elvira Agrón2, Wai Wong2, Emily Y. Chew2†, Zhiyong Lu1†

1. National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland, United States;

2. National Eye Institute (NEI), National Institutes of Health (NIH), Bethesda, Maryland, United States;

* These authors contributed equally to this work.

† zhiyong.lu@nih.gov; echew@nei.nih.gov

We developed a deep learning framework that can classify retinal color fundus photographs into a 6 class patient-based age-related macular degeneration (AMD) severity score at a level that exceeds retinal specialists.

Age-related macular degeneration (AMD) is the leading cause of incurable blindness worldwide in people over the age of 65. The Age-Related Eye Disease Study (AREDS) Simplified Severity Scale uses two risk factors found in color fundus photographs (drusen and pigmentary abnormalities) to provide convenient risk categories for the development of late AMD. However, manual assignment can still be time consuming, expensive, and requires domain expertise.

Our model, DeepSeeNet, mimics the human grading process by first detecting risk factors for each eye (large drusen and pigmentary abnormalities) and subsequently calculates patient-based AMD severity scores. DeepSeeNet was trained and validated on 59,302 color fundus photographs from 4,549 participants.

DeepSeeNet was trained on the NIH AREDS dataset, the largest publicly available dataset of color fundus images for AMD analysis

This dataset, released by the NIH, contains retinal color fundus images from over 4,549 patients. Grades obtained from a central reading center were used to calculate AMD severity scores for ground truth labels. Performance of DeepSeeNet was compared to the performance of retinal specialists, who independently assessed 450 AREDS participants as part of a qualification survey used to determine initial AMD severity for each eye.

Our model consistently exceeds retinal specialists on drusen and pigmentary abnormalities, and is comparable to retinal specialists on late AMD detection.

As seen to the right, DeepSeeNet's performance (accuracy=0.671; kappa=0.558) exceeds retinal specialists performance levels (accuracy=0.599; kappa=0.467) on identifying AREDS Simplified Severity Scale scores. Additionally, DeepSeeNet's performance was compared to two other deep learning models with different training strategies employed. Again, DeepSeeNet's performance is superior to both model performance levels.

Summary

While several automated deep learning systems have been developed for classifying color fundus photographs of individual eyes by AMD severity score, none to date have utilized a patient-based scoring system that employs images from both eyes to obtain one classification score for the individual.

DeepSeeNet, trained on one of the largest color fundus image datasets for AMD analysis, shows high classification accuracy in the AREDS dataset and can be used to assign individual patients to AMD risk categories based on the AREDS Simplified Severity Scale. DeepSeeNet performed better on patient-based, multi-class classification (accuracy=0.671; kappa=0.558) than retinal specialists (accuracy=0.599; kappa=0.467) with high AUCs in the detection of large drusen (0.94), pigmentary abnormalities (0.93) and late AMD (0.97), respectively. Its superior performance highlights the potential of deep learning systems to enhance clinical decision-making processes and allow for better understanding of retinal disease.

Acknowledgments

This work was supported by the Intramural Research Programs of the National Institutes of Health, National Library of Medicine and National Eye Institute.