Deep Learning Combining Mammography and Ultrasound Images to Predict the Malignancy of BI-RADS US 4a Lesions in Women with Dense Breasts: A Diagnostic Study

Research on Using Deep Learning to Combine Mammography and Ultrasound Images for Predicting Malignancy of BI-RADS US 4A Lesions in Women with Dense Breasts


Breast cancer is the most common malignant tumor in women, with a relatively high incidence and mortality rate. Previous studies have found that women with dense breasts are more likely to develop breast cancer. Research indicates that breast density in Asian women is generally higher than in African and Caucasian women, making it particularly important to study Asian women with high breast density.

Mammography (MG) is considered an essential means of screening for breast cancer and is claimed to reduce breast cancer-related mortality by 30%. However, MG performs poorly in detecting breast lesions in women with dense breasts, with its sensitivity decreasing to 48%-85%, mainly due to glandular obstructions. In this situation, ultrasound (US) plays an indispensable role in screening and diagnosing women with dense breasts. Combined US and MG examinations can improve the detection rate for patients with dense breasts. A meta-analysis showed that US as a supplementary method to MG can increase the detection rate of asymptomatic breast cancer by an average of 40%.

The 5th edition of the Breast Imaging Reporting and Data System (BI-RADS) atlas released by the American College of Radiology (ACR) in 2013 subdivides BI-RADS category 4 lesions into 4A, 4B, and 4C subcategories, with varying degrees of malignancy probability. As BI-RADS 4A lesions have a relatively low malignancy rate, typically 2%-10%, and are the most common among BI-RADS category 4 lesions, accounting for 55.6%, routine biopsy may lead to unnecessary patient anxiety, occasional complications, and medical costs. In clinical practice, radiologists often make final assessments based on BI-RADS standards, which may be influenced by subjective interpretation. Therefore, an objective method is necessary to improve the early diagnosis of BI-RADS 4A breast lesions. Artificial intelligence (AI) can identify complex information in imaging data and may provide a rapid and quantitative alternative.


This paper was written by Yaping Yang, Ying Zhong, Junwei Li, Jiahao Feng, Chang Gong, Yunfang Yu, Yue Hu, Ran Gu, Hongli Wang, Fengtao Liu, Jingsi Mei, Xiaofang Jiang, Jin Wang, Qinyue Yao, Wei Wu, Qiang Liu, and Herui Yao. The research was jointly conducted by the Breast Tumor Center of Sun Yat-sen University Cancer Center and Cellvision (Guangzhou) Medical Technology Co., Ltd. The paper was published online on February 12, 2024, in the prestigious international journal International Journal of Surgery.


Study Design

A total of 1210 women were recruited for the study, including retrospective and prospective cohorts. Of these, 992 patients were randomly assigned to the training (789) and test (203) cohorts in a 4:1 ratio. Additionally, 218 patients were recruited to form a prospective validation cohort. During the training process, transfer learning and weighted random sampling methods were employed to address the imbalance between positive and negative samples and improve model generalization. The ResNet18 was used as the base deep learning model.

Data Processing and Experimental Design

All breast ultrasound and mammography images underwent the same resolution normalization processing. The deep learning model used a multimodal feature elimination technique and online data augmentation technique to reduce overfitting during model training. Model performance was evaluated using the receiver operating characteristic (ROC) curve and its area under the curve (AUC), with comparisons of sensitivity and specificity between different models to assess their ability to predict malignancy.

Data Analysis

The baseline clinical and pathological characteristics of patients in the training, test, and validation cohorts were presented as frequencies and percentages. To address the sample imbalance issue, the model employed weighted random sampling during training. The receiver operating characteristic (ROC) curve and AUC were used to evaluate the performance of different models. Additionally, decision curve analysis (DCA) was used to quantify the clinical utility of the model at different probability thresholds.


Baseline Characteristics

The study included a total of 1210 participants, with a median age of 44 years. All patients were diagnosed with BI-RADS US 4A lesions and had dense breasts. The training and test cohorts were comparable in terms of pathological results, age, and lesion size, but differed in family history and imaging features. The malignancy rate was 8.7% in the training cohort, 12.3% in the test cohort, and 9.2% in the validation cohort.

Model Performance

The ResNet18 model was chosen as the base model due to its highest sensitivity (92%) and specificity (88.2%) in the test cohort. In the test cohort, the combined model incorporating US and MG images had the best AUC score of 0.940 (95% CI: 0.874-1.000) for predicting malignancy. In the validation cohort, the combined model also achieved an AUC score of 0.906 (95% CI: 0.817-0.995).


This study developed an efficient and objective deep learning model that combined US and MG image features, significantly improving the accuracy of predicting malignancy in BI-RADS US 4A breast lesions. The model performed remarkably in reducing unnecessary biopsies. It may assist clinicians in making better clinical decisions, avoiding unnecessary biopsies, and improving the reproducibility and reliability of the clinical workflow.

Research Highlights

  1. High Accuracy: The model demonstrated high AUC scores in predicting the malignancy of BI-RADS US 4A breast lesions.
  2. Multimodal Fusion: Combining US and MG image features significantly outperformed single-mode models.
  3. AI in Medical Imaging: Demonstrated the great potential of AI and deep learning in improving diagnostic accuracy and reducing subjectivity.

Research Importance and Application Value

This study provides an efficient AI-assisted diagnostic method that successfully combines multiple medical imaging modalities, significantly improving diagnostic accuracy and effectiveness. By reducing unnecessary biopsies, the model not only saves medical resources but also significantly relieves patient anxiety. When applied in clinical practice, this model has the potential to become an essential tool for physicians in diagnosing and treating breast cancer. Future multi-center validation on a broader scale is needed to further enhance the model’s generalization and clinical applicability.