Chemical Space-Property Predictor Model of Perovskite Materials by High-Throughput Synthesis and Artificial Neural Networks

Chemical space-property predictor model of perovskite materials by high-throughput synthesis and artificial neural networks

Academic Background

Perovskite materials have attracted extensive attention due to their wide applications in solar cells and other electronic devices. Their optical properties (such as bandgap and lattice vibrations) can be flexibly modulated by tuning the chemical composition. Although the prediction of optical properties from perovskite structure has been well developed, the inverse problem—predicting chemical composition from optical data—has remained a major challenge. Solving this problem is of great significance for accelerating the development and production of perovskite materials. Especially in large-scale industrial manufacturing, rapid screening and verification of the chemical composition of new materials will dramatically improve production efficiency.

To address this challenge, the researchers proposed an innovative approach that integrates high-throughput synthesis, high-resolution spectroscopy, and machine learning (specifically, artificial neural networks, ANN). Through this method, they were able not only to efficiently synthesize various perovskite materials with different chemical compositions, but also to accurately predict their chemical composition from optical data. This work provides a new tool for the rapid screening and optimization of perovskite materials.

Source of the Article

This research was jointly completed by Md. Ataur Rahman, Md. Shahjahan, Yaqing Zhang, Rihan Wu, and Elad Harel from Michigan State University. Elad Harel is the corresponding author, responsible for overall project design and guidance. The paper was published in the journal Chem on April 10, 2025, titled “Chemical Space-Property Predictor Model of Perovskite Materials by High-Throughput Synthesis and Artificial Neural Networks”.

Research Workflow

1. High-Throughput Synthesis of Perovskite Materials

The first step of the study was to synthesize a series of perovskite single crystals with different chemical compositions. The research team adopted a high-throughput synthesis method, utilizing a liquid-handling robot (Opentrons OT-2) to mix precursor solutions (such as MAPbCl₃, CsPbBr₃, and CsPbI₃) in a 96-well plate, achieving uniform mixing by heating and shaking. Subsequently, these solutions were transformed into microdroplets and deposited onto coverslips to form single crystals.

To ensure accuracy and reproducibility, all experiments were conducted in a temperature- and humidity-controlled laboratory. The team also investigated the effect of different solvents (such as DMSO, DMF, and GBL) on the size and shape of the crystals, discovering that GBL solvent is particularly effective in improving the size and shape of bromide perovskite single crystals.

2. Spectroscopic Data Collection and Analysis

The synthesized perovskite single crystals were subjected to spectral analysis, mainly including UV-visible absorption spectra (UV-Vis), photoluminescence spectra (PL), and terahertz Raman spectra (THz Raman). These spectral data served as the input features for the artificial neural network, used to train the model in predicting the chemical composition of perovskites.

  • UV-Visible Spectra: The research team recorded the absorption spectra of all perovskite samples and analyzed their bandgap variations. As the halide composition shifted from chloride to bromide to iodide, the absorption peak shifted from the blue region (around 400 nm) to the red region (around 700 nm).

  • Photoluminescence Spectra: The PL spectra and images showed that the emission color of the perovskite single crystals changed from blue to green to red, consistent with the change in halide composition. However, due to phase segregation in mixed-halide perovskites, the reproducibility of PL data was relatively low, limiting its role in the model.

  • Terahertz Raman Spectra: THz Raman spectra can probe low-frequency vibrational modes of the perovskite lattice, such as the symmetric stretching and bending modes of the Pb-X bond. Changes in these vibrational modes are closely related to halide composition, making THz Raman data stand out in predicting chemical composition.

3. Construction and Training of the Artificial Neural Network Model

The research team developed a chemical space-property predictor model based on artificial neural networks (ANN), capable of handling multiple inputs and outputs simultaneously. The model inputs included UV-visible spectra, PL spectra, and THz Raman spectra, while the outputs were the chemical compositions (x and y values) of the halides in the perovskite.

To train the model, the researchers employed two algorithms: Levenberg-Marquardt and Bayesian regularization, dividing the data into training set (70%), validation set (15%), and test set (15%). Results showed that THz Raman data performed best in predicting chemical composition, with a regression coefficient of 0.851. When combined with UV-visible spectral data, the prediction accuracy of the model further increased to 92%.

Main Results and Conclusions

1. Correlation between Spectral Data and Chemical Composition

The results showed that THz Raman spectral data offer a significant advantage in predicting the chemical composition of perovskites. In particular, the low-frequency vibrational modes (such as bending and stretching) of the Pb–X bond are closely correlated with halide composition. By contrast, PL data are less predictive due to the influence of phase segregation.

2. Performance of the Artificial Neural Network Model

The ANN model based on THz Raman and UV-visible spectra exhibited excellent performance in predicting perovskite chemical composition, achieving a regression coefficient of 0.917. The success of this model provides a powerful tool for the rapid screening and optimization of perovskite materials.

3. Scientific Value and Application Prospects

This study not only solves the key problem of predicting perovskite chemical composition from optical data, but also provides a new method for the efficient development and industrial production of perovskite materials. By combining high-throughput synthesis and machine learning, the research team can quickly verify the chemical composition of materials and achieve real-time quality control in production. In addition, the model can also be applied to explore local and structural heterogeneity as well as phase segregation dynamics in perovskite materials.

Research Highlights

  1. Innovative Approach: This research, for the first time, integrates high-throughput synthesis, high-resolution spectroscopy, and artificial neural networks to successfully predict the chemical composition of perovskite materials.
  2. High-Accuracy Prediction: The ANN model based on THz Raman and UV-visible spectral data achieved a prediction accuracy of up to 92%, significantly outperforming traditional methods.
  3. Broad Application Prospects: This model can be widely applied to rapid screening, optimization, and industrial production of perovskite materials, and possesses important application value.

Other Valuable Information

The research team also conducted feature importance analysis and found that, for UV-visible spectral data, the wavelength regions of 305–315 nm, 370–470 nm, and 540–600 nm have significant impact on model prediction. For THz Raman spectral data, the regions of 10–19 cm⁻¹ (Pb-X bending mode), 40–60 cm⁻¹ (X-Pb-Br symmetric stretching), and 120–185 cm⁻¹ (X-Pb-X asymmetric stretching) are especially important for prediction.

Summary

By combining high-throughput synthesis, spectroscopic techniques, and artificial neural networks, this research successfully developed a high-accuracy model capable of predicting the chemical composition of perovskites from optical data. This achievement not only addresses a key challenge in perovskite research, but also provides new tools and methods for the efficient development and industrial production of perovskite materials. In the future, with the expansion of dataset size and model optimization, the prediction accuracy and application scope of this method will be further enhanced.