
FOLLOWUS
School of Chemistry and Chemical Engineering, South China University of Technology, Guangzhou 510640, China
Peric Special Gases Co., Ltd., Handan 057550, China
Corresponding author. E-mail address: cejingxiao@scut.edu.cn (J. Xiao).
收稿:2025-07-15,
修回:2025-09-22,
录用:2025-09-22,
网络首发:2025-10-18,
纸质出版:2026-02
Scan QR Code
Wu Zhikang, Wu Ying, Miao Guang, 等. Machine learning for adsorption-related parameters prediction of electronic specialty gases: DFT-based dataset construction and balanced data augmentation[J]. 中国化学工程学报(英文), 2026,90(2):261-271.
Wu Zhikang, Wu Ying, Miao Guang, et al. Machine learning for adsorption-related parameters prediction of electronic specialty gases: DFT-based dataset construction and balanced data augmentation[J]. Chinese Journal of Chemical Engineering, 2026, 90(2): 261-271.
Wu Zhikang, Wu Ying, Miao Guang, 等. Machine learning for adsorption-related parameters prediction of electronic specialty gases: DFT-based dataset construction and balanced data augmentation[J]. 中国化学工程学报(英文), 2026,90(2):261-271. DOI:
Wu Zhikang, Wu Ying, Miao Guang, et al. Machine learning for adsorption-related parameters prediction of electronic specialty gases: DFT-based dataset construction and balanced data augmentation[J]. Chinese Journal of Chemical Engineering, 2026, 90(2): 261-271. DOI:
Electronic
specialty gases play vital roles in key chip manufacturing processes like lithography
etching
deposition and cleaning. While their ultra-high purity (≥99.999%) creates challenging separation requirements
insufficient physicochemical data has hindered adsorbent development. To bridge this gap
we constructed a multidimensional database covering 101 semiconductor-related molecules with 19 physical parameters
and developed a Bayesian regression-based collaborative prediction model demonstrating high accuracy (
R
2
=0.95—0.97) on test sets. We further constructed the balanced dataaugmented Transformer-based molecular property prediction (BD-TMPP) model to address the overfitting problem in small-sample learning. This model achieves the end-to-end prediction of molecular quadrupole moment (
R
2
= 0.99)
and polarizability (
R
2
= 0.98)
via
the capture of interatomic spatial correlations. Compared with traditional density functional theory calculations
the model achieves a five-orders-of-magnitude improvement in computational efficiency while maintaining accuracy
demonstrating a successful application of the "structure-property relationship" theory in chemical machine learning.
M.B. Chang, J.S. Chang, Abatement of PFCs from semiconductor manufacturing processes by nonthermal plasma technologies: a critical review, Ind. Eng. Chem. Res. 45 (12) (2006) 4101—4109.
W.X. Zhang, Y.H. Li, Y. Wu, Y. Fu, S.H. Chen, Z.H. Zhang, S.J. He, T. Yan, H.P. Ma, Fluorinated porous organic polymers for efficient recovery perfluorinated electronic specialty gas from exhaust gas of plasma etching, Sep. Purif. Technol. 287(2022)120561.
M.Z. Zheng, W.J. Xue, T.A. Yan, Z.F. Jiang, Z. Fang, H.L. Huang, C.L. Zhong, Fluorinated MOF-based hexafluoropropylene nanotrap for highly efficient purification of octafluoropropane electronic specialty gas, Angew. Chem. Int. Ed. 63(15)(2024)e202401770.
W.G. Cui, T.L. Hu, X.H. Bu, Metal—organic framework materials for the separation and purification of light hydrocarbons, Adv. Mater. 32 (3) (2020) e1806445.
K.E. Lamb, M.D. Dolan, D.F. Kennedy, Ammonia for hydrogen storage; a review of catalytic ammonia decomposition and hydrogen sep aration and purification, Int. J. Hydrog. Energy 44 (7) (2019) 3580—3593.
R.B. Lin, S.C. Xiang, H.B. Xing, W. Zhou, B.L. Chen, Exploration of porous metal—organic frameworks for gas separation and purification, Coord. Chem. Rev. 378(2019)87—103.
X.Q. Li, K. Chen, R.L. Guo, Z. Wei, Ionic liquids functionalized MOFs for adsorption, Chem. Rev. 123(16)(2023)10432—10467.
X. Peng, R.G. Pan, X. Li, W.M. Zhong, F. Qian, Molecular descriptor-assisted interpretable machine learning: a scheme for guiding the synthesis of zeolites with target structures, Chem. Eng. Sci. 308 (2025) 121378.
X. Zhao, Y.X. Wang, D.S. Li, X.H. Bu, P.Y. Feng, Metal—organic frameworks for separation, Adv. Mater. 30 (37)(2018) 1705189.
S.A. Chen, W.L. Wu, Z.Y. Niu, D.Q. Kong, W.B. Li, Z.L. Tang, D.H. Zhang, High adsorption selectivity of activated carbon and carbon molecular sieve boosting CO 2 /N 2 and CH 4 /N 2 separation, Chin. J. Chem. Eng. 67 (2024) 282—297.
J.W. Huang, C.T. Yang, X.Y. Zhou, X.X. Li, Z.L. Du, L. Zhu, H. Yin, G. Miao, J. Xiao, Sub-nanopore orifice control on carbonaceous adsorbent boosting N 2 /CH 4 inverse separation with ultra-high selectivity, Carbon 233 (2025) 119922.
C.Q. Su, W.T. Jia ng, Y. Guo, G.D. Yi, Z.X. Li, H. Li, Rational molecular design of P-doped porous carbon material for the VOCs adsorption, Chin. J. Chem. Eng. 79(2025)155—163.
J.W. Huang, J.J. Peng, X. Wei, S.J. Du, C.T. Yang, J. Xiao, Synergetic thermodynamic/kinetic separation ofC 3 H 8 /CH 3 Fon carbon adsorbents for ultrapure fluoromethane electronic gas, AIChE J. 69 (5) (2023) e18027.
J.R. Li, R.J. Kuppler, H.C. Zhou, Selective gas adsorption and separation in metal—organic frameworks, Chem. Soc. Rev. 38(5)(2009)1477—1504.
R.M. Barrer, Molecular sieves, Nature 249(5459)(1974) 783.
B.E. Poling, J.M. Prausnitz, J.P. O'Connell, Properties of Gases and Liquids, fifth ed., McGraw-ill Education, New York (2001).
S. Sircar, Basic research needs for design of adsorptive gas separation processes, Ind. Eng. Chem. Res. 45 (16) (2006) 5435—5448.
D.R. Lide, CRC Handbook of Chemistry and Physics, 88th Edition, Taylor &Francis, London (2007).
J.A. Keith, V. Vassilev-Galindo, B.Q. Cheng, S. Chmiela, M. Gastegger, K.R. Müller, A. Tkatchenko, Combining machine learning and computational chemistry for predictive insights into chemical systems, Chem. Rev. 121(16) (2021)9816—9872.
J.B. Hu, J.Y. Cui, B. Gao, L.F. Yang, Q. Ding, Y.J. Li, Y.M. Mo, H.J. Chen, X.L. Cui, H.B. Xing, Machine-learning-assisted exploration of anion-pillared metal organic frameworks for gas separation, Matter 5 (11) (2022) 3901—3911.
J.Y. Cui, F. Wu, W. Zhang, L.F. Yang, J.B. Hu, Y. Fang, P. Ye, Q. Zhang, X. Suo, Y.M. Mo, X.L. Cui, H.J. Chen, H.B. Xing, Direct prediction of gas adsorption via spatial atom interaction learning, Nat. Commun. 14 (1) (2023) 7043.
W.Y. Zhou, H.S. Feng, S.H. Zhou, M.X. Wang, Y.P. Chen, C.Y. Lu, H. Yuan, J. Yang, Q. Li, L.X. Tan, L.C. Dong, Y.W. Zhang, Designing and screening singleatom alloy catalysts for CO 2 reduction to CH 3 OH via DFT and machine learning, AIChE J. 71 (3) (2025) e18678.
N.D. Vo, D.H. Oh, S.H. Hong, M. Oh, C.H. Lee, Combined approach using mathematical modelling and artificial neural network for chemical industries: steam methane reformer, Appl. Energy 255 (2019) 113809.
F. Ye, S. Ma, L. Tong, J.S. Xiao, P. Bénard, R. Chahine, Artificial neural network based optimization for hydrogen purification performance of pressure swing adsorption, Int. J. Hydrog. Energy 44 (11) (2019) 5334—5344.
L.M.C. Oliveira, H. Koivisto, I.G.I. Iwakiri, J.M. Loureiro, A.M. Ribeiro, I.B.R. Nogueira, Modelling of a pressure swing adsorption unit by deep learning and artificial intelligence tools, Chem. Eng. Sci. 224(2020)115801.
Q.M. Pu, Y.H. Li, H. Zhang, H.D. Yao, B. Zhang, B.J. Hou, L. Li, Y.L. Zhao, L.N. Zhao, Screen efficiency comparisons of decision tree and neural network algorithms in machine learning assisted drug design, Sci. China Chem. 62(4) (2019)506—514.
J. Peña-Guerrero, P.A. Nguewa, A.T. García-Sosa, Machine learning, artificial intelligence, and data science breaking into drug design and neglected diseases, Wires Comput. Mol. Sci. 11 (5) (2021) e1513.
H. Liu, H.W. Xu, W.G. Zhu, Y. Zhou, K. Xue, Z.Y. Zhu, Y.L. Wang, J.G. Qi, Prediction of the viscosity of green deep eutectic solvents by constructing ensemble model based on machine learning, Chem. Eng. Sci. 304(2025)120987.
M.R. Youcefi, F.M. Alqahtani, M. Nait Amar, H. Djema, M. Ghasemi, An interpretable and explainable deep learning model for predicting hydrogen solubility in diverse chemicals, Chem. Eng. Sci. 304(2025)121048.
F. Kretschmer, J. Seipp, M. Ludwig, G.W. Klau, S. Böcker, Coverage bias in small molecule machine learning, Nat. Commun. 16 (1)(2025)554.
Y. Wan, J.L. Wu, T.J. Hou, C.Y. Hsieh, X.W. Jia, Multi-channel learning for integrating structural hierarchies into context-dependent molecular representation, Nat. Commun. 16 (1) (2025) 413.
F. Wang, Z.Y. Bi, L.F. Ding, Q.Y. Yang, Large-scale computational screening of metal—organic frameworks for D 2 /H 2 se paration, Chin. J. Chem. Eng. 54 (2023) 323—330.
X.Y. Pi, J.F. Lu, S.M. Li, J.L. Zhang, Y.L. Wang, H.Y. He, Computer-aided ionic liquid design for green chemical processes based on molecular simulation and artificial intelligence, Sep. Purif. Technol. 361(2025)131585.
K.T. Butler, D.W. Davies, H. Cartwright, O. Isayev, A. Walsh, Machine learning for molecular and materials science, Nature 559(7715)(2018)547—555.
W.P. Walters, R. Barzilay, Applications of deep learning in molecule generation and molecular property prediction, Acc. Chem. Res. 54(2)(2021)263—270.
X.X. Yu, Y.H. Shen, Z.B. Guan, D.H. Zhang, Z.L. Tang, W.B. Li, Multi-objective optimization of ANN-based PSA model for hydrogen purification from steammethane reforming gas, Int. J. Hydrog. Energy 46 (21) (2021) 11740—11755.
J.Q. Wang, J.P. Liu, H.S. Wang, M.S. Zhou, G.L. Ke, L.F. Zhang, J.Z. Wu, Z.F. Gao, D.N. Lu, A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks, Nat. Commun. 15 (1) (2024)1904.
C. Shorten, T.M. Khoshgoftaar, A survey on image data augmentation for deep learning, J. Big Data 6(1)(2019)60.
J.J. Su, X.J. Yu, X.R. Wang, Z.J. Wang, G.Q. Chao, Enhanced transfer learning w ith data augmentation, Eng. Appl. Artif. Intell. 129(2024)107602.
D. Andrae, U. Häußermann, M. Dolg, H. Stoll, H. Preuß, Energy-adjusted ab initio pseudopotentials for the second and third row transition elements, Theor. Chim. Acta. 77(2) (1990)123—141.
K.A. Peterson, D. Figgen, E. Goll, H. Stoll, M. Dolg, Systematically convergent basis sets with relativistic pseudopotentials. II. Small-core pseudopotentials and correlation consistent basis sets for the post- d group 16—18 elements, J. Chem. Phys. 119 (21)(2003)11113—11123.
F. Weigend, R. Ahlrichs, Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design and assessment of accuracy, Phys. Chem. Chem. Phys. 7 (18) (2005) 3297—3305.
D. Rappoport, F. Furche, Property-optimized Gaussian basis sets for molecular response calculations, J. Chem. Phys. 133 (13) (2010) 134105.
T. Lu, F.W. Chen, Multiwfn: a multifunctional wavefunction analyzer, J. Comput. Chem. 33(5) (2012)580—592.
T. Lu, A comprehensive electron wavefunction analysis toolbox for chemists, Multiwfn, J. Chem. Phys. 161 (8)(2024)082503.
C. Zhao, J. Zhang, W.J. Zhang, Y. Yang, D.G. Guo, H.J. Zhang, L. Liu, Reveal the main factors and adsorption behavior influencing the adsorption of pollutants on natural mineral adsorbents: based on machine learning modeling and DFT calculation, Sep. Purif. Technol. 331 (2024) 125706.
Z.Y. Yang, Z.Z. Chen, H.J. Gong, X.S. Wang, Copper oxide modified activated carbon for enhanced adsorption performance of siloxane: an experimental and DFT study, Appl. Surf. Sci. 601(2022)154200.
M. Waskom, Seaborn: statistical data visualization, J. Open Source Softw. 6 (60)(2021)3021.
O. Méndez-Lucio, B. Baillif, D.A. Clevert, D. Rouquié, J. Wichard, De novo generation of hit-like molecules from gene expression signatures using artificial intelligence, Nat. Commun. 11(1)(2020)10.
C.L. Xie, X.X. Zhuang, Z.M. Niu, R.X. Ai, S. Lautrup, S.J. Zheng, Y.H. Jiang, R.Y. Han, T.S. Gupta, S.Q. Cao, M.J. Lagartos-Donate, C.Z. Cai, L.M. Xie, D. Caponio, W.W. Wang, T. Schmauck-Medina, J.Y. Zhang, H.L. Wang, G.F. Lou, X.L. Xiao, W.H. Zheng, K. Palikaras, G. Yang, K.A. Caldwell, G.A. Caldwell, H.M. Shen, H. Nilsen, J.H. Lu, E.F. Fang, Amelioration of Alzheimer's disease pathology by mitophagy inducers identified via machine learning and a cross-species workflow, Nat. Biomed. Eng. 6(1)(2022)76—93.
P. Willett, Similarity-based virtual screening using 2D fingerprints, Drug Discov. Today 11(23—24)(2006)1046—1053.
P.R. Haddad, M. Taraji, R. Szücs, Prediction of analyte retention time in liquid chromatography, Anal. Chem. 93 (1)(2021)228—256.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, É. Duchesnay, Scikit-learn: machine learning in Python, J. Mach. Learn. Res. 12 (2011)2825—2830.
S.M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, Curran Associates, Inc., Red, Hook, NY, (2017) 4678—4777.
T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C.W. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, A. Rush, Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, ACL, Stroudsburg, PA, (2020) 38—45.
L.H. de Oliveira, J.G. Meneguin, M.V. Pereira, E.A. da Silva, W.M. Grava, J.F. do Nascimento, P.A. Arroyo, H 2 S adsorption on NaY zeolite, Microporous Mesoporous Mater. 284 (2019) 247—257.
M.B. Kim, K.M. Kim, T.H. Kim, T.U. Yoon, E.J. Kim, J.H. Kim, Y.S. Bae, Highly selective adsorption of SF 6 over N 2 in a bromine-functionalized zirconiumbased metal-organic framework, Chem. Eng. J. 339 (2018) 223—229.
J. Lee, Y. Lee, J. Kim, A.R. Kosiorek, S. Choi, Y.W. Teh, Set transformer: a framework for attention-based permutation-invariant neural networks, arXiv(2019) arXiv:1810.00825.
D. Buterez, J.P. Janet, D. Oglic, P. Liò, An end-to-end attention-based approach for learning on graphs, Nat. Commun. 16(1) (2025)5244.
0
浏览量
0
Downloads
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621