Efficient parallel linear scaling method to get the response density matrix in all-electron real-space density-functional perturbation theory

被引:6
|
作者
Shang, Honghui [1 ]
Liang, WanZhen [2 ,3 ,4 ,5 ]
Zhang, Yunquan [1 ]
Yang, Jinlong [6 ,7 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
[2] Xiamen Univ, State Key Lab Phys Chem Solid Surfaces, Xiamen 361005, Fujian, Peoples R China
[3] Xiamen Univ, Collaborat Innovat Ctr Chem Energy Mat, Xiamen 361005, Fujian, Peoples R China
[4] Xiamen Univ, Coll Chem & Chem Engn, Fujian Prov Key Lab Theoret & Computat Chem, Xiamen 361005, Fujian, Peoples R China
[5] Xiamen Univ, Coll Chem & Chem Engn, Dept Chem, Xiamen 361005, Fujian, Peoples R China
[6] Univ Sci & Technol China, Dept Chem Phys, Hefei Natl Lab Phys Sci Microscale, Hefei 230026, Anhui, Peoples R China
[7] Univ Sci & Technol China, Synerget Innovat Ctr Quantum Informat & Quantum P, Hefei 230026, Anhui, Peoples R China
关键词
Density-functional perturbation theory; Linear scaling; MPI; Numeric atomic orbitals; Density-function theory; HARTREE-FOCK; FORCE-CONSTANTS; DERIVATIVES; IMPLEMENTATION;
D O I
10.1016/j.cpc.2020.107613
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The real-space density-functional perturbation theory (DFPT) for the computations of the response properties with respect to the atomic displacement and homogeneous electric field perturbation has been recently developed and implemented into the all-electron, numeric atom-centered orbitals electronic structure package FHI-aims. It is found that the bottleneck for large scale applications is the computation of the response density matrix, which scales as O(N-3). Here for the response properties with respect to the homogeneous electric field, we present an efficient parallel linear scaling algorithm for the response density matrix calculation. Our scheme is based on the second-order trace-correcting purification and the parallel sparse matrix-matrix multiplication algorithms. The new scheme reduces the formal scaling from O(N-3) to O(N), and shows good parallel scalability over tens of thousands of cores. As demonstrated by extensive validation, we achieve a rapid computation of accurate polarizabilities using DFPT. Finally, the computational efficiency of this scheme has been illustrated by making the scaling tests and scalability tests on massively parallel computer systems. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条