The MMSE Sandbox

Question

Similar to this sandbox, you're encouraged to write drafts of your questions here before posting them on the main site, if you think this may be beneficial. I was tempted to come up with a set of "rules" for this, or to simply say "don't over use this", but my guess is that this won't be necessary. Also, if you "over use" this, bringing more traffic into our Meta site might not be such a bad thing :)

If you have an answer to any of these questions, please leave a comment and the question can be moved from The MMSE Sandbox to the main site, if the asker wishes.

Jaafar Mehrez · Accepted Answer · 2025-10-26 03:48:14Z

Why does our PySCF-MRCC interface work for N but not for O?

Introduction

My colleague and I have made a PySCF-MRCC interface that can generate fort.55 files using PySCF for RHF or UHF references. It works for every all-electron calculation that I have tried. I was also able to get the frozen-core energies and the effect of frozen core orbitals the integrals. The entire script looks like the following:

import pyscf import numpy as np import warnings from functools import reduce from pyscf import gto from pyscf.gto.basis import parse_gaussian from pyscf.mp.mp2 import get_frozen_mask from pyscf import lib, ao2mo from pyscf.scf import atom_hf from pyscf import __config__ np.set_printoptions(threshold=np.inf) def _normalize_spin_masks(mask, nmo): if isinstance(mask, (tuple, list)) and len(mask) == 2: ma = np.asarray(mask[0], dtype=bool) mb = np.asarray(mask[1], dtype=bool) else: m = np.asarray(mask, dtype=bool) if m.size == 2 * nmo: ma = m[:nmo].copy() mb = m[nmo:].copy() elif m.size == nmo: ma = mb = m.copy() else: raise ValueError(f"Mask length {m.size} not compatible with nmo={nmo}.") return ma, mb def freezeCore(oneBody_a, oneBody_b, twoBody_a, twoBody_b, twoBody_ab, frozen_core, active=None): nmo_a = np.asarray(oneBody_a).shape[0] nmo_b = np.asarray(oneBody_b).shape[0] if nmo_a != nmo_b: raise ValueError(f"Different nmo in oneBody_a ({nmo_a}) and oneBody_b ({nmo_b}).") nmo = nmo_a core_a, core_b = _normalize_spin_masks(frozen_core, nmo) if active is None: valence_a = ~core_a valence_b = ~core_b else: valence_a, valence_b = _normalize_spin_masks(active, nmo) assert core_a.shape == (nmo,) and core_b.shape == (nmo,) assert valence_a.shape == (nmo,) and valence_b.shape == (nmo,) core_idx_a = np.where(core_a)[0] core_idx_b = np.where(core_b)[0] val_idx_a = np.where(valence_a)[0] val_idx_b = np.where(valence_b)[0] constant = np.einsum('ii->', oneBody_a[np.ix_(core_idx_a, core_idx_a)]) constant += np.einsum('ii->', oneBody_b[np.ix_(core_idx_b, core_idx_b)]) core_aa = twoBody_a[np.ix_(core_idx_a, core_idx_a, core_idx_a, core_idx_a)] core_bb = twoBody_b[np.ix_(core_idx_b, core_idx_b, core_idx_b, core_idx_b)] core_ab = twoBody_ab[np.ix_(core_idx_a, core_idx_a, core_idx_b, core_idx_b)] constant += 0.5 * ( np.einsum('iijj->', core_aa) - np.einsum('ijji->', core_aa) + np.einsum('iijj->', core_bb) - np.einsum('ijji->', core_bb) + 2.0 * np.einsum('iijj->', core_ab) ) h_active_a = oneBody_a[np.ix_(val_idx_a, val_idx_a)].copy() coul_aa = twoBody_a[np.ix_(val_idx_a, val_idx_a, core_idx_a, core_idx_a)] coul_ab = twoBody_ab[np.ix_(val_idx_a, val_idx_a, core_idx_b, core_idx_b)] h_active_a += np.einsum('pqkk->pq', coul_aa) h_active_a += np.einsum('pqkk->pq', coul_ab) exch_aa = twoBody_a[np.ix_(val_idx_a, core_idx_a, core_idx_a, val_idx_a)] h_active_a -= np.einsum('pkkq->pq', exch_aa) h_active_b = oneBody_b[np.ix_(val_idx_b, val_idx_b)].copy() coul_bb = twoBody_b[np.ix_(val_idx_b, val_idx_b, core_idx_b, core_idx_b)] h_active_b += np.einsum('pqkk->pq', coul_bb) twoBody_ab_T = twoBody_ab.transpose(2,3,0,1) coul_ba = twoBody_ab_T[np.ix_(val_idx_b, val_idx_b, core_idx_a, core_idx_a)] h_active_b += np.einsum('pqkk->pq', coul_ba) exch_bb = twoBody_b[np.ix_(val_idx_b, core_idx_b, core_idx_b, val_idx_b)] h_active_b -= np.einsum('pkkq->pq', exch_bb) twoBody_active_a = twoBody_a [np.ix_(val_idx_a, val_idx_a, val_idx_a, val_idx_a)].copy() twoBody_active_b = twoBody_b [np.ix_(val_idx_b, val_idx_b, val_idx_b, val_idx_b)].copy() twoBody_active_ab = twoBody_ab[np.ix_(val_idx_a, val_idx_a, val_idx_b, val_idx_b)].copy() return h_active_a, h_active_b, twoBody_active_a, twoBody_active_b, twoBody_active_ab, constant def main(): from pyscf import gto, scf, ao2mo name = 'out' mol = pyscf.M( atom='O', unit='angstrom', basis={'O': parse_gaussian.load('O-6-31G-EMSL.gbs', 'O')}, charge=0, spin=2, verbose=9, symmetry=True, output=name + '.txt', symmetry_subgroup='D2h', max_memory=4000, ) original_AtomSphAverageRHF = atom_hf.AtomSphAverageRHF class CustomAtomSphAverageRHF(original_AtomSphAverageRHF): def __init__(self, mol): super().__init__(mol) self.max_cycle = 9999 self.direct_scf = False atom_hf.AtomSphAverageRHF = CustomAtomSphAverageRHF mymf = mol.UHF().set( conv_tol=1e-14, max_cycle=9999, ddm_tol=1e-15, direct_scf=False, chkfile=name + '.chk', init_guess='atom', irrep_nelec={'Ag': 4, 'B3u': 2, 'B2u': 1, 'B1u': 1} ) mymf.kernel() atom_hf.AtomSphAverageRHF = original_AtomSphAverageRHF def compute_mo_irreps(mol, mo_coeff): symm_orbs = mol.symm_orb irrep_labels = mol.irrep_name mo_irreps = [] for mo in mo_coeff.T: projections = [np.linalg.norm(symm_orbs[i].T @ mo) for i in range(len(symm_orbs))] irrep_idx = np.argmax(projections) mo_irreps.append(irrep_labels[irrep_idx]) return mo_irreps def align_beta_orbitals_to_alpha(mol, mo_coeff): alpha_orbs, beta_orbs = mo_coeff[0], mo_coeff[1] alpha_irreps = compute_mo_irreps(mol, mo_coeff[0]) beta_irreps = compute_mo_irreps(mol, mo_coeff[1]) beta_orbs_sorted = [] used_indices = set() for target_irrep in alpha_irreps: for idx, beta_irrep in enumerate(beta_irreps): if beta_irrep == target_irrep and idx not in used_indices: beta_orbs_sorted.append(beta_orbs[:, idx]) used_indices.add(idx) break else: raise ValueError(f"No matching beta orbital found for alpha irrep: {target_irrep}") beta_orbs_sorted = np.column_stack(beta_orbs_sorted) return alpha_orbs, beta_orbs_sorted mo_coeff = mymf.mo_coeff mol = mymf.mol alpha_irreps = compute_mo_irreps(mol, mo_coeff[0]) beta_irreps = compute_mo_irreps(mol, mo_coeff[1]) print(alpha_irreps) print(beta_irreps) assert mo_coeff[0].dtype == np.double and mo_coeff[1].dtype == np.double mo_coeff_a, mo_coeff_b = align_beta_orbitals_to_alpha(mol, mo_coeff) mymf.mo_coeff = (mo_coeff_a, mo_coeff_b) alpha_irreps = compute_mo_irreps(mol, mo_coeff_a) beta_irreps = compute_mo_irreps(mol, mo_coeff_b) print(alpha_irreps) print(beta_irreps) from pyscf import cc mycc = cc.UCCSD(mymf, frozen=1) orbsym_full = getattr(mo_coeff_a, 'orbsym', None) print(orbsym_full) nuc = mymf.energy_nuc() nmo = mo_coeff_a.shape[0] h1e_a = reduce(np.dot, (mo_coeff_a.T, mymf.get_hcore(), mo_coeff_a)) h1e_b = reduce(np.dot, (mo_coeff_b.T, mymf.get_hcore(), mo_coeff_b)) eri_a = ao2mo.restore(1,ao2mo.incore.general(mymf._eri,(mo_coeff_a, mo_coeff_a, mo_coeff_a, mo_coeff_a),compact=False),nmo) eri_b = ao2mo.restore(1,ao2mo.incore.general(mymf._eri,(mo_coeff_b, mo_coeff_b, mo_coeff_b, mo_coeff_b),compact=False),nmo) eri_ab = ao2mo.restore(1,ao2mo.incore.general(mymf._eri,(mo_coeff_a, mo_coeff_a, mo_coeff_b, mo_coeff_b),compact=False),nmo) active = get_frozen_mask(mycc) active_in = active mo_occ_obj = getattr(getattr(mycc, '_scf', None) or getattr(mycc, 'mf', None) or mycc, 'mo_occ', None) if isinstance(mo_occ_obj, np.ndarray) and mo_occ_obj.ndim == 2 and mo_occ_obj.shape[0] == 2: mo_occ_a = mo_occ_obj[0] mo_occ_b = mo_occ_obj[1] elif isinstance(mo_occ_obj, (list, tuple)) and len(mo_occ_obj) == 2: mo_occ_a = np.asarray(mo_occ_obj[0]) mo_occ_b = np.asarray(mo_occ_obj[1]) else: mo_occ_a = mo_occ_b = np.asarray(mo_occ_obj) nmo = mo_occ_a.size if isinstance(active_in, (tuple, list)) and len(active_in) == 2: act_a = np.asarray(active_in[0], dtype=bool) act_b = np.asarray(active_in[1], dtype=bool) else: act = np.asarray(active_in, dtype=bool) if act.size == 2 * nmo: act_a = act[:nmo].copy() act_b = act[nmo:].copy() elif act.size == nmo: act_a = act_b = act.copy() else: raise ValueError(f"Unexpected active length {act.size}; expected {nmo} or {2*nmo}") shared = np.asarray(act_a, dtype=bool) & np.asarray(act_b, dtype=bool) act_a[:] = shared act_b[:] = shared active_full = np.empty(2 * nmo, dtype=bool) active_full[:nmo] = act_a active_full[nmo:] = act_b active = active_full frozen_core = np.zeros_like(active, dtype=np.bool_) nocc_full = mol.nelectron // 2 frozen_core[:nocc_full] = ~active[:nocc_full] nocc_a = int(np.count_nonzero(np.asarray(mo_occ_a) > 0.5)) nocc_b = int(np.count_nonzero(np.asarray(mo_occ_b) > 0.5)) frozen_core_a = np.zeros_like(act_a, dtype=bool) frozen_core_b = np.zeros_like(act_b, dtype=bool) frozen_core_a[:nocc_a] = ~act_a[:nocc_a] frozen_core_b[:nocc_b] = ~act_b[:nocc_b] h1e_a, h1e_b, eri_a, eri_b, eri_ab, constant = freezeCore(h1e_a, h1e_b, eri_a, eri_b, eri_ab, frozen_core=(frozen_core_a, frozen_core_b), active=(act_a, act_b)) nmo_active = h1e_a.shape[0] if orbsym_full is None: orbsym_active = None else: orbsym_arr = np.asarray(orbsym_full, dtype=int) valence_idx = np.where(act_a)[0] orbsym_active = [int(x) for x in orbsym_arr[valence_idx]] if orbsym_active is not None: if len(orbsym_active) != nmo_active: raise RuntimeError(f"Length mismatch: orbsym_active has length {len(orbsym_active)} " f"but number of active orbitals is {nmo_active}.") filename = 'fort.55' nelec_full = mol.nelectron spin = mol.spin nalpha_full = (nelec_full + spin) // 2 nbeta_full = (nelec_full - spin) // 2 nocc_full = mol.nelectron // 2 nfrozen_core = int(np.count_nonzero(frozen_core[:nocc_full])) nalpha_active = nalpha_full - nfrozen_core nbeta_active = nbeta_full - nfrozen_core nelec_active = (nalpha_active, nbeta_active) DEFAULT_FLOAT_FORMAT = getattr(__config__, 'fcidump_float_format', ' %.16g') TOL = getattr(__config__, 'fcidump_write_tol', 1e-15) def write_hcore_uhf(fout, h1e_a, h1e_b, nmo, tol=TOL, float_format=DEFAULT_FLOAT_FORMAT): h1e_a = h1e_a.reshape(nmo,nmo) h1e_b = h1e_b.reshape(nmo,nmo) indx = [i+1 for i in range(nmo)] output_format = float_format + ' %5d %5d 0 0\n' for i in range(nmo): for j in range(i, nmo): if abs(h1e_a[i,j]) > TOL: fout.write(output_format % (h1e_a[i,j], indx[i], indx[j])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') for i in range(nmo): for j in range(i, nmo): if abs(h1e_b[i,j]) > TOL: fout.write(output_format % (h1e_b[i,j], indx[i], indx[j])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') print("DEBUG: eri shapes / dtypes") print("eri_a.shape =", np.asarray(eri_a).shape, "dtype=", np.asarray(eri_a).dtype) print("eri_b.shape =", np.asarray(eri_b).shape, "dtype=", np.asarray(eri_b).dtype) print("eri_ab.shape=", np.asarray(eri_ab).shape, "dtype=", np.asarray(eri_ab).dtype) def write_eri_uhf(fout, eri_a, eri_b, eri_ab, nmo, tol=TOL, float_format=DEFAULT_FLOAT_FORMAT): eri_a = np.asarray(eri_a) eri_b = np.asarray(eri_b) eri_ab = np.asarray(eri_ab) npair = nmo * (nmo + 1) // 2 output_format = float_format + ' %5d %5d %5d %5d\n' indx = [i + 1 for i in range(nmo)] def pair_index(i, j): return i * (i + 1) // 2 + j if eri_a.ndim == 2 and eri_b.ndim == 2 and eri_ab.ndim == 2: assert eri_a.shape == (npair, npair) and eri_b.shape == (npair, npair) and eri_ab.shape == (npair, npair) kl = 0 for l in range(nmo): for k in range(0, l+1): ij = 0 for i in range(0, nmo): for j in range(0, i+1): if i >= k: if abs(eri_a[ij, kl]) > tol: fout.write(output_format % (eri_a[ij, kl], indx[i], indx[j], indx[k], indx[l])) ij += 1 kl += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') kl = 0 for l in range(nmo): for k in range(0, l+1): ij = 0 for i in range(0, nmo): for j in range(0, i+1): if i >= k: if abs(eri_b[ij, kl]) > tol: fout.write(output_format % (eri_b[ij, kl], indx[i], indx[j], indx[k], indx[l])) ij += 1 kl += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') ij = 0 for j in range(nmo): for i in range(0, j+1): kl = 0 for k in range(nmo): for l in range(0, k+1): if abs(eri_ab[ij, kl]) > tol: fout.write(output_format % (eri_ab[ij, kl], indx[i], indx[j], indx[k], indx[l])) kl += 1 ij += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') return # CASE B: full 4D arrays if eri_a.ndim == 4 and eri_b.ndim == 4 and eri_ab.ndim == 4: for i in range(nmo): for j in range(0, i + 1): ij_idx = pair_index(i, j) for k in range(nmo): for l in range(0, k + 1): kl_idx = pair_index(k, l) if ij_idx >= kl_idx: val = eri_a[i, j, k, l] if abs(val) > tol: fout.write(output_format % (val, indx[i], indx[j], indx[k], indx[l])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') for i in range(nmo): for j in range(0, i + 1): ij_idx = pair_index(i, j) for k in range(nmo): for l in range(0, k + 1): kl_idx = pair_index(k, l) if ij_idx >= kl_idx: val = eri_b[i, j, k, l] if abs(val) > tol: fout.write(output_format % (val, indx[i], indx[j], indx[k], indx[l])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') for i in range(nmo): for j in range(0, i + 1): for k in range(nmo): for l in range(0, k + 1): val = eri_ab[i, j, k, l] if abs(val) > tol: fout.write(output_format % (val, indx[i], indx[j], indx[k], indx[l])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') return raise RuntimeError(f"Unsupported ERI shapes: eri_a {eri_a.shape}, eri_b {eri_b.shape}, eri_ab {eri_ab.shape}") def write_head(fout, nmo, nelec, ms=0, orbsym=None): is_uhf = isinstance(nelec, (list, tuple)) and len(nelec) == 2 and nelec[0] != nelec[1] if not isinstance(nelec, (int, np.number)): ms = abs(nelec[0] - nelec[1]) nelec = nelec[0] + nelec[1] fout.write(' &FCI NORB=%4d,NELEC=%2d,MS2=%d,\n' % (nmo, nelec, ms)) if orbsym is not None and len(orbsym) > 0: fout.write(' ORBSYM=%s\n' % ','.join([str(x) for x in orbsym])) else: fout.write(' ORBSYM=%s\n' % ('1,' * nmo)) fout.write(' ISYM=1,\n') if is_uhf: fout.write(' IUHF=1,\n') fout.write(' &END\n') def write_head55(fout, nmo, nelec, ms=0, orbsym=None): if not isinstance(nelec, (int, np.number)): ms = abs(nelec[0] - nelec[1]) nelec = nelec[0] + nelec[1] fout.write(f"{nmo:1d} {nelec:1d}\n") if orbsym is not None and len(orbsym) > 0: orbsym = [x + 1 for x in orbsym] fout.write(f"{' '.join([str(x) for x in orbsym])}\n") else: fout.write(f"{' 1' * nmo}\n") fout.write(' 150000\n') def from_integrals_uhf(filename, h1e_a, h1e_b, eri_a, eri_b, eri_ab, nmo, nelec, nuc=0, ms=0, orbsym=None, tol=TOL, float_format=DEFAULT_FLOAT_FORMAT): with open(filename, 'w') as fout: if filename == 'fort.55': write_head55(fout, nmo, nelec, ms, orbsym) else: write_head(fout, nmo, nelec, ms, orbsym) write_eri_uhf(fout, eri_a, eri_b, eri_ab, nmo, tol=tol, float_format=float_format) write_hcore_uhf(fout, h1e_a, h1e_b, nmo, tol=tol, float_format=float_format) output_format = float_format + ' 0 0 0 0\n' fout.write(output_format % nuc) from_integrals_uhf(filename, h1e_a, h1e_b, eri_a, eri_b, eri_ab, nmo_active, nelec_active, nuc + constant, 0, orbsym_active, tol=1e-18, float_format='% 0.20E') if __name__ == '__main__': main()

The first part is the freezeCore() function which is responsible for calculating the frozen core energy and updating the Hamiltonians accordingly. Next is a test for atom='O', and the rest of the script is the function for writing the integrals to a fort.55 file.

Testing

I calculated CCSD with CFOUR-MRCC (SCF and integrals calculated in CFOUR, and CCSD calculated in MRCC), MRCC-MRCC, and PySCF-MRCC.

All three CCSD energies matched for (N/STO-3G), (N/aVDZ), (N/6-31G), and (O/STO-3G), but not for (O/6-31G), and (O/aVDZ).

(O)/6-31G UHF CCSD MRCC-MRCC : -74.780 309 895 517 -74.838 166 288 782 CFOUR-MRCC : -74.780 309 895 517 -74.838 166 288 782 PySCF-MRCC : -74.780 309 895 517 -74.837 845 006 946

Debugging

In my previous work on generating the fort.55 from a UHF reference in PySCF, I found that I should align the orbitals according to their irreps between alpha/beta before building the eri matrices. However, with the the above implementation of the frozen core approximation, I am finding that there is something wrong with the symmetry in the final generated fort.55 file which is leading to the difference in CCSD energy but I am not able to figure it out. I can get the energies to match by turning symmetry off, but want to use symmetry.

The following (repository) contains the inputs/outputs for the studied cases, and here I will focus on 6-31G as an example:

When I try to use the generated fort.55 with mrcc, I am finding that mrcc is requesting two different values for the`symm` keyword, for instance, it is asking for `symm=2` in the case of oxygen while `symm=5` for nitrogen. Eventually,

the CCSD energy value for nitrogen is correct while it is not for oxygen.

Further Debugging

I was skeptical that there might be something different in the tensors of the two-body Hamiltonian for those two systems as they have different numbers of electrons, especially for the p orbital, I am thinking that the p orbitals have better degeneracy in the case of N where each of 2$p_x$, 2$p_y$, and 2$p_z$ is holding only one electron, while for oxygen the 2$p_x$ is holding one electron for alpha and another for beta which I am thinking makes the p orbital less isotropic in the case of the oxygen atom. I have furher investigated the tensors of the two-body Hamiltonian for the alpha orbital (eri_a) between N and O, where I am finding some differences that could be relevant to the issue. The full tensors are available for (eri_a Oxygen) and (eri_a Nitrogen) but I will just show the first matrices for comparison:

Nitrogen eri_a (sample) [[[[ 0.69909025 0. 0. 0. 0. 0. 0. 0.15240456 ] [ 0. 0.67781933 0. 0. -0.17633027 0. 0. 0. ] [ 0. 0. 0.67781933 0. 0. -0.17633027 0. 0. ] [ 0. 0. 0. 0.67781933 0. 0. -0.17633027 0. ] [ 0. -0.17633027 0. 0. 0.54071259 0. 0. 0. ] [ 0. 0. -0.17633027 0. 0. 0.54071259 0. 0. ] [ 0. 0. 0. -0.17633027 0. 0. 0.54071259 0. ] [ 0.15240456 0. 0. 0. 0. 0. 0. 0.54209725]]]] Oxygen eri_a (sample) [[[[ 0.8077749 0. 0. 0. 0. 0. 0. -0.1773966 ] [ 0. 0.78896728 0. 0. -0.20465148 0. 0. 0. ] [ 0. 0. 0.78896728 0. 0. -0.20465148 0. 0. ] [ 0. 0. 0. 0.7801885 0. 0. -0.20822529 0. ] [ 0. -0.20465148 0. 0. 0.61210763 0. 0. 0. ] [ 0. 0. -0.20465148 0. 0. 0.61210763 0. 0. ] [ 0. 0. 0. -0.20822529 0. 0. 0.6208864 0. ] [-0.1773966 0. 0. 0. 0. 0. 0. 0.61939121]]]]

If we take a closer look at the above values, we will see that for the (N) atom the "three" values above and below the diagonal are equal. Moreover, within the diagonal, (1,1), (2,2), (3,3) are equal to each other. Also (4,4), (5,5) (6,6) are equal to each other. However for (O), you might notice that the third value among those above and below the diagonal is different compared to the other two values. Similar observations could be seen for the diagonal values.

I guess that this difference in symmetry is because of the difference in the number of electrons and how they occupy the corresponding orbitals.

Eventually, if the two are of different symmetries, why do we arrive to the same indices and same number of integrals and the same irreps for orbsym between O and N?

Thank you for making it this far

I know it is a long question, but I would appreciate any indicators or possible modifications that might lead towards a solution to the above problem.

+1 but I think the question needs some work, even after the edits that I made so far. As for the symm values for O and N, they make sense according to the HPQC cheat sheet for quantum chemistry: O can be considered to have unpaired electrons in px and py orbitals, which leads to xy symmetry; and N can be considered to have unpaired electrons in px, py, and pz orbitals, which leads to xyz symmetry. xy=2 and xyz=5 in MRCC's convention. — Nike Dattani
– Nike Dattani, Commented Oct 25 at 18:44
@NikeDattani Thank you for the edits, I agree that we need to discuss further and edit the question. — Jaafar Mehrez
– Jaafar Mehrez, Commented Oct 26 at 1:36

Nike Dattani · Accepted Answer · 2023-02-17 00:18:54Z

3

Can this procedure for multiply augmenting a basis set with the "basis_set_exchange" package be improved?

python,basis-sets,basis-set-exchange,quantum-chemistry,electronic-structure

1) Install the `basis_set_exchange` package

git clone and pip install -e are required because the default version that's installed with pip does not include the basis_set_exchange.manip.geometric_augmentation() function for multiply augmenting a basis set:

git clone https://github.com/MolSSI-BSE/basis_set_exchange.git pip install -e basis_set_exchange

2) Create a basis set "dictionary" for the original basis set

We will multiply augment the aug-cc-pVDZ basis set for the H atom. We can also do it for other atoms by using something like elements=[6,8] or even elements=['C', 8, 'Ne', '16']:

bs_dict = basis_set_exchange.get_basis('aug-cc-pVDZ',elements=1)

3) Add the extra diffuse functions:

The 1 indicates that we're constructing a daug basis set, but we could use 2 and get a taug basis set, or 3 and get a qaug basis set:

daug=basis_set_exchange.manip.geometric_augmentation(bs_dict, 1, use_copy=True, as_component=False, steep=False)

4) Get the new basis set into the format we desire

In this case I'll use the Guassian basis set format:

daug_in_gaussian_format=basis_set_exchange.write_formatted_basis_str(daug, 'gaussian94', header=None)

4) Print the new basis set to a file!

print(daug_in_gaussian_format, file=open('daug-cc-pVDZ.gbs', 'w')

answered Feb 17, 2023 at 0:18

Nike Dattani

38.6k2 gold badges8 silver badges41 bronze badges

$\begingroup$ Hi @Nike, could you please explain the specific aspects you would like to be improved in this procedure? $\endgroup$

Jaafar Mehrez
– Jaafar Mehrez

2024-10-22 06:41:01 +00:00
Commented Oct 22, 2024 at 6:41
$\begingroup$ @JaafarMehrez when I originally did this, it felt like the procedure was too lengthy and complicated to be the intended way to augment a basis set! $\endgroup$

Nike Dattani
– Nike Dattani

2024-10-22 10:26:34 +00:00
Commented Oct 22, 2024 at 10:26
$\begingroup$ After looking at the source file of bse, I think you should be able to achieve the same thing just by passing the arguments to 'get_basis' function: 'basis_set = bse.get_basis('aug-cc-pVDZ',elements=1,augment_diffuse=2,fmt='gaussian94',header=False)', the only argument that we are not able to access directly from get_basis() is 'use_copy' which is by default 'use_copy=False'. $\endgroup$

Jaafar Mehrez
– Jaafar Mehrez

2024-10-24 14:17:52 +00:00
Commented Oct 24, 2024 at 14:17
$\begingroup$ @JaafarMehrez it has been almost 2 years since I made this draft of a question, so none of this is fresh in my mind, but it does look like you've been able to combine some of the steps in the procedure that was presented in the original question, so I think you have a good answer to the question. If you can confirm in your answer that what you are doing is undocumented (it sounds like that's the case, but you can correct me if I'm wrong) and that you found it by digging through the source code, that would be helpful too. I can post the question on the main site once your answer is ready. $\endgroup$

Nike Dattani
– Nike Dattani

2024-10-31 08:05:51 +00:00
Commented Oct 31, 2024 at 8:05
$\begingroup$ If you want, you can make an MMSE.Meta post which can act as a sandbox for answers (and this may be used by others in the future), or if you think it's more appropriate/practical, you can just draft your answer on the main site and save it somewhere until I post the question. Either way, once you have an answer ready to be posted, I can post my question on the main site (among other things, this will limit my question's exposure in the unanswered queue, so it doesn't steal other questions' attention). $\endgroup$

Nike Dattani
– Nike Dattani

2024-10-31 08:07:39 +00:00
Commented Oct 31, 2024 at 8:07
$\begingroup$ I was following the steps you mentioned so I had the source code right in front of me and I was able to see arguments of the function, but after your comment I checked the BSE documentation, the section dealing with augmentation molssi-bse.github.io/basis_set_exchange/augmentation.html is only mentioning 'basis_set_exchange.manip.geometric_augmentation()'. However, if you go to the main user API or the Developer Documentation (molssi-bse.github.io/basis_set_exchange/…), you will find more details about those arguments. $\endgroup$

Jaafar Mehrez
– Jaafar Mehrez

2024-11-01 03:28:04 +00:00
Commented Nov 1, 2024 at 3:28
$\begingroup$ Thus I think you could say it is documented, and my answer wouldn't make much sense. $\endgroup$

Jaafar Mehrez
– Jaafar Mehrez

2024-11-01 03:29:17 +00:00
Commented Nov 1, 2024 at 3:29
$\begingroup$ @JaafarMehrez You can say that it's not mentioned in the main documentation under Augmenting basis sets by extra functions, but it can be found if you look at the "Developer API" documentation. However, your comment suggests to use "basis_set = bse.get_basis", which is not mentioned at all in the Developer API docs, which mention basis_set_exchange.api.get_basis instead. Perhaps you could say that basis_set_exchange.api.get_basis works (if it does), then say that basis_set_exchange.get_basis also works but is undocumented. $\endgroup$

Nike Dattani
– Nike Dattani

2024-11-02 15:52:14 +00:00
Commented Nov 2, 2024 at 15:52
$\begingroup$ Also, while trying to see if the pages that you found existed when I asked this question, I found that this did exist since August 2022, but I don't see the Developer API docs here. $\endgroup$

Nike Dattani
– Nike Dattani

2024-11-02 15:57:49 +00:00
Commented Nov 2, 2024 at 15:57

Add a comment |

Jaafar Mehrez · Accepted Answer · 2024-10-19 13:53:59Z

Generated Fort.55 File in PySCF for MRCC, how to handle symmetry from UHF reference calculation?

Background

I am trying to build an interface between PySCF and MRCC using UHF reference. The basic idea of this interface is to generate 1e, 2e integrals using PySCF, save those integrals with ASCII format in what it is known as fort.55 file in MRCC or FCIDUMP file in PySCF, the format of this file follows the UHF standard designed by Knowles and Handy, in which after the header part, separate 4-index and 2-index integrals are written for alpha-alpha, beta-beta, and alph-beta orbitals, with a line of zeroes separating them. Unfortunately, this format and despite being used in some major quantum chemistry codes like MRCC or CFOUR, is not natively supported in PySCF.

Progress

I have implied this format to PySCF FCIDUMP file while taking care of the header part such that the user can generate and save the designated fort.55 file which can be later used as an input for MRCC in order to perform state-of-the-art post-HF calculations at the level of FCI or CC(n).

Code

The following are details about the script I am using to build the fort.55 file. Suppose that we have defined a Mol object for open-shell system, followed by running a UHF calculation. In the next step, I convert the 2e integrals using 4-fold permutation symmetry for alph-alpha, beta-beta, and alpha-beta orbitals which is saved in corresponding eri_aaaa, eri_bbbb, and eri_aabb, respectively. Moreover, the transformed 1e core Hamiltonian for alpha and beta are also calculated and stored in h_aa and h_bb, respectively.

orbs = mf.mo_coeff nmo = orbs[0].shape[0] eri_aaaa = pyscf.ao2mo.restore(4,pyscf.ao2mo.incore.general(mf._eri, (orbs[0],orbs[0],orbs[0],orbs[0]), compact=False),nmo) eri_bbbb = pyscf.ao2mo.restore(4,pyscf.ao2mo.incore.general(mf._eri, (orbs[1],orbs[1],orbs[1],orbs[1]), compact=False),nmo) eri_aabb = pyscf.ao2mo.restore(4,pyscf.ao2mo.incore.general(mf._eri, (orbs[0],orbs[0],orbs[1],orbs[1]), compact=False),nmo) h_core = mf.get_hcore(mol) h_aa = reduce(numpy.dot, (orbs[0].T, h_core, orbs[0])) h_bb = reduce(numpy.dot, (orbs[1].T, h_core, orbs[1])) nuc = mol.energy_nuc()

In the next section of the code, I try to handle the symmetry of the orbitals in a way that will match the format of fort.55, I also define the range of values for integral indices.

if mol.symmetry: groupname = mol.groupname if groupname in ('SO3', 'Dooh'): logger.info(mol, 'Lower symmetry from %s to D2h', groupname) raise RuntimeError('Lower symmetry from %s to D2h' % groupname) elif groupname == 'Coov': logger.info(mol, 'Lower symmetry from Coov to C2v') raise RuntimeError('''Lower symmetry from Coov to C2v''') orbsym = pyscf.symm.label_orb_symm(mol,mol.irrep_name,mol.symm_orb,orbs[0]) orbsym = numpy.array(orbsym) orbsym = [param.IRREP_ID_TABLE[groupname][i]+1 for i in orbsym] a_inds = [i+1 for i in range(orbs[0].shape[0])] b_inds = [i+1 for i in range(orbs[1].shape[1])] nelec = mol.nelec tol=1e-18

The last part is considered the main part of the script, which is supposed to handle the header of the generated file in a way that will match the fort.55 format, it is supposed to deal with the 4-index, and 2-index integrals in which I am assuming that we are using a 4-fold permutation symmetry. The sections after '4-fold symmetry' are in the sequence of alpha-alpha, beta-beta, alpha-beta 2e integrals, followed by alpha, beta 1e integrals, and finally the nuclear repulsion energy.

with open('fort.55', 'w') as fout: if not isinstance(nelec, (int, numpy.number)): ms = abs(nelec[0] - nelec[1]) nelec = nelec[0] + nelec[1] else: ms=0 fout.write(f"{nmo:1d} {nelec:1d}\n") if orbsym is not None and len(orbsym) > 0: fout.write(f"{' '.join([str(x) for x in orbsym])}\n") else: fout.write(f"{' 1' * nmo}\n") fout.write(' 150000\n') output_format = float_format + ' %5d %5d %5d %5d\n' #4-fold symmetry kl = 0 for l in range(nmo): for k in range(0, l+1): ij = 0 for i in range(0, nmo): for j in range(0, i+1): if i >= k: if abs(eri_aaaa[ij,kl]) > tol: fout.write(output_format % (eri_aaaa[ij,kl], a_inds[i], a_inds[j], a_inds[k], a_inds[l])) ij += 1 kl += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') kl = 0 for l in range(nmo): for k in range(0, l+1): ij = 0 for i in range(0, nmo): for j in range(0, i+1): if i >= k: if abs(eri_bbbb[ij,kl]) > tol: fout.write(output_format % (eri_bbbb[ij,kl], b_inds[i], b_inds[j], b_inds[k], b_inds[l])) ij += 1 kl += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') ij = 0 for j in range(nmo): for i in range(0, j+1): kl = 0 for k in range(nmo): for l in range(0, k+1): if abs(eri_aabb[ij,kl]) > tol: fout.write(output_format % (eri_aabb[ij,kl], a_inds[i], a_inds[j], b_inds[k], b_inds[l])) kl += 1 ij +=1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') h_aa = h_aa.reshape(nmo,nmo) h_bb = h_bb.reshape(nmo,nmo) output_format = float_format + ' %5d %5d 0 0\n' for i in range(nmo): for j in range(nmo): fout.write(output_format % (h_aa[i,j], a_inds[i], a_inds[j])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') for i in range(nmo): for j in range(nmo): fout.write(output_format % (h_bb[i,j], b_inds[i], b_inds[j])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') output_format = float_format + ' 0 0 0 0\n' fout.write(output_format % nuc)

Test & Debugging

In order to test the above method and further arrive at my question, I will consider two open-shell systems, namely Be(+) and Be(-). The former having 3 electrons occupying s-type orbitals, while the latter having 5 electrons with s- and p-type orbitals being occupied. by using the generated fort.55 file as described above with the following input file in MRCC:

basis=STO-XG-EMSL iface=cfour uncontract=off calc=CC(5) core=corr itol=18 scftol=13 cctol=7 ccmaxit=999 scfmaxit=9999 scfiguess=ao scftype=UHF rohftype=semicanonical rest=2 charge=-1 refdet=serialno 1,2 3 symm=0 mult=2 occ=2,0,0,0,0,0,0,1/2,0,0,0,0,0,0,0 geom BE tprint=0.01 verbosity=3

You can see that I have turned off the symmetry 'symm=0', upon which, the calculation will run normally, to confirm that, I compare the energies I get with those generated using standalone MRCC, and I can confirm that the energies are identical in that case for both Be(-) and Be(+).

Those values are for Be(-) using STO-XG mrcc-mrcc: CCSD -14.274723492321 CCSD[T] -14.274740253256 CCSD(T) -14.274739121789 CCSDT -14.274741793166 CCSDT[Q] -14.274743492727 CCSDT(Q)/A -14.274743594299 CCSDT(Q)/B -14.274743590678 CCSDTQ -14.274743599645 CCSDTQ[P] -14.274743599645 CCSDTQ(P)/A -14.274743599645 CCSDTQ(P)/B -14.274743599645 CCSDTQP -14.274743599073 pyscf-mrcc (symmetry is off): CCSD -14.274723492321 CCSD[T] -14.274740253256 CCSD(T) -14.274739121789 CCSDT -14.274741793166 CCSDT[Q] -14.274743492727 CCSDT(Q)/A -14.274743594299 CCSDT(Q)/B -14.274743590678 CCSDTQ -14.274743599645 CCSDTQ[P] -14.274743599645 CCSDTQ(P)/A -14.274743599645 CCSDTQ(P)/B -14.274743599645 CCSDTQP -14.274743599073

The Problem

Running the above calculation without using symmetry is affordable in terms of computational power and time for small systems with small basis sets, however, my attempt for such interface is to perform higher-order coupled cluster calculations such as CCSDT(Q), CCSDTQ, CCSDTQ(P), CCSDTQP, etc. with large basis sets such as aCV9Z. Thus, symmetry shall be applied as well.

I will start with Be(+) which has 3 electrons occupying s-type orbitals, using the generated fort.55 file in PySCF with STO-XG basis set, which in this case has the following ORBSYM: 'Ag' 'Ag' 'B1u' 'B2u' 'B3u' 'Ag', I should be using 'symm=1' in the input file of MRCC, I find that energies are identical to the energies generated with standalone MRCC.

Be+ (STO-XG) pyscf-mrcc (symmetry is on): CCSD -14.101017848371 CCSD[T] -14.101018583402 CCSD(T) -14.101018579146 CCSDT -14.101018557319 mrcc-mrcc CCSD -14.101017848371 CCSD[T] -14.101018583402 CCSD(T) -14.101018579146 CCSDT -14.101018557319

On the other hand, in the case of Be(-) which has 5 electrons occupying s- and p-type orbitals, the generated fort.55 file in PySCF would have the following ORBSYM: Ag' 'Ag' 'B3u' 'B1u' 'B2u' 'Ag', I should be using 'symm=8' in this case. However, the final energies are found to be different (lower) than the energies generated with standalone MRCC.

pyscf-mrcc (symmetry is on): CCSD -14.256521081042 CCSD[T] -14.256521081042 CCSD(T) -14.256521081042 CCSDT -14.256521081190 CCSDT[Q] -14.256521081190 CCSDT(Q)/A -14.256521081190 CCSDT(Q)/B -14.256521081190 CCSDTQ -14.256521081240 CCSDTQ[P] -14.256521081240 CCSDTQ(P)/A -14.256521081240 CCSDTQ(P)/B -14.256521081240 CCSDTQP -14.256521081257

You can find full files for the inputs and outputs generated with PySCF and MRCC in the following Input_Output_Files.

The Question

Based on the above description, what is the reason that Be(+) works correctly with symmetry while not for Be(-)? What could possibly be wrong/missing in my code in order to correctly handle the symmetry of spatial orbitals in systems where p-type orbitals are occupied?

Nike Dattani · Accepted Answer · 2022-09-22 20:03:50Z

VibRot in MOLCAS doesn't find all levels

bug, molcas, spectroscopy, vibrations, software-assistance

The (7,7) isotopologue of the b-state of Li₂ supports over 90 vibrational levels, as you can see from Table IV of my 2015 paper, or lines 506 to 607 of the output of my LEVEL program (the corresponding input can be found on lines 365 to 467 of this file).

Using the VibRot program in OpenMOLCAS, I was only able to find about 80 vibrational levels (see here for an output file that contains a printout of the input file at the top). Using the Scale keyword didn't change things. The range (from 0.3 to 3000 in a.u.) of internuclear distances for numerical integration already has an extremely low lower bound and extremely high upper bound, and the Step value (Step = 0.000000001 a.u.) for energies is already extremely low. Changing the Range or the Step values also didn't improve things. I'll be more specific:

Increasing the lower bound of the integration range from 0.3 to 0.4 (ironically, since an increase in the lower bound should make the numerical solution less accurate) gave me 2 more vibrational levels, but increasing it further to 0.5 or 0.6 didn't give me more levels.
Decreasing the upper bound of the integration range from 3000 to 1000 gave me fewer levels, but increasing it to 4000 caused the program to crash prematurely and making it 3500 also gave me fewer levels. I could try 2000 and other values but the calculations take extremely long (for a rovibrational solver) and this type of trial-and-error work is unusual for trying to find over a dozen missing levels (perhaps I need to explicitly incorporate the C₃/r³ long-range tail of the potential like LEVEL does?).
Strangely, increasing the Grid value usually made the numerical solution worse (leading to more convergence problems and/or fewer vibrational energies found). For example, the above-linked output file shows that I used Grid = 470 and when using 500 I got marginally more levels, but when using Grid = 510 the program found fewer levels again.
Decreasing Step surprisingly does reduce the number of levels (marginally) but increasing it would make the cost outrageously long for the standards of a rovibrational level solver (the runtime is already extremely long with the Step value in the above calculation).

Is it possible with VibRot to get a number of vibrational levels closer to what the above-linked LEVEL output gave me (which was 98 vibrational levels including $v=0$)?

Nike Dattani · Accepted Answer · 2023-03-21 17:46:02Z

What are the seminal papers for ERIs?

1950 Boys

Based on the Rys quadrature:

1976,1983 Dupuis, Rys, and King x2 (Rys quadrature)
1978 Pople and Hehre (uses Rys quadrature, good for highly contracted s and p functions)
1986, 1988 Obara and Saika x 3 (reduce FLOPS by recurrence relation)
1988, 1989, 1990, 1991 Gill, Head-Gordon and Pople x 4 (reduce FLOPS, contraction scheme, "horizontal transfer equations")
1991, 1993 Lindh, Ryu, and Liu x2 (OS and GHP are less efficient than DRK when there's not much contraction)
1991,1996 Ishida x 2 (ACE has "attractive" FLOPS)
2001 Dupuis and Marquez (combination of the above to optimize FLOPS) [implementable! but notation isn't familiar to all]
2004 Valeev libint
2014 Sun (libcint, uses DRK algorithm)
2014 Schwenke
2015 Shizgal
2016 King

Based on the McMurchie-Davidson scheme

1978 McMurchie Davidson (angular momentum transfer)
2022 Neese (SHARK) [implementable!]

Independent/largely-unnoticed

2018 Milovanovic

Parallelization and hardware-related optimizations

1994 Blelloch et al.
2007 Yasuda
2010 Asadschev et al.
2011 Luehr et al.
2012 Asadchev and Gordon
2013 Titove et al.

Reviews

1994 M. W. Gill,Adv. Quantum Chem.1994,25, 141.
2002 R. Lindh, in Encylopedia Comp Chem (Ed: P. von Rague-Schleyer), JohnWiley & Sons, Chichester
2018 Samu (PhD thesis, notation still complicated) PDF/ertekezes.pdf

Nike Dattani · Accepted Answer · 2022-09-06 23:24:21Z

Who was JK Cashion?

Searching "JK Cashion" in Google seems to result in nothing but papers. I cannot even find his or her first name!

However JK Cashion remains legendary in the field of diatomic spectroscopy due to his seminal papers in the 1960s on numerically solving the Schroedinger equation to get the energies and wavefuncions of vibrational and rotational states of molecules with potentials that are more realistic than the very simple ones for which the problem is solvable analytically.

JK Cashion was mentioned in the Nobel Prize lecture of John Polyani (see this PDF). Polanyi's official website also mentions JK Cashion, saying that Cashion was Polanyi's graduate student and that their two-author 1958 paper introduced a method that "promises to provide for the first time information concerning the distribution of vibrational and possibly rotational energy among the products of a three-center reaction" which was then realized about a decade later, after the "work of a substantial group of graduate students". Interestingly, Google Scholar provides a citation to Cashion's PhD thesis from 5 years earlier (and 2 years before Polanyi joined that university, see this PDF). It would appear from his two-author paper with Donald James LeRoy (father of Robert J. Le Roy), published 1 year after the 1953 PhD thesis (see this PDF) that he did a first PhD with DJ LeRoy followed by becoming a grad student of Polanyi. Maybe Google Scholar has erroneously called his Bachelors or Masters level thesis of 1953 a PhD thesis, but in any case I was eventually able to find his 1960 PhD thesis (which also finally gives his full name: John Kenneth Cashion!). Searching "John Kenneth Cashion" basically gives 1 new hit, which is a black-and-white picture of a professor emeritus at University of Wisconsin-Parkside but not much more.

After a 1961 paper with Polanyi, Cashion seems to have joined the "Aeronical Research Laboratories" where he completed a solo-author technical report in 1962, followed by a move to what is now called the Lawrence Berkeley National Lab where in 1963 he published papers by himself and with Nobel Prize winner Dudley Herschbach's then-PhD-student Ricahrd Zare and then a papers with Herschbach himself in 1964 including an empirical potential energy surface for H3 which has been described as the Cashion-Herschbach potential in the title of a paper as recently as 2000.

When Herschbach (and Zare) moved to Harvard, it looks like Cashion came along too. Cashion and Herschbach have a 1964 paper together with Berkeley affiliation and also a 1964 paper together with Harvard affiliation, but then he stopped publishing with them and instead did an intermolecular potential for (H₂)₂ published in 1965 with Nobel Prize winner Van Vleck's Harvard PhD student Roy Gordon, followed by three solo-author papers with Harvard affiliation in 1966 (here, here, and here), seemingly nothing in 1967, then a solo-author 1968 paper with IBM Waston Lab affiliation.

I'm not able to find any 1969 papers by Cashion, but at IBM he published 4 papers with Dean E. Eastman and others from 1970-1971 including three in PRL! But Cashion's career publishing in scientific journals seems to have ended in 1971, Dean Eastman continued publishing until this seemingly final 1988 Science paper and went on to become a very successful executive at IBM and then the Arognne National lab (part of a pattern of success with Cashion's other collaborators: Nobel Prize winners Polanyi and Herschbach, extremely successful professors Zare and Gordon, and high-ranking executives LeRoy and Eastman), but Cashion seems to have fallen off the visible record after 1971.

After referring to some of the original papers (including Cashion's) on solving the Schroedinger equation to get rovibrational energies and wavefunctions of diatomic molecules, and not being able to find anything (not even a first name!) in a Google search, I felt like asking "who was JK Cashion?" here. In my requisite research prior to posting, I was able to trace his life from Toronto to Berkeley to Harvard to IBM, who he worked with, and when he published (almost exclusively through Google Scholar and looking at actual papers), but I wonder if someone who knows better might be able to write a biography of him in an answer here?

You'll find more references to "Ken Cashion, including on Dudley Herschbach's list of former group members, as well as in a couple of Herschbach's writings doi.org/10.1002/anie.201610968 doi.org/10.1146/annurev.physchem.51.1.1 Polanyi also referred to him as Ken in some old autobiographical notes that mention Ken was his first grad student. — Anyon
– Anyon, Commented Sep 7, 2022 at 3:23
Here's a version of Cashion's old website that lists his different appointments, and here's a later version that removes most of the information but has a photo and mentions his emeritus status. — Anyon
– Anyon, Commented Sep 7, 2022 at 3:25
And here's a 2014 obituary for a UW history professor that features some personal comments from Cashion. And then there is this travel account (downloads a .doc file!) that mentions some personal anecdotes about Cashion. — Anyon
– Anyon, Commented Sep 7, 2022 at 3:37
@Anyon very interesting! Did you know him as "Ken"? When I searched "JK Cashion" I found almost nothing in terms of biographical information: Polanyi's website briefly mentioned him, but didn't reveal his first or middle names, and it was by digging through papers found on Google Scholar that I was able to find where he worked (and who he worked with), and then eventually after having spent a lot of time writing this post, I found that a BibTeX entry from Google Scholar for his PhD thesis gave the full name, but searching "John Cashion" or "John K Cashion" didn't provide much more than JK did. — Nike Dattani
– Nike Dattani, Commented Sep 7, 2022 at 13:54
Well, I don't think I've ever met the man, so probably I first read or heard "Ken Cashion" in some account about the discovery of chemiluminescence. — Anyon
– Anyon, Commented Sep 7, 2022 at 23:54
@Anyon Cool! You had an advantage over me in that you'd heard or seen him as "Ken" rather than "JK" or "John". I looked up "Ken Cashion" on Google just now and still it wasn't until the 20th hit (last hit on the 2nd page; i.e. I had to click "next" once, and still scroll to the bottom) that his University of Wisconsin page came up (the first 19 of them for me didn't seem to be about him). That page also didn't have any biographical information, so your searching was still impressive! Interestingly the Polanyi website linked in my post only mentioned "JK Cashion" once, but yours has Ken 3 times — Nike Dattani
– Nike Dattani, Commented Sep 8, 2022 at 1:24

Nike Dattani · Accepted Answer · 2024-11-02 16:15:53Z

Why does our PySCF-MRCC interface work for B++ but not for Li?

Background

My colleague and I have built an interface between PySCF and MRCC which seems to work all the time for RHF references, but only sometimes for UHF references. We use PySCF to calculate the integrals and print them in FCIDUMP format, then we convert the FCIDUMP file into MRCC's fort.55 format.

The format of the FCIDUMP follows the UHF standard designed by Knowles and Handy, in which after the header part, separate 4-index and 2-index integrals are written for alpha-alpha, beta-beta, and alph-beta orbitals, with a line of zeroes separating them. Unfortunately, despite this format being used in MOLPRO, and being compatible with the fort.55 file in some major quantum chemistry codes like MRCC or CFOUR, it is not natively supported in PySCF.

The Problem

We considered two open-shell systems, Li and B(+2). Both systems have 3 electrons occupying s-type orbitals. For B(+2), the coupled cluster energies obtained from my PySCF-MRCC interface are identical to the ones obtained using MRCC-MRCC (which is how we will label "standalone MRCC" in which the integrals, SCF and AO->MO transformation are calculated using MRCC, and the post-SCF calculations are also done by MRCC).

B[+2] (6-31G) (symmetry is on) mrcc-mrcc CCSD -23.372809509970 CCSD(T) -23.372810050030 CCSDT -23.372810049178 pyscf-mrcc CCSD -23.372809509970 CCSD(T) -23.372810050030 CCSDT -23.372810049178

But for Li, this is what I get:

Li (6-31G) (symmetry is on) mrcc-mrcc UHF. <please add this to show that it's the same> CCSD -7.431553874113 CCSD(T) -7.431554248616 CCSDT -7.431554228106 pysc-mrcc UHF <please add this to show that it's the same> CCSD -7.431274408997 CCSD(T) -7.431274447587 CCSDT -7.431274443531

Upon some further debugging, I am finding that the generated fort.55 in the case B(+2) is similar to the one generated in MRCC in terms of having the same ORBSYM and the same indices for the integrals. However, for the case of Li, the generated fort.55 is having similar ORBSYM with that generated with MRCC but the indices and the number of integrals are not the same. You can find full files for the inputs and outputs generated with PySCF and MRCC in the following Input/Output Files.

I will point out that the energies actually do match if I turn off symmetry, but then the RAM and CPU time are significantly greater, which means that we won't be able to do very-high-order coupled cluster calculations with big basis sets.

Why is the code described below working for B++ but not for Li?

The code

Suppose that we have defined a Mol object for an open-shell system, followed by running a UHF calculation. In the next step, we convert the 2e integrals using 4-fold permutation symmetry for alpha-alpha, beta-beta, and alpha-beta orbitals which is saved in corresponding eri_aaaa, eri_bbbb, and eri_aabb, respectively. Moreover, the transformed 1e core Hamiltonian for alpha and beta are also calculated and stored in h_aa and h_bb, respectively.

orbs = mf.mo_coeff nmo = orbs[0].shape[0] eri_aaaa = pyscf.ao2mo.restore(4,pyscf.ao2mo.incore.general(mf._eri, (orbs[0],orbs[0],orbs[0],orbs[0]), compact=False),nmo) eri_bbbb = pyscf.ao2mo.restore(4,pyscf.ao2mo.incore.general(mf._eri, (orbs[1],orbs[1],orbs[1],orbs[1]), compact=False),nmo) eri_aabb = pyscf.ao2mo.restore(4,pyscf.ao2mo.incore.general(mf._eri, (orbs[0],orbs[0],orbs[1],orbs[1]), compact=False),nmo) h_core = mf.get_hcore(mol) h_aa = reduce(numpy.dot, (orbs[0].T, h_core, orbs[0])) h_bb = reduce(numpy.dot, (orbs[1].T, h_core, orbs[1])) nuc = mol.energy_nuc()

In the next section of the code, we try to handle the symmetry of the orbitals in a way that will match the format of fort.55. We also define the range of values for integral indices:

if mol.symmetry: groupname = mol.groupname if groupname in ('SO3', 'Dooh'): logger.info(mol, 'Lower symmetry from %s to D2h', groupname) raise RuntimeError('Lower symmetry from %s to D2h' % groupname) elif groupname == 'Coov': logger.info(mol, 'Lower symmetry from Coov to C2v') raise RuntimeError('''Lower symmetry from Coov to C2v''') orbsym = pyscf.symm.label_orb_symm(mol,mol.irrep_name,mol.symm_orb,orbs[0]) orbsym = numpy.array(orbsym) orbsym = [param.IRREP_ID_TABLE[groupname][i]+1 for i in orbsym] a_inds = [i+1 for i in range(orbs[0].shape[0])] b_inds = [i+1 for i in range(orbs[1].shape[1])] nelec = mol.nelec tol=1e-18

The last part is considered the main part of the script, which is supposed to handle the header of the generated file in a way that will match the fort.55 format. It is supposed to deal with the 4-index, and 2-index integrals in which we are assuming that we are using a 4-fold permutation symmetry. The sections after '4-fold symmetry' are in the sequence of alpha-alpha, beta-beta, alpha-beta 2e integrals, followed by alpha, beta 1e integrals, and finally the nuclear repulsion energy.

with open('fort.55', 'w') as fout: if not isinstance(nelec, (int, numpy.number)): ms = abs(nelec[0] - nelec[1]) nelec = nelec[0] + nelec[1] else: ms=0 fout.write(f"{nmo:1d} {nelec:1d}\n") if orbsym is not None and len(orbsym) > 0: fout.write(f"{' '.join([str(x) for x in orbsym])}\n") else: fout.write(f"{' 1' * nmo}\n") fout.write(' 150000\n') output_format = float_format + ' %5d %5d %5d %5d\n' #4-fold symmetry kl = 0 for l in range(nmo): for k in range(0, l+1): ij = 0 for i in range(0, nmo): for j in range(0, i+1): if i >= k: if abs(eri_aaaa[ij,kl]) > tol: fout.write(output_format % (eri_aaaa[ij,kl], a_inds[i], a_inds[j], a_inds[k], a_inds[l])) ij += 1 kl += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') kl = 0 for l in range(nmo): for k in range(0, l+1): ij = 0 for i in range(0, nmo): for j in range(0, i+1): if i >= k: if abs(eri_bbbb[ij,kl]) > tol: fout.write(output_format % (eri_bbbb[ij,kl], b_inds[i], b_inds[j], b_inds[k], b_inds[l])) ij += 1 kl += 1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') ij = 0 for j in range(nmo): for i in range(0, j+1): kl = 0 for k in range(nmo): for l in range(0, k+1): if abs(eri_aabb[ij,kl]) > tol: fout.write(output_format % (eri_aabb[ij,kl], a_inds[i], a_inds[j], b_inds[k], b_inds[l])) kl += 1 ij +=1 fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') h_aa = h_aa.reshape(nmo,nmo) h_bb = h_bb.reshape(nmo,nmo) output_format = float_format + ' %5d %5d 0 0\n' for i in range(nmo): for j in range(nmo): fout.write(output_format % (h_aa[i,j], a_inds[i], a_inds[j])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') for i in range(nmo): for j in range(nmo): fout.write(output_format % (h_bb[i,j], b_inds[i], b_inds[j])) fout.write(' 0.00000000000000000000E+00' + ' 0 0 0 0\n') output_format = float_format + ' 0 0 0 0\n' fout.write(output_format % nuc)

AbdulMuhaymin · Accepted Answer · 2025-04-03 09:28:15Z

Delta SCF for periodic system

Following up on @Schoblerone's comment in this answer to How to obtain a defect configuration coordinate diagram using first-principles calculation?:

After some reading, I think Δ SCF procedure doesn't work for periodic system. I tried manually fixing the occupation to simulate an excited (promoted 1 electron in the highest occupied defect level to a higher level) state but self-consistency was not reached. Is it possible to obtain this within the realm of DFT by maybe making the system charged and then relaxing the coordinates?

I'd very interested how you came to that conclusion. Regarding your calculation: ΔSCF often does not work out of the box as easily as regular DFT calculations.Can you describe your system further? Especially relevant: are you exciting into a degenerate level? Unless you have a solid case to make that the defect is excited by some kind of charge switching, I don't believe adding an electron is a good idea.

References that says delta SCF is not possible for periodic system:

Explanation on why it should not work:

References where they have used delta SCF for periodic system and it worked:

Question: Why is the discrepancy

Why do the PBE pseudopotentials for Zn and S from pseudo-dojo give such bad lattice constant (~10%) in siesta but it's reasonable for the PBEsol(~1%) or even LDA pseudopotentials(~3%)?

How to generate my own pseudopotential in psml format for siesta 5.2?

Nike Dattani · Accepted Answer · 2025-04-03 12:55:47Z

I now tried to post the following question on MMSE but the "Review your question" button doesn't seem to work on this old version of Safari for the tablet:

Have I found a way to restart perturbative coupled cluster calculations in MRCC?

Preamble

More than 6 years ago I started the thread "Restarts during perturbative corrections" on the MRCC forum. Mihaly Kallay, the lead author of MRCC wrote:

"Dear Nike,
Unfortunately, it is not possible to restart the perturbative corrections."

Current problem

We are dealing with a calculation that involves 4 spin cases, in which even with 64 cores we are likely not to be able to finish the calculation within the maximum wall time limit set by the administrator of our national supercomputer centre. 64 cores is the most that we have available for the nodes with this amount of RAM.

We can easily finish 3 of the spin cases, but not all 4.

Proposed solution

If the job crashes after completing spin cases 1, 2, and 3, and during the calculation for spin case 4, then can we not in theory start another CCSDT(Q) calculation with rest=1 (including the fort.16 file), and tell the program to only do spin case 4?

If so, then in MINP we can also provide the cumulative energy from all of the completed spin cases, and pert.f can add the energy for spin case 4 to this cumulative sum.

Question

Would this not mean that we can "restart" the perturbative CCSDT(Q) calculation as long as we add the proposed adjustments to the code?

Nike Dattani · Accepted Answer · 2025-09-28 01:24:35Z

https://github.com/HPQC-LABS/QCBugs/issues/27. Also maybe the restart for the RHF case that was working before.

The MMSE Sandbox

10 Answers 10

Why does our PySCF-MRCC interface work for N but not for O?

Introduction

Testing

Debugging

Further Debugging

Thank you for making it this far

Can this procedure for multiply augmenting a basis set with the "basis_set_exchange" package be improved?

1) Install the basis_set_exchange package

2) Create a basis set "dictionary" for the original basis set

3) Add the extra diffuse functions:

4) Get the new basis set into the format we desire

4) Print the new basis set to a file!

Generated Fort.55 File in PySCF for MRCC, how to handle symmetry from UHF reference calculation?

Background

Progress

Code

Test & Debugging

The Problem

The Question

VibRot in MOLCAS doesn't find all levels

What are the seminal papers for ERIs?

Based on the Rys quadrature:

Based on the McMurchie-Davidson scheme

Independent/largely-unnoticed

Parallelization and hardware-related optimizations

Reviews

Who was JK Cashion?

Why does our PySCF-MRCC interface work for B++ but not for Li?

Background

The Problem

Why is the code described below working for B++ but not for Li?

The code

Delta SCF for periodic system

References that says delta SCF is not possible for periodic system:

Explanation on why it should not work:

References where they have used delta SCF for periodic system and it worked:

Question: Why is the discrepancy

Why do the PBE pseudopotentials for Zn and S from pseudo-dojo give such bad lattice constant (~10%) in siesta but it's reasonable for the PBEsol(~1%) or even LDA pseudopotentials(~3%)?

How to generate my own pseudopotential in psml format for siesta 5.2?

Have I found a way to restart perturbative coupled cluster calculations in MRCC?

Preamble

Current problem

Proposed solution

Question

You must log in to answer this question.

Linked

Related

Hot Network Questions

1) Install the `basis_set_exchange` package