Average-of-configuration open-shell Hartree-Fock

This is a short introduction to the theory behind average-of-configuration open-shell Hartree-Fock as implemented in DIRAC. For a more complete description the reader may consult chapter 3 of the PhD thesis of Jørn Thyssen [Thyssen2004] .

It should first be noted that there is no restricted open-shell Hartree-Fock (ROHF) code in DIRAC. The reason is that spin-orbit interaction couples spin and spatial degrees of freedom and make the formalism much more complicated since one cannot exploit spin symmetry alone for fixing the expansion coefficients in the reference configuration state function (CSF) which serves as the trial function.

Instead of optimizing the energy for a single open-shell state, we shall optimize the energy for a limited set of open-shell states.

Energy expression

Suppose that we have a set of $N_{d e t}$ of Slater determinants, constituting our N-particle basis. We next construct and diagonalize a CI matrix in this basis. This gives $N_{d e t}$ solutions of the form

| Ψ_{I} ⟩ = \sum_{P = 1}^{N_{d e t}} | Φ_{P} ⟩ C_{P I}

We will now find the set of orbitals which minimizes the average energy

E_{a v} = \frac{1}{N_{d e t}} \sum_{I = 1}^{N_{d e t}} ⟨ Ψ_{I} | H | Ψ_{I} ⟩

Inserting the expansion of the solutions in terms of Slater determinants and using the fact that the expansion coefficients $C_{P I}$ are elements of a unitary matrix we obtain

E_{a v} = \frac{1}{N_{d e t}} \sum_{P = 1}^{N_{d e t}} \sum_{Q = 1}^{N_{d e t}} ⟨ Φ_{P} | H | Φ_{Q} ⟩ \sum_{I = 1}^{N_{d e t}} C_{P I}^{*} C_{Q I} = \frac{1}{N_{d e t}} \sum_{P = 1}^{N_{d e t}} ⟨ Φ_{P} | H | Φ_{P} ⟩

showing that the average can also be taken over the N-particle basis itself.

Introducing open shells and active electrons

The above average energy expression is a functional of the orbitals entering the Slater determinants. We will make a distinction between :

inactive orbitals, present in all Slater determinants, represented by indices $i j k l$
active orbitals, present in some, but not all Slater determinants, represented by indices $u v x y$
secondary orbitals, not present in any Slater determinant, represented by indices $a b c d$

We shall also employ indices $p q r s$ for general orbitals.

In order to generate our N-particle basis for averaging we will distribute the orbitals into a number of shells. Each shell $S$ is specified by $M_{S}$ orbital and $N_{S}$ electrons. Inactive and secondary shells have $N_{S} = M_{S}$ and $N_{S} = 0$ , respectively, whereas active shells have $N_{S} < M_{S}$ . We generate our N-electron basis by distributing all active electrons in all possible ways within their respective shells. The total energy can then be written in terms of orbitals rather than Slater determinants as

E_{a v} = \sum_{S} f_{s} {\sum_{p \in S} (h_{p p} + \frac{1}{2} \sum_{S^{'}} Q_{p p}^{S^{'}} + \frac{1}{2} (a_{S} - 1) Q_{p p}^{S})}

where we have introduced

the fractional occupation $f_{S} = \frac{N_{S}}{M_{S}}$ of shell $S$
the coupling coefficient $a_{S}$
- $a_{S} = \frac{M_{S} (N_{S} - 1)}{N_{S} (M_{S} - 1)}$ for $f_{S} \neq 0$
- $a_{S} = 1$ for $f_{S} = 0$
two-electron contributions $Q_{p q}^{S} = f_{S} \sum_{r \in S} ⟨ p r | | q r ⟩$

Orbital rotations

We will use a exponential parametrization for the rotation of orbitals

{\tilde{ϕ}}_{p} = \sum_{q} ϕ_{q} {[e x p (- κ)]}_{p q}

where $κ$ is an anti-Hermitian matrix to ensure unitarity of the transformation. The exponential parametrization allows for unconstrained optimization (no Lagrange multipliers). It also allows the easy identification of redundant variational parameters, that is, parameters whose variation does not change the energy. In this particular case one finds that rotations within shells are redundant and the corresponding matrix elements $κ_{p q}$ can be set to zero.

Gradient elements and off-diagonal blocks of the Fock matrix

The generally non-zero elements of the gradient vector are:

inactive-secondary rotations:

g_{i a} = h_{a i} + \sum_{S^{'}} Q_{a i}^{S^{'}} = F_{a i}^{I}

active-secondary rotations:

g_{u a} = f_{U} [F_{a u}^{I} + (a_{U} - 1) Q_{a u}^{U}]; u \in U

inactive-active rotations:

g_{i u} = (1 - f_{U}) [F_{u i}^{I} + f_{U} α_{U} Q_{u i}^{U}]; u \in U

inter-shell active-active rotations

g_{u v} = (f_{U} - f_{V}) F_{v u}^{I} - (α_{U} - 1) f_{U} Q_{v u}^{U} - (α_{V} - 1) f_{V} Q_{v u}^{V}; u \in U, v \in V

where we have introduced

α_{S} = \frac{1 - a_{S}}{1 - f_{S}}

These gradient elements allow the definition of the off-diagonal elements of the Fock matrix:

inactive-secondary block:

F_{i a} = F_{i a}^{I}

active-secondary block:

F_{u a} = F_{u a}^{I} + (a_{U} - 1) Q_{u a}^{U}; u \in U

inactive-active block:

F_{i u} = F_{i u}^{I} + f_{U} α_{U} Q_{i u}^{U}; u \in U

inter-shell active-active rotations $(U \neq V)$ :

\begin{array}{r} F_{u v} = {\begin{cases} F_{u v}^{I} + \frac{(a_{U} - 1)}{(f_{U} - f_{V})} Q_{u v}^{U} + \frac{(a_{V} - 1)}{(f_{V} - f_{U})} Q_{u v}^{V} & for f_{U} \neq f_{V} \\ (a_{U} - 1) Q_{u v}^{U} + (a_{V} - 1) Q_{u v}^{V} & for f_{U} = f_{V} \end{cases} \end{array}

Diagonal blocks of the Fock matrix

The diagonal blocks of the Fock matrix are a priori not related to gradient elements and there is therefore freedom of choice in their specification. The specific choice will not affect the total energy, but will affect orbitals energies as well as convergence of the AOC HF calculation.

In order to obtain a meaningful definition of the diagonal blocks of the Fock matrix we will consider an extension of Koopmans’ theorem to average-of-configuration Hartree-Fock, that is, we consider average energy after removal of an electron from a specific shell $T$ and using the same orbital set as for the original N-electron system.

The energy difference becomes

E_{a v}^{N} - E_{a v}^{N - 1} = \frac{1}{M_{T}} \sum_{t \in T} [h_{t t} + \sum_{S} Q_{t t}^{S} + (a_{T} - 1) Q_{t t}^{T}]

If we now define the diagonal block of the Fock matrix corresponding to shell $T$ as

F_{p q} = h_{p q} + \sum_{S} Q_{p q}^{S} + (a_{T} - 1) Q_{p q}^{T}

the ionization potential associated with the electron removal becomes

I P = E_{a v}^{N - 1} - E_{a v}^{N} = - \frac{1}{M_{T}} \sum_{t \in T} ε_{t}

In the case of a degenerate shell we simply find

I P = E_{a v}^{N - 1} - E_{a v}^{N} = - ε_{t}, t \in T

identical to the original Koopmans’ theorem, whereas one in the general case gets an average over the orbital energies of the shell.

Based on these observations we define the diagonal blocks of the AOC Fock matrix as

inactive-inactive block:

F_{i j} = F_{i j}^{I}

secondary-secondary block:

F_{a b} = F_{a b}^{I}

active-active block:

F_{u v} = F_{u v}^{I} + (a_{U} - 1) Q_{u v}^{U}; u, v \in U

These are the definitions employed in DIRAC12 and onwards (and also the definition found in the thesis of Jørn Thyssen [Thyssen2004]). In previous versions the term $(a_{U} - 1) Q_{u v}^{U}$ was missing from the active-active block. Since $Q_{u u}^{U}$ is positive and for an open shell $a_{U} < 1$ removal of this term tend to shift orbital energies of the open shell upwards.

Convergence problems typically occur when orbital energies between shells have similar values such that the selection of occupied orbitals for the construction of the Fock matrix becomes ambiguous. In a closed-shell system this will for instance happen when the HOMO-LUMO gap closes. The definition of the active-active block in pre-DIRAC12 version (which was in fact an unintended omittal) can in some instances lead to improved convergence. More specifically, this happens when the orbitals of an open shell and the closed shell (or another open shell) are almost degenerate. However, such situations are often symptomatic for a wrong choice of partitioning of orbitals into closed and open shells. Furthermore, the definition of the active-active block in pre-DIRAC12 versions tend to close the HOMO-LUMO gap which may hamper convergence.

Level shift

Whenever there is almost degeneracy of orbitals between different shells the recommended strategy is to exploit the freedom in the definition of diagonal blocks of the Fock matrix and introduce a level shift $λ$ , that is

F_{u v}^{U} \to F_{u v}^{U} + λ δ_{u v}

The level shift of secondary (virtual) orbitals is controlled by the keyword .LSHIFT, whereas open shells can be shifted using the keyword .OLEVEL.

Convergence issues

Open-shell systems tend to be more difficult to converge than closed-shell ones, because of additional orbital classes and more possibilities of near-degeneracies between orbital classes. It is important to understand that DIRAC will generally order orbitals according to energy. Furthermore, DIRAC starts by filling closed-shell orbitals, then open-shell ones. In the case of Uranium ( $[R n] 5 f^{3} 6 d^{1} 7 s^{2}$ ) the closed-shell $7 s$ orbitals will have higher orbital energies than the $6 d$ open-shell ones, and this may lead to convergence problems. Fortunately, since DIRAC21 convergence of open-shell atoms is unproblematic thanks to atomic supersymmetry, see keyword .KPSELE.