A serious flaw of Mulliken population analysis is its basis set sensitivity. This is well known from the non-relativistic domain. In the relativistic domain a further disadvantage is that the expansion of large and small components in separate basis setz means large and small component populations come separately and it is not easy to see how they are connected. Furthermore, if the large and small component are expanded in scalar basis functions, as is the case of the DIRAC code, then the poulation analysis can not distinguish distributions from different spin-orbit components of atomic orbitals, for instance, \(p_{1/2}\) and \(p_{3/2}\).

Projection analysis eliminates all these inconveniences. It is based on the simple concept of molecular orbitals as linear combination of atomic orbitals and allows analysis of electronic structure in terms of well-defined atomic (or fragment) orbitals.

Projection analysis requires as input a set of pre-calculated orbitals {\(\psi^A_p\)} for the constituent atoms (or fragment) of the molecule. Here \(A\) refers to the atomic center (or fragment) and \(p\) is on orbital index for that center. The molecular orbitals are then expanded as

\[\begin{split}\left|\psi^{MO}_i\right> = \sum_A \sum_{p\in A} \psi^A_p c^A_{pi} + \left|\psi^{pol}_i\right>\end{split}\]

Typically one will select the occupied orbitals of the constituent atoms. They are not guaranteed to fully span the molecular orbitals and therefore the expansion includes their orthogonal complement \(\left|\psi^{pol}_i\right>\), denoted the polarization contribution. The expansion coefficients \(c^A_{pi}\) are found by solving the equation

\[\begin{split}\left<\psi^A_p|\psi^{MO}_i\right> = \sum_B\sum_{q\in B}\left<\psi^A_p|\psi^B_q\right>c^B_{qi}\end{split}\]

obtained by projection.

As a first simple example (albeit showing some subtle points) we shall consider the methane molecule.
The molecular input file `CH4.mol` is

```
INTGRL
methane
cc-pVDZ basis
C 2 A
6. 1
C 0.0000 0.0000 0.0000
LARGE BASIS cc-pVDZ
1. 4
H1 .629889144 .629889144 .629889144
H2 -.629889144 -.629889144 .629889144
H3 .629889144 -.629889144 -.629889144
H4 -.629889144 .629889144 -.629889144
LARGE BASIS cc-pVDZ
FINISH
```

Suppose that we have run a DFT(PBE) calculation using:

```
pam --inp=PBE --mol=CH4 --outcmo
```

where we keep the coefficients and the menu file `PBE.inp` is

```
**DIRAC
.WAVE FUNCTION
**HAMILTONIAN
.DFT
PBE
**WAVE FUNCTION
.SCF
*SCF
.CLOSED SHELL
10
*END OF
```

We would now like to analyze the electronic structure of the methane molecule in terms of the orbitals of the constituent atoms. Now, looking in the output we find:

```
Full group is: T(d)
Represented as: D2
```

This means that DIRAC correctly detected \(T_d\) as the molecular point group symmetry. However, since the spatial symmetry of DIRAC is limited to \(D_{2h}\) and subgroups the program will use the \(D_2\) subgroup for the calculation. In this subgroup, as in the parent \(T_d\) opint group, the four hydrogen atoms are related by symmetry. Since DIRAC presently can only do projection analysis in terms of fragments complying to the molecular point group, we have to lower the symmetry if we want to carry out the projection analysis in terms of individual atoms. A simple solution is simply to have DIRAC dump the coeffients in \(C_1\) symmetry. We then simply add:

```
**GENERAL
.ACMOUT
```

to `PBE.inp` and run the calculation as:

`pam --inp=PBE --mol=CH4 --get "DFACMO=ac.CH4"`

where the latter argument means that the file `DFACMO` of coefficients in \(C_1\) (no) symmetry
is saved under the name `ac.CH4`. It is also possible to convert an input `DFCOEF` file to corresponding `DFACMO` by using

```
**DIRAC
.WAVE FUNCTION
**GENERAL
.ACMOUT
**WAVE FUNCTION
*END OF
```

and the command:

`pam --incmo --inp=convert --mol=CH4 --get "DFACMO=ac.CH4"`

However, before turning to the analysis itself there is one subtle point. Looking in the DIRAC output again we find the lines:

```
Original Coordinates
--------------------
6 0.00000000 0.00000000 0.00000000 1
1 1.19031798 1.19031798 1.19031798 1
1 -1.19031798 -1.19031798 1.19031798 1
1 1.19031798 -1.19031798 -1.19031798 1
1 -1.19031798 1.19031798 -1.19031798 1
Symmetry class found: T(d)
Centered and Rotated
--------------------
1 1.19031798 -1.19031798 1.19031798 1
1 -1.19031798 1.19031798 1.19031798 1
1 -1.19031798 -1.19031798 -1.19031798 1
1 1.19031798 1.19031798 -1.19031798 1
6 0.00000000 0.00000000 0.00000000 1
```

showing that DIRAC will (translate and) rotate molecules during the process of symmetry detection.
For the projection analysis we can therefore not use the original coefficient file since we
now will turn off symmetry detection and insist on running without symmetry. The simplest solution
to this complication is to use the xyz-file that we also find in the output (we shall call it `CH4.xyz`)

```
5
C 0.0000000000 0.0000000000 0.0000000000
H 0.6298891440 0.6298891440 -0.6298891440
H 0.6298891440 -0.6298891440 0.6298891440
H -0.6298891440 0.6298891440 0.6298891440
H -0.6298891440 -0.6298891440 -0.6298891440
```

(we have replaced `H4` by `H`)
We now generate the atomic fragments. I recommend calculating the atoms in the highest possible
symmetry and the converting the coefficients to no symmetry. This allows a very precise identification
of the individual atomic orbitals (including \(m_j\) quantum number). We accordingly run:

```
pam --inp=H --mol=H --get "DFACMO=ac.H"
pam --inp=C --mol=C --get "DFACMO=ac.C"
```

with corresponding input files `H.inp`,

```
**DIRAC
.WAVE FUNCTION
.ANALYZE
**GENERAL
.ACMOUT
**HAMILTONIAN
.DFT
PBE
**WAVE FUNCTION
.SCF
*SCF
.OPEN SHELL
1
1/2,0
**ANALYZE
.MULPOP
*END OF
```

`H.mol`,

```
INTGRL
hydrogen
cc-pVDZ basis
C 1 A
1. 1
H 0.0000 0.0000 0.0000
LARGE BASIS cc-pVDZ
FINISH
```

`C.inp`

```
**DIRAC
.WAVE FUNCTION
.ANALYZE
**GENERAL
.ACMOUT
**HAMILTONIAN
.DFT
PBE
**WAVE FUNCTION
.SCF
*SCF
.OPEN SHELL
1
1/2,0
**ANALYZE
.MULPOP
*END OF
```

and `C.mol`

```
INTGRL
carbon atom
cc-pVDZ basis
C 1 A
6. 1
C 0.0000 0.0000 0.0000
LARGE BASIS cc-pVDZ
FINISH
```

We are now ready for the projection analysis itself. We prepare the input file `prj.inp`

```
**DIRAC
.ANALYZE
**HAMILTONIAN
.DFT
PBE
**WAVE FUNCTION
.SCF
*SCF
.CLOSED SHELL
10
**ANALYZE
.PROJEC
*PROJEC
.OWNBAS
.VECREF
5
AFCXXX
1
1..5
AFH1XX
1
1
AFH2XX
1
1
AFH3XX
1
1
AFH4XX
1
1
**MOLECULE
*BASIS
.DEFAULT
cc-pVDZ
*SYMMETRY
.NOSYM
*END OF
```

- In the
`*PROJEC`section we specify five fragments identified by the name of their coefficient files. - We also specify
`.OWNBAS`, which means that the fragments have been calculated in their own basis and not the full molecular one. Note also that since we will use a*xyz*- file for this calculation

we have added a `**MOLECULE` section where we specify basis *and* we explicitly turn off symmetry.

As a final preparation we make symbolic links to the needed coefficient files:

```
$ ln -s ac.C AFCXXX
$ ln -s ac.H AFH1XX
$ ln -s ac.H AFH2XX
$ ln -s ac.H AFH3XX
$ ln -s ac.H AFH4XX
$ ln -sf ac.CH4 DFCOEF
```

We now run the projection analysis using:

```
pam --inp=prj --mol=CH4.xyz --incmo --copy="AF*"
```

The first thing to look for in the output is the section:

```
* Total gross contributions:
AFCXXX 6.4301 E - 6.4301 P - 0.0000
AFH1XX 0.8716 E - 0.8716 P - 0.0000
AFH2XX 0.8716 E - 0.8716 P - 0.0000
AFH3XX 0.8716 E - 0.8716 P - 0.0000
AFH4XX 0.8716 E - 0.8716 P - 0.0000
Polarization: 0.0834
```

It shows the gross population (in Mulliken sense, but in terms of the fragment orbitals) of each atom and then, most importantly, the gross population of the polarization contribution. In this example it is small and perfectly acceptable. If it becomes too large the analysis looses meaning because it impies that significant electron density has not been assigned to any fragment. In the case of significant polarization contribution one can either increase the number of atomic orbitals used in the analysis or turn on repolarization.

Having concluded that the polarization contribution is acceptable, we can proceed with the analysis. From the atomic gross population we can already extract atomic charges (keeping in mind that some density is not assigned to any atom). We can see that the carbon atom has a small negative charge of -0.43 atomic units, whereas the hydrogens have positive charges +0.13.

We next turn to this section:

```
* Total reference orbital contributions:
AFCXXX E 1 1.99999 -0.10660951E+02
AFCXXX E 2 1.27874 -0.97828469E+00
AFCXXX E 3 1.05077 -0.65580701E+00
AFCXXX E 4 1.05030 -0.65541446E+00
AFCXXX E 5 1.05030 -0.65541446E+00
AFH1XX E 1 0.87162 -0.23118508E+00
AFH2XX E 1 0.87162 -0.23118508E+00
AFH3XX E 1 0.87162 -0.23118508E+00
AFH4XX E 1 0.87162 -0.23118508E+00
```

which gives the accumulated gross population of each fragment orbital. We can identify
the fragments from the name of the coefficient file and its orbital energy. For instance
the first orbital on the list is clearly carbon *1s* followed by carbon *2s*. Next comes three
orbitals that are almost degenerate. These are cleary the carbon *2p* orbitals, the first one
being \(2p_{1/2}\) followed by the two \(2p_{3/2}\). By furthermore looking into the
output of the carbon atom calculation (note that we asked for Mulliken population analysis in the
input) we find that the first \(2p_{3/2}\) orbital has \(m_j=1/2\) and the
second \(m_j=-3/2\). From these gross population we can deduce the atomic configurations of
the atoms in the molecule. For carbon we find \([He]2s^{1.3}2p^{3.2}\) and for hydrogen
\(1s^{0.9}\).

There is also detailed output for each molecular orbital in the output. For the third molecular orbital we find for instance:

```
* Electronic eigenvalue nr. 3: -0.3411330794652 (Occupation : f = 1.0000)
================================================
Orbital Total Eigenvalue Kramers partner 1 Kramers partner 2
AFCXXX E 3 0.57945 -0.65580701E+00 (-1.0000,-0.0000) ( 0.0000,-0.0000)
AFH1XX E 1 0.31559 -0.23118508E+00 ( 0.0000, 0.5774) (-0.5774,-0.5774)
AFH2XX E 1 0.31559 -0.23118508E+00 ( 0.0000,-0.5774) ( 0.5774,-0.5774)
AFH3XX E 1 0.31559 -0.23118508E+00 ( 0.0000,-0.5774) (-0.5774, 0.5774)
AFH4XX E 1 0.31559 -0.23118508E+00 ( 0.0000, 0.5774) ( 0.5774, 0.5774)
* Gross contributions:
AFCXXX 0.5254 E - 0.5254 P - 0.0000
AFH1XX 0.1165 E - 0.1165 P - 0.0000
AFH2XX 0.1165 E - 0.1165 P - 0.0000
AFH3XX 0.1165 E - 0.1165 P - 0.0000
AFH4XX 0.1165 E - 0.1165 P - 0.0000
Polarization: 0.0085
```

To fully understand the output we have to keep in mind that all calculations are Kramers-restricted such that all orbitals come in pairs. The expansion of the molecular orbitals should therefore rather be written as

\[\begin{split}\left|\psi^{MO}_i\right> = \sum_A \sum_{p\in A} \left(\psi^A_p c^A_{pi} + \psi^A_\overline{p} c^A_{\overline{p}i}\right) + \left|\psi^{pol}_i\right>\end{split}\]

where \(c^A_{pi}\) and \(c^A_{\overline{p}i}\) is the expansion coefficient of Kramers partner 1 and 2, respectively. Let us now define the absolute value

\[\left|c^A_{Pi}\right| = \sqrt{\left|c^A_{pi}\right|^2+\left|c^A_{\overline{p}i}\right|^2}\]

This is the real number reported under the heading `Total`. Under the heading
`Kramers partner 1` is reported the complex number

\[z_1 = \left(Re(z_1),Im(z_1)\right) = \frac{c^A_{pi}}{\left|c^A_{Pi}\right|},\]

and analogously for Kramers partner 2, from which we get the detailed decomposition of the molecular orbital. We find that this molecular orbital is composed of the carbon \(2p_{1/2}\) orbital and the hydrogen \(1s\) orbitals.

We next turn to a molecule with a heavy atom and a more complicated electronic structure, namely uranyl \(UO_2^{2+}\). Here the two oxygens are related by symmetry, and so one would at first consider running the projection analysis with no symmetry, following the procedure outlined in the methane example above. However, we can reduce the symmetry from \(D_{\infty h}\) to \(C_{\infty v}\) by introducing a ghost center. Our molecular input file then looks like

```
DIRAC
uranyl
....
C 3 A
92. 1
U 0.0 0.0 0.00000
LARGE BASIS dyall.v3z
8. 2
O1 0.0 0.0 1.70447861
O2 0.0 0.0 -1.70447861
LARGE BASIS cc-pVTZ
0. 1
Gh 0.0 0.0 10.0
LARGE 0
FINISH
```

Now the oxygen atoms are no longer connected by symmetry and we can carry out the projection analysis in \(C_{\infty v}\) symmetry.

Using the menu file `HF.inp`

```
**DIRAC
.ANALYZE
.WAVE FUNCTION
**INTEGRALS
*READIN
.UNCONT
**WAVE FUNCTION
.SCF
*SCF
.CLOSED SHELL
106
**END OF
```

we run 4-component Hartree-Fock:

`pam --mw=120 --inp=HF --mol=UO2 --get "DFCOEF=cf.UO2"`

saving the coefficients. Next we generate atomic reference orbitals. For oxygen this is quite straightforward:

`pam --inp=O --mol=O --get "DFCOEF cf.O"`

using molecule input `O.mol`

```
DIRAC
UO2
cc-pVTZ basis
C 2 A
8. 1
O 0.0 0.0 0.0
LARGE BASIS cc-pVTZ
0. 1
Gh 0.0 0.0 10.0
LARGE 0
FINISH
```

and the menu file `O.inp`

```
**DIRAC
.ANALYZE
.WAVE FUNCTION
**INTEGRALS
*READIN
.UNCONT
**WAVE FUNCTION
.SCF
*SCF
.CLOSED SHELL
4
.OPEN SHELL
1
4/6
**ANALYZE
.MULPOP
**END OF
```

For the uranium we must be a bit more careful since the ground state configuration is an open-shell one: \([Rn]5f^36d^17s^2\).
Using the molecular input `U.mol`

```
DIRAC
UO2
cc-pVTZ basis
C 2 A
92. 1
U 0.0 0.0 0.00000
LARGE BASIS dyall.v3z
0. 1
Gh 0.0 0.0 10.0
LARGE 0
FINISH
```

we first calculate the \(U^{3+}\) cation with configuration \([Rn]5f^3\)

```
pam --inp=UIII --mol=U --outcmo
```

which converges easily using the menu file `UIII.inp`

```
**DIRAC
.ANALYZE
.WAVE FUNCTION
**INTEGRALS
*READIN
.UNCONT
**WAVE FUNCTION
.SCF
*SCF
.CLOSED SHELL
86
.OPEN SHELL
1
3/14
**ANALYZE
.MULPOP
*MULPOP
.VECPOP
1..oo
**END OF
```

Note that we ask for a Mulliken population analysis of all positive-energy orbitals. This is in order to find the *6d* and *7s* orbitals.
Once identified, we bring them into correct position using reordering and keep them there using static overlap selection as shown in
the menu file `U.inp`

```
**DIRAC
.ANALYZE
.WAVE FUNCTION
**INTEGRALS
*READIN
.UNCONT
**WAVE FUNCTION
.REORDER
1..43,51,44..50,52..56
.SCF
*SCF
.CLOSED SHELL
88
.OPEN SHELL
2
3/14
1/10
.OVLSEL
.NODYNSEL
**ANALYZE
.MULPOP
*MULPOP
.VECPOP
1..oo
**END OF
```

Convergence is now straightforward

`pam --incmo --inp=U --mol=U --get "DFCOEF=cf.U"`

and the correct configuration is confirmed by Mulliken population analysis.

We are now ready to do the projection analysis. We prepare the input file `prj.inp`

```
**DIRAC
.ANALYZE
**INTEGRALS
*READIN
.UNCONT
**WAVE FUNCTION
.SCF
*SCF
.CLOSED SHELL
106
**ANALYZE
.PROJEC
*PROJEC
.OWNBAS
.VECREF
3
AFUXXX
1
1..56
AFO1XX
1
1..5
AFO2XX
1
1..5
**END OF
```

and prepare the coefficient files:

```
$ ln -s cf.O AFO1XX
$ ln -s cf.O AFO2XX
$ ln -s cf.U AFUXXX
$ cp cf.UO2 DFCOEF
```

We run the projection analysis:

```
pam --incmo --inp=prj --mol=UO2lin --copy="AF*" --mw=120
```

As in the methane case above we first look at the polarization contribution:

```
* Total gross contributions:
AFUXXX 89.0635 E - 89.0635 P - 0.0000
AFO1XX 8.3098 E - 8.3098 P - 0.0000
AFO2XX 8.3098 E - 8.3098 P - 0.0000
Polarization: 0.3169
```

We see that it is more sizable than before. Later we shall see that we can eliminate it by polarization of the fragment orbitals. The atomic gross populations tell us that the oxygen atoms has a charge -0.32, whereas the uranium atom has charge +2.94, considerably far from the formal oxidation state +VI.

We next turn to the electron configuration of the atoms in the molecule by looking at the section:

```
* Total reference orbital contributions:
AFUXXX E 1 2.00000 -0.42818128E+04 1s
AFUXXX E 2 2.00000 -0.80663682E+03 2s
AFUXXX E 3 2.00000 -0.77703498E+03 2p-
AFUXXX E 4 2.00000 -0.63578312E+03 2p+
AFUXXX E 5 2.00000 -0.63578312E+03 2p+
AFUXXX E 6 2.00000 -0.20673001E+03 3s
AFUXXX E 7 2.00000 -0.19325143E+03 3p-
AFUXXX E 8 2.00000 -0.16037784E+03 3p+
AFUXXX E 9 2.00000 -0.16037784E+03 3p+
AFUXXX E 10 2.00000 -0.13906994E+03 3d-
AFUXXX E 11 2.00000 -0.13906994E+03 3d-
AFUXXX E 12 2.00000 -0.13242590E+03 3d+
AFUXXX E 13 2.00000 -0.13242590E+03 3d+
AFUXXX E 14 2.00000 -0.13242590E+03 3d+
AFUXXX E 15 2.00000 -0.54355461E+02 4s
AFUXXX E 16 2.00000 -0.48231971E+02 4p-
AFUXXX E 17 2.00000 -0.39554187E+02 4p+
AFUXXX E 18 2.00000 -0.39554187E+02 4p+
AFUXXX E 19 2.00000 -0.29743891E+02 4d-
AFUXXX E 20 2.00000 -0.29743891E+02 4d-
AFUXXX E 21 2.00000 -0.28130201E+02 4d+
AFUXXX E 22 2.00000 -0.28130201E+02 4d+
AFUXXX E 23 2.00000 -0.28130201E+02 4d+
AFUXXX E 24 2.00000 -0.15202067E+02 4f-
AFUXXX E 25 2.00000 -0.15202067E+02 4f-
AFUXXX E 26 2.00000 -0.15202067E+02 4f-
AFUXXX E 27 2.00000 -0.14785575E+02 4f+
AFUXXX E 28 2.00000 -0.14785575E+02 4f+
AFUXXX E 29 2.00000 -0.14785575E+02 4f+
AFUXXX E 30 2.00000 -0.14785575E+02 4f+
AFUXXX E 31 2.00001 -0.12603340E+02 5s
AFUXXX E 32 1.99988 -0.10135890E+02 5p-
AFUXXX E 33 1.99992 -0.80948982E+01 5p+
AFUXXX E 34 1.99950 -0.80948982E+01 5p+
AFUXXX E 35 1.99885 -0.43524477E+01 5d-
AFUXXX E 36 1.99968 -0.43524477E+01 5d-
AFUXXX E 37 1.99973 -0.40407150E+01 5d+
AFUXXX E 38 1.99932 -0.40407150E+01 5d+
AFUXXX E 39 1.99828 -0.40407150E+01 5d+
AFUXXX E 40 1.98041 -0.21392050E+01 6s
AFUXXX E 41 1.93129 -0.13442976E+01 6p-
AFUXXX E 42 1.73850 -0.98481977E+00 6p+ ( 1/2)
AFUXXX E 43 1.96897 -0.98481977E+00 6p+ ( 3/2)
AFUXXX E 44 0.03884 -0.20241412E+00 7s
AFUXXX E 45 0.82231 -0.34596371E+00 5f- ( 1/2)
AFUXXX E 46 0.15542 -0.34596371E+00 5f- (-3/2)
AFUXXX E 47 0.00000 -0.34596371E+00 5f- ( 5/2)
AFUXXX E 48 0.00000 -0.31822326E+00 5f+ (-7/2)
AFUXXX E 49 0.00000 -0.31822326E+00 5f+ ( 5/2)
AFUXXX E 50 0.33671 -0.31822326E+00 5f+ (-3/2)
AFUXXX E 51 0.93693 -0.31822326E+00 5f+ ( 1/2)
AFUXXX E 52 0.41017 -0.19266767E+00 6d- ( 1/2)
AFUXXX E 53 0.09354 -0.19266767E+00 6d- (-3/2)
AFUXXX E 54 0.33147 -0.18309790E+00 6d+ ( 1/2)
AFUXXX E 55 0.32364 -0.18309790E+00 6d+ (-3/2)
AFUXXX E 56 0.00010 -0.18309790E+00 6d+ ( 5/2)
AFO1XX E 1 2.00005 -0.20696087E+02 1s
AFO1XX E 2 1.89559 -0.12503665E+01 2s
AFO1XX E 3 1.47986 -0.61398417E+00 2p-
AFO1XX E 4 1.41895 -0.61259798E+00 2p+
AFO1XX E 5 1.51536 -0.61259798E+00 2p+
AFO2XX E 1 2.00005 -0.20696087E+02 1s
AFO2XX E 2 1.89559 -0.12503665E+01 2s
AFO2XX E 3 1.47986 -0.61398417E+00 2p-
AFO2XX E 4 1.41895 -0.61259798E+00 2p+
AFO2XX E 5 1.51536 -0.61259798E+00 2p+
```

We can to a large extent deduce the identity of the various atomic orbitals by just looking at orbital energies
and I have added it by hand to the above output in a hopefully obvious notation. Even more precise information,
such as \(m_j\) value (indicated in parenthesis) can be obtained from looking at the Mulliken population analysis of each atom.
From this list of gross population we extract the atomic valence configuration \(5f^{2.25}6d^{1.16}7s^{0.98}\) for
the uranium atom and \(2s^{1.90}2p^{4.41}\) for oxygen. An interesting feature is that the uranium *6p* population
adds up to *5.64* and not six electrons. This is a manifestation of the so-called *6p-hole*, due to overlap with the oxygen ligands,
and mostly located to the \(6p_{3/2.1/2}\) orbital.