next up previous
Next: Bibliography

Hartree-Fock approximation for inverse many-body problems

J. C. Lemm and J. Uhlig
Institut für Theoretische Physik I
Universität Münster, 48149 Münster, Germany

Abstract:

A new method is presented to reconstruct the potential of a quantum mechanical many-body system from observational data, combining a nonparametric Bayesian approach with a Hartree-Fock approximation. A priori information is implemented as a stochastic process, defined on the space of potentials. The method is computationally feasible and provides a general framework to treat inverse problems for quantum mechanical many-body systems.

The reconstruction of inter-particle forces from observational data is of key importance for any application of quantum mechanics to real world systems. Such inverse problems have been studied intensively in inverse scattering theory and in inverse spectral theory for one-body systems in one and, later, in three dimensions [1,2]. In this Letter we now outline a method, designed to deal with inverse problems for many-body systems.

Inverse problems are notoriously ill-posed [3]. It is well known that for ill-posed problems additional a priori information is required to obtain a unique and stable solution. In this Letter we refer to a nonparametric Bayesian framework [4,5] where a priori information is implemented explicitly, i.e., in form of stochastic processes over potentials [6].

Calculating an exact solution is typically not feasible for inverse many-body problems. As an example of a possible approximation scheme we will study in the following an `Inverse Hartree-Fock Approximation' (IHFA). For situations where a Hartree-Fock (HF) ansatz is not sufficient, the inverse problem would have to be solved on top of other approximation schemes. A Random Phase Approximation or a full Time-Dependent Hartree-Fock Approximation [7,8,9,10], for example, would go beyond HF.

Bayesian methods can easily be adapted to different learning situations and have therefore been applied to a variety of empirical learning problems, including classification, regression, density estimation [11,12,13], and, recently, to quantum statistics [6]. In particular, using a Bayesian approach for quantum systems it is straightforward to deal with measurements of arbitrary quantum mechanical observables, to include classical noise processes, and to implement a priori information explicitly in terms of the potential.

Computationally, on the other hand, working with stochastic processes, or discretized versions thereof, is much more demanding than, for example, fitting a small number of parameters. This holds especially for applications to quantum mechanics where one cannot take full advantage of the convenient analytical features of Gaussian processes. Due to increasing computational resources, however, the corresponding learning algorithms become now numerically feasible.

To define the problem let us consider many-fermion systems with Hamiltonians, $H$ = $T+V$, consisting of a one-body part $T$ (e.g., in coordinate space representation $-(1/2m)\Delta$, with Laplacian $\Delta$, mass $m$, $\hbar$ = 1), and a two-body potential $V$. Introducing fermionic creation and annihilation operators $a_\alpha^\dagger$, $a_\alpha$, corresponding to a complete single particle basis $\mbox{$\vert\,\varphi_\alpha\!>$}$, such Hamiltonians can be written,

\begin{displaymath}
H
= \sum_{\alpha\beta}T_{\alpha\beta} \,a^\dagger_\alpha a_...
...ta}
\, a^\dagger_\alpha a^\dagger_\beta a_\delta a_\gamma
,
\end{displaymath} (1)

with $T_{\alpha\beta}$ = $<\!\!\varphi_\alpha\vert T\vert\varphi_\beta\!\!>$ and antisymmetrized two-body matrix elements $V_{\alpha\beta\gamma\delta}$ = $<\!\!\varphi_\alpha\varphi_\beta\vert V\vert\varphi_\gamma\varphi_\delta\!\!>$. In particular, not to complicate the numerical calculations unnecessarily, we will study local two-body potentials depending only on the distance between two particles $z$ = $\vert x_{(1)}\!-\!x_{(2)}\vert$, i.e., $V_{x_{(1)} x_{(2)} {x^\prime}_{(1)} {x^\prime}_{(2)}}$ = $v(z)
\big[
\delta_{x_{(1)},x_{(1)}^\prime} \delta_{x_{(2)},x_{(2)}^\prime}
-
\delta_{x_{(1)},x_{(2)}^\prime} \delta_{x_{(2)},x_{(1)}^\prime}\big]$. (Being interested in numerical applications we will use discretized $x$ variables.) Our aim is to reconstruct the function $v(z)$ from observational data.

In order to apply the Bayesian framework we need two model inputs: firstly, the probability $p(D\vert v)$ of measuring the observational data $D$ given a potential $v$ (for fixed $D$ also known as likelihood of $v$), and, secondly, a prior probability $p(v)$ for $v$. The probability for $v$ given data $D$, also called the posterior probability for $v$, is then obtained according to Bayes' rule,

\begin{displaymath}
p(v\vert D) \propto p(D\vert v) p(v)
.
\end{displaymath} (2)

As solution of the reconstruction problem we will consider the potential with the maximal posterior probability (2). This approach, also known as Maximum A Posteriori Approximation (MAP), can be regarded as resulting from an approximation for the predictive probability $p(D_F\vert D)$ = $\sum_v p(D_F\vert v) p(v\vert D)$ of future observations $D_F$ given data $D$. The sum $\sum_v$, standing for a summation over all values $v(z)$ of the potential $v$ (a functional integral for continuous $z$), can typically not be calculated exactly. Assuming that the main contribution to the sum comes from the potential with maximal posterior probability (MAP), i.e., $p(D_F\vert D)\approx p(D_F\vert v^*)$ where $v^*$ = ${\rm argmax}_v p(v\vert D)$, we are left with maximizing (2). This can be done by setting the derivative of the posterior probability (or, technically often more convenient, its logarithm) with respect to $v(z)$ to zero.

As the first step, we have to identify the likelihood $p(D\vert v)$ of $v$ for observational data $D$. In order to be able to obtain information about the potential, the system has to be prepared in a $v$-dependent state. Such a state can be a stationary statistical state, e.g. a canonical ensemble, or a time-dependent state evolving according to the Hamiltonian of the system. In the following we will discuss many-body systems prepared in their ground state $\psi_0$. The (normalized) $N$-particle ground state wave function $\psi_0$ depends on $v$ and is antisymmetrized for fermions. In particular, we will study two kinds of observational data $D$ = $\{D_i\vert 1\le i\le n\}$: (A) $n$ simultaneous measurements of the coordinates of all $N$ particles, (B) $n$ measurements of the coordinates of a single particle. In case $A$, the $i$th measurement results in a vector $D_i$ = $\vec x_i$, consisting of $N$ components $x_{i(j)}$, each representing the coordinates of a single particle (which may also form a vector, e.g., a three dimensional one). According to the axioms of quantum mechanics, the probability of measuring the coordinate vector $\vec x_i$, given $v$, is,

\begin{displaymath}
p(\vec x_i\vert v)
= \vert\psi_0(\vec x_i)\vert^2
= \vert\psi_0(x_{i(1)},\cdots, x_{i(N)})\vert^2
.
\end{displaymath} (3)

In case $B$, the probability for the single particle coordinates $x$ to take the values $D_i$ = $x_i$ is obtained by summing over the unobserved coordinates
\begin{displaymath}
p(x_i\vert v)
=
\sum_{x_{(2)}}\cdots\sum_{x_{(N)}}
\vert\psi_0(x_i,x_{(2)},\cdots,x_{(N)})\vert^2
.
\end{displaymath} (4)

Similarly, it is for example straightforward to obtain the likelihoods for measuring inter-particle distances $z$, the particle number in a certain region, or particle momenta.

In contrast to an ideal measurement of a classical system, the state of a quantum system is typically changed by the measurement process. In particular, its state is projected in the space of eigenfunctions of the measured observable with eigenvalue equal to the measurement result. Hence, if we want to deal with independent, identically distributed data, the system must be prepared in the same state before each measurement. Under that condition the total likelihood factorizes

\begin{displaymath}
p(D\vert v) = \prod_{i=1}^n p(D_i\vert v)
.
\end{displaymath} (5)

As the second step, we have to choose a prior probability $p(v)$. A common and convenient choice are Gaussian prior probabilities (or, for functions, Gaussian processes), for which

\begin{displaymath}
p(v) =
\left(\det \frac{{\bf K}_0}{2\pi}\right)^\frac{1}{2}
e^{-\frac{1}{2} < v-v_0 \vert {\bf K}_0 \vert v-v_0 >}
,
\end{displaymath} (6)

with positive (semi-)definite covariance ${{\bf K}_0}^{-1}$, and mean $v_0(z)$, playing the role of a reference potential. A typical choice for the inverse covariance is the (discretized) negative Laplacian multiplied with a `regularization parameter' $\lambda$, i.e., ${{\bf K}_0}$ = $-\lambda \Delta$, favoring smooth potentials. Higher order differential operators are for example often included in ${\bf K}_0$ for regression problems to get differentiable regression functions Covariances can be constructed to implement general approximate symmetries, like for example approximate periodicity of $v$ [6].

Having defined a likelihood $p(D\vert v)$ for many-body quantum systems and a prior probability $p(v)$ the next step is to solve the stationarity equation for the posterior probability (2)

\begin{displaymath}
0
=\partial_{v(z)} \ln p(v)+\sum_i \partial_{v(z)} \ln p(\vec x_i\vert v)
,
\end{displaymath} (7)

where we introduced the notation $\partial_{v(z)}$ = $\partial/\partial v(z)$. The derivative of a Gaussian prior with respect to $v(z)$ is easily found as
\begin{displaymath}
\partial_{v}
\ln p (v)
= -{\bf K}_0(v-v_0)
,
\end{displaymath} (8)

$\partial_v$ denoting the gradient with respect to $v$, having components $\partial_{v(z)}$. To calculate the derivatives of the likelihoods (3) and (4) we need to know $\partial_{v(z)} \psi_0$. It is straightforward to show, by taking the derivative of $H\psi_0$ = $E_0\psi_0$ with respect to $v(z)$ that, for a nondegenerate ground state,
\begin{displaymath}
\vert\partial_{v(z)} \psi_0\!>
=\sum_{\gamma\ne 0}
\frac{1}{...
...!\psi_\gamma\vert\partial_{v(z)} H
\mbox{$\vert\,\psi_0\!>$}
,
\end{displaymath} (9)

with a complete basis of eigenstates $\psi_\gamma$, energies $E_\gamma$, and requiring $<\!\psi_0 \vert\partial_{v(z)}\psi_0\!>$ = 0 to fix norm and phase. Furthermore, from $\partial_{v(z)} v(z)$ = $\delta_{z,\vert x_{(1)}-x_{(2)}\vert}$ directly follows
\begin{displaymath}
\partial_{v(z)} H
=
\frac{1}{2}\sum_{x}
a^\dagger_{x} (a^\dagger_{x-z} a_{x-z} +
a^\dagger_{x+z} a_{x+z} ) a_{x}
.
\end{displaymath} (10)

Typically, a direct solution of the many-body equation (9) is not feasible. To get a solvable problem we treat the many-body system in Hartree-Fock approximation [7,8,10] (For nonhermitian $H$ see [14]). Thus, as the first step of an IHFA, we approximate the many-body Hamiltonian $H$ by a one-body Hamiltonian $h$ defined self-consistently by
\begin{displaymath}
h_{xx^\prime}
=
T_{xx^\prime}
+ \sum_{k=1}^N <\!x \varphi_k\,\vert\,V\,\vert\,x^\prime \varphi_k\!>
,
\end{displaymath} (11)

$\varphi_k$ being the $N$-lowest orthonormalized eigenstates of $h$, i.e.,
\begin{displaymath}
h\varphi_k = \epsilon_k \varphi_k,
\qquad \epsilon_1\le \epsilon_k\le \epsilon_N
.
\end{displaymath} (12)

The corresponding normalized $N$-particle Hartree-Fock ground state $\vert\Phi_0\!>$ = $\vert\varphi_1,\cdots,\varphi_N\!>$ is the Slater determinant made of the $N$-lowest single particle orbitals $\varphi_k$. The Hartree-Fock likelihood replacing (3) becomes,
\begin{displaymath}
p_{\scriptscriptstyle \rm HF}(\vec x_i\vert v)
= \vert\Phi_0(\vec x_i)\vert^2
= \vert\det B_i\vert^2,
\end{displaymath} (13)

defining the overlap matrix $B_i$ with matrix elements $B_{kl;i}$ = $\varphi_k(x_{i(l)})$. Analogously, the likelihood (4) becomes
\begin{displaymath}
p_{\scriptscriptstyle \rm HF}(x_i\vert v)
= \sum_{k=1}^N \vert\varphi_k (x_i)\vert^2
.
\end{displaymath} (14)

From (13) follows $\partial_{v(z)} p_{\rm HF}(\vec x_i\vert v)$ = $\Phi_0(\vec x_i) \partial_{v(z)} \Phi_0^*(\vec x_i)$ + $\Phi_0^*(\vec x_i) \partial_{v(z)}\Phi_0(\vec x_i)$, so that expanding the determinant $\det B_i$ = $\sum_l^N M_{kl;i} B_{kl;i}$ in terms of the cofactors, $M_{kl,i}$ = $(B_i^{-1})_{lk}\det B_i$, yields
\begin{displaymath}
\partial_{v(z)} \Phi_0(\vec x_i)
= \sum_{kl}^N M_{kl;i} \, \partial_{v(z)} {\varphi_k}(x_{i(l)})
.
\end{displaymath} (15)

Hence, to obtain the derivatives of the HF likelihoods (13) or (14) with respect to $v(z)$, we have to find $\partial_{v(z)} \varphi_k$.

Again, we proceed by taking the derivative of Eq. (12) and obtain after standard manipulations (for nondegenerate $\epsilon_k$ and $<\!\varphi_k\vert\partial_{v(z)} \varphi_k\!>$ = 0),

\begin{displaymath}
\vert\partial_{v(z)} \varphi_k>
=
\sum_{l\ne k} \frac{1}{\ep...
...t\varphi_l><\varphi_l\vert\partial_{v(z)} h
\vert\varphi_k>
.
\end{displaymath} (16)

Furthermore, from Eq. (11) follows
$\displaystyle \partial_{v(z)} h_{x x^{\prime}}$ $\textstyle =$ $\displaystyle \sum_{j=1}^N \Big(
<\!x \,\varphi_j\,\vert\,\partial_{v(z)}V\,\vert\,x^{\prime}\,\varphi_j\!>$  
    $\displaystyle +
<\!x \,\partial_{v(z)} \varphi_j\,\vert\,V\,\vert\,x^{\prime}\,
\varphi_j\!>$  
    $\displaystyle +
<\!x\,\varphi_j\,\vert\,V\,\vert\,x^{\prime}\,\partial_{v(z)}
\varphi_j\!>
\Big)
.$ (17)

Therefore, the derivative of the orbitals is found by inserting Eq. (17) into Eq. (16)
$\displaystyle \partial_{v(z)} \varphi_k (x)$ $\textstyle =$ $\displaystyle \sum_{l\ne k} \frac{\varphi_l(x)}{\epsilon_k-\epsilon_l}
\sum_{j=...
...(
<\!\varphi_l\varphi_j\,\vert\,\partial_{v(z)} V\,\vert\,\varphi_k\varphi_j\!>$  
    $\displaystyle +
<\!\varphi_l\partial_{v(z)}\varphi_j\,\vert\,V\,\vert\,\varphi_k\varphi_j\!>$  
    $\displaystyle +
<\!\varphi_l\varphi_j\,\vert\,V\,\vert\,\varphi_k\partial_{v(z)}\varphi_j\!>
\Big).$ (18)

That ``inverse HF equation'' (18) can quite effectively be solved by iteration, starting for example with initial guess $\partial_{v(z)}\varphi_j(x)$ = 0. Because $\partial_{v(z)}\varphi_k(x)$ depends on $z$ and $x$, Eq. (18), being the central equation of the IHFA, has the dimension of a two-body equation for the lowest $N$ orbitals. Introducing, analogously to $B_i$, the matrix $\Delta_{i}(z)$ with $\Delta_{kl;i}(z)$ = $\partial_{v(z)} {\varphi_k}(x_{i(l)})$ (for $z > 0$, $1\le l\le N$, $1\le k\le N$, $1\le i\le n$), it is straightforward to check that the derivatives of the logarithm of the HF likelihoods (13) and (14) are given by
$\displaystyle \partial_{v(z)} \!\ln p_{\scriptscriptstyle \rm HF}(\vec x_i\vert v)$ $\textstyle =$ $\displaystyle 2 {\rm Re}\left[{\rm Tr} (B_i^{-1}\Delta_{i}(z))\right]
,$ (19)
$\displaystyle \partial_{v(z)} \!\ln p_{\scriptscriptstyle \rm HF}(x_i\vert v)$ $\textstyle =$ $\displaystyle 2
\frac{{\rm Re}\left[\sum_{k=1}^N
\varphi_k^*(x_i)
\partial_{v(z)}\varphi_k(x_i)
\right]}
{\sum_{l=1}^N \vert\varphi_l(x_i)\vert^2}
.$ (20)

The stationarity equation (7) can now be solved by iteration, for example,
\begin{displaymath}
v^{(r+1)}
\!=\!
v^{(r)}\!\! +\!
\eta \big[ v_0\!-v^{(r)}
\!...
...p_{\scriptscriptstyle \rm HF}(\vec x_i\vert v^{(r)})
\big]
,\!
\end{displaymath} (21)

choosing a positive step width $\eta$ and starting from an initial guess $v^{(0)}$.

In conclusion, reconstructing a potential from data by IHFA is based on the definition of a prior probability for $v$ and requires the iterative solution of

1. the stationarity equation for the potential (7), needing as input for each iteration step (21)

2. the derivatives of the likelihoods (19), obtained by solving the (two-body-like) equation (18) for given

3. single particle orbitals, defined in (12) as solutions of the direct (one-body) Hartree-Fock equation.

We tested the numerical feasibility of the IHFA for a Hamiltonian $H$ = $T+V_1+V$ including a local two-body potential $V$ to be reconstructed and a local one-body potential $V_1$, with diagonal elements $v_1(x)$ = 0 for $-a\le x\le a$ and $v_1(x)$ = $\infty$ elsewhere, to confine the particles and to break translational symmetry.

To check the validity of the IHFA we must be able to sample artificial data from the exact true many-body likelihood (3) for a given true potential $V_{\rm true}$. Because this requires to solve the corresponding many-body problem exactly, we have chosen a two-particle system with one-dimensional $x$ (on a grid with 21 points, $a$ = 10) for which the true ground state can be calculated by diagonalizing $H$ numerically. We want to stress, however, that, in the case of real data, application of the IHFA to systems with $N>2$ particles is straightforward and only requires to solve Eq. (18) for $N$ instead for two orbitals.

We selected a local true two-body potential $V_{\rm true}$ with diagonal elements $v_{\rm true}(z)$ = $b/(1+e^{-2 \gamma (z-L/2)/L})$ and parameter values ($b$ = 1, $\gamma$ = 10, $L$ = $2a+1$ = 21 for mass $m$ = $0.1$) for which the iteration of the HF equation (12) converges. (For two-body systems the HF iteration leads easily to oscillations.) Calculating the true ground state $\psi_0$ for $V_{\rm true}$ and the corresponding true likelihoods (3) and (4) we then sampled: (A) $n$ = 100 two-particle data from the true likelihood (3) and (B) $n$ = 200 single particle data from the true likelihood (4). (See Figs. 1a and 2a.)

The calculations have been done for a Gaussian prior probability $p(v)$ as in (6) with $\lambda ({\bf I}-\Delta)/2$ (with identity ${\bf I}$, $\lambda$ = $5$) as inverse covariance ${\bf K}_0$, and a reference potential $v_{0}(z)$ of the form of $v_{\rm true}$, but with $\gamma$ = 1 (so it becomes nearly linear in the shown interval.) Furthermore, we have set all potentials to zero at the origin and constant beyond the right boundary. The reconstructed potential $v_{\scriptscriptstyle \rm IHFA}$ has then been obtained by iterating according to Eq. (21) and solving Eqs. (12) and (18) within each iteration step.

The resulting IHFA likelihoods $p(\vec x_i\vert v_{\scriptscriptstyle \rm IHFA})$ (case A, Fig. 1a) or $p(x_i\vert v_{\scriptscriptstyle \rm IHFA})$ (case B, Fig. 2a) did indeed fit well the true likelihoods $p(\vec x_i\vert v_{\rm true})$ or $p(x_i\vert v_{\rm true})$, respectively. (Fig. 1a shows instead of the two-dimensional $p(\vec x_i\vert v)$ for vectors $\vec x_i$ the one-dimensional $p(z\vert v)$ = $\sum_{x} \left[ p(x,x-z\vert v)
+p(x,x+z\vert v) \right]$ for the inter-particle distance $z$.) In particular, in case A the reconstructed likelihood $p(z\vert v_{\scriptscriptstyle \rm IHFA})$ is over the whole range an improvement over the reference likelihood $p(z\vert v_{0})$, while in case B the IHFA solution $p(x_i\vert v_{\scriptscriptstyle \rm IHFA})$ is nearly exactly the same as the true likelihood $p(x_i\vert v_{\rm true})$. That perfect result for case B is due to the fact that reconstructing the likelihood $p(x_i\vert v)$ for single particle data is a much simpler task than reconstructing the full $p(\vec x_i\vert v)$.

The situation is more complex for potentials (Figs. 1b and 2b). Firstly, one sees that the correlation information contained in the two-particle data of case A yields a better reconstruction for $v_{\scriptscriptstyle \rm IHFA}$ than the less informative single particle data of case B. In both cases, however, the true potential is only well approximated at medium inter-particle distances. For large and small distances, on the other hand, the IHFA solution is still dominated by the reference potential of the prior probability $p(v)$. This effect is a consequence of the lack of empirical data in those regions: The probability to find particles at large distances is small, because the true potential has its maximum at large distances. Also, measuring small distances is unlikely, because antisymmetry forbids two fermions to be at the same place. In such low data regions one must therefore rely on a priori information.

We are grateful to A. Weiguny for stimulating discussions.

Figure 1: Two-particle data (case A): (a) True likelihood $p(z\vert v_{\rm true})$ (thin line), empirical probability (relative frequency) $p_{\rm emp}(z)$ = $(1/n)\sum _{i=1}^n\delta _{z,\vert x_{i(1)}-x_{i(2)}\vert}$ (bars) obtained from the $n$ = 100 sampled two-particle data $\vec x_i$, reference likelihood $p(z\vert v_0)$ (dashed), and reconstructed likelihood $p(z\vert v_{\scriptscriptstyle \rm IHFA})$ (thick line) as function of the inter-particle distance $z$ = $\vert x_{(1)}-x_{(2)}\vert$. (b) True potential $v_{\rm true}(z)$ (thin line), reference potential $v_0(z)$ (dashed), and reconstructed IHFA potential $v_{\scriptscriptstyle \rm IHFA}(z)$ (thick line) as function of the inter-particle distance $z$.
\begin{figure}\begin{center}
\epsfig{file=figure1.eps, width= 55mm}\epsfig{file=...
...)[l]{\small Inter--particle distance $z$}}
\end{picture}\end{center}\end{figure}

Figure 2: Single particle data (case B): (a) True likelihood $p(x\vert v_{\rm true})$ (occluded by thick line), empirical probability $p_{\rm emp}(x)$ = $(1/n)\sum _{i=1}^n\delta _{x,x_i}$ (bars) obtained from the $n$ = 200 sampled single particle data $x_i$, reference likelihood $p(x\vert v_0)$ (dashed), and reconstructed likelihood $p(x\vert v_{\scriptscriptstyle \rm IHFA})$ (thick line) as function of single particle coordinates $x$. (b) True potential $v_{\rm true}(z)$ (thin line), reference potential $v_0(z)$ (dashed), and reconstructed IHFA potential $v_{\scriptscriptstyle \rm IHFA}(z)$ (thick line) as function of the inter-particle distance $z$.
\begin{figure}\begin{center}
\epsfig{file=figure3.eps, width= 55mm}\epsfig{file=...
...)[l]{\small Inter--particle distance $z$}}
\end{picture}\end{center}\end{figure}




next up previous
Next: Bibliography
Joerg_Lemm 1999-12-21