next up previous contents index
Next: Variational Theorem for the Up: The Variational Theorem Previous: The Variational Theorem

   
The Method of Linear Variations

In this section we show that the method of linear variations (also called the Ritz method [11]) is equivalent to the matrix formulation ${\bf
H c} = E {\bf c}$ of the Schrödinger equation $\hat{H} \vert \Psi \rangle = E
\vert \Psi \rangle$. Our treatment is similar to that of Szabo and Ostlund [1], p. 116. The linear variation method states that, given the linear expansion

 \begin{displaymath}
\vert \Psi \rangle = \sum_i c_i \vert \Phi_i \rangle
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred1 | = _i c_i | _i )

we vary the coefficients ci so that we minimize $E = \langle \Psi \vert \hat{H}
\vert \Psi \rangle / \langle \Psi \vert \Psi \rangle$. We begin by requiring that the wavefunction be normalized so that $\langle \Psi \vert \Psi \rangle = 1$. Normalization means that we cannot minimize E simply by solving

 \begin{displaymath}
\frac{\delta}{\delta c_k} \langle \Psi \vert \hat{H} \vert \Psi \rangle = 0
\hspace{0.5in} k=1,2,\ldots,N
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred2 c_k | H | = 0 k=1,2,...,N )

because the ci's are not independent. In this case we have a constrained minimization, so we apply Lagrange's method of undetermined multipliers, and we minimize the functional

 \begin{displaymath}
{\cal L} = \langle \Psi \vert \hat{H} \vert \Psi \rangle - E ( \langle \Psi \vert \Psi \rangle - 1)
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred3 L = | H | - E ( | - 1) )

which has the same minimum as E when $\vert \Psi \rangle$ is normalized. When we substitute equation (3.1) into equation (3.3), we obtain

 \begin{displaymath}
{\cal L} = \sum_{ij} c_i^{*}c_j \langle \Phi_i \vert \hat{H}...
...ij} c_i^{*}c_j \langle \Phi_i \vert \Phi_j \rangle - 1 \right)
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred4 L = _ij c_i^*c_j _i | H | _j - E ( _ij c_i^*c_j _i | _j - 1 ) )

which we may rewrite as

 \begin{displaymath}
{\cal L} = \sum_{ij} c_i^{*} c_j H_{ij} -
E \left( \sum_{ij} c_i^{*} c_j S_{ij} - 1 \right)
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred5 L = _ij c_i^* c_j H_ij - E ( _ij c_i^* c_j S_ij - 1 ) )

where of course $H_{ij} =
\langle \Phi_i \vert \hat{H} \vert \Phi_j \rangle$ and $S_{ij} = \langle \Phi_i \vert \Phi_j \rangle$. Now set the first variation in ${\cal L}$ equal to zero:

 \begin{displaymath}
\delta {\cal L} = \sum_{ij} \delta c_i^{*} c_j H_{ij} -
E ...
...delta c_j H_{ij} -
E \sum_{ij} c_i^{*} \delta c_j S_{ij} = 0
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred6 L = _ij c_i^* c_j H_ij - E _ij c_i^* c_j S_ij + _ij c_i^* c_j H_ij - E _ij c_i^* c_j S_ij = 0 )

Since the summations run over all i and j, and since Hij = Hji* and Sij = Sji*, we can simplify to

 \begin{displaymath}
\delta {\cal L} = \sum_i \delta c_i^{*} \left[
\sum_j H_{i...
... E S_{ij} c_j \right] +
{\rm complex \hspace{5pt} conj.} = 0
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred7 L = _i c_i^* [ _j H_ij c_j - E S_ij c_j ] + complex conj. = 0 )

Since each term is a sum of a number and its complex conjugate, the imaginary parts will cancel. However, the real part will not necessarily be zero; in fact, since all the $\delta c_i$'s are arbitrary (that is the whole point of using Lagrange's method), then for $\delta {\cal L}$ to be zero, the term in brackets must be zero. We may rewrite this condition as a matrix equation

 \begin{displaymath}
{\bf H c} = E {\bf S c}
\end{displaymath} (tex2html_deferredtex2html_deferred3.tex2html_deferred8 H c = E S c )

If the basis functions { $\vert \Phi_i \rangle$ } are chosen orthonormal (as is usually the case), then ${\bf S} = {\bf I}$ the identity matrix, and we have ${\bf
H c} = E {\bf c}$. Of course ${\bf c}$ is the column-vector representation of $\vert \Psi \rangle$ in the basis { $\vert \Phi_i \rangle$ }. We thus have two equivalent ways of viewing a CI--either as the matrix formulation of the Schrödinger equation within the given linear vector space of N-electron basis functions, or as the minimization of the energy with respect to the linear expansion coefficients ci of (3.1), subject to the constraint that the wavefunction remain normalized. Another way of viewing the results of this section is to note that only eigenvectors of the Hamiltonian matrix ${\bf H}$ are stable with respect to variations in the linear expansion coefficients.

At this point it is reasonable to ask why we wish to minimize the energy by varying the coefficients in equation (3.1). How do we know that this will give us the best estimate of the wavefunction? There are two answers to this. First, as we have just shown, minimizing the energy by variation of the linear expansion coefficients gives the Schrödinger equation in matrix form; thus the procedure is justified a posteriori by the validity of its result. The other reason is that, for the ground state, the linear expansion in equation (3.1) gives an expectation value for the energy E which is always an upper bound to the exact nonrelativistic ground state energy ${\cal E}_0$. We will prove this assertion in the next section; the result is called the Variational Theorem. The best estimate of E, then, is the minimum value which can be obtained by varying the coefficients in equation (3.1) (while also maintaining normalization). These arguments also hold for excited states, so long as each excited state is made orthogonal to all lower states.


next up previous contents index
Next: Variational Theorem for the Up: The Variational Theorem Previous: The Variational Theorem
C. David Sherrill
2000-04-18