next up previous contents
Next: Dynamic memory management Up: Program Structure Previous: SCF procedure   Contents

Diagonalization

In the next phase the eigenvalue is solved, and the mixing coefficients $\{ c_j \}$ are updated.

Figure 6.16: Steps in solving the eigenvalue problem.
\begin{figure}\centerline{ \psfig{file=tex/fig/scf_diag.eps}}\end{figure}

During this phase diag() performs the process referred to as a "simultaneous optimization" in three distinctive steps. During the first step the matrix elements are assembled and the matrix diagonalized. Diagonalization occurs in the one of the routines diag_memory_all(),diag_disk_clst(), diag_disk_ico(),diag_disk_hmx(). The differ in the storage methods which are used as function of available memory. The main loop is over all coefficients, tt coeff, which are contained in file c.lst. It is a complicated loop with a number of logical statements and it includes single and double precision arithmetics, this reduces the optimization of the Mflop performance to less than 30% of theoretical possible for floating point calculations. The fastest routine is diag_memory_all(), it does not include any disk IO. The main loop is shown below:

      do ii = 1, n_cf;
          n_count_tmp = ncoef+ii
          if (ii.gt.ico(nijcurr+nz)) nijcurr = nijcurr + 1
          hmx(nijcurr) = hmx(nijcurr) +
     :            coeff(n_count_tmp)*value(inptr(n_count_tmp))
          if (nijcurr.gt.jptr(jjh)) then
            jjh = jjh + 1;
            max_col = max_col+1
          end if 
        end do

This fragment combines the matrix elements based on the information about the column index ico, the column pointer jptr and the coefficient coeff. The rest of the diag_*() routines perform the same task, however apply disk IO to a various degree, with the worst case being diag_disk_hmx(), which reads each of the files c.lst, ih.nn.lst, ico.lst.

The next step computes the selected eigenvalues using the Davidson algorithm. The eigenvectors for each requested eigenvalue are saved and applied for updating the coefficients. This steps proceeds consequently for each block, after the last step the weighted total energy is computed an displayed during each iteration.

Figure 6.17: Steps in solving the eigenvalue problem.
\begin{figure}\centerline{\psfig{file=tex/fig/diag_steps.eps}}\end{figure}

The optimization is performed on the functional

\begin{displaymath}{\cal E} = \sum_{T_i} w_{T_i} {\cal E}(T_i) /\sum_{T_i}w_{T_i}\end{displaymath}

where $w_{T_i}$ is the weight for ${T_i}$, where ${\cal E}(T_i)$ represents an energy functional for term $T$ and eigenvalue $i$ Updating the coefficients proceeds similar to the calculation of the matrix elements. Depending on storage requirements and availability one of the routines, UPDATC_memory_all(), UPDATC_disk_ico(), UPDATC_disk_clst(), UPDATC_disk_ih(), is applied to update the coefficients of all integrals, which contribute to the energy. This procedure is applied to all blocks, figure  6.17. For all coefficients, the weighted contribution of the mixin coefficient is in a complicated loop containing multiple logical constructs, floating and integral calculations. Upon convergence, during the last iteration, the routine prprty() is called to compute a number of wave function properties, saved in file summry.

The listing below shows the complexity of the coefficient updating process.

 
      if (iblock == 1) ncoef = 0;
        do i = 1, cf_tot(iblock);
            n_count_tmp = ncoef+i
            if (i.gt.ico(nijcurr)) then
                nijcurr = nijcurr + 1
*               .. have we also changed column?
                if (nijcurr.gt.jptr(max_col)) then
                   jjh = jjh + 1;
                   max_col = max_col+1
                end if
             end if
             iih = ihh(nijcurr)
             im1 = 0;
             do j = 1,maxev
               ioffw = (j-1)*ncfg
               if (leigen(j,iblock)) then
                 wcoef = eigst_weight(j,iblock)
                 W = wcoef*wt(ioffw+iih)*wt(ioffw+jjh)
                 T = W*coeff(n_count_tmp)
                 IF (IIH .NE. JJH) T = T+T
                 coef(inptr(n_count_tmp)) = coef(inptr(n_count_tmp)) + T
                 if (last) then
                   W0 = wt(ioffw+iih)*wt(ioffw+jjh);
                   T0 = W0*coeff(n_count_tmp);
                   IF (IIH .NE. JJH) T0 = T0 + T0;
                   itmp_s = (inptr(n_count_tmp))+idim*im1;
                   tmp_coef(itmp_s) = tmp_coef(itmp_s) + T0;
                   im1 = im1 + 1;
                 end if
               end if
             end do
      end do

Figure  6.18 shows the implementation and the call sequence in diag().

Figure 6.18: diag() computes the elements of the H matrix and solves the eigenvalue problem. At the begining, memory is allocated for the diagonalization procedure. Then, one of the versions of diag_*() is called and the matrix elements are computed and diagonalized. Further, the dvdson() routines return the eigenvector(s). The memory used by dvdson() is deallocated. Then, the coef are updated in one of the versions of updatc_*(). NOTE: diag_*) and diag_*() differ only with regards to memory and disk use.
\begin{figure}\begin{center}
\centerline{\psfig{figure=tex/fig/mchf_diag.epsi}}\end{center}\end{figure}

There are special storage requirements for each step: Arrays (hmx_diag), inptr, coeff and ico are accessed only once. In contrast, the iterative solution of the eigenvalue problem (dvdson) requires multiple read access operations on the interaction matrix (only hmx and ih are used). Therefore, higher priority for storing in memory was given to hmx and ih. After dvdson the memory used in diag_hmx and dvdson is deallo cated and used in updatc. Finally, updating the coefficients requires a single access to ih, and ico suggesting higher priority for ico over coeff. Table  6.3 shows the multilevel storage scheduling derived from the frequency of data access. The storage scheduling will depend on the size of the problem and the system capacity and mchf is designed to select the best level with respect to computational efficiency.

Figure 6.19: Storage access requirements in diag().
\begin{figure}\centerline{\psfig{file=tex/fig/diag_storage.epsi}}\end{figure}


next up previous contents
Next: Dynamic memory management Up: Program Structure Previous: SCF procedure   Contents
2001-10-11