 JOINT   INSTITUTE   FOR   NUCLEAR   RESEARCH  #### F498

Authors: P.G.Akishin,A.P.Sapojnikov Language: Fortran  It sounds strange but usage of disks while solving large linear systems still looks as actual problem!
In fact, 2000x2000 matrix requires 32 Mb of memory, not every PC has it.
Usage of virtual memory involves a lot of swoping, because all usual solvers have to scan matrix chaotically, both "col by col" and "row by row".

Program proposed here uses classic Gauss scheme with pivot element in a column. We keep a matrix in a disk storage and process it "portion by portion" in a buffer given by user. Partially, if this buffer is large enough to contain all matrix, we don't use disks at all!
Actually, it's a special way of swoping which minimizes the number of disk exchanges. Details of matrix placement are fully hidden for users.

The user have to do only 2 actions:

1. to prepare a function for source matrix and system right parts elements calculation;
2. to prepare an array for matrix buffering, the larger it is - the less will be swoping.

So, let us start. The N-dimensional system A*X = C is solving,
where A - matrix [ N : N ], X,C - vectors with lehgth N.
Practically very often we're solving the whole family of NC systems

A * X(k) = C(k)    k=1,2,...,NC

with the same matrix A. At first we transform the input matrix A into two triangle matrix L and R, such as A = ~L * R .
Than for each right part C(k) of source system the solution X(k) results as trivial multiplication: X(k) = ~R * ( L * C(k) ).

Process of solving consist of 3 stages:

1. CALL XLLS_Solver(buf,lbuf,n,elm) - matrix input and factorization.

 n - system dimension; buf - onedimensional array with length not less than lbuf REAL*8 words; lbuf - size of working storage for Solver. lbuf must be not less than 3*n. For the other hand, if lbuf > 2*n*(n+nc+1)+n we needn't use disk storage at all. While using disk storage the logical numbers 78 and 79 used; elm(i,j) - REAL*8 function for A(i,j) calculation. Solver inputs the source matrix A only once by calling elm(i,j) strictly "col by col", i.e. in such a sequence: (1,1),(2,1),...,(n,1), (1,2),(2,2),...,(n,2),...,(n-1,n),(n,n).

2. CALL XLLS_Solver(buf,0,nc,elc) - solving for given Right Parts.

 buf - the same working area as for stage 1; nc - number of system right parts C; elc(i,j) - REAL*8 function calculates C(i,j), i=1,2,...,n; j=1,2,...,nc set of nc system right parts (n rows, nc columns). elc also is called only once and strictly "col by col". The solution stays in a working area.

3. CALL XLLS_Extractor(buf,x,k) - extract one solution X(k), corresponding the right part C(k).

 buf - the same working area as for stage 1; k - 1,2,...,nc - number of solution; x - REAL*8 array with length n. Extracted solution'll be placed here.

If one need not solve the system but get the inverse matrix ~A, stages 1 and 2 are combined into the single calling with negative N:

CALL XLLS_Solver(buf,lbuf,-n,elm)

In this case the Solver, having performed factorization of matrix A, just starts the solving of A * X = E system with n right parts.

Before the 1-st call of Solver the User can change some of his global
parameters if uses

subroutine XLLS_Params(np), where:
 np = 1 - economize of disk storage. Only 4 bytes of each REAL*8 number will be written on disk. It reduces twice required 16*n*(n+nc) amount of disk storage, thought decreases accuracy. np = 2 - restore standard regime of writing. np = 7 - print disk exchange counter. np = 8 - some computers, for example VAXes, require to measure disk record sizes in words but not in bytes!      