|
content="text/html; charset=iso-8859-1">
content="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">
VECFEM3 Reference Manual: LSOLPP
Type: FORTRAN routine
NAME
LSOLPP - LINSOL: Solves a linear system of equations MAT*x
= b
SYNOPSIS
- call LSOLPP(lmat,lprec,liprec,lindex,l,ldw,liw,nproc,
- lsym,ia1,info,MAT,prec,x,b,dw,iprec,index,iw,ilin, lmatbk,ptrinf,tids,epslin,iconv,ierr)
- integer
- lmat,lprec,liprec,lindex,l,ldw,liw,nproc,ia1,iconv, ierr
- integer
- index(lindex),info(ia1,7),ilin(50),iprec(liprec), iw(liw),lmatbk(nproc),ptrinf(12,nproc),
tids(nproc)
- double precision
- MAT(lmat),prec(lprec),dw(ldw),x(l),b(l),epslin
- logical
- lsym
PURPOSE
LSOLPP is a routine to solve a linear system MAT*x = b
with a large and sparse l x l real matrix MAT
and a right hand side b of length l.
Different solution methods of the generalised CG-method are used,
which can be selected by the user. To accelerate the iteration
and to get a better stability, the matrix MAT is
normalised by a diagonal matrix and/or precondition from the left-hand
side and/or the right hand side. MAT is handed over in
a sparse matrix storage scheme.
ARGUMENTS
If a parameter has the attribute 'local', the parameter values
can vary on different processors. If a parameter has the
attribute 'global', it must have the same value on all processors.
The label *PAR means that the parameter only has to be
declared, but does not have to be set (initialised) on NON-PARALLEL
computers. The label **PAR means that the parameter
only has to be declared, but does not have to be set, if nproc is
set (initialised) to 1 (on NON-PARALLEL computers nproc is
internally set to 1).
- lmat integer, scalar, input, local
- Length of (local) matrix array MAT.
- lprec integer, scalar, input, local
- Length of preconditioning matrix prec. lprec
must be greater or equal 5*l.
- lindex integer, scalar, input, local
- Length of the integer array index.
- liprec integer, scalar, input, local
- Length of the integer array iprec. liprec
must be greater or equal 1.
- l integer, scalar, input, global
- Maximal number of unknowns on a processor (it must hold:
l = max{lmatbk(i),i=1,..,nproc}).
- ldw integer, scalar, input, local
- Length of the real work array dw.
ldw => |
22*l |
for PRES20 and Polyalgorithm |
ldw => |
22*l |
for BICO, QMR-Simulator and CG |
ldw => |
22*l |
Otherwise |
- liw integer, scalar, input, local
- Length of the integer work array iw.
liw => 2*nproc |
,if optim<>100 and optim<>200 |
liw => max(2*nproc,4*l) |
,if optim==100 or optim==200 |
- nproc integer, scalar, input, global
<<< *PAR
- Number of processors (processes), see combgn.
- lsym logical, scalar, input, global
- is .true., if the matrix MAT is symmetric.
- ia1 integer, scalar, input, local
- First dimension of matrix information array info, see
sparse matrix storage scheme.
The parameter must be greater or equal max{ptrinf(12,i),i=1,nproc}
on each processor.
- info integer, array: info(ia1,ia2),
input, local
- Information array for the matrix MAT (ia2 {= 7}
is a constant), see sparse matrix
storage scheme.
- MAT double precision, array: MAT(lmat),
input/output, local
- Matrix array, see sparse matrix
storage scheme.
- prec double precision, array: prec(lprec),
input, local
- Preconditioning matrix.
- x double precision, array: x(l),
input/output, local
- Solution vector.
- b double precision, array: b(l),
input/output, local
- Right hand side.
- dw double precision, array: dw(ldw),
internal, local
- Real work array.
- iprec integer, array: iprec(liprec),
input, local
- Information vector of preconditioning matrix.
- index integer, array: index(lindex),
input, local
- Array of indices allowing indirect access to the matrix
array MAT, see sparse
matrix storage scheme.
- iw integer, array: iw(liw),
internal, local
- Integer work array.
- ilin integer, array: ilin(nilin),
input/output, global
- Information vector (nilin {=50} is a constant).
- (1) = ms, input, global
- Method selection.
ms = 1 |
PRES20 |
ms = 2 |
BICO |
ms = 3 |
Bi-CGSTAB |
ms = 4 |
AT-PRES |
ms = 5 |
CGS |
ms = 6 |
QMR-Simulator |
ms = 7 |
GMERR |
ms = 9 |
CG-AT for non-symmetric
matrices |
ms =10 |
Polyalgorithm PRES20 ->
BICO -> AT-PRES |
ms =123 |
Classical CG for
symmetric matrices. |
- (2) = itmax, input, global
- Maximal number of iteration steps.
- (3) = nmsg, input/output, global
<<< *PAR
- Message counter for the number of communication
units.
- (4) not used
- (5) not used
- (6) = isprec, input/output, global
- indicates, if the matrix MAT is alread
normalised.
isprec = 0 |
matrix is not normalised, |
isprec = 1 |
matrix is normalised. |
- (7) not used
- (8) = msprec, input, global
- selects the method of preconditioning. (The
preconditioning matrices are stored as one-dimensional
arrays in prec and accessed via
information array iprec).
msprec = 0 : |
no preconditioning, |
msprec = 1 : |
preconditioning by (complete)
LU factorisation << in preparation
>> |
- (9) = noprec, input, global
noprec = 0 |
normalised system is
considered for checking of stopping
criterion, |
noprec = 1 |
original system is
considered for checking of stopping
criterion. |
- (10) = nmvm, output, global
- Number of computed matrix-vector multiplications.
- (11) not used
- (12) = lout, input, local
- Unit number of the standard output file (default
is 6).
- (13) = idoku, input, global
- Output control.
idoku = 0 |
print only error messages, |
idoku = 1 |
print further information,
but no output during iteration, |
idoku = i>1 |
output always after i
iteration steps on the processor with
logical process(or) number 1, |
idoku = -i<0 |
it holds the same as for
idoku>0 with the difference of output
on all processors. |
- (14) not used
- (15) = msnorm, input, global
- selects the method of normalisation. Note: "A"
stands for a usually stored matrix (matrix MAT is
stored as a one-dimensional array and accessed
via array info).
msnorm = 0 |
no normalisation, |
msnorm = 1 |
normalisation by row sum
(prec(i) = sign A_ii/sum abs(A_ij),j=1,l), |
msnorm = 2 |
normalisation by main
diagonal elements (prec(i) = 1/A_ii), |
msnorm = 3 |
normalisation by square
sum (prec(i) = 1/sqrt(sum A_ij**2,j=1,l)), |
msnorm = 4 |
Froboenius normalisation
(prec(i) = A_ii/sum A_ij**2,j=1,l), |
msnorm = 11 |
same as 1-4,
but with square root of preconditioning
matrix from left and right (automatically
selected if the matrix MAT is
symmetric). |
msnorm = 12 |
msnorm = 13 |
msnorm = 14 |
- (16) = startx, input, global
startx <> 4711 |
iteration starts from
zero vector, |
startx = 4711 |
an initial guess is given
in x. |
- (17) = myproc, input, local
<<< **PAR
- Logical process(or) number.
- (18) = smooth, input, global
smooth = 0 |
iteration returns the non-smoothed
solution, |
smooth <> 0 |
iteration returns the
smoothed solution. |
- (19) = misc, input, global
misc = 00000 |
Standard LSOLPP |
misc = 00001 |
Printing and checking -
as far as possible - of vector terms(you
can set ilin(21)) |
misc = 00010 |
Check, if recursion
emerge(only of use on vector computers)
<not yet implemented> |
misc = 00100 |
Reading of (distributed)
matrix MAT in LSOLPP-Format (you
must set ilin(22)) |
misc = 01000 |
Storing of (distributed)
matrix MAT in LSOLPP-Format (you
must set ilin(23)) |
misc = 10000 |
Output of error |
- (20) = optim, input, global
optim = 000 |
No optimisation |
optim = 001 |
Optimisation of data
structures<not yet implemented> |
optim = 010 |
Optimisation of data
structures with printing of statistics
regarding the data structures <not yet
implemented> |
optim = 100 |
Safe cache reuse, ie.
arrays MAT and INDEX
do not change |
optim = 200 |
Unsafe cache reuse, ie.
arrays MAT and INDEX
do change |
- (21) = vtunit, input, global
- I/O unit for printing of vector terms (should be
greater 20 and less 100 - if not set, then
printing on standard output).
- (22) = matin, input, global
- I/O unit for reading of matrix MAT (should
be greater 20 and less 100 - if not set, then
LSOLPP stops).
- (23) = matout, input, global
- I/O unit for storing of matrix MAT (should
be greater 20 and less 100 - if not set, then
LSOLPP stops).
- lmatbk integer, array: lmatbk(nproc),
input, global <<< **PAR
- lmatbk(i) returns the number of unknowns on
process(or) i, see sparse
matrix storage scheme.
- ptrinf integer, array: ptrinf(ntyp+1,nproc),
input, local
- ptrinf(ityp,i)+1 points to the first entry
within the array info pointing to vector terms
of storage pattern ityp in block i; the array info
must be sorted by storage patterns (ntyp {=11}
is a constant) and by blocks, see sparse matrix storage scheme.
- tids integer, array: tids(nproc),
input, global <<< **PAR
- tids defines the mapping of the logical
process(or) numbers to the physical process ID's, see combgn.
- iconv integer, output, global
- iconv indicates the behaviour of LSOLPP.
iconv = 1 |
convergence, |
iconv = 2 |
iteration reached the maximal
number of matrix-vector-multiplications (declared
in ilin(2)), |
iconv = 3 |
divergence, |
iconv = 4 |
matrix is not suitable for LSOLPP,
e.g. non-regular matrix, |
iconv = 99 |
fatal error. |
- ierr integer, output, local
- ierr is a error number with information about
the error.
ierr =ijk |
i = 0,..,9 indicates the level of
routines, on which the error occurs, |
|
j = 0 means: wrong parameters, |
|
j = 1 means: error during
normalisation, |
|
j = 2 means: error during
iteration process, |
|
j = 3 means: error during
communication, |
|
k is a general error counter with
0<k<100. |
- epslin double precision, input, global
- epslin is the relative stopping factor. If (lcond1
.and. lcond2) then the iteration is stopped. If (lcond1 .and.
.not. lcond2) then a new stopping criterion is computed.
lcond1=(||(precondition) smoothed residuum||2
<= epslin *|| (precondition) r.h.s. ||2)
and
noprec=1 |
Lcond2=(||original smoothed
residuum||max<=epslin *||original r.h.s.
||max) |
noprec=0 |
Lcond2=(||(precondition) smoothed
residuum||max<=epslin*||(preconitioned)
r.h.s. ||max) |
EXAMPLE
N/A
METHOD
The iterative methods, smoothing, and normalisation are
described in the LINSOL user's guide [10].
Normalization
At the beginning the matrix MAT is normalised from
the left by a diagonal matrix prec(1..l), if MAT
is not normalised (ilin(6)=isprec = 0) and a
normalisation method is chosen (ilin(15)=msnorm
<> 0). If MAT is symmetric or a symmetric
normalisation is selected, it is additionally normalised from the
right to preserve symmetry. The user can choose different methods
of normalisation. We recommend the normalisation by row sum (ilin(15)
= 1), because in the most cases it is the best method with regard
to the performed matrix-vector multiplications. Returning from LSOLPP
the array MAT is normalised (ilin(6)=isprec
= 1). This is important to remember, if LSOLPP
is recalled or the matrix should be used in other contexts after
calling LSOLPP!
Smoothing
The residual of a generalised CG-method can oscillate strongly.
Therefore both the residual and the approximated solution are
smoothed in every iteration step to ensure that the Euclidean
norm of the residual does not increase. By setting the parameter ilin(18)
to zero it is possible to obtain the original solution instead of
the smoothed one, but the iteration process is always controlled
by the smoothed residual.
Non-symmetric
For non-symmetric problems, LSOLPP provides seven different
solution algorithms. These algorithms are implemented both with
smoothing and without smoothing. Thus we get from seven
implemented algorithms thirteen methods. In the sequel the most
popular and "well-known" ten basic methods, one error-minimising
method and the polyalgorithm are described.
- 1. PRES20 [1] (Pseudo-RESidual 20),
ilin(1)=1, ilin(18)<>0
- For sufficient diagonal dominance of the matrix this is a
quickly converging method. If the norm of the residual
decreases only in the first iteration steps, then choose
one of the other methods. The residuals decrease
monotonically.
- 2a. BICO [1] (BICOnjugate gradients
smoothed), ilin(1)=2, ilin(18)<>0
- BICO is more robust than PRES20 and converges quickly
even if the matrix is not diagonally dominant, but BICO
uses two matrix-vector multiplications (MVM) by both the
matrix MAT and its transposed matrix MAT^T
per iteration step. The residuals decrease monotonically.
- 2b. BCG [2] (BiConjugate Gradients),
ilin(1)=2, ilin(18)=0
- BCG is more robust than PRES20 and converges quickly even
if the matrix is not diagonally dominant, but BICO uses
two matrix-vector multiplications (MVM) by both the
matrix MAT and its transposed matrix MAT^T
per iteration step. The residuals oscillate heavily.
Therefore, use BICO or QMR.
- 3a. Bi-CGSTAB smoothed [3,8]
, ilin(1)=3, ilin(18)<>0
- The Bi-Conjugate Gradient STABilized (Bi-CGSTAB) method
is a modification of CGS that should be more stable. The
residuals decrease monotonically.
- 3b. Bi-CGSTAB [3] , ilin(1)=3, ilin(18)=0
- This is a modification of CGS that should be more stable.
- 4a. AT-PRES = CGNR [4] , ilin(1)=4,
ilin(18)<>0
- The A_Transposed-Pseudo_RESidual (AT-PRES) method is
equivalent to the CG Normal equations Residual-minimising
(CGNR) method and should only be used if one of the
methods BICO, CGS or Bi-CGSTAB does not converge. It may
converge slowly but it is very robust. The residuals
decrease monotonically.
- 4b. CGNE (CG Normal eq. Error-min.) [5],
ilin(1)=4, ilin(18)=0
- CGNE should only be used if one of the methods BICO, CGS
or Bi-CGSTAB does not converge. It may converge slowly
but it is very robust. The errors decrease monotonically.
- 5a. CGS (Conjugate Grad. Squared) smoothed, ilin(1)=5,
ilin(18)<>0
- The smoothed CGS [6,8]
is as robust as BICO. An advantage of CGS compared to
BICO is that CGS only uses matrix-vector multiplications
by MAT and not by the transposed matrix MAT^T.
Therefore it could run a little better (-: It's getting
better all the time :-> on parallel computers. The
residuals decrease monotonically.
- 5b. CGS (Conjugate Gradient Squared) [6],
ilin(1)=5, ilin(18)=0
- CGS is as robust as BICO. An advantage of CGS compared to
BICO is that CGS only uses matrix-vector multiplications
by MAT and not by the transposed matrix MAT^T.
Therefore it could run a little better (-: It's getting
better all the time :-> on parallel computers.
- 6. QMR simplified [7,9],
ilin(1)=6, ilin(18)<>0
- The Quasi-Minimal Residual (QMR) method behaves similar
as BICO. In this implementation the look-ahead process
for avoiding breakdowns is omitted.
- 7. GMERR (General Minimal ERRor), ilin(1)=7
- GMERR is the generalisation for arbitrary matrices of
Fridman's algorithm. GMERR should be competitive with
GMRES for symmetric matrices.
- 9. CG-AT (CG with matrix A*A^T), ilin(1)=9
- LSOLPP computes the linear system (MAT)(MAT^T)x=b;
you have only to hand over the matrix MAT to
LSOLPP. The matrix MAT must not be symmetric.
The matrix (MAT)(MAT^T) then is
symmetric; thus LSOLPP chooses the CG method as solution
process.
- 10. Polyalgorithm [1]
- The Polyalgorithm tries to exploit the advantages of the
methods PRES20, BICO and AT-PRES. It starts with the most
efficient - but least robust - method PRES20. If PRES20
falls asleep, it switches to BICO. If BICO does not
converge fast enough, the Polyalgorithm changes to the
"emergency exit" AT-PRES. The Polyalgorithm
should be used if no detailed knowledge of an optimal
iterative method exists. In 95% of all cases the
Polyalgorithm is as fast as the best method in the set {PRES20,BICO,AT-PRES}.
.RE -7 If you use the above-mentioned methods, the matrix
must not be stored in the symmetric storage scheme.
Symmetric
The matrix MAT has to be given in the symmetric
storage scheme. This reduces both the amount of workspace and CPU-time
needed. You again get two different methods, if you choose
smoothing and no smoothing respectively. For symmetric matrices
both CG methods are faster and more robust than the above
mentioned methods.
- 123a. CG smoothed = GMRES, ilin(1)=123, ilin(18)<>0
- The smoothed Conjugate Gradient (CG) method is
mathematically equivalent to the GMRES method.
- 123b. Classical CG, ilin(1)=123, ilin(18)=0
REFERENCES
- [1] W. Schoenauer, Scientific Computing
on Vector Computers, North-Holland, 1987, Amsterdam, New
York, Oxford, Tokyo.
- [2] R. Fletcher, Conjugate Gradient
Methods for Indefinite Systems, Proc. Dundee Biennial
Conf. on Num. Anal., 1975, ed. G.A. Watson, pp. 73-89,
Springer-Verlag.
- [3] H.A. van der Vorst, BI-CGSTAB: A
Fast and Smoothly Converging Variant of BI-CG for the
Solution of Non-symmetric Linear Systems, SIAM J. Sci.
Stat. Comp., 1992, Vol. 13, No. 2, pp. 631-644.
- [4] W. Schoenauer, M. Schlichte, R.
Weiss, Wrong Ways and Promising Ways towards Robust and
Efficient Iterative Linear Solvers, Advances in Computer
Methods for Partial Differential Equations VI, Publ.
IMACS, 1987, pp. 7-14.
- [5] E.J. Craig, The N-Step Iteration
Procedures, Math. Phys., 1955, Vol. 34, pp. 64--73.
- [6] P. Sonneveld, CGS, A fast Lanczos-type
Solver for non-symmetric linear Systems, SIAM J. Sci.
Stat. Comp., 1989, Vol. 10, No.1, pp. 36-52.
- [7] R.W. Freund and N.M. Nachtigal, QMR:
a Quasi-Minimal Residual Method for Non-Hermitian Linear
Systems, Numer. Math., 1991, Vol. 60, pp. 315--339.,
- [8] R. Weiss, Properties of Generalised
Conjugate Gradient Methods, Num. Lin. Alg. with Appl.,
1994, Vol. 1, No. 1, pp. 45--63.
- [9] L. Zhou and H.F. Walker, Residual
Smoothing Techniques for Iterative Methods, SIAM J. Sci.
Comput., 1994, Vol. 15, No. 2, pp. 297--312.
- [10] R. Weiss, H. Haefner, W.
Schoenauer: LINSOL (LINear
SOLver) - Description and User's Guide for the
Parallelized Version (DRAFT Version 0.96); University
of Karlsruhe; Computing Center; Internal Report 61/95;
1995.
Authors
Program by L. Grosz, H. Haefner, C. Roll, P.
Sternecker, R. Weiss 1989-96. Please contact L.Grosz.
By L. Grosz, Auckland, 4 June 2000.
|