This document is essential reading for every user of the NAG Fortran SMP Library implementation specified in the title. It provides implementation-specific detail that augments the information provided in the NAG Fortran SMP Library documentation. Wherever this documentation refers to the "Users' Note for your implementation", you should consult this note.
NAG recommends that you read the following minimum reference material which can be found in the documentation, together with this note, before calling any library routine:
(a) Introduction to the NAG Fortran SMP Library
(b) Essential
Introduction to the NAG Fortran Library
(c) The appropriate
Chapter Introduction
(d) The appropriate Routine Document
Assuming that libnagsmp.a has been installed in a directory in the search path of the linker, such as /usr/lib, then you may link to the NAG Fortran SMP Library in the following manner:
(a) Set the environment variable OMP_NUM_THREADS to the number of processors to be used, e.g.:
(for Korn and Bourne shell - ksh, bsh) set OMP_NUM_THREADS=N export OMP_NUM_THREADS (for C shell - csh) setenv OMP_NUM_THREADS N
where N is the number of processors to be used.
(b) Compile and link with the NAG Fortran SMP Library and the Intel Math Kernel Library (MKL), e.g. to use the static MKL libraries
ifc -auto -O3 -tpp6 -xK -openmp -fpp driver.f \ -lnagsmp -lguide /opt/intel/mkl/lib/32/libmkl_lapack.a \ /opt/intel/mkl/lib/32/libmkl_p3.a
or, to use the shared MKL libraries
ifc -auto -O3 -tpp6 -xK -openmp -fpp driver.f \ -lnagsmp -lguide -L/opt/intel/mkl/lib/32 -lmkl_lapack -lmkl_p3
where driver.f is your application program, and the Intel MKL library has been installed in /opt/intel/mkl.
In the above example the -openmp compiler option is essential so that linking will automatically include the required OpenMP runtime library. It is also essential to link with the Guide library before linking to MKL 5.2. It may be necessary to increase the value of the environment variable KMP_STACKSIZE, the size of the thread stacksize in bytes. Also note that the flag -static should not be used. Consult the Intel Fortran Compiler release notes for further information on these issues.
The example programs are most easily accessed by the command nagexample, which will provide you with a copy of an example program (and its data, if any), compile the program and link it with the library (showing you the compile command so that you can recompile your own version of the program). Finally, the executable program will be run, presenting its output to stdout. The example program concerned is specified by the argument to nagexample, e.g.
nagexample c06eaf
will copy the example program and its data into the files c06eafe.f and c06eafe.d in the current directory and process them to produce the example program results.
The example programs supplied to a site in machine-readable form have been modified as necessary so that they are suitable for immediate execution. In some instances they may differ from the example program supplied in the documentation. The distributed example programs should be used in preference wherever possible.
For this double precision implementation, the bold italicised terms used in the documentation should be interpreted as:
real - DOUBLE PRECISION (REAL*8) basic precision - double precision complex - COMPLEX*16 additional precision - quadruple precision (REAL*16) machine precision - the machine precision, see the value returned by X02AJF in Section 3
Thus a parameter described as real should be declared as DOUBLE PRECISION in your program. If a routine accumulates an inner product in additional precision, it is using software to simulate quadruple precision.
In some routine documents additional bold italicised terms are used in the published example programs and they must be interpreted as follows:
real as an intrinsic function name - DBLE imag - DIMAG cmplx - DCMPLX conjg - DCONJG e in constants, e.g. 1.0e-4 - D, e.g. 1.0D-4 e in formats, e.g. e12.4 - D, e.g. D12.4
All references to routines in Chapter F07 - Linear Equations (LAPACK) and Chapter F08 - Least-squares and Eigenvalue Problems (LAPACK) use the LAPACK name, not the NAG F07/F08 name. The LAPACK name is precision dependent, and hence the name appears in a bold italicised typeface.
The typeset examples use the single precision form of the LAPACK
name. To convert this name to its double precision form, change the
first character either from S to D or C to Z as appropriate.
For
example:
sgetrf refers to the LAPACK routine name - DGETRF cpotrs - ZPOTRS
Certain routines produce explicit error messages and advisory messages via output units which either have default values or can be reset by using X04AAF for error messages and X04ABF for advisory messages. (The default values are given in Section 3.) The maximum record lengths of error messages and advisory messages (including carriage control characters) are 80 characters, except where otherwise specified.
The following machine-readable information files are provided in the doc directory:
un.html - Users' Note (this document)
blas_lapack_to_nag - BLAS/F06, LAPACK/F07 and LAPACK/F08 listing
nag_to_blas_lapack - F06/BLAS, F07/LAPACK and F08/LAPACK listing
See Section 4 for additional documentation available from NAG.
Any further information which applies to one or more routines in this implementation is listed below, chapter by chapter.
The example programs for D03RAF and D03RBF take much longer to run than other examples.
In this implementation calls to the Basic Linear Algebra Subprograms (BLAS) and linear algebra routines (LAPACK) are implemented by calls to the Sun Performance Library except for the following routines where the NAG equivalent is used:
DBDSQR DGEBAL DGEBRD DGEHRD DGEQRF DGETRF DGETRS DOPGTR DORGBR DORGHR DORGQR DORGTR DORMBR DORMHR DORMQR DORMTR DPOTRF DPOTRS DSBEVD DSPEVD DSTEQR DSTEVD DSYEVD DSYTRD ZBDSQR ZGEBAL ZGEBRD ZGEHRD ZGEQRF ZGETRF ZGETRS ZHBEVD ZHEEVD ZHETRD ZHPEVD ZPOTRF ZPOTRS ZSTEQR ZUNGBR ZUNGHR ZUNGQR ZUNGTR ZUNMBR ZUNMHR ZUNMQR ZUNMTR ZUPGTR
The value of ACC, the machine-dependent constant mentioned in several documents in the chapter, is 1.0D-13.
In this implementation the default mechanism used for generating random numbers is the parallelised set of Wichmann-Hill generators. This can also be selected manually by calling G05ZAF with its only parameter set to 'W' prior to any calls to G05 routines. Alternatively, the standard serial generator, as used in the NAG Fortran Library (Mark 19 or earlier), can be selected by calling G05ZAF with its parameter set to 'O' prior to any calls to G05 routines.
The default mechanism contains 273 generators. When OpenMP parallelism is requested by setting the environment variable OMP_NUM_THREADS to a value greater than 1, generators are used to generate independently portions of a sequence of random numbers. The generator assigned to each portion cannot be predetermined; therefore reproducibility of results should not be expected when using these routines in parallel. If reproducibility of random sequences is required, then the standard serial mechanism should be selected using G05ZAF.
On hard failure, P01ABF writes the error message to the error message unit specified by X04AAF and then stops.
The constants referred to in the documentation have the following values in this implementation:
S07AAF F(1) = 1.0D+13 F(2) = 1.0D-14 S10AAF E(1) = 1.8500D+1 S10ABF E(1) = 7.080D+2 S10ACF E(1) = 7.080D+2 S13AAF x(hi) = 7.083D+2 S13ACF x(hi) = 1.0D+16 S13ADF x(hi) = 1.0D+17 S14AAF IFAIL = 1 if X > 1.70D+2 IFAIL = 2 if X < -1.70D+2 IFAIL = 3 if abs(X) < 2.23D-308 S14ABF IFAIL = 2 if X > 2.55D+305 S15ADF x(hi) = 2.66D+1 x(low) = -6.25D+0 S15AEF x(hi) = 6.25D+0 S17ACF IFAIL = 1 if X > 1.0D+16 S17ADF IFAIL = 1 if X > 1.0D+16 IFAIL = 3 if 0.0D+00 < X <= 2.23D-308 S17AEF IFAIL = 1 if abs(X) > 1.0D+16 S17AFF IFAIL = 1 if abs(X) > 1.0D+16 S17AGF IFAIL = 1 if X > 1.038D+2 IFAIL = 2 if X < -5.6D+10 S17AHF IFAIL = 1 if X > 1.041D+2 IFAIL = 2 if X < -5.6D+10 S17AJF IFAIL = 1 if X > 1.041D+2 IFAIL = 2 if X < -1.8D+9 S17AKF IFAIL = 1 if X > 1.041D+2 IFAIL = 2 if X < -1.8D+9 S17DCF IFAIL = 2 if abs (Z) < 3.93D-305 IFAIL = 4 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 5 if abs (Z) or FNU+N-1 > 1.07D+9 S17DEF IFAIL = 2 if imag (Z) > 7.00D+2 IFAIL = 3 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 4 if abs (Z) or FNU+N-1 > 1.07D+9 S17DGF IFAIL = 3 if abs (Z) > 1.02D+3 IFAIL = 4 if abs (Z) > 1.04D+6 S17DHF IFAIL = 3 if abs (Z) > 1.02D+3 IFAIL = 4 if abs (Z) > 1.04D+6 S17DLF IFAIL = 2 if abs (Z) < 3.93D-305 IFAIL = 4 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 5 if abs (Z) or FNU+N-1 > 1.07D+9 S18ADF IFAIL = 2 if 0.0D+00 < X <= 2.23D-308 S18AEF IFAIL = 1 if abs(X) > 7.116D+2 S18AFF IFAIL = 1 if abs(X) > 7.116D+2 S18CDF IFAIL = 2 if 0.0D+00 < X <= 2.23D-308 S18DCF IFAIL = 2 if abs (Z) < 3.93D-305 IFAIL = 4 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 5 if abs (Z) or FNU+N-1 > 1.07D+9 S18DEF IFAIL = 2 if real (Z) > 7.00D+2 IFAIL = 3 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 4 if abs (Z) or FNU+N-1 > 1.07D+9 S19AAF IFAIL = 1 if abs(x) >= 4.95000D+1 S19ABF IFAIL = 1 if abs(x) >= 4.95000D+1 S19ACF IFAIL = 1 if X > 9.9726D+2 S19ADF IFAIL = 1 if X > 9.9726D+2 S21BCF IFAIL = 3 if an argument < 1.579D-205 IFAIL = 4 if an argument >= 3.774D+202 S21BDF IFAIL = 3 if an argument < 2.820D-103 IFAIL = 4 if an argument >= 1.404D+102
The values of the mathematical constants are:
X01AAF (PI) = 3.1415926535897932D+00 X01ABF (GAMMA) = 0.5772156649015329D+00
The values of the machine constants are:
The basic parameters of the model
X02BHF = 2 X02BJF = 53 X02BKF = -1021 X02BLF = 1024 X02DJF = .TRUE.
Derived parameters of the floating-point arithmetic
X02AJF = Z'3CA0040000000000' ( 1.11130722679765D-16 ) X02AKF = Z'0010000000000000' ( 2.22507385850721D-308 ) X02ALF = Z'7FEFFFFFFFFFFFFF' ( 1.79769313486231D+308 ) X02AMF = Z'0010000000000000' ( 2.22507385850721D-308 ) X02ANF = Z'0010000000000000' ( 2.22507385850721D-308 )
Parameters of other aspects of the computing environment
X02AHF = Z'4950000000000000' ( 1.42724769270596D+45 ) X02BBF = 2147483647 X02BEF = 15 X02DAF = .FALSE.
The default output units for error and advisory messages for those routines which can produce explicit output are both Fortran Unit 6.
The finest granularity of wall-clock time available on this system is one second, so the seventh element of the integer array passed as a parameter to X05AAF will always be returned with the value 0.
Each NAG Fortran SMP Library site is ordinarily provided with a single printed copy of all supporting documentation. If you require additional copies then please contact NAG.
On-line documentation is also provided, in PDF form, with this implementation. Please see the Readme file on the distribution medium for further information.
Queries concerning this document or the implementation generally should be directed initially to your local Advisory Service. If you have difficulty in making contact locally, you can contact NAG directly at one of the addresses given in the Appendix. Users subscribing to the support service are encouraged to contact one of the NAG Response Centres (see below).
The NAG Response Centres are available for general enquiries from all users and also for technical queries from sites with an annually licensed product or support service.
The Response Centres are open during office hours, but contact is possible by fax, email and phone (answering machine) at all times.
When contacting a Response Centre it helps us deal with your enquiry quickly if you can quote your NAG site reference and NAG product code (in this case FSLUX20DBL).
The NAG websites provide information about implementation availability, descriptions of products, downloadable software, product documentation and technical reports. The NAG websites can be accessed at
http://www.nag.co.uk/, http://www.nag.com/ (in North America) or http://www.nag-j.co.jp/ (in Japan)
If you would like to be kept up to date with news from NAG then please register to receive our free electronic newsletter, which will alert you to special offers, announcements about new products or product/service enhancements, customer stories and NAG's event diary. You can register via one of our websites, or by contacting us at nagnews@nag.co.uk.
Many factors influence the way NAG's products and services evolve and your ideas are invaluable in helping us to ensure that we meet your needs. If you would like to contribute to this process we would be delighted to receive your comments. Please contact your local NAG Response Centre (shown below).
NAG Ltd Wilkinson House Jordan Hill Road OXFORD OX2 8DR NAG Ltd Response Centre United Kingdom email: support@nag.co.uk Tel: +44 (0)1865 511245 Tel: +44 (0)1865 311744 Fax: +44 (0)1865 310139 Fax: +44 (0)1865 310139 NAG Inc 1431 Opus Place, Suite 220 Downers Grove IL 60515-1362 NAG Inc Response Center USA email: infodesk@nag.com Tel: +1 630 971 2337 Tel: +1 630 971 2345 Fax: +1 630 971 2706 Fax: +1 630 971 2706 Nihon NAG KK Hatchobori Frontier Building 2F 4-9-9 Hatchobori Chuo-ku Tokyo 104-0032 Japan email: help@nag-j.co.jp Tel: +81 (0)3 5542 6311 Fax: +81 (0)3 5542 6312