This document is essential reading for every user of the NAG Fortran SMP Library implementation specified in the title. It provides implementation-specific detail that augments the information provided in the NAG Fortran SMP Library documentation. Wherever this documentation refers to the "Users' Note for your implementation", you should consult this note.
NAG recommends that you read the following minimum reference material which can be found in the documentation, together with this note, before calling any library routine:
(a) Introduction to the NAG Fortran SMP Library
(b) Essential
Introduction to the NAG Fortran Library
(c) The appropriate
Chapter Introduction
(d) The appropriate Routine Document
Assuming that libnagsmp.a has been installed in a directory in the search path of the linker, such as /usr/lib/sparcv9, then you may link to the NAG Fortran SMP Library in the following manner:
(a) Set the environment variables STACKSIZE and OMP_NUM_THREADS to the amount of stack for each processor and the number of processors to be used, e.g.:
(for Korn and Bourne shell - ksh, bsh) set STACKSIZE=SIZE export STACKSIZE set OMP_NUM_THREADS=N export OMP_NUM_THREADS (for C shell - csh) setenv STACKSIZE SIZE setenv OMP_NUM_THREADS N
where N is the number of processors to be used, SIZE the stacksize required to run the program in parallel, as each thread requires its own stack space. SIZE is in KB and must be at least 4000. i.e. 4MB for each processor.
(b) Compile and link with the NAG Fortran SMP Library and the multithreaded Sun Performance Library e.g.
f90 -dalign -xtarget=ultra2 -xarch=v9a -explicitpar -mp=openmp -stackvar \ -O3 driver.f -lnagsmp -xlic_lib=sunperf_mt -lsocket -lnsl
where driver.f is your application program.
In the above example the -explicitpar and -mp=openmp compiler options are essential so that linking will automatically include the required micro-tasking library and threadsafe Fortran run-time library. The minimum level of optimization that can be used is -O3. If a lower level of optimization is specified, or should the optimization level not be specified explicitly, the compiler will assume an optimization level of -O3 and produce an appropriate warning message. Please note that the -xarch compiler option must appear after -xtarget in all compilation and linking commands. In addition the -stackvar compiler option is also specified as this is recommended when using parallelism.
N.B. The -dalign flag MUST be used to compile all subprograms and files in the program unit, as it was used to compile all NAG Fortran SMP Library routines.
The example programs are most easily accessed by the command nagexample, which will provide you with a copy of an example program (and its data, if any), compile the program and link it with the library (showing you the compile command so that you can recompile your own version of the program). Finally, the executable program will be run, presenting its output to stdout. The example program concerned is specified by the argument to nagexample, e.g.
nagexample c06eaf
will copy the example program and its data into the files c06eafe.f and c06eafe.d in the current directory and process them to produce the example program results.
The example programs supplied to a site in machine-readable form have been modified as necessary so that they are suitable for immediate execution. In some instances they may differ from the example program supplied in the documentation. The distributed example programs should be used in preference wherever possible.
For this double precision implementation, the bold italicised terms used in the documentation should be interpreted as:
real - DOUBLE PRECISION (REAL*8) basic precision - double precision complex - COMPLEX*16 additional precision - quadruple precision (REAL*16) machine precision - the machine precision, see the value returned by X02AJF in Section 3
Thus a parameter described as real should be declared as DOUBLE PRECISION in your program. If a routine accumulates an inner product in additional precision, it is using software to simulate quadruple precision.
In some routine documents additional bold italicised terms are used in the published example programs and they must be interpreted as follows:
real as an intrinsic function name - DBLE imag - DIMAG cmplx - DCMPLX conjg - DCONJG e in constants, e.g. 1.0e-4 - D, e.g. 1.0D-4 e in formats, e.g. e12.4 - D, e.g. D12.4
All references to routines in Chapter F07 - Linear Equations (LAPACK) and Chapter F08 - Least-squares and Eigenvalue Problems (LAPACK) use the LAPACK name, not the NAG F07/F08 name. The LAPACK name is precision dependent, and hence the name appears in a bold italicised typeface.
The typeset examples use the single precision form of the LAPACK
name. To convert this name to its double precision form, change the
first character either from S to D or C to Z as appropriate.
For
example:
sgetrf refers to the LAPACK routine name - DGETRF cpotrs - ZPOTRS
Certain routines produce explicit error messages and advisory messages via output units which either have default values or can be reset by using X04AAF for error messages and X04ABF for advisory messages. (The default values are given in Section 3.) The maximum record lengths of error messages and advisory messages (including carriage control characters) are 80 characters, except where otherwise specified.
The following machine-readable information files are provided in the doc directory:
un.html - Users' Note (this document)
blas_lapack_to_nag - BLAS/F06, LAPACK/F07 and LAPACK/F08 listing
nag_to_blas_lapack - F06/BLAS, F07/LAPACK and F08/LAPACK listing
See Section 4 for additional documentation available from NAG.
Any further information which applies to one or more routines in this implementation is listed below, chapter by chapter.
The example programs for D03RAF and D03RBF take much longer to run than other examples.
In this implementation calls to the Basic Linear Algebra Subprograms (BLAS) and linear algebra routines (LAPACK) are implemented by calls to the Sun Performance Library except for the following routines where the NAG equivalent is used:
DBDSQR DGEBAL DGEBRD DGEHRD DGEQRF DGETRF DGETRS DOPGTR DORGBR DORGHR DORGQR DORGTR DORMBR DORMHR DORMQR DORMTR DPOTRF DPOTRS DSBEVD DSPEVD DSTEQR DSTEVD DSYEVD DSYTRD ZBDSQR ZGEBAL ZGEBRD ZGEHRD ZGEQRF ZGETRF ZGETRS ZHBEVD ZHEEVD ZHETRD ZHPEVD ZPOTRF ZPOTRS ZSTEQR ZUNGBR ZUNGHR ZUNGQR ZUNGTR ZUNMBR ZUNMHR ZUNMQR ZUNMTR ZUPGTR
The value of ACC, the machine-dependent constant mentioned in several documents in the chapter, is 1.0D-13.
In this implementation the default mechanism used for generating random numbers is the parallelised set of Wichmann-Hill generators. This can also be selected manually by calling G05ZAF with its only parameter set to 'W' prior to any calls to G05 routines. Alternatively, the standard serial generator, as used in the NAG Fortran Library (Mark 19 or earlier), can be selected by calling G05ZAF with its parameter set to 'O' prior to any calls to G05 routines.
The default mechanism contains 273 generators. When OpenMP parallelism is requested by setting the environment variable OMP_NUM_THREADS to a value greater than 1, generators are used to generate independently portions of a sequence of random numbers. The generator assigned to each portion cannot be predetermined; therefore reproducibility of results should not be expected when using these routines in parallel. If reproducibility of random sequences is required, then the standard serial mechanism should be selected using G05ZAF.
On hard failure, P01ABF writes the error message to the error message unit specified by X04AAF and then stops.
The constants referred to in the documentation have the following values in this implementation:
S07AAF F(1) = 1.0D+13 F(2) = 1.0D-14 S10AAF E(1) = 18.50 S10ABF E(1) = 708.0 S10ACF E(1) = 708.0 S13AAF x(hi) = 708.3 S13ACF x(hi) = 5.6D+14 S13ADF x(hi) = 5.6D+14 S14AAF IFAIL = 1 if X > 170.0 IFAIL = 2 if X < -170.0 IFAIL = 3 if abs(X) < 2.23D-308 S14ABF IFAIL = 2 if X > 2.55D+305 S15ADF x(hi) = 26.6 x(low) = -6.25 S15AEF x(hi) = 6.25 S17ACF IFAIL = 1 if X > 5.6D+14 S17ADF IFAIL = 1 if X > 5.6D+14 IFAIL = 3 if 0.0 < X <= 2.23D-308 S17AEF IFAIL = 1 if abs(X) > 5.6D+14 S17AFF IFAIL = 1 if abs(X) > 5.6D+14 S17AGF IFAIL = 1 if X > 103.8 IFAIL = 2 if X < -8.9D+9 S17AHF IFAIL = 1 if X > 104.1 IFAIL = 2 if X < -8.9D+9 S17AJF IFAIL = 1 if X > 104.1 IFAIL = 2 if X < -1.8D+9 S17AKF IFAIL = 1 if X > 104.1 IFAIL = 2 if X < -1.8D+9 S17DCF IFAIL = 2 if abs (Z) < 3.93D-305 IFAIL = 4 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 5 if abs (Z) or FNU+N-1 > 1.07D+9 S17DEF IFAIL = 2 if imag (Z) > 700.0 IFAIL = 3 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 4 if abs (Z) or FNU+N-1 > 1.07D+9 S17DGF IFAIL = 3 if abs (Z) > 1.02D+3 IFAIL = 4 if abs (Z) > 1.04D+6 S17DHF IFAIL = 3 if abs (Z) > 1.02D+3 IFAIL = 4 if abs (Z) > 1.04D+6 S17DLF IFAIL = 2 if abs (Z) < 3.93D-305 IFAIL = 4 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 5 if abs (Z) or FNU+N-1 > 1.07D+9 S18ADF IFAIL = 2 if 0.0 < X <= 2.23D-308 S18AEF IFAIL = 1 if abs(X) > 711.6 S18AFF IFAIL = 1 if abs(X) > 711.6 S18CDF IFAIL = 2 if 0.0 < X <= 2.23D-308 S18DCF IFAIL = 2 if abs (Z) < 3.93D-305 IFAIL = 4 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 5 if abs (Z) or FNU+N-1 > 1.07D+9 S18DEF IFAIL = 2 if real (Z) > 700.0 IFAIL = 3 if abs (Z) or FNU+N-1 > 3.27D+4 IFAIL = 4 if abs (Z) or FNU+N-1 > 1.07D+9 S19AAF IFAIL = 1 if abs(x) >= 49.50 S19ABF IFAIL = 1 if abs(x) >= 49.50 S19ACF IFAIL = 1 if X > 997.26 S19ADF IFAIL = 1 if X > 997.26 S21BCF IFAIL = 3 if an argument < 1.579D-205 IFAIL = 4 if an argument >= 3.774D+202 S21BDF IFAIL = 3 if an argument < 2.820D-103 IFAIL = 4 if an argument >= 1.404D+102
The values of the mathematical constants are:
X01AAF (PI) = 3.1415926535897932 X01ABF (GAMMA) = 0.5772156649015329
The values of the machine constants are:
The basic parameters of the model
X02BHF = 2 X02BJF = 53 X02BKF = -1021 X02BLF = 1024 X02DJF = .TRUE.
Derived parameters of the floating-point arithmetic
X02AJF = Z'3CA0000000000001' ( 1.11022302462516D-16 ) X02AKF = Z'0010000000000000' ( 2.22507385850721D-308 ) X02ALF = Z'7FEFFFFFFFFFFFFF' ( 1.79769313486231D+308 ) X02AMF = Z'0010000000000000' ( 2.22507385850721D-308 ) X02ANF = Z'0020000000000000' ( 4.45014771701441D-308 )
Parameters of other aspects of the computing environment
X02AHF = Z'4300000000000000' ( 5.62949953421312D+14 ) X02BBF = 2147483647 X02BEF = 15 X02DAF = .FALSE.
The default output units for error and advisory messages for those routines which can produce explicit output are both Fortran Unit 6.
The finest granularity of wall-clock time available on this system is one second, so the seventh element of the integer array passed as a parameter to X05AAF will always be returned with the value 0.
Each NAG Fortran SMP Library site is ordinarily provided with a single printed copy of all supporting documentation. If you require additional copies then please contact NAG.
On-line documentation is also provided, in PDF form, with this implementation. Please see the Readme file on the distribution medium for further information.
Queries concerning this document or the implementation generally should be directed initially to your local Advisory Service. If you have difficulty in making contact locally, you can contact NAG directly at one of the addresses given in the Appendix.
The NAG Response Centres are available for general enquiries from all users and also for technical queries from sites with an annually licensed product or support service.
The Response Centres are open during office hours, but contact is possible by fax, email and phone (answering machine) at all times.
When contacting a Response Centre please quote your NAG site reference and NAG product code (in this case FSSO620DAL).
The NAG websites are an information service providing items of interest to users and prospective users of NAG products and services. The information is reviewed and updated regularly and includes implementation availability, descriptions of products, downloadable software, product documentation and technical reports. The NAG websites can be accessed at
or
http://www.nag.com/ (in North America)
or
http://www.nag-j.co.jp/ (in Japan)
If you would like to be kept up to date with news from NAG you may want to register to receive our electronic newsletter, which will alert you to special offers, announcements about new products or product/service enhancements, case studies and NAG's event diary. To register visit the NAG Ltd website or contact us at nagnews@nag.co.uk.
Many factors influence the way NAG's products and services evolve and your ideas are invaluable in helping us to ensure that we meet your needs. If you would like to contribute to this process we would be delighted to receive your comments. Please contact your local NAG Response Centre (shown below).
NAG Ltd Wilkinson House Jordan Hill Road OXFORD OX2 8DR NAG Ltd Response Centre United Kingdom email: support@nag.co.uk Tel: +44 (0)1865 511245 Tel: +44 (0)1865 311744 Fax: +44 (0)1865 310139 Fax: +44 (0)1865 310139 NAG Inc 1431 Opus Place, Suite 220 Downers Grove IL 60515-1362 NAG Inc Response Center USA email: infodesk@nag.com Tel: +1 630 971 2337 Tel: +1 630 971 2345 Fax: +1 630 971 2706 Fax: +1 630 971 2706 Nihon NAG KK Hatchobori Frontier Building 2F 4-9-9 Hatchobori Chuo-ku Tokyo 104-0032 Japan email: help@nag-j.co.jp Tel: +81 (0)3 5542 6311 Fax: +81 (0)3 5542 6312