|
Home /
Products /
Software Developer Tools /
NAGWare f95 Compiler / Performance Tips
Performance Tips for NAGWare f95
General Performance Tips
- Use -O3 or -O4 instead of just -O.
This will lengthen compile time (sometimes substantially with -O4),
but runtime performance is usually improved.
- If you use assumed-shape arrays and you know that the actual arguments are
always contiguous (i.e. you do not pass array slices using section notation),
use -Oassumed=always_contig.
With this option, a runtime error occurs if a non-contiguous actual argument
is detected (so it is also useful for discovering whether you use such array
sections).
If you are not 100% sure, but you think that this is true all or almost all of
the time, use -Oassumed.
With this option, non-contiguous actual arguments will be accepted though
access to them will be slow.
Specific Platforms
Performance tips for Intel Linux
- -Wc,-malign-double
This compiler option may provide a worthwhile speed up on this platform.
However it may also have some pitfalls.
It may give incorrect results when either common blocks or derived types
have double precision entities following an odd number of single precision
entities. E.g.
COMMON/c/x(3),d
INTEGER x
DOUBLE PRECISION d
or
TYPE t
DOUBLE PRECISION value1
LOGICAL flag
DOUBLE PRECISION value2
END TYPE
You can often avoid these problems by ensuring that double precision entities
are at the beginning of common blocks and structures, e.g.
COMMON/c/d,x(3)
and
TYPE t
DOUBLE PRECISION value1,value2
LOGICAL flag
END TYPE
But, if your code does not use common blocks or derived types with the
above pitfalls, a good speed up may be expected on many programs.
Performance tips for DEC Alpha running Unix
- -ieee=nonstd
This typically speeds up an application by a factor of three at the cost of
losing IEEE gradual underflow. Speed-ups of more than a factor of 100 (that is not a typo!) have
been seen in some cases.
- -Ounsafe
This option increases speed yet further over -ieee=nonstd.
However, some numerically unsafe optimisations are done, and floating-point
exceptions are sometimes reported later than expected.
Performance tips for IBM Risc System 6000
- -ieee=full
The floating-point hardware on the RS/6000 is much slower when floating-point
traps are enabled. By default, these traps are enabled by NAGWare f95 (because it greatly eases
debugging); by using -ieee=full floating-point operations run several
times faster.
Performance tips for Sun SPARC running Solaris
- -ieee=nonstd
If your application makes significant use of denormalised numbers, but does not
rely on them for accurate results, this option can improve performance
substantially. (This is not true of all SPARC processors; the switch is only
important if a significant fraction of execution time is "system" rather than
"user" time).
|