pgf90 编译的优化参数 防止编译器优化

7 Command-line Options

7.1 Generic PGI Compiler Options

7.2 C and C++ -specific Compiler Options



Thischapter describes the syntax and operation of each compiler option. The optionsare arranged in alphabetical order. On a command-line, options need to bepreceded by a hyphen (-). If the compiler does not recognize an option, itpasses the option to the linker.

This chapter uses the following notation:

[item]

Square brackets indicate that the enclosed item is optional.

{item | item}

Braces indicate that you must select one and only one of the encloseditems. A vertical bar (|) separates the choices.

...

Horizontal ellipses indicate that zero or more instances of the precedingitem are valid.

NOTE

Some options do not allow a space between the option and its argument orwithin an argument. This fact is noted in the syntax section of the respectiveoption.

Table 7-1 Generic PGI Compiler Options

Option

Description

-#

Displayinvocation information.

-###

Showbut do not execute the driver commands (same as

-dryrun).

-byteswapio

Swapbytes from big-endian to little-endian or vice versa on input/output ofunformatted data

-C

Performarray bounds checking.

-c

Stopsafter the assembly phase and saves the object code in filename.o.

-cyglibs

(Win32only) link against the Cygnus libraries and use the Cygnus include files. Youmust have the full Cygwin32 environment installed in order to use thisswitch.

-D<arg>

Definesa preprocessor macro.

-dryrun

Showbut do not execute driver commands.

-E

Stopsafter the preprocessing phase and displays the preprocessed file on thestandard output.

-F

Stopsafter the preprocessing phase and saves the preprocessed file in filename.f(this option is only valid for the PGI Fortran compilers)

-f

Ignored

-fast

Generallyoptimal; equivalent to: -O -Munroll -Mnoframe

-flags

Displayvalid driver options.

-fpic

Generateposition-independent code.

-fPIC

Equivalentto -fpic.

-G

(Solaris86only) Passed to the linker. Instructs the linker to generate a shared objectfile.

-g

Includesdebugging information in the object module.

-g77libs

Allowobject files generated by g77 to be linked into PGI main programs.

-help

Displaydriver help message.

-I<arg>

Addsa directory to the search path for #include files.

-i

Passedto the linker

-i2

TreatINTEGER variables as 2 bytes.

-i4

TreatINTEGER variables as 4 bytes.

-i8

TreatINTEGER variables as 8 bytes and use 64-bits for INTEGER*8 operations.

-Kflag

Requestsspecial compilation semantics with regard to conformance to IEEE 754.

-L

Specifiesa library directory.

-l

Loadsa library.

-Mpgflag

Selectsvariations for code generation and optimization.

-m

Displaysa link map on the standard output

-module<moduledir>

Save/searchfor module files in directory <moduledir>; (only valid for thePGF90 and PGHPF compilers)

-mp

Interpretand process user-inserted shared-memory parallel programming directives (seeChapters 10 and 11).

-mslibs

(Win32only) use the Microsoft linker and include files, and link against theMicrosoft Visual C++ libraries. Microsoft Visual C++ must beinstalled in order to use this switch.

-msvcrt

(Win32only) use Microsoft's msvcrt.dll at runtime rather than the defaultcrtdll.dll.

-Olevel

Specifiescode optimization level where level is 0, 1, or 2.

-o

Namesthe object file.

-P

Stopsafter the preprocessing phase and saves the preprocessed file infilename.i (only valid for the PGI C/C++ compilers)

-pc

Setprecision for certain calculations.

-Q

Selectsvariations for compiler steps.

-R<directory>

(Linuxand Solaris86 only) Passed to the Linker. Hard code <directory> into thesearch path for shared object files.

-r

Createsa relocatable object file.

-r4

InterpretDOUBLE PRECISION variables as REAL.

-r8

InterpretREAL variables as DOUBLE PRECISION.

-rcfile

Specifiesthe name of the driver's startup file.

-S

Stopsafter the compiling phase and saves the assembly-language code in filename.s.

-s

Stripsthe symbol-table information from the object file.

-shared

(Linuxonly) Passed to the linker. Instructs the linker to generate a shared objectfile.

-show

Displaydriver's configuration parameters after startup.

-silent

Donot print warning messages.

-time

Printexecution times for the various compilation steps.

-tp

Specifythe type of the target processor; -tp p5 for Pentium processors, -tpp6 for Pentium Pro/II/III processors, and -tp px for blended p5/p6code generation.

-Usymbol

Undefinea preprocessor macro.

-usymbol

Initializesthe symbol table with symbol, which is undefined for the linker. An undefinedsymbol triggers loading of the first member of an archive library.

-V

Displaysthe version messages and other information.

-v

Displaysthe compiler, assembler, and linker phase invocations.

pgf90 编译的优化参数 防止编译器优化
-W

Passesarguments to a specific phase.

-w

Donot print warning messages.

Table 7-2 C and C++ -specific Compiler Options

Option

Description

-A

(pgCConly) Accept proposed ANSI C++.

--[no_]alternative_tokens

(pgCConly) Enable/disable recognition of alternative tokens. These are tokens thatmake it possible to write C++ without the use of the , , [, ], #, &,and ^ and characters. The alternative tokens include the operator keywords(e.g., and, bitand, etc.) and digraphs. The default is

--no_alternative_tokens.

-b

(pgCConly) Compile with cfront 2.1 compatibility. This accepts constructs anda version of C++ that is not part of the language definition but isaccepted by cfront.

-b3

(pgCConly) Compile with cfront 3.0 compatibility. See -b above.

--[no_]bool

(pgCConly) Enable or disable recognition of bool. The default value is--bool.

--[no]builtin

Do/don'tcompile with math subroutine builtin support, which causes selected mathlibrary routines to be inlined. The default is --builtin.

--cfront_2.1

(pgCConly) Enable compilation of C++ with compatibility with cfrontversion 2.1.

--cfront_3.0

(pgCConly) Enable compilation of C++ with compatibility with cfrontversion 3.0.

--create_pchfilename

(pgCConly) Create a precompiled header file with the name filename.

--dependencies

(pgCConly) Print makefile dependences to stdout.

--dependencies_to_filefilename

(pgCConly) Print makefile dependences to file filename.

--diag_errortag

(pgCConly) Override the normal error severity of the specified diagnostic messages.

--diag_remarktag

(pgCC only) Override the normal error severity of the specifieddiagnostic messages.

--diag_suppresstag

(pgCConly) Override the normal error severity of the specified diagnostic messages.

--diag_warningtag

(pgCConly) Override the normal error severity of the specified diagnostic messages.

--display_error_number

(pgCConly) Display the error message number in any diagnostic messages that aregenerated.

--enumber

(pgCConly) Set the C++ front-end error limit to the specified number.

--[no_]exceptions

(pgCConly) Disable/enable exception handling support. The default is -exceptions

--gnu_extensions

(pgCConly) Allow GNU extensions like "include next" which are required to compileLinux system header files.

--instantiation_dirdirectory

(pgCConly) If --one_instantiation_per_object is used, define directoryas the instantiation directory.

--[no]llalign

(pgCConly) Do/don't align long long integers on integer boundaries. Thedefault is --llalign.

-M

Generatemake dependence lists.

-MD

Generatemake dependence lists.

-MD,filename

(pgCConly) Generate make dependence lists and print them to filefilename.

--one_instantiation_per_object

(pgCConly) Put out each template instantiation (function or static data member) in aseparate object file. The primary object file contains everything else in thecompilation. Allows users of libraries to pull only the instantiations thatare needed. Necessary for template libraries that rely on other templatelibraries.

--optk_allow_dollar_in_id_chars

(pgCConly) Accept dollar signs in identifiers.

--pch

(pgCConly) Automatically use and/or create a precompiled header file.

--pch_dirdirectory

(pgCConly) The directory in which to search for and/or create a precompiledheader file.

--[no_]pch_messages

(pgCConly) Enable/ disable the display of a message indicating that a precompiledheader file was created or used.

+p

(pgCConly) Disallow all anachronistic constructs.

-P

Stopsafter the preprocessing phase and saves the preprocessed file infilename.i.

--preinclude=<filename>

(pgCConly) Specify file to be included at the beginning of compilation; to setsystem-dependent macros, types, etc

--prelink_objects

(pgCConly) If --one_instantiation_per_object is used, create templateinstantiations for a set of objects that are about to become a templatelibrary. Required for template libraries that reference other template libraries.

-t

Controlinstantiation of template functions.

--use_pchfilename

(pgCConly) Use a precompiled header file of the specified name as part of thecurrent compilation.

--[no_]using_std

(pgCConly) Enable/disable implicit use of the std namespace when standard headerfiles are included.

-X

(pgCConly) Generate cross reference information and place output in specified file.

-Xm

(pgCConly) Allow $ in names.

-xh

(pgCConly) Enable exception handling

-.suffix

(pgCConly) Use with -E, -F, or -P to save intermediate file ina file with the specified suffix.



7.1 Generic PGI Compiler Options

-#

Use the -# option to display the invocations of the compiler,assembler and linker. These invocations are command-lines created by the driverfrom your command-line input and the default values.

Default: The compiler does not display individual phase invocations.

Usage: The following command-line requests verbose invocationinformation.

$ pgf90 -# prog.f

Cross-reference: -Minfo, -V, -v.

-###

Use the -### option to display the invocations of the compiler,assembler and linker but do not execute them. These invocations arecommand-lines created by the compilation driver from the PGIRC files and thecommand-line options.

Default: The compiler does not display individual phase invocations.

Usage: The following command-line requests verbose invocationinformation.

$ pgf90 -### myprog.f

Cross-reference: -Minfo, -V, -dryrun.

-byteswapio

Use the -byteswapi ooptionto swap the byte-order of data in unformatted Fortran data files oninput/output. When this option is used, the order of bytes is swapped in boththe data and record control words (the latter occurs in unformatted sequentialfiles). Specifically, this option can be used to convert big-endian formatdata files produced by most RISC workstations and high-end servers to thelittle-endian format used on Intel Architecture systems on-the-fly during filereads/writes. This option assumes that the record layouts of unformattedsequential access and direct access files are the same on the systems. Also,the assumption is that the IEEE representation is used for floating-pointnumbers. In particular, the format of unformatted data files produced by PGIFortran compilers is known to be identical to the format used on Sun and SGIworkstations, which allows you to read and write unformatted Fortran data filesproduced on those platforms from a program compiled for an IA-32 platform usingthe -byteswapio option.

Default: The compiler does not byte-swap data on input/output.

Usage: The following command-line requests byte-swapping be performedon input/output.

$ pgf90 -byteswapio myprog.f

-C

Perform array bounds checking. If an array bounds violation occurs whena program is executed, an error message describing where the error occurred isprinted and the program terminates.

Usage:

$ pgf90 -C myprog.f

Thetext of the error message includes the name of the array, the location wherethe error occurred (the source file and the line number in the source), andinformation about the subscript which is out of bounds (its value, its upperbound, and its dimension).

Cross-reference: -Mbounds, -Mnobounds.

-c

Stops after the assembling phase. Use the -c option to halt thecompilation process after the assembling phase and write the object code to thefile filename.o, where the input file is filename.f.

Default: The compiler produces an executable file (does not use the-c option).

Usage: In this example, the compiler produces the object filemyprog.o in the current directory.

$ pgf90 -c myprog.f

Cross-reference: -E, -Mkeepasm, -o, and -S.

-cyglibs

(Win32 only) link against the Cygnus libraries and use the Cygnusinclude files. You must have the full Cygwin32 environment installed inorder to use this switch.

Default: The compiler does not link against the Cygnus libraries.

-D

Defines a preprocessor macro. Use the -D option to create a macrowith a given value. The value must be either an integer or a character string.You can use the -D option more than once on a compiler command line. Thenumber of active macro definitions is limited only by available memory.

You can use macros with conditional compilation to select source text duringpreprocessing. A macro defined in the compiler invocation remains in effect foreach module on the command line, unless you remove the macro with an#undef preprocessor directive or with the -U option. Thecompiler processes all of the -U options in a command line afterprocessing the -D options.

Syntax:

-Dname[=value]

Wherename is the symbolic name, and value is either an integer valueor a character string.

Default: If you define a macro name without specifying a value thepreprocessor assigns the string 1 to the macro name.

Usage: In the following example, the macro PATHLENGTH has the value 256until a subsequent compilation. If the -D option is not used,PATHLENGTH's value is set to 128.

$ pgf90 -DPATHLENGTH=256 myprog.F

Wherethe source text is:

#ifndef PATHLENGTH

#define PATHLENGTH 128

#endif

SUBROUTINE SUB

CHARACTER*PATHLENGTH path

...

END

Cross-reference: -U

-dryrun

Use the -dryrun option to display the invocations of thecompiler, assembler and linker but do not execute them. These invocations arecommand lines created by the compilation driver from the PGIRC file and thecommand-line supplied with -dryrun.

Default: The compiler does not display individual phase invocations.

Usage: The following command-line requests verbose invocationinformation.

$ pgf90 -dryrun myprog.f

Cross-reference: -Minfo, -V, -###

-E

Stops after the preprocessing phase. Use the -E option to haltthe compilation process after the preprocessing phase and display thepreprocessed output on the standard output.

Default: The compiler produces an executable file.

Usage: In the following example the compiler displays the preprocessedmyprog.f on the standard output.

$ pgf90 -E myprog.f

Cross-reference: See the options -C, -c, -Mkeepasm, -o, -F,-S.

-F

Stops compilation after the preprocessing phase. Use the -Foption to halt the compilation process after preprocessing and write thepreprocessed output to the file filename.f where the input file isfilename.F.

Default: The compiler produces an executable file.

Usage: In the following example the compiler produces the preprocessedfile myprog.f in the current directory.

$ pgf90 -F myprog.F

Cross-reference: -c,-E, -Mkeepasm, -o, -S

-fast

A generally optimal set of options is chosen depending on the targetsystem. Always includes the options -O, -Munroll, and-Mnoframe on all IA-32 platforms. In addition, the -tpp5 or -tp p6 option is automatically included on Pentium and PentiumPro/II/III platforms respectively. NOTE: auto-selection of the appropriate-tp option means that IA-32 programs built using the -fast optionon a given system are not necessarily backward-compatible with older IA-32systems.

Cross-reference: -O, -Munroll,-Mnoframe, -tp

-flags

Displays driver options on the standard output. Use this option with-v to list options that are

recognized and ignored, as well as thevalid options.

Cross-reference: -#, -###, -v

-fpic

(Linux and Solaris86 only) Generate position-independent code suitablefor inclusion in shared object (dynamically linked library) files.

Cross-reference: -shared, -G, -R

-fPIC

(Linux and Solaris86 only) Equivalent to -fpic. Provided forcompatibility with other compilers.

Cross-reference: -fpic, -shared, -G, -R

-G

Valid only on Solaris86. Passed to the linker. Instructs the linker toproduce a shared object (dynamically linked library) file.

Cross-reference: -fpic, -shared, -R

-g

The -g option instructs the compiler to include symbolicdebugging information in the object module. Debuggers, such as PGDBG,require symbolic debugging information in the object module to display andmanipulate program variables and source code. Note that including symbolicdebugging information increases the size of the object module.

If you specify the -g option on the command-line, the compiler sets theoptimization level to -O0 (zero), unless you specify the -Ooption. For more information on the interaction between the -g and-O options, see the -O entry. Symbolic debugging may giveconfusing results if an optimization level other than zero is selected.

Default: The compiler does not put debugging information into theobject module.

Usage: In the following example the object file a.out willcontain symbolic debugging information.

$ pgf90 -g myprog.f

-g77libs

Use the -g77lib soptionon the link line if you are linking g77-compiled program units into apgf90-compiled main program using the pgf90 driver. Whenthis option is present, the pgf90 driver will search the necessaryg77 support libraries to resolve references specific tog77-compiled program units. The g77 compiler must be installedon the system on which linking occurs in order for this option to functioncorrectly.

Default: The compiler does not search g77 support librariesto resolve references at link time.

Usage: The following command-line requests that g77 supportlibraries be searched at link time:

$ pgf90 -g77libs myprog.f g77_object.o

-help

Displays options recognized by the driver on the standard output.

Cross-reference: -#, -###, -show, -V,-flags

-I

Adds a directory to the search path for files that are included usingthe INCLUDE statement or the preprocessor directive #include.Use the -I option to add a directory to the list of where to search forthe included files. The compiler searches the directory specified by the-I option before the default directories.

Syntax:

-Idirectory

Wheredirectory is the name of the directory added to the standard search pathfor include files.

Usage: The Fortran INCLUDE statement directs the compiler tobegin reading from another file. The compiler uses two rules to locate thefile:

If the file name specified in the INCLUDE statement includes apath name, the compiler begins reading from the file it specifies.

If no path name is provided in the INCLUDE statement, thecompiler searches (in order): any directories specified using the -I option (in the orderspecified.)

the directory containing the source file

the current directory

For example, the compiler applies rule(1) to the following statements:

INCLUDE '/bob/include/file1' (absolute path name)

INCLUDE '../../file1' (relative path name)

andrule (2) to this statement:

INCLUDE 'file1'

Cross-reference: -Mnostdinc

-i2, -i4 and -i8

Treat INTEGER variables as either two, four, or eight bytes.INTEGER*8 values not only occupy 8 bytes of storage, but operationsuse 64 bits, instead of 32 bits.

-Kflag

Requests that the compiler provide special compilation semantics.

Syntax:

-Kflag

Where flag is one of the following:

ieee

Perform floating-point operations in strict conformance with theIEEE 754 standard. Some optimizations are disabled, and on some systems a moreaccurate math library is linked if -Kieee is used during the link step.

noieee

Use the fastest available means to perform floating-point operations,link in faster non-IEEE libraries if available, and disable underflow traps.

PIC

(Linux and Solaris86 only) Generate position-independent code.Equivalent to -fpic. Provided for compatibility with other compilers.

pic

(Linux and Solaris86 only) Generate position-independent code.Equivalent to -fpic. Provided for compatibility with other compilers.

trap=option[,option]...

This new option controls the behavior ofthe IA-32 processor when IA-32 floating-point exceptions occur. Possibleoptions include:

fp

align (ignored)

inv

denorm

divz

ovf

unf

inexact

-Ktrap is only processed by the compilerswhen compiling main functions/programs. The options inv, denorm,divz, ovf, unf, and inexact correspond to the IA-32processor's exception mask bits invalid operation, denormalized operand,divide-by-zero, overflow, underflow, and precision, respectively. Normally,the IA-32 processor's exception mask bits are on (floating pointexceptions are masked - the IA-32 processor recovers from the exceptions andcontinues). If a floating point exception occurs and its corresponding maskbit is off (or "unmasked"), execution terminates with an arithmeticexception (C's SIGFPE signal). -Ktrap=fp is equivalent to-Ktrap=inv,divz,ovf.

Default: The default is-Knoieee.

-L

Specifies a directory to search for libraries. Use -L to adddirectories to the search path for library files. Multiple -L optionsare valid. However, the position of multiple -L options is importantrelative to -l options supplied.

Syntax:

-Ldirectory

Wheredirectory is the name of the library directory.

Default: Search the standard library directory.

Usage: In the following example the library directory is /liband the linker links in the standard libraries required by PGF90 from/lib.

$ pgf90 -L/lib myprog.f

Inthe following example the library directory /lib is searched for thelibrary file libx.a and both the directories /lib and/libz are searched for liby.a.

$ pgf90 -L/lib -lx -L/libz -ly myprog.f

-l<library>

Loads a library. The linker searches <library> inaddition to the standard libraries. Libraries specified with -l aresearched in order of appearance and before the standard libraries.

Syntax:

-llibrary

Wherelibrary is the name of the library to search. The compiler prepends thecharacters lib to the library name and adds the .a extensionfollowing the library name.

Usage: In the following example if the standard library directory is/lib the linker loads the library /lib/libmylib.a, in addition tothe standard libraries.

$ pgf90 myprog.f -lmylib

-Mpgflag

Selects options for code generation. The options are divided into thefollowing categories:

*Code generation

Environment

Inlining

Fortran Language Controls

C/C++ Language Controls

Optimization

Miscellaneous

Table 7-2 lists and briefly describes theoptions alphabetically and includes a field showing the category.

Table 7-2 -M Options Summary

pgflag

Description

Category

anno

annotatethe assembly code with source code.

Miscellaneous

[no]asmkeyword

specifieswhether the compiler allows the asm keyword in C/C++source files (pgcc and pgCC only).

C/C++Language

[no]backslash

determineshow the backslash character is treated in quoted strings (pgf77,pgf90, and pghpf only).

FortranLanguage

[no]bounds

specifieswhether array bounds checking is enabled or disabled.

Miscellaneous

[no]builtin

Do/don'tcompile with math subroutine builtin support, which causes selected mathlibrary routines to be inlined (pgcc and pgCC only).

Optimization

byteswapio

Swapbyte-order (big-endian to little-endian or vice versa) during I/O of Fortranunformatted data.

Miscellaneous

cache_align

wherepossible, align data objects of size greater than or equal to 16 bytes oncache-line boundaries.

Optimization

chkfpstk

checkfor internal consistency of the x86 FP stack in the prologue of a function andafter returning from a function or subroutine call.

Miscellaneous

chkptr

checkfor NULL pointers (pgf90 and pghpf only).

Miscellaneous

chkstk

checkthe stack for available space upon entry to and before the start of a parallelregion. Useful when many private variables are declared.

Miscellaneous

concur

enableauto-concurrentization of loops. Multiple processors will be used to executeparallelizable loops (only valid on shared memory multi-CPU systems).

Optimization

cray

ForceCray Fortran (CF77) compatibility (pgf77, pgf90, andpghpf only).

Optimization

[no]dclchk

determineswhether all program variables must be declared (pgf77, pgf90,and pghpf only).

FortranLanguage

[no]defaultunit

determineshow the asterisk character ("*") is treated in relation to standard input andstandard output (regardless of the status of I/O units 5 and 6, pgf77,pgf90, and pghpf only).

FortranLanguage

[no]depchk

checksfor potential data dependences.

Optimization

[no]dlines

determineswhether the compiler treats lines containing the letter "D" in column one asexecutable statements (pgf77, pgf90, and pghpf only).

FortranLanguage

dollar

specifiesthe character to which the compiler maps the dollar sign code (pgf77,pgf90, and pghpf only).

FortranLanguage

extend

thecompiler accepts 132-column source code; otherwise it accepts 72-column code(pgf77, pgf90, and pghpf only).

FortranLanguage

extract

invokesthe function extractor.

Inlining

fcon

instructsthe compiler to treat floating-point constants as float data types(pgcc and pgCC only).

C/C++Language

[no]i4

determineshow the compiler treats INTEGER variables (pgf77, pgf90, andpghpf only).

Optimization

info

printinformational messages regarding optimization and code generation to standardoutput as compilation proceeds.

Miscellaneous

inform

specifiesthe minimum level of error severity that the compiler displays.

Miscellaneous

inline

invokesthe function inliner.

Inlining

[no]iomutex

determineswhether critical sections are generated around Fortran I/O calls(pgf77, pgf90, and pghpf only).

FortranLanguage

keepasm

instructsthe compiler to keep the assembly file.

Miscellaneous

[no]list

specifieswhether the compiler creates a listing file.

Miscellaneous

neginfo

instructsthe compiler to produce information on why certain optimizations arenot performed.

Miscellaneous

noframe

eliminateoperations that set up a true stack frame pointer for functions.

Optimization

nomain

whenthe link step is called, don't include the object file which calls the Fortranmain program (pgf77, pgf90, and pghpf only).

CodeGeneration

noopenmp

whenused in combination with the -mp option, causes the compiler to ignoreOpenMP parallelization directives or pragmas, but still process SGI-styleparallelization directives or pragmas.

Miscellaneous

nosgimp

whenused in combination with the -mp option, causes the compiler to ignoreSGI-style parallelization directives or pragmas, but still process OpenMPdirectives or pragmas.

Miscellaneous

nostartup

donot link in the standard startup routine (pgf77, pgf90, andpghpf only).

Environment

nostddef

instructsthe compiler to not recognize the standard preprocessor macros.

Environment

nostdinc

instructsthe compiler to not search the standard location for include files.

Environment

nostdlib

instructsthe linker to not link in the standard libraries.

Environment

[no]onetrip

determineswhether each DO loop executes at least once (pgf77, pgf90,and pghpf only).

Language

prof

setprofile options; function-level and line-level profiling are supported.

CodeGeneration

[no]r8

determineswhether the compiler promotes REAL variables and constants to DOUBLE PRECISION(pgf77, pgf90, and pghpf only).

Optimization

[no]r8intrinsics

determineshow the compiler treats the intrinsics CMPLX and REAL (pgf77,pgf90, and pghpf only).

Optimization

[no]recursive

allocate(do not allocate) local variables on the stack, this allows recursion. SAVEd,data-initialized, or namelist members are always allocated statically,regardless of the setting of this switch (pgf77, pgf90, andpghpf only).

CodeGeneration

[no]reentrant

specifieswhether the compiler avoids optimizations that can prevent code from beingreentrant.

CodeGeneration

[no]ref_externals

do(don't) force references to names appearing in EXTERNAL statements(pgf77, pgf90, and pghpf only).

CodeGeneration

safeptr

instructsthe compiler to override data dependences between pointers and arrays(pgcc and pgCC only).

Optimization

safe_lastval

Inthe case where a scalar is used after a loop, but is not defined on everyiteration of the loop, the compiler does not by default parallelize the loop.However, this option tells the compiler it safe to parallelize the loop. For agiven loop the last value computed for all scalars make it safe to parallelizethe loop.

CodeGeneration

[no]save

determineswhether the compiler assumes that all local variables are subject to the SAVEstatement (pgf77, pgf90, and pghpf only).

FortranLanguage

schar

specifiessigned char for characters (pgcc and pgCC only -also see uchar).

C/C++Language

[no]second_underscore

do(don't) add the second underscore to the name of a Fortran global if its namealready contains an underscore (pgf77, pgf90, andpghpf only).

CodeGeneration

[no]signextend

specifieswhether the compiler extends the sign bit, if it is set.

CodeGeneration

[no]single

convertfloat parameters to double parameters characters(pgcc and pgCC only).

C/C++Language

standard

causesthe compiler to flag source code that does not conform to the ANSI standard(pgf77, pgf90, and pghpf only).

FortranLanguage

[no]stride0

thecompiler generates (does not generate) alternate code for a loop that containsan induction variable whose increment may be zero (pgf77,pgf90, and pghpf only).

CodeGeneration

uchar

specifiesunsigned char for characters (pgcc and pgCC only -also see schar).

C/C++Language

unix

useUNIX calling and naming conventions for Fortran subprograms (pgf77,pgf90, and pghpf for Win32 only).

Code

Generation

[no]unixlogical

determineswhether logical .TRUE. and .FALSE. are determined by non-zero (TRUE) and zero(FALSE) values for unixlogical. With nounixlogical, the default, -1 values areTRUE and 0 values are FALSE (pgf77, pgf90, andpghpf only).

FortranLanguage

[no]unroll

controlsloop unrolling.

Optimization

[no]upcase

determineswhether the compiler allows uppercase letters in identifiers (pgf77,pgf90, and pghpf only).

FortranLanguage

vect

invokesthe code vectorizer.

Optimization

-MpgflagCode Generation Controls

Syntax:

-Mnomain

instructs the compiler not to include the object file which callsthe Fortran main program as part of the link step. This option is useful forlinking programs in which the main program is written in C/C++and one or more subroutines are written in Fortran (pgf77,pgf90, and pghpf only).

-Mprof[=option[,option,...]]Set profile options. option can be any of the following:

func

perform PGI-style function-level profiling.

line

perform PGI-style line-level profiling.

-Mrecursive

instructs the compiler to allow Fortran subprograms to be calledrecursively.

-Mnorecursive

Fortran subprograms may not be called recursively.

-Mref_externals

force references to names appearing in EXTERNAL statements (pgf77,pgf90, and pghpf only).

-Mnoref_externals

donot force references to names appearing in EXTERNAL statements (pgf77,pgf90, and pghpf only).

-Mreentrant

instructs the compiler to avoid optimizations that can prevent code frombeing reentrant.

-Mnoreentrant

instructs the compiler not to avoid optimizations that can prevent codefrom being reentrant.

-Msecond_underscore

instructsthe compiler to add a second underscore to the name of a Fortran global symbolif its name already contains an underscore. This option is useful formaintaining compatibility with object code compiled using g77, whichuses this convention by default (pgf77, pgf90, andpghpf only).

-Mnosecond_underscore

instructsthe compiler not to add a second underscore to the name of a Fortranglobal symbol if its name already contains an underscore (pgf77,pgf90, and pghpf only).

-Msignextend

instructs the compiler to extend the sign bit that is set as a result ofconverting an object of one data type to an object of a larger signed data type.

-Mnosignextend

instructs the compiler not to extend the sign bit that is set as the resultof converting an object of one data type to an object of a larger data type.

-Msafe_lastval

In the case where a scalar is used after a loop, but is not defined onevery iteration of the loop, the compiler does not by default parallelize theloop. However, this option tells the compiler it's safe to parallelize theloop. For a given loop the last value computed for all scalars make it safe toparallelize the loop.

-Mstride0

instructs the compiler to inhibit certain optimizations and to allow forstride 0 array references. This option may degrade performance and should onlybe used if zero-stride induction variables are possible.

-Mnostride0

instructs the compiler to perform certain optimizations and to disallow forstride 0 array references.

-Munix

use UNIX symbol and parameter passing conventions for Fortran subprograms(pgf77, pgf90, and pghpf for Win32only).

Default: For arguments that you do not specify, thedefault code generation controls are as follows:

norecursive

nostride0

noreentrant

signextend

nosecond_underscore

noref_externals

-MpgflagEnvironment Controls

Syntax:

-Mnostartup

instructs the linker not to link in the standard startup routine whichcontains the entry point (_start) for the program.

Note: If you use the -Mnostartup option and do not supply anentry point, the linker issues the following error message:

Warning:cannot find entry symbol _start

-Mnostdlib

instructs the linker not to link in the standard librarieslibpgftnrtl.a, libm.a, libc.a and libpgc.a in thelibrary directory lib within the standard directory. You can link inyour own library with the -l option or specify a library directory withthe -L option.

Default: For arguments that you do notspecify, the default environment option depends on your configuration.

Cross-reference: -D, -I, -L, -l, -U

-MpgflagInlining Controls

This section describes the -Mpgflag options that control functioninlining.

Syntax:

-Mextract[=option[,option,...]]Extracts functions from the file indicated on the command line and createsor appends to the specified extract directory. option can be any of:

name:func

instructs the extractor to extract function func from

the file.

size:number

instructs the extractor to extract functions with

number orfewer, statements from the file.

lib:dirname

Use directory dirname as the extract directory (required in

order to save and re-use inline libraries).

If you specify both name and size, the compiler extractsfunctions that match func, or that have number or fewerstatements. For examples of extracting functions, see Chapter 4, FunctionInlining.

-Minline[=func| filename.ext | number | levels:number],... This passes options to the function inliner where:

func

instructs the inliner to inline the function func. The funcname should be a non-numeric string that does not contain a

period. Youcan also use a name: prefix followed by the

function name. Ifname: is specified, what follows is always

the name of afunction.

filename.ext

instructs the inliner to inline the functions within the library filefilename.ext. The compiler assumes that a filename.ext optioncontaining a period is a library file. Create the library file using the-Mextract option. You can also use a lib: prefix followedby the library name. If lib: is specified, no period isnecessary in the library name. Functions from the specified library areinlined. If no library is specified, functions are extracted from a temporarylibrary created during an extract prepass.

number

instructs the inliner to inline functions with number or fewer,

statements. You can also use a size: prefix followed by a

number. If size: is specified, what follows is alwaystaken as a

number.

levels:number

instructs the inliner to perform number levels of inlining. The

default number is 1.

If you specify both func and number, the compiler inlinesfunctions that match the function name or have number or fewerstatements. For examples of inlining functions, see Chapter 4, FunctionInlining.

Usage: In the following example, the compilerextracts functions that have 500 or fewer statements from the source filemyprog.f and saves them in the file extract.il.

$ pgf90 -Mextract=500 -oextract.il myprog.f

Inthe following example, the compiler inlines functions with fewer thanapproximately 100 statements in the source file myprog.f and writes theexecutable code in the default output file a.out.

$ pgf90 -Minline=size:100 myprog.f

Cross-reference: -o

-Mpgflag Fortran Language Controls

This section describes the -Mpgflag options that affect Fortranlanguage interpretations by the PGI Fortran compilers. These options are onlyvalid to the pgf77, pgf90, and pghpf compilationdrivers.

Syntax:

-Mbackslash

the compiler treats the backslash as a normal character, and not asan escape character in quoted strings.

-Mnobackslash

the compiler recognizes a backslash as an escape character in quotedstrings (in accordance with standard C usage).

-Mdclchk

the compiler requires that all program variables be declared.

-Mnodclchk

the compiler does not require that all program variables be declared.

-Mdefaultunit

the compiler treats "*" as a synonym for standard input for reading andstandard output for writing.

-Mnodefaultunit

the compiler treats "*" as a synonym for unit 5 on input and unit 6 onoutput.

-Mdlines

the compiler treats lines containing "D" in column 1 as executablestatements (ignoring the "D").

-Mnodlines

the compiler does not treat lines containing "D" in column 1 asexecutable statements (does not ignore the "D").

-Mdollar,char

char specifies the character to which the compiler maps thedollar sign. The compiler allows the dollar sign in names.

-Mextend

with -Mextend, the compiler accepts 132-column source code;otherwise it accepts 72-column code.

-Miomutex

the compiler generates critical section calls around Fortran I/Ostatements.

-Mnoiomutex

the compiler does not generate critical section calls around FortranI/O statements.

-Monetrip

the compiler forces each DO loop to execute at least once.

-Mnoonetrip

the compiler does not force each DO loop to execute atleast once. This option is useful for programs written for earlier versions ofFortran.

-Msave

the compiler assumes that all local variables are subject to theSAVE statement. Note that this may allow older Fortran programs torun, but it can greatly reduce performance.

-Mnosave

the compiler does not assume that all local variables are subject tothe SAVE statement.

-Mstandard

the compiler flags non-ANSI-conforming source code.

-Munixlogical

directs the compiler to treat logical values as true if the value isnon-zero and false if the value is zero (UNIX F77 convention.) When

-Munixlogical is enabled, a logical value or test that is non-zerois .TRUE., and a value or test that is zero is .FALSE.. Inaddition, the value of a logical expression is guaranteed to be one (1) whenthe result is .TRUE..

-Mnounixlogical

directs the compiler to use the VMS convention for logical values for trueand false. Even values are true and odd values are false.

-Mupcase

the compiler allows uppercase letters in identifiers. With-Mupcase, the identifiers "X" and "x" are different, and keywords mustbe in lower case. This selection affects the linking process: if you compileand link the same source code using -Mupcase on one occasion and

-Mnoupcase on another, you may get two different executables(depending on whether the source contains uppercase letters). The standardlibraries are compiled using the default -Mnoupcase.

-Mnoupcase

the compiler converts all identifiers to lower case. This selection affectsthe linking process: If you compile and link the same source code using-Mupcase on one occasion and -Mnoupcase on another, you may gettwo different executables (depending on whether the source contains uppercaseletters). The standard libraries are compiled using -Mnoupcase.

Default: For arguments that you do not specify, the defaultsare as follows:

nobackslash

noiomutex

nodclchk

noonetrip

nodefaultunit

nosave

nodlines

nounixlogical

dollar,_

noupcase

-Mpgflag C/C++ Language Controls

This section describes the -Mpgflag options that affectC/C++ language interpretations by the PGI C and C++compilers. These options are only valid to the pgcc andpgCC compilation drivers.

Syntax:

-Masmkeyword

instructs the compiler to allow the asm keyword in C sourcefiles. The syntax of the asm statement is as follows:

asm("statement");

Where statement is a legal assembly-language statement. The quotemarks are required.

-Mnoasmkeyword

instructsthe compiler not to allow the asm keyword in C source files.If you use this option and your program includes the asm keyword,unresolved references will be generated

-Mdollar,char

char specifies the character to which the compiler maps thedollar sign ($). The PGCC compiler allows the dollar sign in names; ANSIC does not allow the dollar sign in names

-Mfcon

instructs the compiler to treat floating-point constants as floatdata types, instead of double data types. This option can improvethe performance of single-precision code.

-Mschar

specifies signed char characters. The compiler treats "plain"char declarations as signed char.

-Msingle

do not to convert float parameters to double parametersin non-prototyped functions. This option can result in faster code if yourprogram uses only float parameters. However, since ANSI Cspecifies that routines must convert float parameters todouble parameters in non-prototyped functions, this option results innon-ANSI conformant code.

-Mnosingle

instructs the compiler to convert float parameters todouble parameters in non-prototyped functions.

-Muchar

instructs the compiler to treat "plain" char declarations asunsigned char.

Default: For arguments that you do notspecify, the defaults are as follows:

noasmkeyword

nosingle

dollar,_

schar

Usage:

In this example, the compiler allows the asm keyword in the sourcefile.

$ pgcc -Masmkeyword myprog.c

Inthe following example, the compiler maps the dollar sign to the dotcharacter.

$ pgcc -Mdollar,. myprog.c

Inthe following example, the compiler treats floating-point constants asfloat values.

$ pgcc -Mfcon myprog.c

Inthe following example, the compiler does not convert float parametersto double parameters.

$ pgcc -Msingle myprog.c

Without-Muchar or with -Mschar, the variable ch is a signedcharacter:

char ch;

signed char sch;

If-Muchar is specified on the command line:

$ pgcc -Muchar myprog.c

charch above is equivalent to:

unsigned char ch;

-MpgflagOptimization Controls

Syntax:

-Mcache_align

Align unconstrained objects of length greater than or equal to 16 bytes oncache-line boundaries. An unconstrained object is a data object that isnot a member of an aggregate structure or common block. This option does notaffect the alignment of allocatable or automatic arrays. NOTE: Toeffect cache-line alignment of stack-based local variables, the main program orfunction must be compiled with -Mcache_align.

-Mconcur[=option[,option,...]] Instructs the compiler to enable auto-concurrentization of loops. If

-Mconcur is specified, multiple processors will be used to executeloops which the compiler determines to be parallelizable. Where optionis one of the following:

altcode:n

Instructs the parallelizer to generate alternate

scalar code forparallelized loops. If altcode is

specified withoutarguments, the parallelizer

determines an appropriate cutoff length and

generates scalar code to be executed whenever the

loop count isless than or equal to that length. If

altcode:n isspecified, the scalar altcode is executed

whenever the loop count is lessthan or equal to n.

noaltcode

If noaltcode is specified, the parallelized version

of the loopis always executed regardless of the loop

count.

dist:block

Parallelize with block distribution (this is the

default).Contiguous blocks of iterations of a

parallelizable loop are assigned tothe available processors.

dist:cyclic

Parallelize with cyclic distribution. The

outermostparallelizable loop in any loop nest is

parallelized. If a parallelizedloop is innermost, its

iterations are allocated to processorscyclically. For

example, if there are 3 processors executing a loop,

processor 0 performs iterations 0, 3, 6, etc.;

processor 1performs iterations 1, 4, 7, etc.; and

processor 2 performs iterations2, 5, 8, etc.

cncall

Calls in parallel loops are safe to parallelize.

Loopscontaining calls are candidates for

parallelization. Also, no minimumloop count

threshold must be satisfied before parallelization

will occur, and last values of scalars are assumed to

be safe.

noassoc

Disables parallelization of loops with reductions.

When linking, the -Mconcur switch must be specified or unresolvedreferences will result. The NCPUS environment variable controls howmany processors are used to execute parallelized loops.

Note: this option applies only on shared-memory multi-processorsystems.

-Mcray[=option[,option,...]](pgf77 and pgf90 only) Force Cray Fortran (CF77)compatibility with respect to the listed options. Possible values ofoption include:

pointer

for purposes of optimization, it is assumed that pointer-based variablesdo not overlay the storage of any other variable.

-Mdepchk

instructs the compiler to assume unresolved data dependences actuallyconflict.

-Mnodepchk

instructs the compiler to assume potential data dependences do notconflict. However, if data dependences exist, this option can produce incorrectcode.

-Mi4

(pgf77 and pgf90 only) the compiler treatsINTEGER variables as INTEGER*4.

-Mnoi4

(pgf77 and pgf90 only) the compiler treatsINTEGER variables as INTEGER*2.

-Mnoframe

Eliminates operations that set up a true stack frame pointer for everyfunction. With this option enabled, you cannot perform a traceback on thegenerated code and you cannot access local variables.

-Mr8

(pgf77, pgf90 and pghpf only) the compilerpromotes REAL variables and constants to DOUBLEPRECISION variables and constants, respectively. DOUBLEPRECISION elements are 8 bytes in length.

-Mnor8

(pgf77, pgf90 and pghpf only) the compiler doesnot promote REAL variables and constants to DOUBLE PRECISION.REAL variables will be single precision (4 bytes in length).

-Mr8intrinsics

(pgf77, and pgf90 only) the compiler treats theintrinsics CMPLX and REAL as DCMPLX andDBLE, respectively.

-Mnor8intrinsics

(pgf77, and pgf90 only) the compiler does notpromote the intrinsics CMPLX and REAL to DCMPLX andDBLE, respectively.

-Msafeptr[=option[,option,...]](pgcc and pgCC only) instructs the C/C++compiler to override data dependences between pointers of a given storageclass. Possible values of option include:

arg

instructs the compiler that arrays and pointers are treated with the samecopyin and copyout semantics as Fortran dummy arguments.

global

instructs the compiler that global or external pointers and arrays do notoverlap or conflict with each other and are independent.

local/auto

instructs the compiler that local pointers and arrays do not overlapor conflict with each other and are independent.

static

instructs the compiler that static pointers and arrays do not overlap orconflict with each other and are independent.

-Munroll[=option[,option...]]invokes the loop unroller. This also sets the optimization level to 2 ifthe level is set to less than 2. The option is one of the following:

c:m

instructs the compiler to completely unroll loops with a constant loopcount less than or equal to m, a supplied constant. If thisvalue is not supplied, the m count is set to 4.

n:u

instructs the compiler to unroll u times, a loop which is notcompletely unrolled, or has a non-constant loop count. If u isnot supplied, the unroller computes the number of times a candidate loop isunrolled.

-Mnounroll

instructs the compiler not to unroll loops.

-Mvect[=option[,option,...]]invokes the code vectorizer, where option is one of the following:

altcode:n

Instructs the vectorizer to generate alternate

scalar code forvectorized loops. If altcode is

specified without arguments, thevectorizer

determines an appropriate cutoff length and

generatesscalar code to be executed whenever the

loop count is less than or equalto that length. If

altcode:n is specified, the scalaraltcode is executed

whenever the loop count is less than or equal ton.

noaltcode

If noaltcode is specified, the vectorized version

of the loopis always executed regardless of the loop

count.

assoc

Instructs the vectorizer to enable certain associativity conversions thatcan change the results of a computation

due to roundoff error. A typicaloptimization is to change an arithmetic operation to an arithmetic operationthat is mathematically correct, but can be computationally different, due toround-off error

noassoc

Instructs the vectorizer to disable associativity conversions.

cachesize:n

Instructs the vectorizer, when performing cache tiling optimizations, toassume a cache size of n. The default is

n = 262144.

smallvect[:n]

Instructs the vectorizer to assume that the maximum vector length is lessthan or equal to n. The vectorizer uses this information to eliminategeneration of the stripmine loop for vectorized loops wherever possible. Ifthe size n is omitted, the default is 100.

Note: no space isallowed on either side of the colon (:).

sse

Instructsthe vectorizer to search for vectorizable loops and, where possible, make useof Pentium III SSE and prefetch instructions.

prefetch

Instructsthe vectorizer to search for vectorizable loops and, where possible, make useof Pentium III or AMD Athlon prefetch instructions.

Default:For arguments that you do not specify, the default optimization control optionsare as follows:

depchk

nor8

i4

nor8intrinsics

Ifyou do not supply an option to -Mvect, the compiler uses defaultsthat are dependent upon the target system.

Usage: In this example, the compiler invokes the vectorizer with idiomrecognition for Pentium III SSE instructions enabled.

$ pgf90 -Mvect=sse -Mcache_align myprog.f

Cross-reference: -g, -O

-MpgflagMiscellaneous Controls

Syntax:

-Manno

annotate the generated assembly code with source code.

-Mbounds

enables array bounds checking. If an array is an assumed size array, thebounds checking only applies to the lower bound. If an array bounds violationoccurs during execution, an error message describing the error is printed andthe program terminates. The text of the error message includes the name of thearray, the location where the error occurred (the source file and the linenumber in the source), and information about the out of bounds subscript (itsvalue, its lower and upper bounds, and its dimension). For example:

PGFTN-F-Subscript out of range for array a (a.f: 2) subscript=3,lower bound=1, upper bound=2, dimension=2

-Mnobounds

disables array bounds checking.

-Mbyteswapio

swap byte-order from big-endian to little-endian or vice versa uponinput/output of Fortran unformatted data files.

-Mchkfpstk

instructs the compiler to check for internal consistency of the x86floating-point stack in the prologue of a function and after returning from afunction or subroutine call. Floating-point stack corruption may occur in manyways, one of which is Fortran code calling floating-point functions assubroutines (i.e. with the CALL statement). If thePGI_CONTINUE environment variable is set upon execution of a programcompiled with -Mchkfpstk, the stack will be automatically cleaned up andexecution will continue. There is a performance penalty associated with thestack cleanup. If PGI_CONTINUE is set to verbose, the stackwill be automatically cleaned up and execution will continue after printing ofa warning message.

-Mchkptr

instructs the compiler to check for pointers that are de-referenced whileinitialized to NULL (pgf90 and pghpf only).

-Mchkstk

instructs the compiler to check the stack for available space in theprologue of a function and before the start of a parallel region. Prints awarning message and aborts the program gracefully if stack space isinsufficient. Useful when many local and private variables are declared in anOpenMP program.

-Minfo[=option[,option,...]] instructs the compiler to produce information on standard error, whereoption is one of the following:

all

instructs the compiler to produce all available -Minfo information.

inline

instructs the compiler to display information about extracted or inlinedfunctions. This option is not useful without either the -Mextract or-Minline option.

loop

instructs the compiler to display information about loops, such asinformation on vectorization.

opt

instructs the compiler to display information about optimization.

time

instructs the compiler to display compilation statistics.

unroll

instructs the compiler to display information about loop unrolling.

-Mneginfo[=option[,option,...]]instructs the compiler to produce information on standard error, whereoption is one of the following:

concur

instructs the compiler to produce all available information on why loopsare not automatically parallelized. In particular, if a loop is notparallelized due to potential data dependence, the variable(s) that cause thepotential dependence will be listed in the -Mneginfo messages.

loop

instructs the compiler to produce information on why memory hierarchyoptimizations on loops are not performed.

-Minform,level

instructs the compiler to display error messages at the specified andhigher levels, where level is one of the following:fatal

instructs the compiler to display fatal error messages.

severe

instructs the compiler to display severe and fatal error messages.

warn

instructs the compiler to display warning, severe and fatal error messages.

inform

instructs the compiler to display all error messages (inform, warn, severeand fatal).

-Mkeepasm

instructs the compiler to keep the assembly file as compilation continues.Normally, the assembler deletes this file when it is finished. The assemblyfile has the same filename as the source file, but with a .s extension.

-Mlist

instructs the compiler to create a listing file. The listing file isfilename.lst, where the name of the source file isfilename.f.

-Mnolist

the compiler does not create a listing file. This is the default.

-Mnoopenmp

when used in combination with the -mp option, causes the compiler toignore OpenMP parallelization directives or pragmas, but still processSGI-style parallelization directives or pragmas.

-Mnosgimp

when used in combination with the -mp option, causes the compiler toignore SGI-style parallelization directives or pragmas, but still processOpenMP parallelization directives or pragmas.

Default: Forarguments that you do not specify, the default miscellaneous options are asfollows:

inform

warn

nolist

nobounds

Usage: In the following example the compiler includes Fortran source code with theassembly code.

$ pgf90 -Manno -S myprog.f

Inthe following example the compiler displays information about inlined functionswith fewer than approximately 20 source lines in the source filemyprog.f.

$ pgf90 -Minfo=inline -Minline=20 myprog.f

Inthe following example the assembler does not delete the assembly filemyprog.s after the assembly pass.

$ pgf90 -Mkeepasm myprog.f

In the following example the compiler creates the listing filemyprog.lst.

$ pgf90 -Mlist myprog.f

Inthe following example array bounds checking is enabled.

$ pgf90 -Mbounds myprog.f

Cross-reference: -m, -S, -V, -v

-module <moduledir>

Use the -module option to specify a particular directory in whichgenerated intermediate .mod files should be placed. If the -module<moduledir> option is present, and USE statements are present in acompiled program unit, <moduledir> will be searched for.mod intermediate files prior to the search in the default (local)directory.

Default: The compiler places .mod files in the current workingdirectory, and searches only in the current working directory for pre-compiledintermediate .mod files.

Usage: The following command line requests that any intermediate modulefile produced during compilation of myprog.f be placed in the directorymymods (in particular, the file ./mymods/myprog.mod will beused):

$ pgf90 -module mymods myprog.f

-mp

Use the -mp option to instruct the compiler to interpretuser-inserted OpenMP shared-memory parallel programming directives and generatean executable file which will utilize multiple processors in a shared-memoryparallel system. See Chapter 10, OpenMP Parallelization Directives forFortran, and Chapter 11, OpenMP Parallelization Pragmas for C andC++, for a detailed description of this programming model and theassociated directives and pragmas.

Default: The compiler ignores user-inserted shared-memory parallelprogramming directives and pragmas.

Usage: The following command line requests processing of anyshared-memory directives present in myprog.f:

$ pgf90 -mp myprog.f

Cross-reference: -Mconcur and -Mvect

-mslibs

(Win32 only)Use the -mslibs option to instruct the compiler to use the Microsoftlinker and include files, and link against the Microsoft Visual C++libraries. Microsoft Visual C++ must be installed in order to use thisswitch. This switch can be used to link Visual C++-compiled programunits into PGI main programs on Win32.

Default: The compiler uses the PGI-supplied linker and include filesand links against PGI-supplied libraries.

Cross-reference: -msvcrt

-msvcrt

(Win32 only)Use the -msvcrt option to instruct the compiler to use Microsoft'smsvcrt.dll at runtime rather than the default crtdll.dll.These files contain the Microsoft C runtime library and the defaultmingw32 C runtime library respectively. It is recommended that you usethe -msvcrt option in combination with the -mslibs option.

Default: The compiler uses crtdll.dll at runtime.

Cross-reference: -mslibs

-O

Invokes code optimization at the specified level.

Syntax:

-O [level]

Where level is one of the following:

0

creates a basic block for each statement. Neither scheduling nor globaloptimization is done. To specify this level, supply a 0 (zero) argument to the-O option.

1

schedules within basic blocks and performs some register allocations, butdoes no global optimization.

2

performs all level-1 optimizations, and also performs global scalaroptimizations such as induction variable elimination and loop invariantmovement.

Default: Table 7-3 shows the interaction between the-O option, -g option, and -Mvect options.

Table 7-3 Optimization and -O, -g, -Mvect, and -Mconcur Options

Optimize

Option

Debug

Option

-M

Option

Optimization Level

none

none

none

1

none

none

-Mvect

2

none

none

-Mconcur

2

none

-g

none

0

-O

noneor -g

none

2

-Olevel

noneor -g

none

level

-Olevel< 2

noneor -g

-Mvect

2

-Olevel< 2

noneor -g

-Mconcur

2

Unoptimizedcode compiled using the option -O0 can be significantly slower than codegenerated at other optimization levels. Like the -Mvect option, the-Munroll option sets the optimization level to level-2 if no -Oor -g options are supplied. For more information on optimization, seeChapters 2 and 3.

Usage: In the following example, since no optimization level isspecified and a -O option is specified, the compiler sets theoptimization to level-2.

$ pgf90 -O myprog.f

Cross-reference: -g, -Mpgflag

-o

Names the executable file. Use the -o option to specify thefilename of the compiler object file. The final output is the result oflinking.

Syntax:

-o filename

Where filename is the name of the file for the compilation output. Thefilename must not have a .f extension.

Default: The compiler creates executable filenames as needed. If you donot specify the -o option, the default filename is the linker outputfile a.out.

Usage: In the following example, the executable file is myproginstead of the default a.out.

$ pgf90 myprog.f -o myprog

Cross-reference: -c ,-E, -F, -S

-pc

Syntax:

-pc { 32 | 64 | 80 }

TheIA-32 architecture implements a floating-point stack using 8 80-bit registers.Each register uses bits 0-63 as the significand, bits 64-78 for the exponent,and bit 79 is the sign bit. This 80-bit real format is the default format(called the extended format). When values are loaded into the floatingpoint stack they are automatically converted into extended real format. Theprecision of the floating point stack can be controlled, however, by settingthe precision control bits (bits 8 and 9) of the floating control wordappropriately. In this way, the programmer can explicitly set the precision tostandard IEEE double-precision using 64 bits, or to single precision using 32bits.[*] The default precision issystem dependent. To alter the precision in a given program unit, the mainprogram must be compiled with the same -pc option. The command lineoption -pc val lets the programmer set the compiler's precisionpreference. Valid values for val are:

32

single precision

64

double precision

80

extended precision

Operations performed exclusively on thefloating-point stack using extended precision, without storing into or loadingfrom memory, can cause problems with accumulated values within the extra 16bits of extended precision values. This can lead to answers, when rounded, thatdo not match expected results.

For example, if the argument to sin is the result of previouscalculations performed on the floating-point stack, then an 80-bit value usedinstead of a 64-bit value can result in slight discrepancies. Results can evenchange sign due to the sin curve being too close to an x-interceptvalue when evaluated. To maintain consistency in this case, the programmer canassure that the compiler generates code that calls a function. According tothe IA-32 ABI, a function call must push its arguments on the stack (in thisway memory is guaranteed to be accessed, even if the argument is an actualconstant.) Thus, even if the called function simply performs the inlineexpansion, using the function call as a wrapper to sin has the effectof trimming the argument precision down to the expected size. Using the-Mnobuiltin option on the command line for C accomplishes thistask by resolving all math routines in the library libm, performing afunction call of necessity. The other method of generating a function call formath routines, but one which may still produce the inline instructions, is byusing the -Kieee switch.

A second example illustrates the precision control problem using a section ofcode to determine machine precision:

program find_precision

w = 1.0

100 w=w+w

y=w+1

z=y-w

if (z .gt. 0) goto 100

C now w is just big enough that |((w+1)-w)-1| >= 1 ...

print*,w

end

Inthis case, where the variables are implicitly real*4, operations areperformed on the floating-point stack where optimization removed unnecessaryloads and stores from memory. The general case of copy propagation beingperformed follows this pattern:

a = x

y = 2.0 + a

Insteadof storing x into a, then loading a to perform theaddition, the value of x can be left on the floating-point stack andadded to 2.0. Thus, memory accesses in some cases can be avoided, leavinganswers in the extended real format. If copy propagation is disabled, stores ofall left-hand sides will be performed automatically and reloaded when needed.This will have the effect of rounding any results to their declared sizes.

For the above program, w has a value of 1.8446744E+19 whenexecuted using default (extended) precision. If, however, -Kieee is set,the value becomes 1.6777216E+07 (single precision.) This difference isdue to the fact that -Kieee disables copy propagation, so allintermediate results are stored into memory, then reloaded when needed. Copypropagation is only disabled for floating-point operations, not integer. Withthis particular example, setting the -pc switch will also adjust theresult.

The switch -Kieee also has the effect of making function calls toperform all transcendental operations. Although the function still produces theIA-32 machine instruction for computation (unless in C the-Mnobuiltin switch is set), arguments are passed on the stack, whichresults in a memory store and load.

Finally, -Kieee also disables reciprocal division for constant divisors.That is, for a/b with unknown a and constant b, theexpression is usually converted at compile time to a*(1/b), thusturning an expensive divide into a relatively fast scalar multiplication.However, numerical discrepancies can occur when this optimization is used.

Understanding and correctly using the -pc, -Mnobuiltin, and-Kieee switches should enable you to produce the desired and expectedprecision for calculations which utilize floating-point operations.

Usage:

$ pgf90 -pc 64 myprog.c

-Q

Selects variations for compilation. There are four uses for the-Q option.

Syntax:

-Qdirdirectory

Thefirst variety, using the dir keyword, lets you supply adirectory parameter that indicates the directory where the compilerdriver is located.

-Qoptionprog,opt

Thesecond variety, using the option keyword, lets you supply the optionopt to the program prog. The prog parameter can be oneof pgftn, as, or ld.

-Qpathpathname

Thethird -Q variety, using the path keyword, lets you supply anadditional pathname to the search path for the compiler's required .ofiles.

-Qproducesourcetype

Thefourth -Q variety, using the produce keyword, lets you choose astop-after location for the compilation based on the supplied sourcetypeparameter. Valid sourcetypes are: .i, .c, .s and .o. These indicaterespectively, stop-after preprocessing, compiling, assembling, or linking.

Usage: The following examples show the different -Q options.

$ pgf90 -Qproduce .s hello.f$ pgf90 -Qoption ld,-s hello.f$ pgf90 -Qpath /home/test hello.f$ pgf90 -Qdir /home/comp/new hello.f

Cross-reference: -p

-R<directory>

Valid only on Linux and Solaris86. Passed to the linker. Instructs thelinker to hard-code the pathname <directory> into the searchpath for generated shared object (dynamically linked library) files. Note thatthere cannot be a space between R and <directory>.

Cross-reference: -fpic, -shared, -G

-r4 and -r8

Interpret DOUBLE PRECISION variables as REAL(-r4) or REAL variables as DOUBLE PRECISION(-r8).

Usage:

$ pgf90 -r4 myprog.f

Cross-reference: -i2, -i4, -i8

-rc

Specifies the name of the driver startup configuration file. If the fileor pathname supplied is not a full pathname, the path for the configurationfile loaded is relative to the $DRIVER path (the path of the currentlyexecuting driver). If a full pathname is supplied, that file is used for thedriver configuration file.

Syntax:

-rc [path] filename

Where path is either a relative pathname, relative to the value of$DRIVER, or a full pathname beginning with "/". Filename isthe driver configuration file.

Default: The driver uses the configuration file .pgirc.

Usage: In the following example, the file .pgf90rctest,relative to /usr/pgi/linux86/bin, the value of $DRIVER, isthe driver configuration file.

$ pgf90 -rc .pgf90rctest myprog.f

Cross-reference: -show

-S

Stops compilation after the compiling phase and writes theassembly-language output to the file filename.s, where the input file isfilename.f.

Default: The compiler produces an executable file.

Usage: In this example, pgf90 produces the filemyprog.s in the current directory.

$ pgf90 -S myprog.f

Cross-reference: -c, -E, -F, -Mkeepasm, -o

-shared

Valid only on Linux. Passed to the linker. Instructs the linker to produce a shared object(dynamically linked library) file.

Cross-reference: -fpic, -G, -R

-show

Produce driver help information describing the current driverconfiguration.

Usage: In the following example, the driver displays configurationinformation to the standard output after processing the driver configurationfile.

$ pgf90 -show myprog.f

Cross-reference: -V , -v, -###, -help, -rc

-silent

Do not print warning messages.

Usage: In the following example, the driver does not display warningmessages.

$ pgf90 -silent myprog.f

Cross-reference: -v, -V, -w

-time

Print execution times for various compilation steps.

Usage: In the following example pgf90 prints the executiontimes for the various compilation steps.

$ pgf90 -time myprog.f

Cross-reference: -#

-tp

Set the target architecture. By default, the PGI compilers produce codespecifically targeted to the type of processor on which the compilation isperformed. In particular, the default is to use so-called p6instructions wherever possible when compiling on Pentium Pro/II/III or AMDAthlon systems. These executables may not be useable on older generationsystems (Pentium, i486, etc). Pentium-specific and Pentium Pro/II/III-specificoptimizations can be specified explicitly by using the -tp p5 and -tpp6 options respectively. AMD Athlon-specific code generation can bespecified explicitly by using the -tp athlon option. A blended p5/p6style of code generation can be specified using the -tp px option.Executables produced using -tp px will run on any x86 system.

Syntax:

-tp {p5 | p6 | px | athlon}

Usage: In the following example pgf90 sets the target architecture toPentium Pro/II/III:

$ pgf90 -tp p6 myprog.f

Default: The default style of code generation is auto-selected depending on the type ofprocessor on which compilation is performed.

-U

Undefines a preprocessor macro. Use the -U option or the#undef preprocessor directive to undefine macros.

Syntax:

-Usymbol

Wheresymbol is a symbolic name.

Usage: The following examples undefine the macro test.

$ pgf90 -Utest myprog.F

$ pgf90 -Dtest -Utest myprog.F

Cross-reference: -D,-Mnostdde.

-V

Displays additional information, including version messages.

Usage: The following command-line shows the output using the -Voption.

$ pgf90 -V myprog.f

Cross-reference: -Minfo, -v

-v

Use the -v option to display the invocations of the compiler,assembler and linker. These invocations are command lines created by thecompilation driver from the files and the -W options youspecify on the compiler command-line.

Default: The compiler does not display individual phase invocations.

Cross-reference: -Minfo, -V

-W

Passes arguments to a specific phase. Use the -W option tospecify options for the assembler, compiler or linker. Note: a given PGIcompiler command invokes the compiler driver, which parses thecommand-line and generates the appropriate commands for the compiler, assemblerand linker.

Syntax:

-W{0|a|l},option[,option...]

Where:

0

(the number zero) specifies the compiler.

a

specifies the assembler.

l

(lowercase letter l) specifies the linker.

option

is a string that is passed to and interpreted by the compiler, assembler orlinker. Options separated by commas are passed as separate command linearguments.

NOTE: You cannot have a space between the -Wand the single-letter pass identifier, between the identifier and the comma, orbetween the comma and the option.

Usage: In the following example the linker loads the text segment ataddress 0xffc00000 and the data segment at address0xffe00000.

$ pgf90 -Wl,-k,-t,0xffc00000,-d,0xffe00000 myprog.f

-w

Do not print warning messages.

7.2 C and C++ -specific Compiler Options

Thefollowing options are specific to PGCC C and/or C++.

-A

(pgCC only)Using this option the PGCC C++ compiler accepts code conforming to theproposed ANSI C++ standard. Issue errors for non-conforming code.

Default: By default, the compiler accepts code conforming to thestandard C++ Annotated Reference Manual.

Usage: The following command-line requests ANSI conformingC++.

$ pgCC -A hello.cc

Cross-references: -b and +p.

--[no_]alternative_tokens

(pgCC only)Enable or disable recognition of alternative tokens. These are tokens that makeit possible to write C++ without the use of the , , [, ], #, &, , ^,and characters. The alternative tokens include the operator keywords (e.g.,and, bitand, etc.) and digraphs. The default behavior is --no_alternative_tokens.

-b

(pgCC only)Enable compilation of C++ with cfront 2.1 compatibility. Thiscauses the compiler to accept language constructs that, while not part of theC++ language definition are accepted by the AT&T C++ LanguageSystem (cfront release 2.1). This option also enables acceptance ofanachronisms.

Default: The compiler does not accept cfront language constructsthat are not part of the C++ language definition.

Usage: In the following example the compiler accepts cfrontconstructs.

$ pgCC -b myprog.cc

Cross-references: --cfront2.1, -b3 , --cfront3.0, +p, -A

-b3

(pgCC only)Enable compilation of C++ with cfront 3.0 compatibility. Thiscauses the compiler to accept language constructs that, while not part of theC++ language definition are accepted by the AT&T C++ LanguageSystem (cfront release 3.0). This option also enables acceptance ofanachronisms.

Default: The compiler does not accept cfront language constructsthat are not part of the C++ language definition.

Usage: In the following example the compiler accepts cfrontconstructs.

$ pgCC -b3 myprog.cc

Cross-references: --cfront2.1, -b , --cfront3.0 , +p, -A

--[no_]bool

(pgCC only)Enable or disable recognition of bool. The default value is --bool.

--cfront_2.1

(pgCC only)Enable compilation of C++ with cfront 2.1 compatibility. Thiscauses the compiler to accept language constructs that, while not part of theC++ language definition are accepted by the AT&T C++ LanguageSystem (cfront release 2.1). This option also enables acceptance ofanachronisms.

Default: The compiler does not accept cfront language constructsthat are not part of the C++ language definition.

Usage: In the following example the compiler accepts cfrontconstructs.

$ pgCC --cfront_2.1 myprog.cc

Cross-references: -b, -b3 , --cfront3.0, +p, -A

--cfront_3.0

(pgCC only)Enable compilation of C++ with cfront 3.0 compatibility. Thiscauses the compiler to accept language constructs that, while not part of theC++ language definition are accepted by the AT&T C++ LanguageSystem (cfront release 3.0). This option also enables acceptance ofanachronisms.

Default: The compiler does not accept cfront language constructsthat are not part of the C++ language definition.

Usage: In the following example the compiler accepts cfrontconstructs.

$ pgCC --cfront_3.0 myprog.cc

Cross-references: --cfront2.1, -b , -b3 , +p, -A

--create_pch filename

(pgCC only)If other conditions are satisfied, create a precompiled header file with thespecified name. If --pch (automatic PCH mode) appears on thecommand line following this option, its effect is erased.

--diag_suppress tag

(pgCC only)Override the normal error severity of the specified diagnostic messages. Themessage(s) may be specified using a mnemonic error tag or using an errornumber.

--diag_remark tag

(pgCC only)Override the normal error severity of the specified diagnostic messages. Themessage(s) may be specified using a mnemonic error tag or using an errornumber.

--diag_warning tag

(pgCC only)Override the normal error severity of the specified diagnostic messages. Themessage(s) may be specified using a mnemonic error tag or using an errornumber.

--diag_error tag

(pgCC only)Override the normal error severity of the specified diagnostic messages. Themessage(s) may be specified using a mnemonic error tag or using an errornumber.

--display_error_number

(pgCC only)Display the error message number in any diagnostic messages that are generated.The option may be used to determine the error number to be used when overridingthe severity of a diagnostic message.

--[no_]exceptions

(pgCC only)Enable/disable exception handling support. The default is --exceptions.

--instantiation_dir <dirname>

(pgCC only)Defines <dirname> as the instantiation directory. The directorymust exist. This switch must appear on both the compile line and the linkline. The compiler will not delete objects from this directory.

--[no]llalign

(pgCC only)Do/don't align long long integers on long long boundaries.The default is --llalign.

-M

Generate a list of make dependences and print them to stdout.Compilation stops after the pre-processing phase.

-MD

Generate a list of make dependences and print them to the file<file>.d, where <file> is the name of the fileunder compilation.

--one_instantiation_per_object

(pgCC only)By default, templates are instantiated in objects that reference them, andmultiple instantiations are avoided by calling pgprelnk to modify thetemplate instantiation file (an intermediate file with a .iiextension). This does not work well for template libraries, where the.ii files are not available to the archived .o files. As aresult, each file in a template library has to have its own local copy of eachtemplate it instantiates.

With the --one_instantiation_per_object implementation, each templateinstantiation becomes an object in the instantiation directory, (default name:Template.dir). As a result, each template can be linked inindependently. At link time, pgprelnk removes all the unnecessaryinstantiations in the local Template.dir objects, and passes theremaining objects to the linker. This is particularly useful in templatelibraries, where it results in only one instantiation of any template in thelibrary.

The --one_instantiation_per_object flag must appear on both thecompile line and the link line. It will create an additional .o in theinstantiation directory for each instantiation. If --instantiation_diris not used (see below), a temporary directory Template.dir is createdand used as the instantiation directory, then deleted after linking of theexecutable.

If you use --one_instantiation_per_object to create your owntemplate libraries, you must add the objects in the Template directory to thearchive list. See the example below.

NOTE: Before using this switch for the first time, all .iifiles should be removed. Old .ii files will cause unpredictableresults. Note that the compiler also creates .ti files for use duringinstantiation.

--optk_allow_dollar_in_id_chars

(pgCC only)Accept dollar signs ($) in identifiers.

-P

Stops compilation after the preprocessing phase. Use the -Poption to halt the compilation process after preprocessing and write thepreprocessed output to the file filename.i, where the input file isfilename.c or filename.cc.

Use the -.suffix option with this option to save the intermediate filein a file with the specified suffix (see the -.suffix description fordetails).

Default: The compiler produces an executable file.

Usage: In the following example, the compiler produces the preprocessedfile myprog.i in the current directory.

$ pgCC -P myprog.cc

Cross-references:-C,-c,-E, -Mkeepasm, -o, -S

--pch

(pgCC only)Automatically use and/or create a precompiled header file. If--use_pch or

--create_pch (manual PCH mode)appears on the command line following this option, its effect is erased.

--pch_dir directoryname

(pgCC only)The directory in which to search for and/or create a precompiled header file.This option may be used with automatic PCH mode (--pch) or manualPCH mode (--create_pch or --use_pch).

--[no_]pch_messages

(pgCC only)Enable or disable the display of a message indicating that a precompiled headerfile was created or used in the current compilation.

--preinclude=<filename>

(pgCC only)Specifies the name of a file to be included at the beginning of thecompilation. This option can be used to set system-dependent macros and types,for example.

--prelink_objects

(pgCC only)Creates the necessary template instantiations for template libraries thatreference other template libraries. In previous releases, for example,libraries that reference templates in the Rogue Wave STL, would generateundefined template references. Now, when the user builds the library objectfiles with the --one_instantiation_per_object flag, and pre-links theobject files with the command:

% pgCC --one_instantiation_per_object --prelink_objects *.o

thepre-linker will instantiate the templates required by the library.

--use_pch filename

(pgCC only)Use a precompiled header file of the specified name as part of the currentcompilation. If --pch (automatic PCH mode) appears on thecommand line following this option, its effect is erased.

--[no_]using_std

(pgCC only)Enable or disable implicit use of the std namespace when standard headerfiles are included.

Default: The default is --using_std.

Usage: The following command-line disables implicit use of thestd namespace:

$ pgCC --no_using_std hello.cc

-t

(pgCC only)Control instantiation of template functions.

Syntax:

-t [arg]

where arg is one of the following:

all

Instantiates all functions whether or not they are used.

local

Instantiates only the functions that are used in this compilation, andforces those functions to be local to this compilation. Note: this may causemultiple copies of local static variables. IF this occurs, the program may notexecute correctly.

none

Instantiates no functions. (this is the default)

used

Instantiates only the functions that are used in thiscompilation.

Usage: In the following example all templates areinstantiated.

$ pgCC -tall myprog.cc



[*]

According to Intel documentation, thisonly affects the operations of add, subtract, multiply, divide, and square root.

  

爱华网本文地址 » http://www.413yy.cn/a/25101013/167745.html

更多阅读

spss教程:两独立样本的非参数检验

spss教程:两独立样本的非参数检验——简介在对总体分布不了解的情况下,通过对两组独立样本的分析来推断样本来自的两个总体分布是否存在显著性差异。Spss提供多种两独立样本的非参数检验方法。其中包括曼-惠特尼U检验、K-S检验、W-W游

图解齿轮的基本参数 锐界的基本参数

图解齿轮的基本参数——简介 UG属于模具中的姣姣者,只要你掌握了工件的图解,学会了UG的基本操作,那么你就对UG的制图就比较了解了。下面是关于齿轮的图解参数。图解齿轮的基本参数——方法/步骤图解齿轮的基本参数 1、齿数z一个齿轮

CO2气体保护焊的工艺参数选择 co2气体保护焊技术

2012-02-20 14:09CO2气体保护焊以其速度快、操作方便、焊接质量高、适用范围广和成本低廉等诸多优势,逐渐取代了传统的手工焊条电弧焊。在焊接生产中,焊接工艺参数对焊接质量和焊接生产率有很大的影响,正确选择焊接工艺参数是获得质量

win7开机慢,关机慢,运行慢的优化解决办法 win7关机优化

很多朋友的电脑配置比较低,那运行win7系统就会出现开机慢,关机慢,运行速度慢的情况,因为Win7系统对电脑的配置要求稍微高一些.另外,Win7系统时间永久了也会出现开机速度慢,关机速度慢,运行速度慢的问题,那win7如果进行系统优化呢?今

声明:《pgf90 编译的优化参数 防止编译器优化》为网友巫女分享!如侵犯到您的合法权益请联系我们删除