Intel® Software Network Knowledge Base Wiki


Constructing Nav Tree
One Moment...

(refresh menu)



 
Welcome, Guest | Quick Login | Register

Develop for Core processor


Use the Microsoft C++ Compiler for the Pentium® M Processor

Version 2, Changed by KYLE LEWIS on 1/3/2008
Created by: KYLEX.S.LEWIS@INTEL.COM

Challenge

Set up the Microsoft .NET* 2003 C++ Compiler to produce the most benefit for applications running on the Pentium® M processor, as well as the Pentium® 4 processor. The fastest and most efficient way to help achieve the best performance for your application is by using a current-generation compiler.

Solution

Set the switches that will likely produce the most benefit from the compiler for your application. For applications that must run on both the Pentium 4 and Pentium M processors with minimum binary code size and a single code path, a compatible code strategy is best. Optimizing your application for the Intel® NetBurst™ microarchitecture is likely to deliver high performance, efficiency, and scalability when running on processors based on current and future generations of IA-32 processors including the Pentium M processor.

The Microsoft .NET 2003 C++ Compiler includes important capabilities to improve performance on the latest generation of Intel processors. The most important features include Pentium 4 and Pentium M tuned optimizations with the /G7 flag, the /GL whole program optimization switch (similar to the Intel® C++ compiler -Qipo switch) and the /arch:SSE2 flag which can emit scalar SSE/SSE2 code.

By default, the Microsoft C++ compiler is set to the -O2 option for release code, which optimizes for the best performance. If code size is critical, the -O1 option may be used to help reduce any code size expansion due to optimization. Again, it is best to experiment with these two options to make sure that you get the best performance, and in some cases, the -O1 option may produce faster code.

The .NET 2003 C++ compiler includes switches to target specific processor implementations: the /G6 switch for Pentium® Pro, Pentium® II, and Pentium® III processors and the /G7 switch for Pentium 4 and later processors. By default, the .NET compiler uses the /G6 switch, but the /G7 switch is recommended, since it is more likely to yield significant performance benefits on the latest processors.

Like the analogous switch in the Intel compiler, the -G7 switch tells the compiler to optimize for the Intel NetBurst microarchitecture and Pentium M processor. With this switch, the compiler does not introduce any Pentium 4 processor-specific or Pentium M processor-specific instructions into the binary, so this switch is backward compatible for use across processor generations. The use of this switch helps ensure the best performance on Pentium 4 processors and Pentium M processors through optimizations similar to those in the Intel C++ compiler.

You should also use the compiler switches for interprocedural optimization. The -GL whole-program optimization switch enables the compiler to perform optimizations with information on all modules in the program. Whole-program optimization is off by default and must be explicitly enabled. With information on all modules, the compiler can optimize the use of registers across function boundaries and inline a function in a module, even when the function is defined in another module. If you compile your program with /GL, you should also use the /LTCG linker option to create the output file.

Finally, it is important to use switches for SIMD optimizations. The .NET 2003 compiler supports generation of code using Streaming SIMD Extensions (SSE) and Streaming SIMD Extensions 2 (SSE2) instructions. For example, /arch:SSE allows the compiler to use SSE instructions, and /arch:SSE2 allows the compiler to use SSE2 instructions.

The optimizer chooses when and how to make use of the SSE and SSE2 instructions when /arch is specified. Currently, SSE and SSE2 instructions are used for some scalar floating-point computations, when it is determined that it is faster to use the SSE/SSE2 instructions and registers rather than the x87 floating-point register stack. As a result, your code will actually use a mixture of both x87 and SSE/SSE2 for floating-point computations. Additionally, with /arch:SSE2, SSE2 instructions may be used for some 64-bit integer operations.

In addition to making use of the SSE and SSE2 instructions, the compiler also makes use of other instructions that are present on the processor revisions that support SSE and SSE2, such as the CMOV instruction.

Note that the .NET 2003 compiler does not vectorize for SSE or SSE2 instructions, which limits the overall performance benefit of the .NET 2003 compiler for floating-point-intensive applications, compared to the Intel C++ compiler.

Source

Optimizing Software for Intel® Centrino® Mobile Technology and Intel® NetBurst™ Microarchitecture

 



Served
23 Knowledge Bases
604 Pages
Search
Powering Up Search...


Vote on this Page

Tags For This Page
Loading Tags..

Tag This



Additional legal information