The Intel® Performance Tuning Utility (PTU) is a cross-platform performance analysis tool set. Alongside such traditional features as identifying the hottest modules and functions of the application, tracking call sequences, identifying performance-critical source code, the Intel PTU has new, more powerful capabilities of data collection, analysis, and visualization. For experienced tuners Intel PTU offers the processor hardware event counters for a detailed look into the performance of the memory system, architectural tuning, etc. It can relate your issues back to the source code. If you are analyzing an application for which you don’t have the source, Intel PTU allows you to represent data with basic block granularity and provides function control flow graph to navigate the disassembly. The Intel Performance Tuning Utility is available for both Windows and Linux.
The Intel® Performance Tuning Utility offers:
Statistical Call Graph
Profiles with low overhead to detect where time is spent in your application
Event Based Sampling
Uses the processor’s onboard performance monitoring hardware to get a detailed look into performance issues
Basic Block Analysis
Displays hotspots with basic block granularity and generates a control flow graph for advanced analysis of application, even without the source code
Events over IP graph
Generates a histogram of performance events distributed over application code
Loop Analysis
Identifies loops and recursion in your application to aid optimization
Result difference
Compares the results of multiple runs to measure changes in performance
Data Access Profiling
Identifies memory hotspots and relates them to code hotspots
Heap Profiler
Identifies dynamic memory usage by application. Can help identify memory leaks
Instrumentation-based Call Graph, Call Count
Provides exact call graph and call count information for your application
Version 3.1 Update 3 of the Intel Performance Tuning Utility introduces the following new features and enhancements:
Notes:
This section details the processor, memory, disk space, and operating system requirements for installing and using various components of the Intel® Performance Tuning Utility. The product was validated on platforms with the following parameters.
Processor Requirements
|
Processor |
IA-32 |
Intel® 64 |
IA-64 |
|
Intel® Celeron® processor |
+ |
|
|
|
Intel® Celeron® D processor |
+ |
|
|
|
Intel® Pentium® 4 processor |
+ |
+ |
|
|
Intel® Pentium® D processor |
+ |
+ |
|
|
Intel® Pentium® 4 processor Extreme Edition |
+ |
|
|
|
Intel® Xeon® processor |
+ |
|
|
|
Intel® Xeon® DP processor |
+ |
+ |
|
|
Intel® Xeon® MP processor |
+ |
+ |
|
|
Intel® Pentium® M processor |
+ |
|
|
|
Mobile Intel® Pentium® 4 processor |
+ |
|
|
|
Mobile Intel® Celeron® processor |
+ |
|
|
|
Intel® Celeron® M processor |
+ |
|
|
|
Intel® Core™ Duo processor |
+ |
|
|
|
Intel® Core™ 2 Duo processor |
+ |
+ |
|
|
Intel® Xeon® processor 50xx, 51xx, 7xxx series |
+ |
+ |
|
|
Intel® Itanium® 2 processor |
|
|
+ |
|
Intel® Itanium® 2 processor series 9000 |
|
|
+ |
To view the full list of currently supported processors, enter:
>vtsarun -cl
Memory Requirements
The application you are tuning may be memory and disk space consuming. If this is the case, make sure you have sufficient memory and disk space for running both your application and the Intel Performance Tuning Utility.
|
Interface |
RAM |
Swap space |
|
Command line collector and viewer |
> 256 MB |
> 256 MB |
|
Loop profiling enabled |
> 700 MB |
> 700 MB |
|
Graphical User Interface |
> 512 MB |
> 512 MB |
Disk Space Requirements
|
Component |
Disk Space |
|
Total (archive file, its extracted files, and all installed components) |
300-400 MB |
Operating System Requirements
The Intel Performance Tuning Utility was tested on the following Windows* and Linux* distributions:
|
Operating System |
IA-32 |
Intel® 64 |
IA-64 |
|
Microsoft* Windows XP Professional Service Pack 2 |
+ |
|
|
|
Microsoft* Windows XP Professional x64 Edition Service Pack 1 |
|
+ |
|
|
Microsoft* Windows Server 2003 Enterprise Edition Service Pack 1 |
+ |
|
|
|
Microsoft* Windows Server 2003 Enterprise x64 Edition Service Pack 1, 2 |
|
+ |
|
|
Microsoft* Windows Server 2008 |
|
+ |
|
|
Microsoft* Windows Vista* (Ultimate, Enterprise) |
+ |
+ |
|
|
Microsoft* Windows Vista* Service Pack 1 |
|
+ |
|
|
Red Hat* Fedora* Core 5 (kernel 2.6.15) |
|
+ |
|
|
Red Hat* Fedora* 7 (kernel 2.6.21-1.3194.fc7) |
|
+ |
|
|
Red Flag Linux* 5.0 DC Server (kernel 2.6.9-11) |
|
|
+ |
|
Red Hat* Enterprise Linux* Advanced Server 3.0 Update 6 (kernel 2.4.21-37) |
+ |
+ |
+ |
|
Red Hat* Enterprise Linux* Advanced Server 4.0 Update 3, 4, 5 (kernel 2.6.9) |
+ |
+ |
+ |
|
Red Hat* Enterprise Linux* Advanced Server 5.0 (kernel 2.6.18-8) |
+ |
+ |
+ |
|
Red Hat* Enterprise Linux* Advanced Server 5.1 (kernel 2.6.18-53) |
+ |
+ |
|
|
SuSE* Linux* Enterprise Server 9 Service Pack 3 (kernel 2.6.5) |
+ |
+ |
+ |
|
SuSE* Linux* Enterprise Server 10 (kernel 2.6.16.21-0.8) |
+ |
+ |
+ |
|
SuSE* Linux* Enterprise Server 10 Service Pack 1 (kernel 2.6.16.46-0.12) |
+ |
+ |
|
|
Turbolinux* 10 (kernel 2.6.9-5.15) |
|
|
+ |
The Intel Performance Tuning Utility works with ALL compilers that follow industry standard object code formats. It was tested on applications built with the following compilers:
Java Environment Requirements
The Intel Performance Tuning Utility requires Eclipse* 3.2, EMF* 2.2, and GEF* 3.2 installed for normal work of the graphical user interface (GUI). Eclipse environment, in its turn, requires the Java* Virtual Machine for its work. Please refer to the <Eclipse_home>/readme/readme_eclipse.html (Running Eclipse chapter) for the list of JVMs supported by Eclipse. The Intel PTU package includes all the components listed above.
To see the Intel® Performance Tuning Utility installation details please refer to the Installation Guide (INSTALL.txt).
The documentation for the Intel Performance Tuning Utility is presented in the following formats:
Reference Guide provides reference information about instructions, events, and penalties for the supported processors. To access the Reference Guide, go to the Eclipse* Help menu > Help Contents and select the Intel(R) Performance Tuning Utility book from the table of contents.
Information on Intel® software development products is available at http://www.intel.com/cd/software/products/asmo-na/eng/index.htm. Visit the following product-related sites for additional information:
Statistical Call Graph Collection Problems and Limitations
You can resolve some of the stack walking issues described above by generation of full unwind information. Use the -fasynchronous-unwind-tables option for GCC and the -fexceptions option for Intel C compiler. To make that sure your executable (and shared libs) have this information, use the objdump -h <binary> command. You should see .eh_frame_hdr section there. For C++ programs exception handling tables are generated by default, however if you switched off exception handling by using the -fno-exceptions option you will need to force generation of exception handling tables or frame pointers. To do this in GCC use -fasynchronous-unwind-tables or -fp options, in ICC you may use only the -fp option.
If it does not help, reduce optimization level (in case it is possible).
Sampling collection problems and limitations
Heap Profiler, exact Call Graph, Call Count problems and limitations
Data Access Profiler problems and limitations
There is a list of systems below where the data profiling is possible:
|
Processor |
Windows |
Linux |
||||
|
IA-32 |
Intel® 64 |
IA-64 |
IA-32 |
Intel® 64 |
IA-64 |
|
|
Intel® Pentium® 4 processor |
+ |
+ |
|
- |
- |
|
|
Intel® Core™ 2 Duo processor |
+ |
+ |
|
- |
+ |
|
|
Intel® Itanium® 2 processor series 9000 |
|
|
+ |
|
|
+ |
On the systems with Intel® Core™ 2 Duo processor some memory load instructions may use the same register as source and destination, for example mov [rax], ax. If samples fall on such instructions they are ignored by data profiling view because it is impossible to calculate data address of the load in this case.
Your feedback is very important to us. To point to an issue and receive a technical answer for the tools provided in this product, visit the web site where you got the package. You can learn about the discussion forum possibilities from that web page. We do not provide technical support for the tools inside this product.
Diagnostic and Logging
While running, the Intel Performance Tuning Utility logs the experiment workflow. Log files are created in a directory assigned as a directory for temporary data for current user. For example, to reach the log location type
Linux: cd /tmp/ptu-log-${USER}
Windows: cd %TEMP%/ptu-log-%USERNAME% or type %TEMP%/ptu-log-%USERNAME% in the explorer address bar and press enter.
The folder ptu-log-<username> contains history of all commands executed in the file history.txt and folders with command processing details. To provide the response team with information about a problem, it is recommended to archive the experiment and ptu-log-<username> directories, and send it along with the problem report to the response team for further investigation.