Intel® Performance Tuning Utility 3.1 Update 3

Author: Intel® Software Network
Published On: Monday, November 12, 2007 | Last Modified On: Wednesday, July 09, 2008

What If Home | Product Overview | Technical Requirements | Screen Shots | FAQ | Primary Technology Contacts | Discussion ForumBlog

Download Now!

 

Product Overview

The Intel® Performance Tuning Utility (Intel® PTU) is a powerful, cross-platform performance analysis tool set.  It offers a low-overhead statistical call graph, and other traditional features such as identification of the hottest modules and functions of an application, tracking of call sequences and identification of performance-critical source code.  Intel® PTU also has new, more powerful data collection, analysis, and visualization capabilities.  For more experienced performance tuners, Intel® PTU offers insight into processor hardware event counters for a more detailed look into the performance of areas such as the memory system and architecture. The tool can also map your issues back to the source code. If you are analyzing an application for which you don’t have the source, Intel® PTU allows you to represent data with basic block granularity and provides a function control flow graph to navigate the disassembly. The Intel® Performance Tuning Utility is available for both Windows* and Linux*.

The Intel® Performance Tuning Utility offers:

  • Statistical Call Graph - Profiles with low overhead to detect where time is spent in your application
  • Event Based Sampling - Uses the processor’s onboard performance monitoring hardware to get a detailed look into performance issues
  • Basic Block Analysis - Displays hotspots with basic block granularity and generates a control flow graph for advanced analysis of application, even without the source code
  • Events over IP graph - Generates a histogram of performance events distributed over application code
  • Loop Analysis - Identifies loops and recursion in your application to aid optimization
  • Result difference - Compares the results of multiple runs to measure changes in performance
  • Data Access Profiling - Identifies memory hotspots and relates them to code hotspots
  • Heap Profiler - Identifies dynamic memory usage by application. Can help identify memory leaks
  • Instrumentation-based Call Graph, Call Count - Provides exact call graph and call count information for your application

Version 3.1 Update 3 of the Intel Performance Tuning Utility introduces the following new features:

  • New CPUs support - Recognize CPU and do basic sampling on it
  • Data Access Profiling - Filter memory data by instructions in Source View, Latency Histogram, Utility charts with Access Stride and Array-of-structures distribution, Working Set chart, global data objects granularity for Memory Hotspots, some other GUI improvements
  • Instrumentation-based Call Graph, Call Count, and Heap Profiling - Support Windows* Intel 64 architecture
  • Profile descriptions - Edit embedded descriptions for user defined profiles
  • Multiple bug fixes and performance improvements

NEW with the recently posted v3.1, are predefined Intel Core™2 Microarchitecture event profiles for Windows* and Linux*, accompanied by embedded descriptions. These profiles have been developed by Intel's application engineers to help you make the best use of these events and are based on the analysis methodologies for the Intel® Core™2 Microarchitecture as described at http://www.devx.com/go-parallel/Door/32532#optimization and http://www.devx.com/go-parallel/Door/33294 

The capabilities of this utility for performance analysis are in many ways similar to that of the Intel® VTune™ Performance Analyzer, however this technology includes features that may be of more value to those who are more experienced with performance tuning. This utility explores some new approaches for data collection and user interface techniques. One of our goals for releasing this utility to the public, in addition to providing our customers powerful performance analysis tools, is to get feedback on what you do or don’t like. We appreciate your help which will make future versions of this utility, or other Intel software products, even better.

As the capabilities in this utility are experimental, we cannot guarantee any level of support for them. Some of the features and interface designs may find their way into released and supported products, some may not.

The current version 3.1 of this utility is built as a plug-in to the Eclipse environment.  This distribution package of Intel® PTU includes an Eclipse environment and is integrated into it.

Parallelization Made Easier with Intel® Performance Tuning Utility was published in The Intel Technology Journal. The paper explorers how the Intel® Performance Tuning Utility significantly improves on the data collection and display features available and adds capabilities needed for enabling and analysis of parallel execution.


Technical Requirements

1. You must have a license for the Intel VTune™ Performance Analyzer product on your system. If you don't, you can acquire the commercial product or try an evaluation copy.


2. Please see the release notes for more details on technical requirements, including the list of supported processors and operating systems.

Screenshots

 

 The ‘outer_loop’ function in different call branches can be the first target for parallelization. The heavy loop is detected inside the function and its caller function.

 The Event over IP view (at the bottom) helps identify the most time-consuming code section within the bladeenc.exe module. The peak occurs at the ‘iteration_loop’ function.

 The heaviest source line (#597) is disassembled on the right and grouped into basic blocks. Analyze the execution flow from the Flow Graph at the bottom.

The performance of the ‘calc_noise’ function is improved. The difference between two collected experiments is 39.51 msec.

 

Frequently Asked Questions

Q - How do I get started using the Intel® Performance Tuning Utility?

A - There are 2 things that we recommend you do before starting to use this tool. First of all, make sure thatcheck if you have on your system the Intel VTune™ Performance Analyzer product with not expired support periodon your system. If you don't or aren’t sure about it, you can acquire the commercial product or try an evaluation copy. Next, make sure that you have reviewed the installation and usage guide. This guide provides a visual screenshot by screenshot display of all user interface interactions needed to invoke and effectively use Intel® PTU. This guide, along with the User Guide, helps you ensure that the installation of Intel(R) PTU does not inadvertently impact your usage of the VTune Performance Analyzer.

Q - Where can I get support for the use of this utility?

A - We encourage you to visit our support forums for support.

Q - What are the licensing terms that spell out how exactly I can use this utility?

A - The licensing terms are listed on the download page.

Q - Can you tell me a little more about the results difference feature of Intel(R) PTU?

A - The results difference feature is a powerful feature that allows you to see the performance difference made by either changing compiler switches using the same compiler or by changing the compiler used to generate the application.  The fans of this capability have seen significant productivity gains by being able to quickly see the performance impact of compile time changes as soon as a new build is ready, which is all the more important during regression testing. 

Q - Is this version backward compatible with v2.0 or v3.0? Can I see view in v3.1 results collected from with PTU v2.0 in PTU v3.0 previous versions?

A - Unfortunately, you may not. Moreover, you will probably need to run your analysis again with Intel PTU v3.1 (sometimes results re-conversion may help).

Q - What is new with Intel® Performance Tuning Utility 3.1?

A - See an appropriate section of the product release notes.


Primary Technology Contacts

Vasanth Tovinkere:
Vasanth Tovinkere is a Sr. Technical Consulting Engineer at the Intel Performance, Analysis and Threading Lab in the Developer Products Division. His current role involves supporting and defining new directions for next generation of performance analyzers and Intel® Threading Tools and consulting with strategic customers through the Tools Immersion Program focusing on threading for multi-core architectures. He has also been involved in the development of automatic semantic detectors for digital sports technologies in Intel Labs. His research interests include data mining of threading performance data and fuzzy inference engines. Vasanth began his career at Intel in 1997 as an engineer where he researched threading behavior and performance and worked with early adopters in Wall Street to enable them for multi-processor architectures. Prior to joining Intel, he was involved in the development of automated fuzzy pattern recognition algorithms for NASA’s Mission to Planet Earth Program.

Alexei Alexandrov:
Alexei Alexandrov is a senior software developer in Intel’s Performance, Analysis and Threading Lab. He lives and works in Russia, Nizhniy Novgorod city. Prior to joining Intel in February 2004 Alexei worked for a local company in Saratov (which is a city about 350 miles to the south of Nizhny Novgorod) doing database-related things. Rewinding even earlier, Alexei worked as a software developer for a company which produced (and seems to still produce) computer-driven machines.

Anatoly Lubomirov:
Anatoly Lubomirov is a usability engineer at Intel’s Performance, Analysis and Threading Lab. He joined Intel in February 2002 and now focuses on the usability user interface design for a new generation of performance analysis tools.

Post a comment If you have any questions, please contact our support team.