Intel® Software Network Knowledge Base Wiki


Constructing Nav Tree
One Moment...

(refresh menu)



 
Welcome, Guest | Quick Login | Register

Develop for Core processor


Use Software Data Prefetch on 32-Bit Intel® Architecture

Version 3, Changed by LINDA SWINK on 3/21/2008
Created by: KYLEX.S.LEWIS@INTEL.COM

Challenge

Use software data prefetch to hide the latency of data access in performance-critical sections of application code. The prefetch instruction allows data to be fetched in advance of its actual usage. The prefetch instructions do not change the user-visible semantics of a program, although they may affect the program’s performance. The prefetch instructions merely provide a hint to the hardware and generally will not generate exceptions or faults.

The prefetch instructions load either non-temporal data or temporal data in the specified cache level. This data-access type and the cache level are specified as a hint. Depending on the implementation, the instruction fetches 32 or more aligned bytes, including the specified address byte, into the instruction-specified cache levels.

Excessive use of prefetch instructions may waste memory bandwidth and result in a performance penalty due to resource constraints. Nevertheless, the prefetch instructions can lessen the overhead of memory transactions by preventing cache pollution and by using the caches and memory efficiently. This is particularly important for applications that share critical system resources, such as the memory bus.

Solution

Use the prefetch instructions in predictable memory-access patterns, time-consuming innermost loops, and locations where the execution pipeline may stall if data is not available. Using the prefetch instructions is recommended only if data does not fit in cache. The prefetch instructions are mainly designed to improve application performance by hiding memory latency in the background. If segments of an application access data in a predictable manner, for example, using arrays with known strides, then they are good candidates for using prefetch to improve performance.

Streaming SIMD Extensions include four flavors of prefetch instructions: one non-temporal, and three temporal. They correspond to two types of operations, temporal and non-temporal. The prefetch instruction is implementation-specific; applications need to be tuned to each implementation to maximize performance.

Note: At the time of prefetch, if the data is already found in a cache level that is closer to the processor than the cache level specified by the instruction, no data movement occurs.

The non-temporal instruction is prefetchnta, which fetches the data into the second-level cache, minimizing cache pollution.

The temporal instructions are as follows:

  • prefetcht0 – fetches the data into all cache levels, that is, to the second-level cache for the Pentium® 4 processor.
  • prefetcht1 – Identical to prefetcht0
  • prefetcht2 – Identical to prefetcht0

The following table lists the prefetch implementation differences between the Pentium® III processor and Pentium 4 processor:

 Prefetch Type

Pentium III Processor

Pentium 4 Processor

Prefetch NTA

  • Fetch 32 bytes
  • Fetch into 1st- level cache
  • Do not fetch into 2nd-level cache
  • Fetch 128 bytes
  • Do not fetch into 1st-level cache
  • Fetch into 1 way of 2nd-level cache

PrefetchT0

  • Fetch 32 bytes
  • Fetch into 1st- level cache
  • Fetch into 2nd- level cache
  • Fetch 128 bytes
  • Do not fetch into 1st-level cache
  • Fetch into 2nd- level cache

PrefetchT1, PrefetchT2

  • Fetch 32 bytes
  • Fetch into 2nd-level cache only
  • Do not fetch into 1st-level cache
  • Fetch 128 bytes
  • Do not fetch into 1st-level cache
  • Fetch into 2nd- level cache only


For more information, including a comparison of prefetch and load instructions, see the IA-32 Intel® Architecture Optimization Reference Manual.

Source

IA-32 Intel® Architecture Optimization Reference Manual

 



Served
23 Knowledge Bases
605 Pages
Search
Powering Up Search...


Vote on this Page

Tags For This Page
Loading Tags..

Tag This



Additional legal information