Intel® Software Network Knowledge Base Wiki


Constructing Nav Tree
One Moment...

(refresh menu)



 
Welcome, Guest | Quick Login | Register

Develop for Core processor


Implement a Horizontal Add/Subtract with SSE3 Instructions

Version 2, Changed by KYLE LEWIS on 1/3/2008
Created by: KYLEX.S.LEWIS@INTEL.COM

Challenge

Implement a horizontal add/subtract using SSE3 instructions. Most SIMD instructions operate vertically. Data elements of the result in position k are a function of data elements in position k the instructions operands. Horizontal instructions operate differently. Contiguous data elements from the same operand are used to produce the result.

Packed horizontal add instructions can be useful to evaluate dot products and matrix multiplications, and to facilitate some SIMD computations operating on vectors that are arranged in arrays of structures.

Solution

Use the HADDPS instruction, as shown in the code given here. HADDPS performs a single-precision addition on contiguous data elements. The first data element of the result is obtained by adding the first and second elements of the first operand. The second element is obtained by adding the third and fourth elements of the first operand. The third element is obtained by adding the first and second elements of the second operand. The fourth element is obtained by adding the third and fourth elements of the second operand.

The following example demonstrates computing the dot product of a four-component vector; it can be adapted and extended to compute matrix multiplication of a 4x4 matrix:

// An example that computes a four component dot product and 
// broadcasts the result which is stored in xmm0. 
movaps xmm0, Vector1 
movaps xmm1, Vector2 
mulps xmm0, xmm1 haddps xmm0, xmm0 haddps xmm0, xmm0 // An example that computes two four-component // dot products from four vectors. movaps xmm0, Vector1 movaps xmm1, Vector2 movaps xmm2, Vector3 movaps xmm3, Vector4 mulps xmm0, xmm1 mulps xmm2, xmm3 haddps xmm0, xmm2 haddps xmm0, xmm0 

 

Source

Next Generation Intel® Processor: Software Developers Guide.

 



Served
25 Knowledge Bases
604 Pages
Search
Powering Up Search...


Vote on this Page

Tags For This Page
Loading Tags..

Tag This



Additional legal information