Hi,
I am attempting to measure the overhead of use rdtsc to measure ticks. To do this I set up a loop to have two successive rdtsc calls and then subtract for the difference. I make a kernel module in C to perform this loop:
#define ITERS 500
unsigned long arr[ITERS];
unsigned long el, sl;
int i;
for (i = 0; i < ITERS; i++) {
asm ( "mov %%cr0, %%edi;"
"rdtsc;"
"mov %%eax, %%esi;"
"mov %%cr0, %%edi;"
"rdtsc;"
"mov %%cr0, %%edi;"
"mov %%eax, %%edi;"
"mov %%esi, %0;"
"mov %%edi, %1;"
:
"=m" (sl),
"=m" (el)
);
arr[i] = el - sl;
}
I basically sandwich two rdtsc's as close together as possible. I add few instructions
to the mix:
three moves from cr0's : I use these to serialize, prevent reordering
an instruction to save the result of the first rdtsc
The majority of the time I get a consistent value (72 ticks for 1.6GHz Core 2 Quad E5310). However, I occasionally get 66 or 78 ticks.
I have two questions:
1) The tick counts always seem to be multiples of 6 ticks. I am assuming this is becuase ticks are measured at the front bus cycles (1066 MHz, Quad pumped => 266 MHz) and multipled by the front bus-to-core frequency ratio (which is 6 for this processor). Is this correct?
2) It seems for this simple loop that there should be no variations. However, there are frequently outliers in these measurements. Are there non-deterministic factors I am missing here?
Thanks for any help you can offer,
Andrew