Hi F0dder, Thank you for your post. That is what i have come up with so far. I run the code multiple times and then i take the most often occurring tick-count(the fastest ones) and then average those out. 90% of the time it is with-in a few clock cycles (that is all i need). The only drawback to this is that it need to be run several times.
BTW here is what i do in a nutshell…
Call the code 4 or 5 times. (to make sure its in cache)
Loop X times
call a sleep(0) (to get to a beginning of the timeslice)
start timer
call the code
end timer
Find the most often (and fastest) occurring clocks
I also set the system to “background processes” so that the OS creates longer timeslices. (helps a little bit)
Thanks for the VTune tip… ill check it out to see if there is anything that can be used without re-inventing the wheel.