Grant Johnson, Tim Dudra, anyone else criticising the Windows scheduler: youâre wrong. Windows does indeed run threads by priority. It is fully pre-emptive, but that word means something different from what you think it does. Pre-empting means that when something occurs that causes a higher-priority thread to become runnable than is currently executing on the current, or any, processor, the current threadâs run is ended early (with the scheduler noting how much of the quantum has been used, the thread getting the rest of its quantum next time) and the higher-priority thread beginning its run immediately, rather than having to wait until either the next timer tick or to the end of the current threadâs quantum. The pre-empted thread gets pushed onto the front of the queue at its priority level, so it will be the next thread to run once all higher-priority threads have blocked (see next paragraph).
Windowsâ scheduler is a round-robin priority-based scheduler. What that means is that it will take into account the priority of all runnable threads on the system. It runs all threads at the highest priority level at which there are runnable threads in a round-robin fashion - running each thread in turn. It will only start to consider lower-priority threads when those higher-priority threads are no longer runnable, which basically means when theyâre blocked waiting for something to happen. The exception is when it has more CPUs than high-priority threads - in this case it will find work for the otherwise-spare CPUs to do.
Multi-CPU (or {logical} core) scheduling is more tricky since Windows has both an affinity mask which governs which cores a thread is allowed to run on (which is also misused!), an âidealâ core for each thread (set when the thread is created, rotating across the cores within a process, so thread 0 might be on CPU 0, then thread 1 will be on CPU 1, thread 2 on CPU 2, then 3 on CPU 0 again, and so on), and remembering the last core that a thread ran on. These last two settings are a hint to try to improve cache locality - with luck the threadâs code or data may still be in the core/processorâs cache. You might find that the two highest priority threads on a two-core system wouldnât actually both run concurrently if they have the same ideal core and thereâs a runnable lower-priority thread which has the other core as its ideal core.
Windows stores two priority values: a base priority, and a dynamic priority. The scheduling is done based on the dynamic priority - the base priority is a low water mark below which a thread will never go. When a thread that was blocked becomes runnable, its dynamic priority is boosted - the amount it is boosted by depends on the reason it becomes unblocked. For disk I/O itâs boosted by 1, for network or serial port activity by 2, for keyboard and mouse input by 6, and if the sound card needs more data, by 8. If it was waiting for an event or a semaphore, itâs normally by 1. There are special functions which can cause a higher boost for events - the event used in a critical section boosts the unblocking thread to the priority of the thread which left the critical section plus one. If a thread unblocks due to GUI activity its priority is boosted by 2.
Joe Duffy mentions the balance set manager thread which wakes up once per second for various management duties. One of those is to look for threads that are runnable but havenât run for 4 seconds. It temporarily gets boosted to priority 15 (the highest non-real-time priority) then, after its quantum expires, drops back to its base priority. It only examines 16 threads on the ready queue at a time, and will only boost 10 threads in any pass, so if there are a lot of runnable threads, it may take some time for any given thread to get its boost.
Full details are in âWindows Internals, Fourth Editionâ by Mark Russinovich and David Solomon, which I recommend to any programmer. If abstractions leak, youâd best understand the leaks.
Windows hasnât had the concept of âyieldâ since Windows 3.1. 16-bit applications do still act this way with each other on Windows NT-family systems but the whole 16-bit environment is pre-emptable against 32-bit applications.
The main problems are threads which poll rather than blocking, and applications which are thread-happy. Polling is brain-dead. The operating system offers numerous ways to wait for changes to occur, and they should be used. Polling wastes laptop batteries, because the CPU cannot be put into a low power state, and often causes disks to be kept spinning because the polling threadâs code and data may be pulled back into memory if trimmed from the working set.
A thread-happy application is one which creates, and keeps runnable, more threads than can be actually serviced by the systemâs processors. The system spends its time thrashing the processorâs cache rather than doing useful work. Put one of these at high priority and suddenly very little else will get done.