SMPLOCKS

At any given point, the cpu can be thought of as being at a 'level'. The level is used to prevent another cpu or I/O device from interfering with what a cpu is doing. These are referred to in the source as smp lock levels. Here is a list of the levels that the kernel uses (in order from least interruptible to most interruptible):

Notes about the levels: I wrote my smplock acquire/release routines to enforce the above rules so I can't do something stupid. If I try to break a rule, the acquire/release routine crashes on the spot, even in the uniprocessor version.

Theoretically, one could simply use one smplock for everything. Whenever you enter kernel mode, acquire the smplock and release it before returning to user mode. The problem with that is that there will be lots of unnecessary contention for the smplock. For example, the thread executing on CPU-1 might be doing some TCP/IP stuff while the thread on CPU-2 might be doing some disk I/O. This would lead to disasterous performance.

So now I needed to determine just how many smplocks were enough. Too few smplocks will work, it just is bad for performance. So is there such a thing as too many smplocks? Well, I think there is. I know I have too many smplocks when either of these conditions hold:

Initially, I came up with a list of what I thought I would need, and it was a fairly short list. Then I got the system working with those after adding a couple that I didn't think about originally. Then I put counters on the smplocks to find which ones had the most contention. Using the above rules, I split up the most contented smplocks as much as I could conceive. Now all the contention counts are low (less than 1% cpu time) when I pound on it, so I think I have a minimal set or very close to it.

The worst offender originally was the non-paged pool allocation smplock. For that one, I rewrote the 'malloc' routine so that it wouldn't even use a smplock for the common cases. It really sped things up quite a bit. The next bad one was that I just had a single smplock for all thread state related stuff. Now it is split up into PS,TF,EV,TP,TC, and I squeezed a couple percentage points of performance out of that. The most contended smplock I have now is for the state of physical memory (PM). I don't think I can split it up, but even if I could, the best performance gain would be less than 1%.

Update. I had nothing to do once and I found out that the contention on the PM lock was due to the disk cache routines using it for their own internal tables. So I split it up into PM and CP, where CP is used for per-cache block lists. Now there is very little contention on either of those locks! So a fairly painless change got rid of a little nag. Not a noticable improvement, but getting rid of lock contention can only help. Now there is no one lock that really stands out for contention.

So here is a list of the actual smplock levels I ended up with:

	#define OZ_SMPLOCK_LEVEL_SH 0x10		/* shutdown handler table */
	#define OZ_SMPLOCK_LEVEL_HT 0x18		/* per-process handle tables */
	#define OZ_SMPLOCK_LEVEL_VL 0x20		/* per-volume filesystem volume locks */
	#define OZ_SMPLOCK_LEVEL_DV 0x28		/* devices */
	#define OZ_SMPLOCK_LEVEL_PT 0x30		/* per-process page table */
	#define OZ_SMPLOCK_LEVEL_GP 0x40		/* global sections page table */
	#define OZ_SMPLOCK_LEVEL_PM 0x48		/* physical memory tables */
	#define OZ_SMPLOCK_LEVEL_CP 0x49		/* cache private lock */
	#define OZ_SMPLOCK_LEVEL_PS 0x4A		/* per-process state */
	#define OZ_SMPLOCK_LEVEL_TF 0x4C		/* thread family list */
	#define OZ_SMPLOCK_LEVEL_EV 0x4E		/* individual event flag state */
	#define OZ_SMPLOCK_LEVEL_TP 0x50		/* individual thread private lock */
	#define OZ_SMPLOCK_LEVEL_TC 0x52		/* thread COM queue lock */
	#define OZ_SMPLOCK_LEVEL_PR 0x54		/* process lists */
	#define OZ_SMPLOCK_LEVEL_SE 0x60		/* security structs */
	#define OZ_SMPLOCK_LEVEL_NP 0x68		/* non-paged pool */
	#define OZ_SMPLOCK_LEVEL_QU 0x70		/* quota */

	#define OZ_SMPLOCK_LEVEL_IRQS 0xE0		/* irq's use 0xE0..0xEF */
							/* the lowest priority, IRQ 7, uses level 0xE0 */
							/* the highest priority, IRQ 0, uses level 0xEF */

	#define OZ_SMPLOCK_LEVEL_KT 0xF8		/* kthread routines */
	#define OZ_SMPLOCK_LEVEL_HI 0xFC		/* lowipl routines */

Notice the non-paged pool lock, NP, is below the IRQ locks. This is what prevents the OZ_KNL_NPPMALLOC from being able to be called from an interrupt routine. If it really becomes necessary to, the NP lock could be changed to be above the IRQ levels, but then that means IRQ delivery will be inhibited during any smplocked non-paged pool allocations/deallocations. Fortunately, most operations are performed atomically and don't need the smplock so it actually might not be that bad, but I haven't needed it yet.