Below is a patch that redefines the kmem_cache_alloc `align' argument:
- align not zero: use the specified alignment. I think values smaller than
sizeof(void*) will work, even on archs with strict alignment requirement (or
at least: slab shouldn't crash. Obviously the user must handle the
alignment properly).
- align zero:
* debug on: align to sizeof(void*)
* debug off, SLAB_HWCACHE_ALIGN clear: align to sizeof(void*)
* debug off, SLAB_HWCACHE_ALIGN set: align to the smaller of
- cache_line_size()
- the object size, rounded up to the next power of two.
Slab never honored cache align for tiny objects: otherwise the 32-byte
kmalloc objects would use 128 byte objects.
There is one additional point: right now slab uses ints for the bufctls.
Using short would save two bytes for each object. Initially I had used short,
but davem objected. IIRC because some archs do not handle short efficiently.
Should I allow arch overrides for the bufctls? On i386, saving two bytes
might allow a few additional anon_vma objects in each page.