mm: create a separate slab for page->ptl allocation
If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64
is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab,
so we loose 24 on each. An average system can easily allocate few tens
thousands of page->ptl and overhead is significant.
Let's create a separate slab for page->ptl allocation to solve this.
To make sure that it really works this time, some numbers from my test
machine (just booted, no load):
Before:
# grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo
kmalloc-96 31987 32190 128 30 1 : tunables 120 60 8 : slabdata 1073 1073 92
After:
# grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo
page->ptl 27516 28143 72 53 1 : tunables 120 60 8 : slabdata 531 531 9
kmalloc-96 3853 5280 128 30 1 : tunables 120 60 8 : slabdata 176 176 0
Note that the patch is useful not only for debug case, but also for
PREEMPT_RT, where spinlock_t is always bloated.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index e9c5504..86487df 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4275,11 +4275,20 @@
#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */
#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS
+
+static struct kmem_cache *page_ptl_cachep;
+
+void __init ptlock_cache_init(void)
+{
+ page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(spinlock_t), 0,
+ SLAB_PANIC, NULL);
+}
+
bool ptlock_alloc(struct page *page)
{
spinlock_t *ptl;
- ptl = kmalloc(sizeof(spinlock_t), GFP_KERNEL);
+ ptl = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
if (!ptl)
return false;
page->ptl = ptl;
@@ -4288,6 +4297,6 @@
void ptlock_free(struct page *page)
{
- kfree(page->ptl);
+ kmem_cache_free(page_ptl_cachep, page->ptl);
}
#endif