aboutsummaryrefslogtreecommitdiffstats
path: root/lib/debugobjects.c
AgeCommit message (Collapse)AuthorFilesLines
2024-10-15debugobjects: Track object usage to avoid premature freeing of objectsThomas Gleixner1-5/+40
The freelist is freed at a constant rate independent of the actual usage requirements. That's bad in scenarios where usage comes in bursts. The end of a burst puts the objects on the free list and freeing proceeds even when the next burst which requires objects started again. Keep track of the usage with a exponentially wheighted moving average and take that into account in the worker function which frees objects from the free list. This further reduces the kmem_cache allocation/free rate for a full kernel compile: kmem_cache_alloc() kmem_cache_free() Baseline: 225k 173k Usage: 170k 117k Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/87bjznhme2.ffs@tglx
2024-10-15debugobjects: Refill per CPU pool more agressivelyThomas Gleixner1-0/+18
Right now the per CPU pools are only refilled when they become empty. That's suboptimal especially when there are still non-freed objects in the to free list. Check whether an allocation from the per CPU pool emptied a batch and try to allocate from the free pool if that still has objects available. kmem_cache_alloc() kmem_cache_free() Baseline: 295k 245k Refill: 225k 173k Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.439053085@linutronix.de
2024-10-15debugobjects: Double the per CPU slotsThomas Gleixner1-1/+1
In situations where objects are rapidly allocated from the pool and handed back, the size of the per CPU pool turns out to be too small. Double the size of the per CPU pool. This reduces the kmem cache allocation and free operations during a kernel compile: alloc free Baseline: 380k 330k Double size: 295k 245k Especially the reduction of allocations is important because that happens in the hot path when objects are initialized. The maximum increase in per CPU pool memory consumption is about 2.5K per online CPU, which is acceptable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.378676302@linutronix.de
2024-10-15debugobjects: Move pool statistics into global_pool structThomas Gleixner1-35/+52
Keep it along with the pool as that's a hot cache line anyway and it makes the code more comprehensible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.318776207@linutronix.de
2024-10-15debugobjects: Implement batch processingThomas Gleixner1-15/+46
Adding and removing single objects in a loop is bad in terms of lock contention and cache line accesses. To implement batching, record the last object in a batch in the object itself. This is trivialy possible as hlists are strictly stacks. At a batch boundary, when the first object is added to the list the object stores a pointer to itself in debug_obj::batch_last. When the next object is added to the list then the batch_last pointer is retrieved from the first object in the list and stored in the to be added one. That means for batch processing the first object always has a pointer to the last object in a batch, which allows to move batches in a cache line efficient way and reduces the lock held time. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.258995000@linutronix.de
2024-10-15debugobjects: Prepare kmem_cache allocations for batchingThomas Gleixner1-31/+49
Allocate a batch and then push it into the pool. Utilize the debug_obj::last_node pointer for keeping track of the batch boundary. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241007164914.198647184@linutronix.de
2024-10-15debugobjects: Prepare for batchingThomas Gleixner1-3/+7
Move the debug_obj::object pointer into a union and add a pointer to the last node in a batch. That allows to implement batch processing efficiently by utilizing the stack property of hlist: When the first object of a batch is added to the list, then the batch pointer is set to the hlist node of the object itself. Any subsequent add retrieves the pointer to the last node from the first object in the list and uses that for storing the last node pointer in the newly added object. Add the pointer to the data structure and ensure that all relevant pool sizes are strictly batch sized. The actual batching implementation follows in subsequent changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.139204961@linutronix.de
2024-10-15debugobjects: Use static key for boot pool selectionThomas Gleixner1-8/+11
Get rid of the conditional in the hot path. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.077247071@linutronix.de
2024-10-15debugobjects: Rework free_object_work()Thomas Gleixner1-43/+39
Convert it to batch processing with intermediate helper functions. This reduces the final changes for batch processing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.015906394@linutronix.de
2024-10-15debugobjects: Rework object freeingThomas Gleixner1-75/+24
__free_object() is uncomprehensibly complex. The same can be achieved by: 1) Adding the object to the per CPU pool 2) If that pool is full, move a batch of objects into the global pool or if the global pool is full into the to free pool This also prepares for batch processing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.955542307@linutronix.de
2024-10-15debugobjects: Rework object allocationThomas Gleixner1-75/+69
The current allocation scheme tries to allocate from the per CPU pool first. If that fails it allocates one object from the global pool and then refills the per CPU pool from the global pool. That is in the way of switching the pool management to batch mode as the global pool needs to be a strict stack of batches, which does not allow to allocate single objects. Rework the code to refill the per CPU pool first and then allocate the object from the refilled batch. Also try to allocate from the to free pool first to avoid freeing and reallocating objects. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.893554162@linutronix.de
2024-10-15debugobjects: Move min/max count into pool structThomas Gleixner1-24/+31
Having the accounting in the datastructure is better in terms of cache lines and allows more optimizations later on. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.831908427@linutronix.de
2024-10-15debugobjects: Rename and tidy up per CPU poolsThomas Gleixner1-26/+17
No point in having a separate data structure. Reuse struct obj_pool and tidy up the code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.770595795@linutronix.de
2024-10-15debugobjects: Use separate list head for boot poolThomas Gleixner1-12/+16
There is no point to handle the statically allocated objects during early boot in the actual pool list. This phase does not require accounting, so all of the related complexity can be avoided. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.708939081@linutronix.de
2024-10-15debugobjects: Move pools into a datastructureThomas Gleixner1-62/+78
The contention on the global pool lock can be reduced by strict batch processing where batches of objects are moved from one list head to another instead of moving them object by object. This also reduces the cache footprint because it avoids the list walk and dirties at maximum three cache lines instead of potentially up to eighteen. To prepare for that, move the hlist head and related counters into a struct. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.646171170@linutronix.de
2024-10-15debugobjects: Reduce parallel pool fill attemptsZhen Lei1-25/+59
The contention on the global pool_lock can be massive when the global pool needs to be refilled and many CPUs try to handle this. Address this by: - splitting the refill from free list and allocation. Refill from free list has no constraints vs. the context on RT, so it can be tried outside of the RT specific preemptible() guard - Let only one CPU handle the free list - Let only one CPU do allocations unless the pool level is below half of the minimum fill level. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240911083521.2257-4-thunder.leizhen@huawei.com- Link: https://lore.kernel.org/all/20241007164913.582118421@linutronix.de -- lib/debugobjects.c | 84 +++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 59 insertions(+), 25 deletions(-)
2024-10-15debugobjects: Make debug_objects_enabled boolThomas Gleixner1-9/+8
Make it what it is. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.518175013@linutronix.de
2024-10-15debugobjects: Provide and use free_object_list()Thomas Gleixner1-6/+16
Move the loop to free a list of objects into a helper function so it can be reused later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241007164913.453912357@linutronix.de
2024-10-15debugobjects: Remove pointless debug printkThomas Gleixner1-4/+1
It has zero value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.390511021@linutronix.de
2024-10-15debugobjects: Reuse put_objects() on OOMThomas Gleixner1-18/+6
Reuse the helper function instead of having a open coded copy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.326834268@linutronix.de
2024-10-15debugobjects: Dont free objects directly on CPU hotplugThomas Gleixner1-13/+14
Freeing the per CPU pool of the unplugged CPU directly is suboptimal as the objects can be reused in the real pool if there is room. Aside of that this gets the accounting wrong. Use the regular free path, which allows reuse and has the accounting correct. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.263960570@linutronix.de
2024-10-15debugobjects: Remove pointless hlist initializationThomas Gleixner1-10/+1
It's BSS zero initialized. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164913.200379308@linutronix.de
2024-10-15debugobjects: Dont destroy kmem cache in init()Thomas Gleixner1-33/+35
debug_objects_mem_init() is invoked from mm_core_init() before work queues are available. If debug_objects_mem_init() destroys the kmem cache in the error path it causes an Oops in __queue_work(): Oops: Oops: 0000 [#1] PREEMPT SMP PTI RIP: 0010:__queue_work+0x35/0x6a0 queue_work_on+0x66/0x70 flush_all_cpus_locked+0xdf/0x1a0 __kmem_cache_shutdown+0x2f/0x340 kmem_cache_destroy+0x4e/0x150 mm_core_init+0x9e/0x120 start_kernel+0x298/0x800 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0xc5/0xe0 common_startup_64+0x12c/0x138 Further the object cache pointer is used in various places to check for early boot operation. It is exposed before the replacments for the static boot time objects are allocated and the self test operates on it. This can be avoided by: 1) Running the self test with the static boot objects 2) Exposing it only after the replacement objects have been added to the pool. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241007164913.137021337@linutronix.de
2024-10-15debugobjects: Collect newly allocated objects in a list to reduce lock ↵Zhen Lei1-8/+10
contention Collect the newly allocated debug objects in a list outside the lock, so that the lock held time and the potential lock contention is reduced. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240911083521.2257-3-thunder.leizhen@huawei.com Link: https://lore.kernel.org/all/20241007164913.073653668@linutronix.de
2024-10-15debugobjects: Delete a piece of redundant codeZhen Lei1-4/+4
The statically allocated objects are all located in obj_static_pool[], the whole memory of obj_static_pool[] will be reclaimed later. Therefore, there is no need to split the remaining statically nodes in list obj_pool into isolated ones, no one will use them anymore. Just write INIT_HLIST_HEAD(&obj_pool) is enough. Since hlist_move_list() directly discards the old list, even this can be omitted. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240911083521.2257-2-thunder.leizhen@huawei.com Link: https://lore.kernel.org/all/20241007164913.009849239@linutronix.de
2024-09-09debugobjects: Remove redundant checks in fill_pool()Zhen Lei1-7/+5
fill_pool() checks locklessly at the beginning whether the pool has to be refilled. After that it checks locklessly in a loop whether the free list contains objects and repeats the refill check. If both conditions are true, it acquires the pool lock and tries to move objects from the free list to the pool repeating the same checks again. There are two redundant issues with that: 1) The repeated check for the fill condition 2) The loop processing The repeated check is pointless as it was just established that fill is required. The condition has to be re-evaluated under the lock anyway. The loop processing is not required either because there is practically zero chance that a repeated attempt will succeed if the checks under the lock terminate the moving of objects. Remove the redundant check and replace the loop with a simple if condition. [ tglx: Massaged change log ] Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240904133944.2124-4-thunder.leizhen@huawei.com
2024-09-09debugobjects: Fix conditions in fill_pool()Zhen Lei1-2/+3
fill_pool() uses 'obj_pool_min_free' to decide whether objects should be handed back to the kmem cache. But 'obj_pool_min_free' records the lowest historical value of the number of objects in the object pool and not the minimum number of objects which should be kept in the pool. Use 'debug_objects_pool_min_level' instead, which holds the minimum number which was scaled to the number of CPUs at boot time. [ tglx: Massage change log ] Fixes: d26bf5056fc0 ("debugobjects: Reduce number of pool_lock acquisitions in fill_pool()") Fixes: 36c4ead6f6df ("debugobjects: Add global free list and the counter") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240904133944.2124-3-thunder.leizhen@huawei.com
2024-09-09debugobjects: Fix the compilation attributes of some global variablesZhen Lei1-7/+7
1. Both debug_objects_pool_min_level and debug_objects_pool_size are read-only after initialization, change attribute '__read_mostly' to '__ro_after_init', and remove '__data_racy'. 2. Many global variables are read in the debug_stats_show() function, but didn't mask KCSAN's detection. Add '__data_racy' for them. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240904133944.2124-2-thunder.leizhen@huawei.com
2024-06-24debugobjects: Annotate racy debug variablesBreno Leitao1-10/+11
KCSAN has identified a potential data race in debugobjects, where the global variable debug_objects_maxchain is accessed for both reading and writing simultaneously in separate and parallel data paths. This results in the following splat printed by KCSAN: BUG: KCSAN: data-race in debug_check_no_obj_freed / debug_object_activate write to 0xffffffff847ccfc8 of 4 bytes by task 734 on cpu 41: debug_object_activate (lib/debugobjects.c:199 lib/debugobjects.c:564 lib/debugobjects.c:710) call_rcu (kernel/rcu/rcu.h:227 kernel/rcu/tree.c:2719 kernel/rcu/tree.c:2838) security_inode_free (security/security.c:1626) __destroy_inode (./include/linux/fsnotify.h:222 fs/inode.c:287) ... read to 0xffffffff847ccfc8 of 4 bytes by task 384 on cpu 31: debug_check_no_obj_freed (lib/debugobjects.c:1000 lib/debugobjects.c:1019) kfree (mm/slub.c:2081 mm/slub.c:4280 mm/slub.c:4390) percpu_ref_exit (lib/percpu-refcount.c:147) css_free_rwork_fn (kernel/cgroup/cgroup.c:5357) ... value changed: 0x00000070 -> 0x00000071 The data race is actually harmless as this is just used for debugfs statistics, as all other debug variables. Annotate all debug variables as racy explicitly, since these variables are known to be racy and harmless. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240611091813.1189860-1-leitao@debian.org
2023-11-22debugobjects: Stop accessing objects after releasing hash bucket lockAndrzej Hajda1-122/+78
After release of the hashbucket lock the tracking object can be modified or freed by a concurrent thread. Using it in such a case is error prone, even for printing the object state: 1. T1 tries to deactivate destroyed object, debugobjects detects it, hash bucket lock is released. 2. T2 preempts T1 and frees the tracking object. 3. The freed tracking object is allocated and initialized for a different to be tracked kernel object. 4. T1 resumes and reports error for wrong kernel object. Create a local copy of the tracking object before releasing the hash bucket lock and use the local copy for reporting and fixups to prevent this. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20231025-debugobjects_fix-v3-1-2bc3bf7084c2@intel.com
2023-10-18treewide: mark stuff as __ro_after_initAlexey Dobriyan1-1/+1
__read_mostly predates __ro_after_init. Many variables which are marked __read_mostly should have been __ro_after_init from day 1. Also, mark some stuff as "const" and "__init" while I'm at it. [akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning] [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-07debugobjects: Recheck debug_objects_enabled before reportingTetsuo Handa1-0/+9
syzbot is reporting false a positive ODEBUG message immediately after ODEBUG was disabled due to OOM. [ 1062.309646][T22911] ODEBUG: Out of memory. ODEBUG disabled [ 1062.886755][ T5171] ------------[ cut here ]------------ [ 1062.892770][ T5171] ODEBUG: assert_init not available (active state 0) object: ffffc900056afb20 object type: timer_list hint: process_timeout+0x0/0x40 CPU 0 [ T5171] CPU 1 [T22911] -------------- -------------- debug_object_assert_init() { if (!debug_objects_enabled) return; db = get_bucket(addr); lookup_object_or_alloc() { debug_objects_enabled = 0; return NULL; } debug_objects_oom() { pr_warn("Out of memory. ODEBUG disabled\n"); // all buckets get emptied here, and } lookup_object_or_alloc(addr, db, descr, false, true) { // this bucket is already empty. return ERR_PTR(-ENOENT); } // Emits false positive warning. debug_print_object(&o, "assert_init"); } Recheck debug_object_enabled in debug_print_object() to avoid that. Reported-by: syzbot <syzbot+7937ba6a50bdd00fffdf@syzkaller.appspotmail.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/492fe2ae-5141-d548-ebd5-62f5fe2e57f7@I-love.SAKURA.ne.jp Closes: https://syzkaller.appspot.com/bug?extid=7937ba6a50bdd00fffdf
2023-05-22debugobjects: Don't wake up kswapd from fill_pool()Tetsuo Handa1-1/+1
syzbot is reporting a lockdep warning in fill_pool() because the allocation from debugobjects is using GFP_ATOMIC, which is (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) and therefore tries to wake up kswapd, which acquires kswapd_wait::lock. Since fill_pool() might be called with arbitrary locks held, fill_pool() should not assume that acquiring kswapd_wait::lock is safe. Use __GFP_HIGH instead and remove __GFP_NORETRY as it is pointless for !__GFP_DIRECT_RECLAIM allocation. Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects") Reported-by: syzbot <syzbot+fe0c72f0ccbb93786380@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/6577e1fa-b6ee-f2be-2414-a2b51b1c5e30@I-love.SAKURA.ne.jp Closes: https://syzkaller.appspot.com/bug?extid=fe0c72f0ccbb93786380
2023-05-02debugobjects,locking: Annotate debug_object_fill_pool() wait type violationPeter Zijlstra1-2/+13
There is an explicit wait-type violation in debug_object_fill_pool() for PREEMPT_RT=n kernels which allows them to more easily fill the object pool and reduce the chance of allocation failures. Lockdep's wait-type checks are designed to check the PREEMPT_RT locking rules even for PREEMPT_RT=n kernels and object to this, so create a lockdep annotation to allow this to stand. Specifically, create a 'lock' type that overrides the inner wait-type while it is held -- allowing one to temporarily raise it, such that the violation is hidden. Reported-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Qi Zheng <zhengqi.arch@bytedance.com> Link: https://lkml.kernel.org/r/20230429100614.GA1489784@hirez.programming.kicks-ass.net
2023-05-02debugobject: Ensure pool refill (again)Thomas Gleixner1-6/+15
The recent fix to ensure atomicity of lookup and allocation inadvertently broke the pool refill mechanism. Prior to that change debug_objects_activate() and debug_objecs_assert_init() invoked debug_objecs_init() to set up the tracking object for statically initialized objects. That's not longer the case and debug_objecs_init() is now the only place which does pool refills. Depending on the number of statically initialized objects this can be enough to actually deplete the pool, which was observed by Ido via a debugobjects OOM warning. Restore the old behaviour by adding explicit refill opportunities to debug_objects_activate() and debug_objecs_assert_init(). Fixes: 63a759694eed ("debugobject: Prevent init race with static objects") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/871qk05a9d.ffs@tglx
2023-04-15debugobject: Prevent init race with static objectsThomas Gleixner1-59/+66
Statically initialized objects are usually not initialized via the init() function of the subsystem. They are special cased and the subsystem provides a function to validate whether an object which is not yet tracked by debugobjects is statically initialized. This means the object is started to be tracked on first use, e.g. activation. This works perfectly fine, unless there are two concurrent operations on that object. Schspa decoded the problem: T0 T1 debug_object_assert_init(addr) lock_hash_bucket() obj = lookup_object(addr); if (!obj) { unlock_hash_bucket(); - > preemption lock_subsytem_object(addr); activate_object(addr) lock_hash_bucket(); obj = lookup_object(addr); if (!obj) { unlock_hash_bucket(); if (is_static_object(addr)) init_and_track(addr); lock_hash_bucket(); obj = lookup_object(addr); obj->state = ACTIVATED; unlock_hash_bucket(); subsys function modifies content of addr, so static object detection does not longer work. unlock_subsytem_object(addr); if (is_static_object(addr)) <- Fails debugobject emits a warning and invokes the fixup function which reinitializes the already active object in the worst case. This race exists forever, but was never observed until mod_timer() got a debug_object_assert_init() added which is outside of the timer base lock held section right at the beginning of the function to cover the lockless early exit points too. Rework the code so that the lookup, the static object check and the tracking object association happens atomically under the hash bucket lock. This prevents the issue completely as all callers are serialized on the hash bucket lock and therefore cannot observe inconsistent state. Fixes: 3ac7fe5a4aab ("infrastructure to debug (dynamic) objects") Reported-by: syzbot+5093ba19745994288b53@syzkaller.appspotmail.com Debugged-by: Schspa Shi <schspa@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://syzkaller.appspot.com/bug?id=22c8a5938eab640d1c6bcc0e3dc7be519d878462 Link: https://lore.kernel.org/lkml/20230303161906.831686-1-schspa@gmail.com Link: https://lore.kernel.org/r/87zg7dzgao.ffs@tglx
2022-12-12Merge tag 'mm-nonmm-stable-2022-12-12' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - A ptrace API cleanup series from Sergey Shtylyov - Fixes and cleanups for kexec from ye xingchen - nilfs2 updates from Ryusuke Konishi - squashfs feature work from Xiaoming Ni: permit configuration of the filesystem's compression concurrency from the mount command line - A series from Akinobu Mita which addresses bound checking errors when writing to debugfs files - A series from Yang Yingliang to address rapidio memory leaks - A series from Zheng Yejian to address possible overflow errors in encode_comp_t() - And a whole shower of singleton patches all over the place * tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits) ipc: fix memory leak in init_mqueue_fs() hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount rapidio: devices: fix missing put_device in mport_cdev_open kcov: fix spelling typos in comments hfs: Fix OOB Write in hfs_asc2mac hfs: fix OOB Read in __hfs_brec_find relay: fix type mismatch when allocating memory in relay_create_buf() ocfs2: always read both high and low parts of dinode link count io-mapping: move some code within the include guarded section kernel: kcsan: kcsan_test: build without structleak plugin mailmap: update email for Iskren Chernev eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD rapidio: fix possible UAF when kfifo_alloc() fails relay: use strscpy() is more robust and safer cpumask: limit visibility of FORCE_NR_CPUS acct: fix potential integer overflow in encode_comp_t() acct: fix accuracy loss for input value of encode_comp_t() linux/init.h: include <linux/build_bug.h> and <linux/stringify.h> rapidio: rio: fix possible name leak in rio_register_mport() rapidio: fix possible name leaks when rio_add_device() fails ...
2022-12-02debugobjects: Print object pointer in debug_print_object()Stephen Boyd1-2/+2
Delayed kobject debugging (CONFIG_DEBUG_KOBJECT_RELEASE) prints the kobject pointer that's being released in kobject_release() before scheduling a randomly delayed work to do the actual release work. If the caller of kobject_put() frees the kobject upon return then this will typically emit a debugobject warning about freeing an active timer. Usually the release function is the function that does the kfree() of the struct containing the kobject. For example the following print is seen kobject: 'queue' (ffff888114236190): kobject_release, parent 0000000000000000 (delayed 1000) ------------[ cut here ]------------ ODEBUG: free active (active state 0) object type: timer_list hint: kobject_delayed_cleanup+0x0/0x390 but the kobject printk cannot be matched with the debug object printk because it could be any number of kobjects that was released around that time. The random delay for the work doesn't help either. Print the address of the object being tracked to help to figure out which kobject is the problem here. Note that this does not use %px here to match the other %p usage in debugobject debugging. Due to %p usage it is required to disable pointer hashing to correlate the two pointer printks. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220519202201.2348343-1-swboyd@chromium.org
2022-11-15lib/debugobjects: fix stat count and optimize debug_objects_mem_initwuchi1-0/+10
1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so track it in debug_objects_replace_static_objects. Do similar things in object_cpu_offline. 2. In debug_objects_mem_init, there is no need to call function cpuhp_setup_state_nocalls when debug_objects_enabled = 0 (out of memory). Link: https://lkml.kernel.org/r/20220611130634.99741-1-wuchi.zero@gmail.com Fixes: 634d61f45d6f ("debugobjects: Percpu pool lookahead freeing/allocation") Fixes: c4b73aabd098 ("debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done") Signed-off-by: wuchi <wuchi.zero@gmail.com> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13debugobjects: Convert to SPDX license identifierThomas Gleixner1-4/+1
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/87v8udpy3u.ffs@tglx
2021-08-13debugobjects: Make them PREEMPT_RT awareThomas Gleixner1-1/+6
On PREEMPT_RT enabled kernels it is not possible to refill the object pool from atomic context (preemption or interrupts disabled) as the allocator might acquire 'sleeping' spinlocks. Guard the invocation of fill_pool() accordingly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/87sfzehdnl.ffs@tglx
2020-10-01debugobjects: Free per CPU pool after CPU unplugZqiang1-0/+25
If a CPU is offlined the debug objects per CPU pool is not cleaned up. If the CPU is never onlined again then the objects in the pool are wasted. Add a CPU hotplug callback which is invoked after the CPU is dead to free the pool. [ tglx: Massaged changelog and added comment about remote access safety ] Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20200908062709.11441-1-qiang.zhang@windriver.com
2020-09-24debugobjects: Allow debug_obj_descr to be constStephen Boyd1-15/+15
The debugobject core could be slightly harder to corrupt if the debug_obj_descr would be a pointer to const memory. Depending on the architecture, const data structures are placed into read-only memory and thus are harder to corrupt or hijack. This descriptor is used to fix up stuff like timers and workqueues when core kernel data structures are busted, so moving the descriptors to read-only memory will make debugobjects more resilient to something going wrong and then corrupting the function pointers inside struct debug_obj_descr. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200815004027.2046113-2-swboyd@chromium.org
2020-07-17debugobjects: Convert to DEFINE_SHOW_ATTRIBUTEQinglang Miao1-12/+1
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. [ tglx: Distangled it from the mess in -next ] Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hch@lst.de Link: https://lkml.kernel.org/r/20200716084747.8034-1-miaoqinglang@huawei.com
2020-01-17debugobjects: Fix various data racesMarco Elver1-21/+25
The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are read locklessly outside the pool_lock critical sections. If read with plain accesses, this would result in data races. This is addressed as follows: * reads outside critical sections become READ_ONCE()s (pairing with WRITE_ONCE()s added); * writes become WRITE_ONCE()s (pairing with READ_ONCE()s added); since writes happen inside critical sections, only the write and not the read of RMWs needs to be atomic, thus WRITE_ONCE(var, var +/- X) is sufficient. The data races were reported by KCSAN: BUG: KCSAN: data-race in __free_object / fill_pool write to 0xffffffff8beb04f8 of 4 bytes by interrupt on cpu 1: __free_object+0x1ee/0x8e0 lib/debugobjects.c:404 __debug_check_no_obj_freed+0x199/0x330 lib/debugobjects.c:969 debug_check_no_obj_freed+0x3c/0x44 lib/debugobjects.c:994 slab_free_hook mm/slub.c:1422 [inline] read to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 2: fill_pool+0x3d/0x520 lib/debugobjects.c:135 __debug_object_init+0x3c/0x810 lib/debugobjects.c:536 debug_object_init lib/debugobjects.c:591 [inline] debug_object_activate+0x228/0x320 lib/debugobjects.c:677 debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline] BUG: KCSAN: data-race in __debug_object_init / fill_pool read to 0xffffffff8beb04f8 of 4 bytes by task 10 on cpu 6: fill_pool+0x3d/0x520 lib/debugobjects.c:135 __debug_object_init+0x3c/0x810 lib/debugobjects.c:536 debug_object_init_on_stack+0x39/0x50 lib/debugobjects.c:606 init_timer_on_stack_key kernel/time/timer.c:742 [inline] write to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 3: alloc_object lib/debugobjects.c:258 [inline] __debug_object_init+0x717/0x810 lib/debugobjects.c:544 debug_object_init lib/debugobjects.c:591 [inline] debug_object_activate+0x228/0x320 lib/debugobjects.c:677 debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline] BUG: KCSAN: data-race in free_obj_work / free_object read to 0xffffffff9140c190 of 4 bytes by task 10 on cpu 6: free_object+0x4b/0xd0 lib/debugobjects.c:426 debug_object_free+0x190/0x210 lib/debugobjects.c:824 destroy_timer_on_stack kernel/time/timer.c:749 [inline] write to 0xffffffff9140c190 of 4 bytes by task 93 on cpu 1: free_obj_work+0x24f/0x480 lib/debugobjects.c:313 process_one_work+0x454/0x8d0 kernel/workqueue.c:2264 worker_thread+0x9a/0x780 kernel/workqueue.c:2410 Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200116185529.11026-1-elver@google.com
2019-06-14debugobjects: Move printk out of db->lock critical sectionsWaiman Long1-19/+39
The db->lock is a raw spinlock and so the lock hold time is supposed to be short. This will not be the case when printk() is being involved in some of the critical sections. In order to avoid the long hold time, in case some messages need to be printed, the debug_object_is_on_stack() and debug_print_object() calls are now moved out of those critical sections. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-6-longman@redhat.com
2019-06-14debugobjects: Less aggressive freeing of excess debug objectsWaiman Long1-12/+49
After a system bootup and 3 parallel kernel builds, a partial output of the debug objects stats file was: pool_free :5101 pool_pcp_free :4181 pool_min_free :220 pool_used :104172 pool_max_used :171920 on_free_list :0 objs_allocated:39268280 objs_freed :39160031 More than 39 millions debug objects had since been allocated and then freed. The pool_max_used, however, was only about 172k. So this is a lot of extra overhead in freeing and allocating objects from slabs. It may also causes the slabs to be more fragmented and harder to reclaim. Make the freeing of excess debug objects less aggressive by freeing them at a maximum frequency of 10Hz and about 1k objects at each round of freeing. With that change applied, the partial output of the debug objects stats file after similar actions became: pool_free :5901 pool_pcp_free :3742 pool_min_free :1022 pool_used :104805 pool_max_used :168081 on_free_list :0 objs_allocated:5796864 objs_freed :5687182 Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-5-longman@redhat.com
2019-06-14debugobjects: Reduce number of pool_lock acquisitions in fill_pool()Waiman Long1-8/+16
In fill_pool(), the pool_lock is acquired and then released once per debug object. If many objects are to be filled, the constant lock and unlock operations are extra overhead. To reduce the overhead, batch them up and do an allocation of 4 objects per lock/unlock sequence. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-4-longman@redhat.com
2019-06-14debugobjects: Percpu pool lookahead freeing/allocationWaiman Long1-6/+69
Most workloads will allocate a bunch of memory objects, work on them and then freeing all or most of them. So just having a percpu free pool may not reduce the pool_lock contention significantly if large number of objects are being used. To help those situations, we are now doing lookahead allocation and freeing of the debug objects into and out of the percpu free pool. This will hopefully reduce the number of times the pool_lock needs to be taken and hence its contention level. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-3-longman@redhat.com
2019-06-14debugobjects: Add percpu free poolsWaiman Long1-24/+91
When a multi-threaded workload does a lot of small memory object allocations and deallocations, it may cause the allocation and freeing of many debug objects. This will make the global pool_lock a bottleneck in the performance of the workload. Since interrupts are disabled when acquiring the pool_lock, it may even cause hard lockups to happen. To reduce contention of the global pool_lock, add a percpu debug object free pool that can be used to buffer some of the debug object allocation and freeing requests without acquiring the pool_lock. Each CPU will now have a percpu free pool that can hold up to a maximum of 64 debug objects. Allocation and freeing requests will go to the percpu free pool first. If that fails, the pool_lock will be taken and the global free pool will be used. The presence or absence of obj_cache is used as a marker to see if the percpu cache should be used. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Qian Cai <cai@gmx.us> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190520141450.7575-2-longman@redhat.com
2019-06-14debugobjects: No need to check return value of debugfs_create()Greg Kroah-Hartman1-12/+2
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Qian Cai <cai@gmx.us> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Waiman Long <longman@redhat.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: Zhong Jiang <zhongjiang@huawei.com> Link: https://lkml.kernel.org/r/20190612153513.GA21082@kroah.com
2018-12-28debugobjects: call debug_objects_mem_init earilerQian Cai1-5/+3
The current value of the early boot static pool size, 1024 is not big enough for systems with large number of CPUs with timer or/and workqueue objects selected. As the results, systems have 60+ CPUs with both timer and workqueue objects enabled could trigger "ODEBUG: Out of memory. ODEBUG disabled". Some debug objects are allocated during the early boot. Enabling some options like timers or workqueue objects may increase the size required significantly with large number of CPUs. For example, CONFIG_DEBUG_OBJECTS_TIMERS: No. CPUs x 2 (worker pool) objects: start_kernel workqueue_init_early init_worker_pool init_timer_key debug_object_init plus No. CPUs objects (CONFIG_HIGH_RES_TIMERS): sched_init hrtick_rq_init hrtimer_init CONFIG_DEBUG_OBJECTS_WORK: No. CPUs objects: vmalloc_init __init_work plus No. CPUs x 6 (workqueue) objects: workqueue_init_early alloc_workqueue __alloc_workqueue_key alloc_and_link_pwqs init_pwq Also, plus No. CPUs objects: perf_event_init __init_srcu_struct init_srcu_struct_fields init_srcu_struct_nodes __init_work However, none of the things are actually used or required before debug_objects_mem_init() is invoked, so just move the call right before vmalloc_init(). According to tglx, "the reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today." Link: http://lkml.kernel.org/r/20181126102407.1836-1-cai@gmx.us Signed-off-by: Qian Cai <cai@gmx.us> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-30debugobjects: avoid recursive calls with kmemleakQian Cai1-3/+2
CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to recursive calls. fill_pool kmemleak_ignore make_black_object put_object __call_rcu (kernel/rcu/tree.c) debug_rcu_head_queue debug_object_activate debug_object_init fill_pool kmemleak_ignore make_black_object ... So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly allocated debug objects at all. Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us Signed-off-by: Qian Cai <cai@gmx.us> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02debugobjects: Remove redundant NULL pointer checkZhong Jiang1-2/+1
kmem_cache_destroy() has a built in NULL pointer check, so the one at the call can be removed. Signed-off-by: Zhong Jiang <zhongjiang@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <longman@redhat.com> Cc: <arnd@arndb.de> Cc: <yang.shi@linux.alibaba.com> Link: https://lkml.kernel.org/r/1533054298-35824-1-git-send-email-zhongjiang@huawei.com
2018-07-30debugobjects: Make stack check warning more informativeJoel Fernandes (Google)1-2/+5
While debugging an issue debugobject tracking warned about an annotation issue of an object on stack. It turned out that the issue was due to the object in concern being on a different stack which was due to another issue. Thomas suggested to print the pointers and the location of the stack for the currently running task. This helped to figure out that the object was on the wrong stack. As this is general useful information for debugging similar issues, make the error message more informative by printing the pointers. [ tglx: Massaged changelog ] Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: kernel-team@android.com Cc: Arnd Bergmann <arnd@arndb.de> Cc: astrachan@google.com Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org
2018-03-14debugobjects: Avoid another unused variable warningArnd Bergmann1-1/+1
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(), and only read in debug_objects_maxchecked, unfortunately both of these are optional and depend on different Kconfig symbols. When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this warning is emitted: lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable] Rather than trying to add more complex #ifdef protections, mark the variable as __maybe_unused so it can be silently dropped when usused. Fixes: bd9dcd046509 ("debugobjects: Export max loops counter") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de
2018-02-22debugobjects: Fix debug_objects_freed accountingArnd Bergmann1-0/+1
The removal of the batched object freeing has caused the debug_objects_freed to become read-only, and the reading is inside an ifdef, so gcc warns that it is completely unused without CONFIG_DEBUG_FS: lib/debugobjects.c:71:14: error: 'debug_objects_freed' defined but not used [-Werror=unused-variable] Assuming we are still interested in this number, this adds back code to keep track of the freed objects. Fixes: 636e1970fd7d ("debugobjects: Use global free list in free_object()") Suggested-by: Waiman Long <longman@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20180222155335.1647466-1-arnd@arndb.de
2018-02-13debugobjects: Use global free list in __debug_check_no_obj_freed()Yang Shi1-9/+7
__debug_check_no_obj_freed() iterates over the to be freed memory region in chunks and iterates over the corresponding hash bucket list for each chunk. This can accumulate to hundred thousands of checked objects. In the worst case this can trigger the soft lockup detector: NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s! CPU: 15 PID: 110342 Comm: stress-ng-getde Call Trace: [<ffffffff8141177e>] debug_check_no_obj_freed+0x13e/0x220 [<ffffffff811f8751>] __free_pages_ok+0x1f1/0x5c0 [<ffffffff811fa785>] __free_pages+0x25/0x40 [<ffffffff812638db>] __free_slab+0x19b/0x270 [<ffffffff812639e9>] discard_slab+0x39/0x50 [<ffffffff812679f7>] __slab_free+0x207/0x270 [<ffffffff81269966>] ___cache_free+0xa6/0xb0 [<ffffffff8126c267>] qlist_free_all+0x47/0x80 [<ffffffff8126c5a9>] quarantine_reduce+0x159/0x190 [<ffffffff8126b3bf>] kasan_kmalloc+0xaf/0xc0 [<ffffffff8126b8a2>] kasan_slab_alloc+0x12/0x20 [<ffffffff81265e8a>] kmem_cache_alloc+0xfa/0x360 [<ffffffff812abc8f>] ? getname_flags+0x4f/0x1f0 [<ffffffff812abc8f>] getname_flags+0x4f/0x1f0 [<ffffffff812abe42>] getname+0x12/0x20 [<ffffffff81298da9>] do_sys_open+0xf9/0x210 [<ffffffff81298ede>] SyS_open+0x1e/0x20 [<ffffffff817d6e01>] entry_SYSCALL_64_fastpath+0x1f/0xc2 The code path might be called in either atomic or non-atomic context, but in_atomic() can't tell if the current context is atomic or not on a PREEMPT=n kernel, so cond_resched() can't be used to prevent the softlockup. Utilize the global free list to shorten the loop execution time. [ tglx: Massaged changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-5-git-send-email-yang.shi@linux.alibaba.com
2018-02-13debugobjects: Use global free list in free_object()Yang Shi1-41/+22
The newly added global free list allows to avoid lengthy pool_list iterations in free_obj_work() by putting objects either into the pool list when the fill level of the pool is below the maximum or by putting them on the global free list immediately. As the pool is now guaranteed to never exceed the maximum fill level this allows to remove the batch removal from pool list in free_obj_work(). Split free_object() into two parts, so the actual queueing function can be reused without invoking schedule_work() on every invocation. [ tglx: Remove the batch removal from pool list and massage changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-4-git-send-email-yang.shi@linux.alibaba.com
2018-02-13debugobjects: Add global free list and the counterYang Shi1-1/+57
free_object() adds objects to the pool list and schedules work when the pool list is larger than the pool size. The worker handles the actual kfree() of the object by iterating the pool list until the pool size is below the maximum pool size again. To iterate the pool list, pool_lock has to be held and the objects which should be freed() need to be put into temporary storage so pool_lock can be dropped for the actual kmem_cache_free() invocation. That's a pointless and expensive exercise if there is a large number of objects to free. In such a case its better to evaulate the fill level of the pool in free_objects() and queue the object to free either in the pool list or if it's full on a separate global free list. The worker can then do the following simpler operation: - Move objects back from the global free list to the pool list if the pool list is not longer full. - Remove the remaining objects in a single list move operation from the global free list and do the kmem_cache_free() operation lockless from the temporary list head. In fill_pool() the global free list is checked as well to avoid real allocations from the kmem cache. Add the necessary list head and a counter for the number of objects on the global free list and export that counter via sysfs: max_chain :79 max_loops :8147 warnings :0 fixups :0 pool_free :1697 pool_min_free :346 pool_used :15356 pool_max_used :23933 on_free_list :39 objs_allocated:32617 objs_freed :16588 Nothing queues objects on the global free list yet. This happens in a follow up change. [ tglx: Simplified implementation and massaged changelog ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-3-git-send-email-yang.shi@linux.alibaba.com
2018-02-13debugobjects: Export max loops counterYang Shi1-1/+8
__debug_check_no_obj_freed() can be an expensive operation depending on the size of memory freed. It already exports the maximum chain walk length via debugfs, but this only records the maximum of a single memory chunk. Though there is no information about the total number of objects inspected for a __debug_check_no_obj_freed() operation, which might be significantly larger when a huge memory region is freed. Aggregate the number of objects inspected for a single invocation of __debug_check_no_obj_freed() and export it via sysfs. The resulting output of /sys/kernel/debug/debug_objects/stats looks like: max_chain :121 max_checked :543267 warnings :0 fixups :0 pool_free :1764 pool_min_free :341 pool_used :86438 pool_max_used :268887 objs_allocated:6068254 objs_freed :5981076 [ tglx: Renamed the variable to max_checked and adjusted changelog ] Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: longman@redhat.com Link: https://lkml.kernel.org/r/1517872708-24207-2-git-send-email-yang.shi@linux.alibaba.com
2017-08-14debugobjects: Make kmemleak ignore debug objectsWaiman Long1-0/+3
The allocated debug objects are either on the free list or in the hashed bucket lists. So they won't get lost. However if both debug objects and kmemleak are enabled and kmemleak scanning is done while some of the debug objects are transitioning from one list to the others, false negative reporting of memory leaks may happen for those objects. For example, [38687.275678] kmemleak: 12 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff92e98aabeb68 (size 40): comm "ksmtuned", pid 4344, jiffies 4298403600 (age 906.430s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 d0 bc db 92 e9 92 ff ff ................ 01 00 00 00 00 00 00 00 38 36 8a 61 e9 92 ff ff ........86.a.... backtrace: [<ffffffff8fa5378a>] kmemleak_alloc+0x4a/0xa0 [<ffffffff8f47c019>] kmem_cache_alloc+0xe9/0x320 [<ffffffff8f62ed96>] __debug_object_init+0x3e6/0x400 [<ffffffff8f62ef01>] debug_object_activate+0x131/0x210 [<ffffffff8f330d9f>] __call_rcu+0x3f/0x400 [<ffffffff8f33117d>] call_rcu_sched+0x1d/0x20 [<ffffffff8f4a183c>] put_object+0x2c/0x40 [<ffffffff8f4a188c>] __delete_object+0x3c/0x50 [<ffffffff8f4a18bd>] delete_object_full+0x1d/0x20 [<ffffffff8fa535c2>] kmemleak_free+0x32/0x80 [<ffffffff8f47af07>] kmem_cache_free+0x77/0x350 [<ffffffff8f453912>] unlink_anon_vmas+0x82/0x1e0 [<ffffffff8f440341>] free_pgtables+0xa1/0x110 [<ffffffff8f44af91>] exit_mmap+0xc1/0x170 [<ffffffff8f29db60>] mmput+0x80/0x150 [<ffffffff8f2a7609>] do_exit+0x2a9/0xd20 The references in the debug objects may also hide a real memory leak. As there is no point in having kmemleak to track debug object allocations, kmemleak checking is now disabled for debug objects. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1502718733-8527-1-git-send-email-longman@redhat.com
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar1-0/+1
<linux/sched/task_stack.h> We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task_stack.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-10debugobjects: Improve variable namingWaiman Long1-5/+5
As suggested by Ingo, the debug_objects_alloc counter is now renamed to debug_objects_allocated with minor twist in comment and debug output. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1486503630-1501-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-05debugobjects: Reduce contention on the global pool_lockWaiman Long1-9/+23
On a large SMP system with many CPUs, the global pool_lock may become a performance bottleneck as all the CPUs that need to allocate or free debug objects have to take the lock. That can sometimes cause soft lockups like: NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21] ... RIP: 0010:[<ffffffff817c216b>] [<ffffffff817c216b>] _raw_spin_unlock_irqrestore+0x3b/0x60 ... Call Trace: [<ffffffff813f40d1>] free_object+0x81/0xb0 [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220 [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0 [<ffffffff81284996>] ? file_free_rcu+0x36/0x60 [<ffffffff81251712>] kmem_cache_free+0xd2/0x380 [<ffffffff81284960>] ? fput+0x90/0x90 [<ffffffff81284996>] file_free_rcu+0x36/0x60 [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550 [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550 [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50 [<ffffffff810c59d1>] kthread+0x101/0x120 [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0 [<ffffffff817c2d32>] ret_from_fork+0x22/0x50 To reduce the amount of contention on the pool_lock, the actual kmem_cache_free() of the debug objects will be delayed if the pool_lock is busy. This will temporarily increase the amount of free objects available at the free pool when the system is busy. As a result, the number of kmem_cache allocation and freeing is reduced. To further reduce the lock operations free debug objects in batches of four. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04debugobjects: Scale thresholds with # of CPUsWaiman Long1-5/+15
On a large SMP systems with hundreds of CPUs, the current thresholds for allocating and freeing debug objects (256 and 1024 respectively) may not work well. This can cause a lot of needless calls to kmem_aloc() and kmem_free() on those systems. To alleviate this thrashing problem, the object freeing threshold is now increased to "1024 + # of CPUs * 32". Whereas the object allocation threshold is increased to "256 + # of CPUs * 4". That should make the debug objects subsystem scale better with the number of CPUs available in the system. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04debugobjects: Track number of kmem_cache_alloc/kmem_cache_free doneWaiman Long1-0/+10
New debugfs stat counters are added to track the numbers of kmem_cache_alloc() and kmem_cache_free() function calls to get a sense of how the internal debug objects cache management is performing. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-13Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-1/+1
Pull workqueue updates from Tejun Heo: "Mostly patches to initialize workqueue subsystem earlier and get rid of keventd_up(). The patches were headed for the last merge cycle but got delayed due to a bug found late minute, which is fixed now. Also, to help debugging, destroy_workqueue() is more chatty now on a sanity check failure." * 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: move wq_numa_init() to workqueue_init() workqueue: remove keventd_up() debugobj, workqueue: remove keventd_up() usage slab, workqueue: remove keventd_up() usage power, workqueue: remove keventd_up() usage tty, workqueue: remove keventd_up() usage mce, workqueue: remove keventd_up() usage workqueue: make workqueue available early during boot workqueue: dump workqueue state on sanity check failures in destroy_workqueue()
2016-11-30lib/debugobjects: export for use in modulesChris Wilson1-0/+8
Drivers, or other modules, that use a mixture of objects (especially objects embedded within other objects) would like to take advantage of the debugobjects facilities to help catch misuse. Currently, the debugobjects interface is only available to builtin drivers and requires a set of EXPORT_SYMBOL_GPL for use by modules. I am using the debugobjects in i915.ko to try and catch some invalid operations on embedded objects. The problem currently only presents itself across module unload so forcing i915 to be builtin is not an option. Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Du, Changbin" <changbin.du@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-17debugobj, workqueue: remove keventd_up() usageTejun Heo1-1/+1
Now that workqueue can handle work item queueing from very early during boot, there is no need to gate schedule_work() while !keventd_up(). Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org>
2016-05-19debugobjects: insulate non-fixup logic related to static obj from fixup ↵Du, Changbin1-17/+32
callbacks When activating a static object we need make sure that the object is tracked in the object tracker. If it is a non-static object then the activation is illegal. In previous implementation, each subsystem need take care of this in their fixup callbacks. Actually we can put it into debugobjects core. Thus we can save duplicated code, and have *pure* fixup callbacks. To achieve this, a new callback "is_static_object" is introduced to let the type specific code decide whether a object is static or not. If yes, we take it into object tracker, otherwise give warning and invoke fixup callback. This change has paassed debugobjects selftest, and I also do some test with all debugobjects supports enabled. At last, I have a concern about the fixups that can it change the object which is in incorrect state on fixup? Because the 'addr' may not point to any valid object if a non-static object is not tracked. Then Change such object can overwrite someone's memory and cause unexpected behaviour. For example, the timer_fixup_activate bind timer to function stub_timer. Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com [changbin.du@intel.com: improve code comments where invoke the new is_static_object callback] Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19debugobjects: correct the usage of fixup call resultsDu, Changbin1-1/+1
If debug_object_fixup() return non-zero when problem has been fixed. But the code got it backwards, it taks 0 as fixup successfully. So fix it. Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19debugobjects: make fixup functions return bool instead of intDu, Changbin1-22/+21
I am going to introduce debugobjects infrastructure to USB subsystem. But before this, I found the code of debugobjects could be improved. This patchset will make fixup functions return bool type instead of int. Because fixup only need report success or no. boolean is the 'real' type. This patch (of 7): The object debugging infrastructure core provides some fixup callbacks for the subsystem who use it. These callbacks are called from the debug code whenever a problem in debug_object_init is detected. And debugobjects core suppose them returns 1 when the fixup was successful, otherwise 0. So the return type is boolean. A bad thing is that debug_object_fixup use the return value for arithmetic operation. It confused me that what is the reall return type. Reading over the whole code, I found some place do use the return value incorrectly(see next patch). So why use bool type instead? Signed-off-by: Du, Changbin <changbin.du@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Triplett <josh@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-27debugobjects: Allow bigger number of early boot objectsChristian Borntraeger1-1/+1
On my bigger s390 systems I always get "Out of memory. ODEBUG disabled". Since the number of objects is needed at compile time, we can not change the size dynamically before the caches etc are available. Doubling the size seems to do the trick. Since it is init data it will be freed anyway, this should be ok. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: http://lkml.kernel.org/r/1453905478-13409-1-git-send-email-borntraeger@de.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-06-04lib/debugobjects.c: convert printk(KERN_DEBUG to pr_debugFabian Frederick1-2/+2
Direct conversion of one KERN_DEBUG message without DEBUG definition (suggested by Josh Triplett) That message will now be disabled by default. (see Documentation/CodingStyle Chapter 13) Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04lib/debugobjects.c: add pr_fmt to loggingFabian Frederick1-5/+8
Add ODEBUG: prefix to pr_fmt Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04lib/debugobjects.c: convert printk to pr_foo()Fabian Frederick1-7/+5
Convert all printk to pr_foo() except KERN_DEBUG (see Documentation/CodingStyle Chapter 13) Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13lib/debugobjects.c: remove unnecessary work pending testXie XiuQi1-1/+1
Remove unnecessary work pending test before calling schedule_work(). It has been tested in queue_work_on() already. No functional changed. Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com> Reviewed-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-18debugobjects: Make debug_object_activate() return statusPaul E. McKenney1-6/+14
In order to better respond to things like duplicate invocations of call_rcu(), RCU needs to see the status of a call to debug_object_activate(). This would allow RCU to leak the callback in order to avoid adding freelist-reuse mischief to the duplicate invoations. This commit therefore makes debug_object_activate() return status, zero for success and -EINVAL for failure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-02-27hlist: drop the node parameter from iteratorsSasha Levin1-11/+10
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-18debugobjects: Fill_pool() returns void nowDan Carpenter1-1/+1
There was a return missed in 1fda107d44 "debugobjects: Remove unused return value from fill_pool()". It makes gcc complain: lib/debugobjects.c: In function ‘fill_pool’: lib/debugobjects.c:98:4: warning: ‘return’ with a value, in function returning void [enabled by default] Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: http://lkml.kernel.org/r/20120418112810.GA2669@elgon.mountain Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-11debugobjects: printk with irqs enabledThomas Gleixner1-1/+1
No point in keeping interrupts disabled here. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-11debugobjects: Remove unused return value from fill_pool()Thomas Gleixner1-4/+3
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-05debugobjects: Fix selftest for static warningsStephen Boyd1-11/+3
debugobjects is now printing a warning when a fixup for a NOTAVAILABLE object is run. This causes the selftest to fail like: ODEBUG: selftest warnings failed 4 != 5 We could just increase the number of warnings that the selftest is expecting to see because that is actually what has changed. But, it turns out that fixup_activate() was written with inverted logic and thus a fixup for a static object returned 1 indicating the object had been fixed, and 0 otherwise. Fix the logic to be correct and update the counts to reflect that nothing needed fixing for a static object. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-23debugobjects: Extend to assert that an object is initializedChristine Chan1-0/+38
Calling del_timer_sync() on an uninitialized timer leads to a never ending loop in lock_timer_base() that spins checking for a non-NULL timer base. Add an assertion to debugobjects to catch usage of uninitialized objects so that we can initialize timers in the del_timer_sync() path before it calls lock_timer_base(). [ sboyd@codeaurora.org: Clarify commit message ] Signed-off-by: Christine Chan <cschan@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1320724108-20788-3-git-send-email-sboyd@codeaurora.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-11-23debugobjects: Be smarter about static objectsStephen Boyd1-4/+12
Make debugobjects use the return code from the fixup function. That allows us better diagnostics in the activate check than relying on a WARN_ON() in the object specific code. [ tglx@linutronix.de: Split out the debugobjects vs. the timer change ] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Christine Chan <cschan@codeaurora.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1320724108-20788-2-git-send-email-sboyd@codeaurora.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-20debugobjects: Fix boot crash when kmemleak and debugobjects enabledMarcin Slusarz1-1/+1
Order of initialization look like this: ... debugobjects kmemleak ...(lots of other subsystems)... workqueues (through early initcall) ... debugobjects use schedule_work for batch freeing of its data and kmemleak heavily use debugobjects, so when it comes to freeing and workqueues were not initialized yet, kernel crashes: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a [<ffffffff81085910>] queue_work_on+0x16/0x1d [<ffffffff81085abc>] queue_work+0x29/0x55 [<ffffffff81085afb>] schedule_work+0x13/0x15 [<ffffffff81242de1>] free_object+0x90/0x95 [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3 [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9 ... because system_wq is NULL. Fix it by checking if workqueues susbystem was initialized before using. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-08debugobjects: Add hint for better object identificationStanislaw Gruszka1-3/+6
In complex subsystems like mac80211 structures can contain several timers and work structs, so identifying a specific instance from the call trace and object type output of debugobjects can be hard. Allow the subsystems which support debugobjects to provide a hint function. This function returns a pointer to a kernel address (preferrably the objects callback function) which is printed along with the debugobjects type. Add hint methods for timer_list, work_struct and hrtimer. [ tglx: Massaged changelog, made it compile ] Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> LKML-Reference: <20110307085809.GA9334@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-05-18Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-3/+56
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits) rcu: remove all rcu head initializations, except on_stack initializations rcu head introduce rcu head init on stack Debugobjects transition check rcu: fix build bug in RCU_FAST_NO_HZ builds rcu: RCU_FAST_NO_HZ must check RCU dyntick state rcu: make SRCU usable in modules rcu: improve the RCU CPU-stall warning documentation rcu: reduce the number of spurious RCU_SOFTIRQ invocations rcu: permit discontiguous cpu_possible_mask CPU numbering rcu: improve RCU CPU stall-warning messages rcu: print boot-time console messages if RCU configs out of ordinary rcu: disable CPU stall warnings upon panic rcu: enable CPU_STALL_VERBOSE by default rcu: slim down rcutiny by removing rcu_scheduler_active and friends rcu: refactor RCU's context-switch handling rcu: rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk rcu: shrink rcutiny by making synchronize_rcu_bh() be inline rcu: fix now-bogus rcu_scheduler_active comments. rcu: Fix bogus CONFIG_PROVE_LOCKING in comments to reflect reality. rcu: ignore offline CPUs in last non-dyntick-idle CPU check ...
2010-05-18Merge branch 'core-debugobjects-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debugobjects: Section mismatch cleanup
2010-05-10Debugobjects transition checkMathieu Desnoyers1-3/+56
Implement a basic state machine checker in the debugobjects. This state machine checker detects races and inconsistencies within the "active" life of a debugobject. The checker only keeps track of the current state; all the state machine logic is kept at the object instance level. The checker works by adding a supplementary "unsigned int astate" field to the debug_obj structure. It keeps track of the current "active state" of the object. The only constraints that are imposed on the states by the debugobjects system is that: - activation of an object sets the current active state to 0, - deactivation of an object expects the current active state to be 0. For the rest of the states, the state mapping is determined by the specific object instance. Therefore, the logic keeping track of the state machine is within the specialized instance, without any need to know about it at the debugobject level. The current object active state is changed by calling: debug_object_active_state(addr, descr, expect, next) where "expect" is the expected state and "next" is the next state to move to if the expected state is found. A warning is generated if the expected is not found. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David S. Miller <davem@davemloft.net> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: akpm@linux-foundation.org CC: mingo@elte.hu CC: laijs@cn.fujitsu.com CC: dipankar@in.ibm.com CC: josh@joshtriplett.org CC: dvhltc@us.ibm.com CC: niv@us.ibm.com CC: peterz@infradead.org CC: rostedt@goodmis.org CC: Valdis.Kletnieks@vt.edu CC: dhowells@redhat.com CC: eric.dumazet@gmail.com CC: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-26debugobjects: Section mismatch cleanupHenrik Kretzschmar1-2/+2
This patch marks two functions, which only get called at initialization, as __init. Here is also interesting, that modpost doesn't catch here the right function name. WARNING: lib/built-in.o(.text+0x585f): Section mismatch in reference from the function T.506() to the variable .init.data:obj The function T.506() references the variable __initdata obj. This is often because T.506 lacks a __initdata annotation or the annotation of obj is wrong. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> LKML-Reference: <1269632315-19403-1-git-send-email-henne@nachtwindheim.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-12-14debugobjects: Convert to raw_spinlocksThomas Gleixner1-37/+37
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-10-11headers: remove sched.h from interrupt.hAlexey Dobriyan1-0/+1
After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-17debugobjects: delay free of internal objectsThomas Gleixner1-12/+41
Impact: avoid recursive kfree calls, less slab activity on heavy load debugobjects checks on kfree whether tracked objects are freed. When a tracked object is freed debugobjects frees the internal reference object as well. The debug object slab cache is marked to not recurse into debugobjects when a slab objects is freed, but the recursive call can be problematic versus locking in the memory allocator. Defer the freeing of debug slab objects via schedule_work. The reasons not to use RCU are: 1) rcu makes the data structure larger 2) there is no real need for rcu as nothing references the obj after we freed it 3) under heavy load it is easier to reuse the to be freed objects instead of allocating new objects from the slab. This lowered the slab activity significantly in a heavy load networking test where lots of timers are created/destroyed. The workqueue based delayed free allows us just to put the to be freed objects back into the object pool and reuse them right away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <200903162049.58058.nickpiggin@yahoo.com.au>
2009-03-17debugobjects: replace static objects when slab cache becomes availableThomas Gleixner1-3/+63
Impact: refactor/consolidate object management, prepare for delayed free debugobjects allocates static reference objects to track objects which are initialized or activated before the slab cache becomes available. These static reference objects have to be handled seperately in free_object(). The handling of these objects is in the way of implementing a delayed free functionality. The delayed free is required to avoid callbacks into the mm code from debug_check_no_obj_freed(). Replace the static object references with dynamic ones after the slab cache has been initialized. The static objects are now marked initdata. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <200903162049.58058.nickpiggin@yahoo.com.au>
2009-03-02debug_objects: add boot-parameter toggle to turn object debugging off againKyle McMartin1-0/+8
While trying to debug why my Atom netbook is falling over booting rawhide debug-enabled kernels, I stumbled across the fact that we've been enabling object debugging by default. However, once you default it to on, you've got no way to turn it back off again at runtime. Add a boolean toggle to turn it off. I would just make it an int module_param, however people may already expect the boolean enable behaviour, so just add an analogue for disabling. Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26debugobjects: add boot parameter default valueIngo Molnar1-1/+3
Impact: add .config driven boot parameter default value Right now debugobjects can only be activated if the debug_objects boot parameter is passed in via the boot command line. Make this more convenient (and randomizable) by also providing a .config method. Enable it by default. (DEBUG_OBJECTS itself is default-off) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-01debugobjects: fix lockdep warningVegard Nossum1-8/+23
Daniel J. Blueman reported: > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.27-rc4-224c #1 > ------------------------------------------------------- > hald/4680 is trying to acquire lock: > (&n->list_lock){++..}, at: [<ffffffff802bfa26>] add_partial+0x26/0x80 > > but task is already holding lock: > (&obj_hash[i].lock){++..}, at: [<ffffffff8041cfdc>] > debug_object_free+0x5c/0x120 We fix it by moving the actual freeing to outside the lock (the lock now only protects the list). The pool lock is also promoted to irq-safe (suggested by Dan). It's necessary because free_pool is now called outside the irq disabled region. So we need to protect against an interrupt handler which calls debug_object_init(). [tglx@linutronix.de: added hlist_move_list helper to avoid looping through the list twice] Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-26Use WARN() in lib/Arjan van de Ven1-10/+5
Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes part of the warning section for better reporting/collection. In addition, one of the if() clauses collapes into the WARN() entirely now. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24add a helper function to test if an object is on the stackFUJITA Tomonori1-3/+1
lib/debugobjects.c has a function to test if an object is on the stack. The block layer and ide needs it (they need to avoid DMA from/to stack buffers). This patch moves the function to include/linux/sched.h so that everyone can use it. lib/debugobjects.c uses current->stack but this patch uses a task_stack_page() accessor, which is a preferable way to access the stack. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-18debugobjects: fix lockdep warningVegard Nossum1-9/+6
Daniel J Blueman reported: | ======================================================= | [ INFO: possible circular locking dependency detected ] | 2.6.26-rc5-201c #1 | ------------------------------------------------------- | nscd/3669 is trying to acquire lock: | (&n->list_lock){.+..}, at: [<ffffffff802bab03>] deactivate_slab+0x173/0x1e0 | | but task is already holding lock: | (&obj_hash[i].lock){++..}, at: [<ffffffff803fa56f>] | __debug_object_init+0x2f/0x350 | | which lock already depends on the new lock. There are two locks involved here; the first is a SLUB-local lock, and the second is a debugobjects-local lock. They are basically taken in two different orders: 1. SLUB { debugobjects { ... } } 2. debugobjects { SLUB { ... } } This patch changes pattern #2 by trying to fill the memory pool (e.g. the call into SLUB/kmalloc()) outside the debugobjects lock, so now the two patterns look like this: 1. SLUB { debugobjects { ... } } 2. SLUB { } debugobjects { ... } [ daniel.blueman@gmail.com: pool_lock needs to be taken irq safe in fill_pool ] Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30infrastructure to debug (dynamic) objectsThomas Gleixner1-0/+890
We can see an ever repeating problem pattern with objects of any kind in the kernel: 1) freeing of active objects 2) reinitialization of active objects Both problems can be hard to debug because the crash happens at a point where we have no chance to decode the root cause anymore. One problem spot are kernel timers, where the detection of the problem often happens in interrupt context and usually causes the machine to panic. While working on a timer related bug report I had to hack specialized code into the timer subsystem to get a reasonable hint for the root cause. This debug hack was fine for temporary use, but far from a mergeable solution due to the intrusiveness into the timer code. The code further lacked the ability to detect and report the root cause instantly and keep the system operational. Keeping the system operational is important to get hold of the debug information without special debugging aids like serial consoles and special knowledge of the bug reporter. The problems described above are not restricted to timers, but timers tend to expose it usually in a full system crash. Other objects are less explosive, but the symptoms caused by such mistakes can be even harder to debug. Instead of creating specialized debugging code for the timer subsystem a generic infrastructure is created which allows developers to verify their code and provides an easy to enable debug facility for users in case of trouble. The debugobjects core code keeps track of operations on static and dynamic objects by inserting them into a hashed list and sanity checking them on object operations and provides additional checks whenever kernel memory is freed. The tracked object operations are: - initializing an object - adding an object to a subsystem list - deleting an object from a subsystem list Each operation is sanity checked before the operation is executed and the subsystem specific code can provide a fixup function which allows to prevent the damage of the operation. When the sanity check triggers a warning message and a stack trace is printed. The list of operations can be extended if the need arises. For now it's limited to the requirements of the first user (timers). The core code enqueues the objects into hash buckets. The hash index is generated from the address of the object to simplify the lookup for the check on kfree/vfree. Each bucket has it's own spinlock to avoid contention on a global lock. The debug code can be compiled in without being active. The runtime overhead is minimal and could be optimized by asm alternatives. A kernel command line option enables the debugging code. Thanks to Ingo Molnar for review, suggestions and cleanup patches. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Greg KH <greg@kroah.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>