mirror of
git://nv-tegra.nvidia.com/linux-nvgpu.git
synced 2025-12-24 02:22:34 +03:00
gpu: nvgpu: Guard profiler_objects list operations with a lock
Both profiler and debugger device nodes access and update the list,
g->profiler_objects. List operations were currently not guarded by
lock thus leading to synchronisation issues. Stress-ng test attempts
to trigger repeated random open close sessions on all the device nodes
exposed by gpu. This results in kernel panic at random stages of test.
Failure signature - Profiler node receives a release call and as part
of it, nvgpu_profiler_free attempts to delete the prof_obj_entry and
free the prof memory. Simulataneously debugger node also receives a
release call and as part of gk20a_dbg_gpu_dev_release, nvgpu attempts
to access g->profiler_objects to check for any profiling sessions
associated with debugger node. There is a race to access the list which
results in kernel panic for address 0x8 because nvgpu tries to access
prof_obj->session_id which is at offset 0x8.
As part of this change, g->profiler_objects list access/update is
guarded with a mutex lock.
Bug 4858627
Change-Id: I1e2cf8d27d195bbc9c012cf511029de9eaadb038
Signed-off-by: Kishan Palankar <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3239897
(cherry picked from commit 2eabcdb8a4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3262771
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Tested-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
This commit is contained in:
committed by
Amulya Yarlagadda
parent
e2d19ad097
commit
8a0a534570
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright (c) 2016-2023, NVIDIA CORPORATION. All rights reserved.
|
||||
* Copyright (c) 2016-2024, NVIDIA CORPORATION. All rights reserved.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify it
|
||||
* under the terms and conditions of the GNU General Public License,
|
||||
@@ -165,6 +165,9 @@ static void nvgpu_init_vars(struct gk20a *g)
|
||||
#ifdef CONFIG_NVGPU_TSG_SHARING
|
||||
nvgpu_mutex_init(&g->ctrl_dev_id_lock);
|
||||
#endif
|
||||
#ifdef CONFIG_NVGPU_PROFILER
|
||||
nvgpu_mutex_init(&g->prof_obj_lock);
|
||||
#endif
|
||||
|
||||
/* Init the clock req count to 0 */
|
||||
nvgpu_atomic_set(&g->clk_arb_global_nr, 0);
|
||||
|
||||
Reference in New Issue
Block a user