Files
linux-nvgpu/drivers/gpu/nvgpu/os/linux
Kishan Palankar 2eabcdb8a4 gpu: nvgpu: Guard profiler_objects list operations with a lock
Both profiler and debugger device nodes access and update the list,
g->profiler_objects. List operations were currently not guarded by
lock thus leading to synchronisation issues. Stress-ng test attempts
to trigger repeated random open close sessions on all the device nodes
exposed by gpu. This results in kernel panic at random stages of test.

Failure signature - Profiler node receives a release call and as part
of it, nvgpu_profiler_free attempts to delete the prof_obj_entry and
free the prof memory. Simulataneously debugger node also receives a
release call and as part of gk20a_dbg_gpu_dev_release, nvgpu attempts
to access g->profiler_objects to check for any profiling sessions
associated with debugger node. There is a race to access the list which
results in kernel panic for address 0x8 because nvgpu tries to access
prof_obj->session_id which is at offset 0x8.

As part of this change, g->profiler_objects list access/update is
guarded with a mutex lock.

Bug 4858627

Change-Id: I1e2cf8d27d195bbc9c012cf511029de9eaadb038
Signed-off-by: Kishan Palankar <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3239897
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
2024-11-07 08:53:58 -08:00
..
2020-12-15 14:13:28 -06:00
2019-02-01 09:45:57 -08:00
2021-08-11 01:55:08 -07:00
2024-05-23 04:11:28 -07:00
2022-06-06 05:55:26 -07:00
2024-08-01 15:09:16 -07:00
2024-01-25 13:54:56 -08:00
2023-12-15 14:09:44 -08:00
2023-04-10 20:45:50 -07:00