mirror of
git://nv-tegra.nvidia.com/linux-nvgpu.git
synced 2025-12-23 09:57:08 +03:00
In gk20a_gr_isr(), we handle various errors including GPC/TPC errors. And then if BPT errors are pending we call gk20a_gr_post_bpt_events() at the end and pass channel pointer to it gk20a_gr_post_bpt_events() extracts TSG pointer based on ch->tsgid But in some race conditions it is possible that we clear the error and trigger recovery and as a result channel is unbounded from TSG and closed by user space before calling gk20a_gr_post_bpt_events() And in that case the code above results in getting incorrect TSG pointer and hence crashes as below Unable to handle kernel paging request at virtual address ffffff8012000c08 ... [<ffffff8008081f84>] el1_da+0x24/0xb4 [<ffffff80086e72e0>] gk20a_tsg_get_event_data_from_id+0x30/0xb0 [<ffffff80086e7560>] gk20a_tsg_event_id_post_event+0x50/0xc8 [<ffffff800872922c>] gk20a_gr_isr+0x27c/0x12e0 To fix this extract the TSG pointer before handling all the errors and pass this pointer to gk20a_gr_post_bpt_events() will post the events if they are enabled and if TSG is still open Bug 200404720 Change-Id: I4861c72e338a2cec96f31cb9488af665c5f2be39 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1735415 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
244 KiB
244 KiB