Files
linux-nvgpu/drivers/gpu/nvgpu/gk20a
Deepak Nibade 4252e00aa6 gpu: nvgpu: fix crash due to accessing incorrect TSG pointer
In gk20a_gr_isr(), we handle various errors including GPC/TPC errors.
And then if BPT errors are pending we call gk20a_gr_post_bpt_events() at the
end and pass channel pointer to it

gk20a_gr_post_bpt_events() extracts TSG pointer based on ch->tsgid

But in some race conditions it is possible that we clear the error and trigger
recovery and as a result channel is unbounded from TSG and closed by user space
before calling gk20a_gr_post_bpt_events()

And in that case the code above results in getting incorrect TSG pointer and
hence crashes as below

Unable to handle kernel paging request at virtual address ffffff8012000c08
...
[<ffffff8008081f84>] el1_da+0x24/0xb4
[<ffffff80086e72e0>] gk20a_tsg_get_event_data_from_id+0x30/0xb0
[<ffffff80086e7560>] gk20a_tsg_event_id_post_event+0x50/0xc8
[<ffffff800872922c>] gk20a_gr_isr+0x27c/0x12e0

To fix this extract the TSG pointer before handling all the errors and pass
this pointer to gk20a_gr_post_bpt_events() will post the events if they are
enabled and if TSG is still open

Bug 200404720

Change-Id: I4861c72e338a2cec96f31cb9488af665c5f2be39
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735415
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-06-14 06:44:06 -07:00
..
2018-04-23 12:12:52 -07:00
2018-05-25 10:15:40 -07:00
2018-06-14 06:44:06 -07:00
2018-05-30 11:56:42 -07:00
2018-05-25 10:15:40 -07:00