gpu: nvgpu: retry tsg unbind if NEXT is set

The NEXT bit can remain set for the channel if timeslice expires before
scheduler clears it. Due to this nvgpu fails TSG unbind and in turn
nvrm_gpu fails channel close. In this case, checking the channel hw
state after some time can help see NEXT bit cleared by scheduler.

Reenable the tsg and return -EAGAIN to nvrm_gpu for it to retry again.

Bug 3144960
Bug 200520811

Change-Id: I35f417f02270e371a4e632986b73a00f8a4f921a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2468391
(cherry picked from commit cf287a4ef5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2479106
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
This commit is contained in:
Sagar Kamble
2021-02-02 22:02:23 +05:30
committed by mobile promotions
parent 9170f2b77c
commit 13fc430775
7 changed files with 23 additions and 15 deletions

View File

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2014-2020, NVIDIA CORPORATION. All rights reserved.
* Copyright (c) 2014-2021, NVIDIA CORPORATION. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
@@ -134,7 +134,10 @@ static int gk20a_tsg_unbind_channel_fd(struct tsg_gk20a *tsg, int ch_fd)
goto out;
}
err = gk20a_tsg_unbind_channel(ch);
err = gk20a_tsg_unbind_channel(ch, false);
if (err == -EAGAIN) {
goto out;
}
/*
* Mark the channel timedout since channel unbound from TSG