gpu: nvgpu: ga10b: Correct VAB implementation

This patch performs the following improvements for VAB:
1) It avoids an infinite loop when collecting VAB information.
   Previously, nvgpu incorrectly assumed that the valid bit would
   be eventually set for the checker when polling. It may not be set
   if a VAB-related fault has occurred.
2) It handles the VAB_ERROR mmu fault which may be caused for various
   reasons: invalid vab buffer address, tracking in protected mode,
   etc. The recovery sequence is to set the vab buffer size to 0 and
   then to the original size. This clears the VAB_ERROR bit. After
   reseting, the old register values are again set in the recovery
   code sequence.
3) Use correct number of VAB buffers. There's only one VAB buffer on
   ga10b, not two.
4) Simplify logic.

Bug 3374805
Bug 3465734
Bug 3473147

Change-Id: I716f460ef37cb848ddc56a64c6f83024c4bb9811
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621290
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Martin Radev
2021-10-27 16:12:12 +03:00
committed by mobile promotions
parent b2db6a8453
commit b67a3cd053
13 changed files with 268 additions and 155 deletions

View File

@@ -776,6 +776,7 @@ static const struct gops_gr ga10b_ops_gr = {
.gr_suspend = nvgpu_gr_suspend,
#ifdef CONFIG_NVGPU_HAL_NON_FUSA
.vab_init = ga10b_gr_vab_init,
.vab_recover = ga10b_gr_vab_recover,
.vab_release = ga10b_gr_vab_release,
#endif
#ifdef CONFIG_NVGPU_DEBUGGER
@@ -878,6 +879,7 @@ static const struct gops_fb_vab ga10b_ops_fb_vab = {
.dump_and_clear = ga10b_fb_vab_dump_and_clear,
.release = ga10b_fb_vab_release,
.teardown = ga10b_fb_vab_teardown,
.recover = ga10b_fb_vab_recover,
};
#endif