gpu: nvgpu: Clear nvlink error persistent state

Error logging bits within the nvlink blocks like TLC and MIF are
persistent through reset, to enable them to be polled following
a reset event.  That means that they are in an unknown state at
cold reset, and may contain error state after a warm reset event.
Software is expected to reset them, either by writing ones to the
status bits or by writing to the DEBUG_RESET register at the IOCTRL
top level, to clear the state out before enabling error reporting.

JIRA NVGPU-4352

Change-Id: Iab4e96388fd827c0d694eada61b20f24bbddd1ff
Signed-off-by: tkudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2317683
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
tkudav
2020-03-24 17:38:48 +05:30
committed by Alex Waterman
parent 5af8cedf05
commit 3856381b43

View File

@@ -162,13 +162,16 @@ static int gv100_nvlink_enable_links_pre_top(struct gk20a *g,
IOCTRL_REG_WR32(g, ioctrl_reset_r(), reg); IOCTRL_REG_WR32(g, ioctrl_reset_r(), reg);
nvgpu_udelay(delay); nvgpu_udelay(delay);
/* Clear warm reset persistent state */
reg = IOCTRL_REG_RD32(g, ioctrl_debug_reset_r()); reg = IOCTRL_REG_RD32(g, ioctrl_debug_reset_r());
reg &= ~ioctrl_debug_reset_link_f(BIT32(link_id)); reg &= ~(ioctrl_debug_reset_link_f(1U) |
ioctrl_debug_reset_common_f(1U));
IOCTRL_REG_WR32(g, ioctrl_debug_reset_r(), reg); IOCTRL_REG_WR32(g, ioctrl_debug_reset_r(), reg);
nvgpu_udelay(delay); nvgpu_udelay(delay);
reg |= ioctrl_debug_reset_link_f(BIT32(link_id)); reg |= (ioctrl_debug_reset_link_f(1U) |
ioctrl_debug_reset_common_f(1U));
IOCTRL_REG_WR32(g, ioctrl_debug_reset_r(), reg); IOCTRL_REG_WR32(g, ioctrl_debug_reset_r(), reg);
nvgpu_udelay(delay); nvgpu_udelay(delay);