gpu: nvgpu: engine preempt timeout in safety

Preempt TSG occurs in non-mission mode, when unbinding channel
from TSG, or aborting TSG. Should a preempt not complete on
engine, we expect other HW safety mechanisms such as FECS
watchdog to detect issues that prevented saving current context.
Add BUG_ON when attempting to recover from preempt timeout,
to make sure we got such error, and sw_quiesce has been
requested.

Jira NVGPU-4230

Change-Id: Ia26a61e703f74eb28d29e72e75664ca4ec97a586
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265082
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Thomas Fleury
2019-12-18 17:22:37 -05:00
committed by Alex Waterman
parent 07a0fe707f
commit 6b62e0f79a

View File

@@ -168,7 +168,11 @@ void nvgpu_rc_preempt_timeout(struct gk20a *g, struct nvgpu_tsg *tsg)
nvgpu_tsg_set_error_notifier(g, tsg,
NVGPU_ERR_NOTIFIER_FIFO_ERROR_IDLE_TIMEOUT);
#ifdef CONFIG_NVGPU_RECOVERY
nvgpu_rc_tsg_and_related_engines(g, tsg, true, RC_TYPE_PREEMPT_TIMEOUT);
#else
BUG_ON(!g->sw_quiesce_pending);
#endif
}
void nvgpu_rc_gr_fault(struct gk20a *g, struct nvgpu_tsg *tsg,