gpu: nvgpu: support per-channel wdt timeouts

Replace the padding in nvgpu_channel_wdt_args with a timeout value in
milliseconds, and add NVGPU_IOCTL_CHANNEL_WDT_FLAG_SET_TIMEOUT to
signify the existence of this new field. When the new flag is included
in the value of wdt_status, the field is used to set a per-channel
timeout to override the per-GPU default.

Add NVGPU_IOCTL_CHANNEL_WDT_FLAG_DISABLE_DUMP to disable the long debug
dump when a timed out channel gets recovered by the watchdog. Printing
the dump to serial console takes easily several seconds. (Note that
there is NVGPU_TIMEOUT_FLAG_DISABLE_DUMP about ctxsw timeout separately
for NVGPU_IOCTL_CHANNEL_SET_TIMEOUT_EX as well.)

The behaviour of NVGPU_IOCTL_CHANNEL_WDT is changed so that either
NVGPU_IOCTL_CHANNEL_ENABLE_WDT or NVGPU_IOCTL_CHANNEL_DISABLE_WDT has to
be set. The old behaviour was that other values were silently ignored.

The usage of the global default debugfs-controlled ch_wdt_timeout_ms is
changed so that its value takes effect only for newly opened channels
instead of in realtime. Also, zero value no longer means that the
watchdog is disabled; there is a separate flag for that after all.

gk20a_fifo_recover_tsg used to ignore the value of "verbose" when no
engines were found. Correct this.

Bug 1982826
Bug 1985845
Jira NVGPU-73

Change-Id: Iea6213a646a66cb7c631ed7d7c91d8c2ba8a92a4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1510898
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Konsta Holtta
2018-02-21 16:42:37 +02:00
committed by mobile promotions
parent 4f9368522e
commit cb6ed949e2
7 changed files with 50 additions and 30 deletions

View File

@@ -319,10 +319,21 @@ static int gk20a_channel_cycle_stats_snapshot(struct channel_gk20a *ch,
static int gk20a_channel_set_wdt_status(struct channel_gk20a *ch,
struct nvgpu_channel_wdt_args *args)
{
if (args->wdt_status == NVGPU_IOCTL_CHANNEL_DISABLE_WDT)
ch->wdt_enabled = false;
else if (args->wdt_status == NVGPU_IOCTL_CHANNEL_ENABLE_WDT)
ch->wdt_enabled = true;
u32 status = args->wdt_status & (NVGPU_IOCTL_CHANNEL_DISABLE_WDT |
NVGPU_IOCTL_CHANNEL_ENABLE_WDT);
if (status == NVGPU_IOCTL_CHANNEL_DISABLE_WDT)
ch->timeout.enabled = false;
else if (status == NVGPU_IOCTL_CHANNEL_ENABLE_WDT)
ch->timeout.enabled = true;
else
return -EINVAL;
if (args->wdt_status & NVGPU_IOCTL_CHANNEL_WDT_FLAG_SET_TIMEOUT)
ch->timeout.limit_ms = args->timeout_ms;
ch->timeout.debug_dump = (args->wdt_status &
NVGPU_IOCTL_CHANNEL_WDT_FLAG_DISABLE_DUMP) == 0;
return 0;
}