For safety build, nvgpu driver should enter SW quiesce state
in case an uncorrectable error has occurred. In this state, any
activity on the GPU should be prevented, without powering off the GPU.
Also, a minimal set of operations should be used to enter SW quiesce
state.
Entering SW quiesce state does the following:
- set sw_quiesce_pending: when this flag is set, interrupt
handlers exit after masking interrupts. This should help mitigate
an interrupt storm.
- wake up thread to complete quiescing.
The thread performs the following:
- set NVGPU_DRIVER_IS_DYING to prevent allocation of new resources
- disable interrupts
- disable fifo scheduling
- preempt all runlists
- set error notifier for all active channels
Note: for channels with usermode submit enabled, userspace can
still ring doorbell, but this will not trigger any work on
engines since fifo scheduling is disabled.
Jira NVGPU-3493
Change-Id: I639a32da754d8833f54dcec1fa23135721d8d89a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2172391
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
1) Update header path of gk20a.h files present in os/
to <nvgpu/gk20a.h>
2) os_fence_android_sema.c indirectly was dependent on gk20a.h via
semaphore.h. So, added #include <nvgpu/gk20a.h> in
os_fence_android_sema.c and replaced the header with forward
declaration of struct gk20a in semaphore.h
Jira NVGPU-597
Change-Id: I96e23befeb80713f3a399071eb5498f6f580211d
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1842868
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move implementation of MC HAL to common/mc. Also bump gk20a
implementation to gm20b.
gk20a_mc_boot_0 was used via a HAL, but we have only one possible
implementation. It also has to be anyway called directly to detect
which HALs to assign, so make it a true common function.
mc_gk20a_handle_intr_nonstall was also used only in os/linux/intr.c
so move it there.
JIRA NVGPU-954
Change-Id: I79aedc9158f90d578db0edc17b714617b52690ac
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1813519
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>