gpu: nvgpu: correct usage for gk20a_busy_noresume

Background: In case of a deferred suspend implemented by gk20a_idle,
the device waits for a delay before suspending and invoking
power gating callbacks. This helps minimize resume latency for any
resume calls(gk20a_busy) that occur before the delay.

Now, some APIs spread across the driver requires that if the device
is powered on, then they can proceed with register writes, but if its
powered off, then it must return. Examples of such APIs include
l2_flush, fb_flush and even nvs_thread. We have relied on
some hacks to ensure the device is kept powered on to prevent any such
delayed suspension to proceed. However, this still raced for some calls
like ioctl l2_flush, so gk20a_busy() was added (Refer to commit Id
dd341e7ecbaf65843cb8059f9d57a8be58952f63)

Upstream linux kernel has introduced the API pm_runtime_get_if_active
specifically to handle the corner case for locking the state during the
event of a deferred suspend.

According to the Linux kernel docs, invoking the API with
ign_usage_count parameter set to true, prevents an incoming suspend
if it has not already suspended.

With this, there is no longer a need to check whether
nvgpu_is_powered_off(). Changed the behavior of gk20a_busy_noresume()
to return bool. It returns true, iff it managed to prevent
an imminent suspend, else returns false. For cases where
PM runtime is disabled, the code follows the existing implementation.

Added missing gk20a_busy_noresume() calls to tlb_invalidate.

Also, moved gk20a_pm_deinit to after nvgpu_quiesce() in
the module removal path. This is done to prevent regs access
after registers are locked out at the end of nvgpu_quiesce. This
can happen as some free function calls post quiesce  might still
have l2_flush, fb_flush deep inside their stack, hence invoke
gk20a_pm_deinit to disable pm_runtime immediately after quiesce.

Kept the legacy implementation same for VGPU and
older kernels

Jira NVGPU-8487

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I972f9afe577b670c44fc09e3177a5ce8a44ca338
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2715654
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
This commit is contained in:
Debarshi Dutta
2022-05-23 15:19:00 +05:30
committed by mobile promotions
parent a0b0acad05
commit c1ea9e3955
9 changed files with 105 additions and 52 deletions

View File

@@ -168,9 +168,63 @@ struct device_node *nvgpu_get_node(struct gk20a *g)
return dev->of_node;
}
void gk20a_busy_noresume(struct gk20a *g)
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 8, 0)
static bool gk20a_pm_runtime_get_if_in_use(struct gk20a *g)
{
pm_runtime_get_noresume(dev_from_gk20a(g));
int ret = pm_runtime_get_if_active(dev_from_gk20a(g), true);
if (ret == 1) {
return true;
} else {
return false;
}
}
#endif
static bool gk20a_busy_noresume_legacy(struct gk20a *g)
{
struct device *dev = dev_from_gk20a(g);;
pm_runtime_get_noresume(dev);
if (nvgpu_is_powered_off(g)) {
pm_runtime_put_noidle(dev);
return false;
} else {
return true;
}
}
/*
* Cases:
* 1) For older than Kernel 5.8, use legacy
* i.e. gk20a_busy_noresume_legacy()
* 2) else if pm_runtime_disabled (e.g. VGPU, DGPU)
* use legacy.
* 3) Else use gk20a_pm_runtime_get_if_in_use()
*/
bool gk20a_busy_noresume(struct gk20a *g)
{
struct device *dev;
if (!g)
return false;
dev = dev_from_gk20a(g);
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 8, 0)
if (pm_runtime_enabled(dev)) {
if (gk20a_pm_runtime_get_if_in_use(g)) {
atomic_inc(&g->usage_count.atomic_var);
return true;
} else {
return false;
}
} else {
/* VGPU, DGPU */
return gk20a_busy_noresume_legacy(g);
}
#else
return gk20a_busy_noresume_legacy(g);
#endif
}
int gk20a_busy(struct gk20a *g)
@@ -222,7 +276,17 @@ fail:
void gk20a_idle_nosuspend(struct gk20a *g)
{
pm_runtime_put_noidle(dev_from_gk20a(g));
struct device *dev = dev_from_gk20a(g);
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5, 8, 0)
if (pm_runtime_enabled(dev)) {
gk20a_idle(g);
} else {
pm_runtime_put_noidle(dev);
}
#else
pm_runtime_put_noidle(dev);
#endif
}
void gk20a_idle(struct gk20a *g)
@@ -1941,6 +2005,14 @@ int nvgpu_remove(struct device *dev)
err = nvgpu_quiesce(g);
WARN(err, "gpu failed to idle during driver removal");
/**
* nvgpu_quiesce has been invoked already, disable pm runtime.
* Informs PM domain that its safe to power down the h/w now.
* Anything after this is just software deinit. Any cache/tlb
* flush/invalidate must have already happened before this.
*/
gk20a_pm_deinit(dev);
if (nvgpu_mem_is_valid(&g->syncpt_mem))
nvgpu_dma_free(g, &g->syncpt_mem);
@@ -2001,8 +2073,6 @@ static int __exit gk20a_remove(struct platform_device *pdev)
nvgpu_put(g);
gk20a_pm_deinit(dev);
return err;
}