gpu: nvgpu: hold power ref for deterministic channels

To support deterministic channels even with platforms where railgating is supported, have each deterministic-marked channel hold a power reference during their lifetime, and skip taking power refs for jobs in submit path for those. Previously, railgating blocked deterministic submits in general because of gk20a_busy()/gk20a_idle() calls in submit path possibly taking time and more significantly because the gpu may need turning on which takes a nondeterministic and long amount of time. As an exception, gk20a_do_idle() can still block deterministic submits until gk20a_do_unidle() is called. Add a rwsem to guard this. VPR resize needs do_idle, which conflicts with deterministic channels' requirement to keep the GPU on. This is documented in the ioctl header now. Make NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_NO_JOBTRACKING always set in the gpu characteristics now that it's supported. The only thing left now blocking NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_FULL is the sync framework. Make the channel debug dump show which channels are deterministic. Bug 200291300 Jira NVGPU-70 Change-Id: I47b6f3a8517cd6e4255f6ca2855e3dd912e4f5f3 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1483038 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2025-12-22 17:36:20 +03:00 · 2017-05-16 13:47:58 +03:00
parent 3c3c39dfe0
commit 7680fd689e
9 changed files with 189 additions and 29 deletions
--- a/drivers/gpu/nvgpu/common/linux/module.c
+++ b/drivers/gpu/nvgpu/common/linux/module.c
@@ -298,6 +298,12 @@ int __gk20a_do_idle(struct device *dev, bool force_reset)
 	bool is_railgated;
 	int err = 0;

+	/*
+	 * Hold back deterministic submits and changes to deterministic
+	 * channels - this must be outside the power busy locks.
+	 */
+	gk20a_channel_deterministic_idle(g);
+
 	/* acquire busy lock to block other busy() calls */
 	down_write(&g->busy_lock);

@@ -403,6 +409,7 @@ fail_drop_usage_count:
 fail_timeout:
 	nvgpu_mutex_release(&platform->railgate_lock);
 	up_write(&g->busy_lock);
+	gk20a_channel_deterministic_unidle(g);
 	return -EBUSY;
 }

@@ -456,6 +463,8 @@ int __gk20a_do_unidle(struct device *dev)
 	nvgpu_mutex_release(&platform->railgate_lock);
 	up_write(&g->busy_lock);

+	gk20a_channel_deterministic_unidle(g);
+
 	return 0;
 }