linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 09:57:08 +03:00

Author	SHA1	Message	Date
prsethi	144f548552	gpu:nvgpu: fix for shadow domain submission There are three issues with shadow domain submission: 1. runlist mem is not being swapped with mem_hw for shadow domain when there is non-shadow domain being bound to tsg which does not allow runlist to have all the tsgs. To fix this nvgpu_runlist_swap_mem() is being called for shadow domain as well. 2. tsg num_active_channels is being set as part of non-shadow domain which does happen after shadow domain. Due to this, runlist tsg length is not being set as part of runlist reconstruct and leaving tsg length 0 for last tsg. To fix this, tsg num_active_channels always get set for shadow domain as it gets configured first. 3. NV_BUILD_CONFIGURATION_VARIANT_IS_EMBEDDED is not solving the purpose to differentiate l4t and embedded_linux builds so using NV_BUILD_SYSTEM_TYPE in place of this to find out the build type. L4t is using round robin scheduling and this issue coming with manual mode scheduling so adding the fix only for manual mode scheduling support. Bug 3884011 Change-Id: Ic55da8f75294eb32c8df6e35fb1fa47df78db8f8 Signed-off-by: prsethi <prsethi@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2833880 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2023-01-09 12:26:00 -08:00
Rajesh Devaraj	2e36ad9e35	gpu: nvgpu: add null check for gp_get, pb_get This patch adds NULL check for gp_get and pb_get. JIRA NVGPU-9325 Change-Id: If41c1c526c58a18cc91a95686e71bdfae9edb328 Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2836366 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2023-01-03 19:10:11 -08:00
Atul Anand	5db14f3bfb	nvgpu: Fix pm resource release sequence The memory leak issue was due to nvgpu_profiler_unbind_context() calling nvgpu_profiler_pm_resource_release() for all resources which clears the flag required by nvgpu_profiler_free_pma_stream() to release the memory for perf_buf instance block. Fixing this issue by splitting nvgpu_profiler_unbind_context() to release all the pm resources at a later time separately. Bug 3510455 Change-Id: Ibab8d071693e600c46f7e7f16575e36e6f62af3c Signed-off-by: atanand <atanand@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2825013 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-12-15 15:13:29 -08:00
Rajesh Devaraj	3b2b225c73	gpu: nvgpu: update pmu_early_init Move the setting of power features related enable flags to separate static function. Invoke this function when PMU is not supported. JIRA NVGPU-9283 Change-Id: I429504c09d40c2cb115fce7550555f06b1e384ed Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2817658 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Ramalingam C <ramalingamc@nvidia.com> Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-12-07 01:51:30 -08:00
Debarshi Dutta	5d2dfc88a3	gpu: nvgpu: Replace CONFIG_NVS_KMD_BACKEND Use CONFIG_KMD_SCHEDULING_WORKER_THREAD instead of CONFIG_NVS_KMD_BACKEND to remove confusion about the CPU based KMD scheduling worker thread. The KMD based scheduling worker thread caters to both Manual Mode CPU based scheduler as well as Automatic Round Robin CPU based scheduler. For the traditional submit path, add correct handling of the CONFIG_NVS_PRESENT. CPU based worker thread should be part of CONFIG_NVS_PRESENT. Eventually, when DCONFIG_KMD_SCHEDULING_WORKER_THREAD is removed, the application must switch to GSP. Jira NVGPU-8619 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I0886ef3b2e0124b6fe22c2bf0bf7d1fa98039d00 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2810217 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-11-23 08:07:24 -08:00
rmylavarapu	c8429c5de9	gpu: nvgpu: gsp sched: get device id from runlist pri base Get the device id from runlist pri base instead of reading from runlist structure which could be failing if the device node not present inside the runlist struct. This Change will call the get device id hal to get it from rlend_id and runlist pri base. NVGPU-8531 Change-Id: Ia81189a6c2281ed09ee52eb461f0cd87164c5fc4 Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2791605 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-11-17 02:04:28 -08:00
Sagar Kamble	96f675595c	gpu: nvgpu: implement get and revoke share token ioctls Add share token list to gk20a_ctrl_priv. Implement GET_SHARE_TOKEN and REVOKE_SHARE_TOKEN ioctls. Revoke tokens while closing the TSG for all active devices. Bug 3677982 JIRA NVGPU-8681 Change-Id: I74455c21d881d5a0d381729fd695239722599980 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2792081 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Scott Long <scottl@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-11-10 11:49:54 -08:00
Sagar Kamble	675edd5053	gpu: nvgpu: maintain authorized devices in TSG When the TSG is successfully created first time or is opened with share token, the device instance id associated with the CTRL fd will be added to the TSG private data structure as authorized device instance ids. This is used for a security check when creating a TSG share token with nvgpu_tsg_get_share_token. Bug 3677982 JIRA NVGPU-8681 Change-Id: I67bb0514e1272dab15023cd3828a6a51e9a4c928 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2792080 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Scott Long <scottl@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-11-10 11:49:44 -08:00
Sagar Kamble	d1b28712b6	gpu: nvgpu: implement VEID alloc/free Implement the ioctls NVGPU_TSG_IOCTL_CREATE_SUBCONTEXT and NVGPU_TSG_IOCTL_DELETE_SUBCONTEXT. These will allocate and free the VEID numbers. Address space association with the VEIDs is verified to ensure that channels association with VEIDs and address space remains consistent. Bug 3677982 JIRA NVGPU-8681 Change-Id: I2d913baf61a6bdeec412c58270c0024b80ca15c6 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2766765 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-11-01 00:05:18 -07:00
Debarshi Dutta	280b69e66d	nvgpu: userspace: add unit test for nvs Add a unit test to add verification for S/W parts of NVGPU-KMD based scheduler Jira NVGPU-8619 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I266cb4167074dc5f7da647ce627e96188fc6bdcb Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2767591 Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-10-10 14:08:03 -07:00
Debarshi Dutta	17dc483a6b	gpu: nvgpu: enclose NVS KMD inside a config Use CONFIG_NVS_KMD_BACKEND to enclose all NVS KMD based scheduling code. Current configuration contains all the scheduling code managed within CONFIG_NVS_PRESENT. Eventually, scheduling code shall only use GSP. Hence, isolate KMD based scheduling code to a config CONFIG_NVS_KMD_BACKEND. This shall make it easier to remove this code later. Jira NVGPU-8619 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I9dc668e0fa3e7706c111fda7a5e2415e1fc0dd03 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2769465 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-10-10 14:07:37 -07:00
Sagar Kamble	6d836becf5	gpu: nvgpu: retry unbind when force killing the channel If NEXT bit remains set for a channel being unbound, it can lead to MMU fault of type unbound inst block. When userspace is closing the channel and NEXT bit is set, userspace retries. When force killing the channel, nvgpu can retry few iterations to ensure the channel is truly idle and unbound. If the channel is really stuck then unbind will fail and TSG will be aborted. Bug 3800844 Change-Id: I8fb024630ff2dd272245ae27116f3db6d6e0f788 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2787533 (cherry picked from commit 99e39f4b387743a93b05ba4b097c33b23fbbcf68) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2786479 Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-10-10 08:17:12 -07:00
vivekku	5bb56723be	gpu: nvgpu: gsp: Create functions to pass nvs data to gsp firmware Changes: - created functions to populate gsp interface data from nvs and runlist structures. - Handled both user domains and shadow domains. - Provided support for four engines from two. NVGPU-8531 Signed-off-by: vivekku <vivekku@nvidia.com> Change-Id: I1d9ec9ded8a9b47a5b2a00c44dacbab22e3b04b1 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2743596 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-10-05 06:18:18 -07:00
Sagar Kamble	f1896e0a64	gpu: nvgpu: acquire tsg ctx_init_lock when changing ctx state GR context associated with channel is updated in various driver paths. Sequence to do the same is disable the TSG, preempt the TSG, update the GR context or instance block and then enable the TSG. These operations and runlist updates for channel have to be done under TSG specific ctx_init_lock to avoid the race. suspend_contexts and resume_contexts needs special handling which is not covered in this patch. Bug 3677982 Change-Id: I837257fe9d9ef3eb6f69f5d7e0707e0bb6d4ea72 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2720222 Reviewed-by: Scott Long <scottl@nvidia.com> Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-09-08 21:00:36 -07:00
Sagar Kamble	ef99d9f010	gpu: nvgpu: implement scg, pbdma and cilp rules Only certain combination of channels of GFX/Compute object classes can be assigned to particular pbdma and/or VEID. CILP can be enabled only in certain configs. Implement checks for the configurations verified during alloc_obj_ctx and/or setting preemption mode. Bug 3677982 Change-Id: Ie7026cbb240819c1727b3736ed34044d7138d3cd Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2719995 Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-09-08 21:00:30 -07:00
Sagar Kamble	693305c0fd	gpu: nvgpu: subcontext add/remove support Subcontext PDBs and valid mask in the instance blocks of the channels in various subcontexts has to be updated when new subcontext is created or a subcontext is removed. Replayable fault state is cached in the channel structure. Replayable fault state for subcontext is set based on first channel's bind parameter. It was earlier programmed in function channel_setup_ramfc. init_inst_block_core is updated to setup TSG level pdb map and mask. Added new hal gv11b_channel_bind to enable the subcontext on channel bind. Bug 3677982 Change-Id: I58156c5b3ab6309b6a4b8e72b0e798d6a39c1bee Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2719994 Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-09-08 21:00:20 -07:00
Sagar Kamble	f55fd5dc8c	gpu: nvgpu: multiple address spaces support for subcontexts This patch introduces following relationships among various nvgpu objects to support multiple address spaces with subcontexts. IOCTLs setting the relationships are shown in the braces. nvgpu_tsg 1<---->n nvgpu_tsg_subctx (TSG_BIND_CHANNEL_EX) nvgpu_tsg 1<---->n nvgpu_gr_ctx_mappings (ALLOC_OBJ_CTX) nvgpu_tsg_subctx 1<---->1 nvgpu_gr_subctx (ALLOC_OBJ_CTX) nvgpu_tsg_subctx 1<---->n nvgpu_channel (TSG_BIND_CHANNEL_EX) nvgpu_gr_ctx_mappings 1<---->n nvgpu_gr_subctx (ALLOC_OBJ_CTX) nvgpu_gr_ctx_mappings 1<---->1 vm_gk20a (ALLOC_OBJ_CTX) On unbinding the channel, objects are deleted according to dependencies. Without subcontexts, gr_ctx buffers mappings are maintained in the struct nvgpu_gr_ctx. For subcontexts, they are maintained in the struct nvgpu_gr_subctx. Preemption buffer with index NVGPU_GR_CTX_PREEMPT_CTXSW and PM buffer with index NVGPU_GR_CTX_PM_CTX are to be mapped in all subcontexts when they are programmed from respective ioctls. Global GR context buffers are to be programmed only for VEID0. Based on the channel object class the state is patched in the patch buffer in every ALLOC_OBJ_CTX call unlike setting it for only first channel like before. PM and preemptions buffers programming is protected under TSG ctx_init_lock. tsg->vm is now removed. VM reference for gr_ctx buffers mappings is managed through gr_ctx or gr_subctx mappings object. For vGPU, gr_subctx and mappings objects are created to reference VMs for the gr_ctx lifetime. The functions nvgpu_tsg_subctx_alloc_gr_subctx and nvgpu_tsg_- subctx_setup_subctx_header sets up the subcontext struct header for native driver. The function nvgpu_tsg_subctx_alloc_gr_subctx is called from vgpu to manage the gr ctx mapping references. free_subctx is now done when unbinding channel considering references to the subcontext by other channels. It will unmap the buffers in native driver case. It will just release the VM reference in vgpu case. Note that TEGRA_VGPU_CMD_FREE_CTX_HEADER ioctl is not called by vgpu any longer as it would be taken care by native driver. Bug 3677982 Change-Id: Ia439b251ff452a49f8514498832e24d04db86d2f Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2718760 Reviewed-by: Scott Long <scottl@nvidia.com> Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-09-08 20:59:59 -07:00
Sagar Kamble	b69c035520	gpu: nvgpu: init golden context image with nvgpu VEID0 channel With subcontexts support added, nvgpu has to allocate VEID0 channel itself to initialize the golden context image. Allocate the channel and init the golden context image at the beginning of alloc_obj_ctx call for first user channel. It can't be initialized at the end of probe as tpc pg settings need to be updated before golden context image is initialized. Bug 3677982 Change-Id: Ia82f6ad6e088c2bc1578a6bd32b7c7a707a17224 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2756289 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-08-31 20:25:11 -07:00
Debarshi Dutta	143034daab	gpu: nvgpu: modify wait_pending The wait_pending HAL is now modified to simply check the pending status of a given runlist. The while loop is removed from this HAL. A new function nvgpu_runlist_wait_pending_legacy() is added that emulates the older wait_pending() HAL. nvgpu_runlist_tick() is modified to accept a 64 bit "preempt_grace_ns" value. These changes prepare for upcoming control-fifo parser changes. Jira NVGPU-8619 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: If3f288eb6f2181743c53b657219b3b30d56d26bc Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2766100 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-08-30 23:45:43 -07:00
Debarshi Dutta	42beb7f4db	gpu: nvgpu: simplify the runlist update sequence Following changes are added here to simplify the overall sequence. 1) Remove deferred update for runlists. NVS worker thread shall submit the updated runlist. 2) Moved Runlist mem swap inside update itself. Protect the swap() and hw_submit() path with a spinlock. This is temporary till GSP. 3) Enable Control-Fifo mode from nvgpu driver. Jira NVGPU-8609 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: Icc52e5d8ccec9d3653c9bc1cf40400fc01a08fde Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2757406 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-08-20 23:33:45 -07:00
Debarshi Dutta	13699c4c15	gpu: nvgpu: ensure worker thread is disabled during rg A previous commit ID `44b6bfbc1` added a hack to prevent the worker thread from calling nvgpu_runlist_tick() in post_process if the next domain matches the previous. This could potentially still face issues with multi-domains in future. A better way is to synchronize the thread to suspend/resume alongwith the device's common OS agnostic suspend/resume operations. This shall emulate the GSP as well. This shall also take care of the power constraints i.e. the worker thread can be expected to always work with the power enabled and thus we can get rid of the complex gk20a_busy() lock here for good. Implemented a state-machine based approach for suspending/ resuming the NVS worker thread from the existing callbacks. Remove support for NVS worker thread creation for VGPU. hw_submit method is currently set to NULL for VGPU. VGPU instead submits its updates via the runlist.reload() method. Jira NVGPU-8609 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I51a20669e02bf6328dfe5baa122d5bfb75862ea2 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2750403 Reviewed-by: Prateek Sethi <prsethi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>	2022-08-20 23:33:29 -07:00
Sagar Kamble	7085934303	gpu: nvgpu: update channel TSG unbind fail paths On tsg.unbind_channel hal failure, channel teardown was being done again that was done already as part of the function nvgpu_tsg_unbind_channel_common. Just abort the TSG and return err in that case. Also, decrement the TSG ch_count in the fail path. Bug 3677982 Change-Id: I466f2b2c693d43ed64dc531b08bf740bf00f28a6 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2749970 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>	2022-08-10 15:42:34 -07:00
atanand	eae4593343	gpu: nvgpu: add ioctl to configure implicit ERRBAR Add ioctl support to configure implicit ERRBAR by setting/unsetting NV_PGRAPH_PRI_GPCS_TPCS_SM_SCH_MACRO_SCHED register. Add gpu characteritics flag: NVGPU_SCHED_EXIT_WAIT_FOR_ERRBAR_SUPPORTED to allow userspace driver to determine if implicit ERRBAR ioctl is supported. Bug: 200782861 Change-Id: I530a4cf73bc5c844e8d73094d3e23949568fe335 Signed-off-by: atanand <atanand@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2718672 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-08-05 23:10:18 -07:00
Sagar Kamble	3fb2a2e209	gpu: nvgpu: track gr_ctx init state On successful obj_ctx allocation, set ctx_initialized member in gr_ctx to true and when it is true then only invoke free_gr_ctx. With this we can get rid of tsg->vm check while calling free_gr_ctx. tsg->vm will go away with multiple address spaces support in TSG. Bug 3677982 Change-Id: I4a64842411ce4ab157010808e4e8e4d5cd254a7f Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2746803 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Scott Long <scottl@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-07-19 10:32:35 -07:00
Debarshi Dutta	1bdca92c50	gpu: nvgpu: modify rl_domain member KMD needs to send the domain id and GPU_VA corresponding to the struct runlist_domains to GSP. In the current implementation, struct nvgpu_runlist_domain contains the domain name instead of domain id. This requires an additional search by name everytime an update is needed to be submitted to the GSP. Modify the struct nvgpu_runlist_domain to store domain id instead of domain name. This simplifies the flow and avoids unnecessary search. Removed the conditional check for existence of shadow domain as its a deadcode. Shadow Domain is not searchable in the list of domains inside the struct nvgpu_runlist. Jira NVGPU-8610 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I0d67cfa93d89186240290e933aa750702b14f4f0 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2744890 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-15 15:15:30 -07:00
Sagar Kamble	d75473a115	gpu: nvgpu: fix unit test traceability issues Some of the functions with no traceability to unit tests are already covered by callee API functions. Skip these functions in SWVR by skipping doxygen for them. Some of the functions are non-fusa like those in profile.h and bsearch.h. Those were included as the header was included in Doxygen sources. Mark then non-safe. Some of the nvgpu functions were not added to Targets entries for respective tests. Fix those. JIRA NVGPU-7211 Change-Id: Iacf22dccdd9340100cf93814566d3979734c455d Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2612982 (cherry picked from commit a40f62654747102cc8ef53ddbd9f953c21c2b745) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2737672 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-15 07:15:34 -07:00
Sagar Kamble	f95cb5f4f8	gpu: nvgpu: maintain ctx buffers mappings separately from ctx mems In order to maintain separate mappings of GR TSG and global context buffers for different subcontexts, we need to separate the memory struct and the mapping struct for the buffers. This patch moves the mappings of all GR ctx buffers to new structure nvgpu_gr_ctx_mappings. This will be instantiated per subcontext in the upcoming patches. Summary of changes: 1. Various context buffers were allocated and mapped separately. All TSG context buffers are now stored in gr_ctx->mem[] array since allocation and mapping is unified for them. 2. Mapping/unmapping and querying the GPU VA of the context buffers is now handled in ctx_mappings unit. Structure nvgpu_gr_ctx_mappings in nvgpu_gr_ctx holds the maps. On ALLOC_OBJ_CTX this struct is instantiated and deleted on free_gr_ctx. 3. Introduce mapping flags for TSG and global context buffers. This is to map different buffers with different caching attribute. Map all buffers as cacheable except PRIV_ACCESS_MAP, RTV_CIRCULAR_BUFFER, FECS_TRACE, GR CTX and PATCH ctx buffers. Map all buffers as privileged. 4. Wherever VM or GPU VA is passed in the obj_ctx allocation functions, they are now replaced by nvgpu_gr_ctx_mappings. 5. free_gr_ctx API need not accept the VM as mappings struct will hold the VM. mappings struct will be kept in gr_ctx. 6. Move preemption buffers allocation logic out of nvgpu_gr_obj_ctx_set_graphics_preemption_mode. 7. set_preemption_mode and gr_gk20a_update_hwpm_ctxsw_mode functions need update to ensure buffers are allocated and mapped. 8. Keep the unit tests and documentation updated. With these changes there is clear seggregation of allocation and mapping of GR context buffers. This will simplify further change to add multiple address spaces support. With multiple address spaces in a TSG, subcontexts created after first subcontext just need to map the buffers. Bug 3677982 Change-Id: I3cd5f1311dd85aad1cf547da8fa45293fb7a7cb3 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2712222 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-15 07:10:11 -07:00
Sagar Kamble	80efe558b1	gpu: nvgpu: add BVEC test for nvgpu_rc_pbdma_fault Update nvgpu_rc_pbdma_fault with invalid checks and add BVEC test for it. Make ga10b_fifo_pbdma_isr static. NVGPU-6772 Change-Id: I5485760c53e1fff1278557a5b25659a1fc0e4eaf Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551617 (cherry picked from commit e917042d395d07cb902580bad3d5a7d0096cc303) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623625 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-14 08:58:31 -07:00
Debarshi Dutta	d8e8eb65d3	nvgpu: gpu: separate runlist submit from construction This patch primary separates runlist modification from runlist submits. Instead of submitting the runlist(domain) immediately after modification, a worker thread interface is now being used to synchronously schedule runlist submits. If the runlist being scheduled is currently active, the submit happens instantly, otherwise, it will happen in the next iteration when the nvs thread will schedule the domain. This external interface uses a condition variable to wait for the completion of the synchronous submits. A pending_update variable is used to synchronize domain memory swaps just before being submitted. To facilitate faster scheduling via the NVS thread, nvgpu_dom itself contains an array of rl_domain pointers. This can then be used to select the appropriate rl_domain directly for scheduling as against the earlier approach of maintaining nvs domains and rl domains in sync everytime. Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I1725c7cf56407cca2e3d2589833d1c0b66a7ad7b Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2739795 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-07-13 16:36:19 -07:00
Sagar Kamble	5b55088970	gpu: nvgpu: skip subctx pdb init during as-channel bind While creating a new channel, ioctls are called in the below sequence: 1. GPU_IOCTL_OPEN_CHANNEL 2. AS_IOCTL_BIND_CHANNEL 3. TSG_IOCTL_BIND_CHANNEL_EX 4. CHANNEL_ALLOC_GPFIFO_EX 5. CHANNEL_ALLOC_OBJ_CTX. subctx pdbs and valid mask are programmed in the channel instance block in the channel ioctls AS_IOCTL_BIND_CHANNEL & CHANNEL_ALLOC_GPFIFO_EX. Programming them in the ioctl AS_IOCTL_BIND_CHANNEL is redundant. Remove related hal g->ops.mm.init_inst_block_for_subctxs. The hal init_inst_block will program context pdb and big page size. The hal init_inst_block_core will program context pdb, big page size and subctx 0 pdb. This is used by h/w units (fecs, pmu, hwpm, bar1, bar2, sec2, gsp, perfbuf etc.). For user channels, subctx pdbs are programmed as part of ramfc setup. Bug 3677982 Change-Id: I6656b002d513404c1fd7c3d349933e80cca7e604 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2680907 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-06-28 23:33:31 -07:00
Krishna Reddy	961925be02	Revert "gpu: nvgpu: correct usage for gk20a_busy_noresume" This reverts commit `c1ea9e3955`. Reason for revert: ap_vulkan, ap_opengles, ap_mods tests failures Bug 3661058 Bug 3661080 Bug 3659004 Change-Id: I929b5675a4fb0ddc8cbf3eeefc982b4ba04ddc59 Signed-off-by: Krishna Reddy <vdumpa@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2718996 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>	2022-05-27 14:49:26 -07:00
Debarshi Dutta	c1ea9e3955	gpu: nvgpu: correct usage for gk20a_busy_noresume Background: In case of a deferred suspend implemented by gk20a_idle, the device waits for a delay before suspending and invoking power gating callbacks. This helps minimize resume latency for any resume calls(gk20a_busy) that occur before the delay. Now, some APIs spread across the driver requires that if the device is powered on, then they can proceed with register writes, but if its powered off, then it must return. Examples of such APIs include l2_flush, fb_flush and even nvs_thread. We have relied on some hacks to ensure the device is kept powered on to prevent any such delayed suspension to proceed. However, this still raced for some calls like ioctl l2_flush, so gk20a_busy() was added (Refer to commit Id dd341e7ecbaf65843cb8059f9d57a8be58952f63) Upstream linux kernel has introduced the API pm_runtime_get_if_active specifically to handle the corner case for locking the state during the event of a deferred suspend. According to the Linux kernel docs, invoking the API with ign_usage_count parameter set to true, prevents an incoming suspend if it has not already suspended. With this, there is no longer a need to check whether nvgpu_is_powered_off(). Changed the behavior of gk20a_busy_noresume() to return bool. It returns true, iff it managed to prevent an imminent suspend, else returns false. For cases where PM runtime is disabled, the code follows the existing implementation. Added missing gk20a_busy_noresume() calls to tlb_invalidate. Also, moved gk20a_pm_deinit to after nvgpu_quiesce() in the module removal path. This is done to prevent regs access after registers are locked out at the end of nvgpu_quiesce. This can happen as some free function calls post quiesce might still have l2_flush, fb_flush deep inside their stack, hence invoke gk20a_pm_deinit to disable pm_runtime immediately after quiesce. Kept the legacy implementation same for VGPU and older kernels Jira NVGPU-8487 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I972f9afe577b670c44fc09e3177a5ce8a44ca338 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2715654 Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-05-25 04:59:46 -07:00
Debarshi Dutta	76cc8870e1	nvgpu: gpu: update default nvs domain implementation In current form, the default domain acts like any schedulable domain. TSGs are bound to it and it can be enumerated via the public interfaces. The new expectation for the default domain is meant to change from the current form to a pseudo domain that cannot act like an ordinary domain in other ways, i.e. it must not be reachable by in particular the domain management API, it can't be removed, does not show up in lists, and TSGs cannot be explicitly bound to this domain. It won't participate in round-robin domain scheduling. It is not really a domain, and acts like one only when activated in the manual mode. Following changes are made overall to support the above change in definition. 1) Domain creation and attaching the domain to the scheduler are now split into two separate functions. The new default domain (having ID = UINT64_MAX) is created separately from a static function without linking it with other domains in the scheduler. 2) struct nvgpu_nvs_scheduler explicitely stores the default domain to support direct lookups. 3) TSGs are initially not bound to default domain/rl_domain. Jira NVGPU-8165 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I916d11f4eea5124d8d64176dc77f3806c6139695 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2697477 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-05-12 00:24:58 -07:00
Debarshi Dutta	26525cb1cf	gpu: nvgpu: runlist changes for default domain implementation In order to support the concept of the default domain, a new rl domain is created that shadows all the other domains i.e. all channels of all TSGs are replicated here. This is scheduled by default during GPU boot. 1) The shadow rl_domain is constructed during poweron sequence via nvgpu_runlist_alloc_shadow_rl_domain(). struct nvgpu_runlist is appended to store this separately as 'shadow_rl_domain'. This is scheduled in background as long as no other user created rl domains exist. 2) 'shadow_rl_domain' is scheduled out once user created rl domain exist. At this point, any updates in the user created rl domains are synchronized with the 'shadow_rl_domain'. i.e. 'shadow_rl_domain' is also reconstructed to contain active channels and tsgs from the rl domain. 3) 'shadow_rl_domain' is scheduled back in when the last user created rl domain is removed. 4) In future for manual mode, driver shall support explicitely switching to 'shadow_rl_domain'. Also, we will move to an implementation where 'shadow_rl_domain' is switched out only when other domains are actively scheduled. These changes will be implemented later. Jira NVGPU-8165 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: Ia6a07d6bfe90e7f6c9e04a867f58c01b9243c3b0 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2704702 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-05-12 00:24:46 -07:00
Sagar Kamble	d82400d2b8	gpu: nvgpu: fix MISRA Rule 5.1 violation BVEC changes for nvgpu_rc_pbdma_fault and nvgpu_rc_mmu_fault started reporting below MISRA issue. kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:321: 1. misra_c_2012_rule_5_1_violation: Declaration with identifier "nvgpu_tsg_unbind_channel_check_hw_state", which is ambiguous. kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:349: 2. other_declaration: The first 31 characters of identifiers "nvgpu_tsg_unbind_channel_check_ctx_reload" and "nvgpu_tsg_unbind_channel_check_hw_state" are identical. Do below renames to fix the issue. Doing both for consistency. s/nvgpu_tsg_unbind_channel_check_hw_state/nvgpu_tsg_unbind_channel_hw_state_check s/nvgpu_tsg_unbind_channel_check_ctx_reload/nvgpu_tsg_unbind_channel_ctx_reload_check JIRA NVGPU-6772 Change-Id: Ib92cabe11c486621351bf15ddb86e20d16d514c4 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584152 (cherry picked from commit a619f259c6a4ffccb05550767212989af60c2a90) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2706551 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-05-11 04:18:12 -07:00
Richard Zhao	1ce899ce46	gpu: nvgpu: fix compile error of new compile flags Preparing to push hvrtos gpu server changes which requires bellow CFLAGS: -Werror -Wall -Wextra \ -Wmissing-braces -Wpointer-arith -Wundef \ -Wconversion -Wsign-conversion \ -Wformat-security \ -Wmissing-declarations -Wredundant-decls -Wimplicit-fallthrough Jira GVSCI-11640 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I25167f17f231ed741f19af87ca0aa72991563a0f Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2653746 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-05-07 15:11:49 -07:00
Jinesh Parakh	d4cb2eb3c0	gpu: nvgpu: Fix Dereference Coverity issues Fixed following Coverity Defects: fw.c : Dereference after null check channel.c : Dereference before null check log.c : Dereference before null check CID 10064128 CID 10056456 CID 10127934 Bug 3460991 Signed-off-by: Jinesh Parakh <jparakh@nvidia.com> Change-Id: I9c075f5c38c2254d5c656af58bb002714bd53396 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2685320 Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-28 07:36:10 -07:00
Dinesh T	e4cf52123f	gpu: nvgpu: Add ce halt function This is adding CE halt fuction to reset CE properly by setting stall req and waiting for stallack. Bug 200641946 Change-Id: I501ccf68a4f6fe95911e73fa2eb65bde93a9f3e9 Signed-off-by: Dinesh T <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678366 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-03-11 20:44:38 -08:00
srajum	8381647662	gpu: nvgpu: fixing MISRA violations - MISRA Directive 4.7 Calling function "nvgpu_tsg_unbind_channel(tsg, ch, true)" which returns error information without testing the error information. - MISRA Rule 10.3 Implicit conversion from essential type "unsigned 64-bit int" to different or narrower essential type "unsigned 32-bit int" - MISRA Rule 5.7 A tag name shall be a unique identifier JIRA NVGPU-5955 Change-Id: I109e0c01848c76a0947848e91cc6bb17d4cf7d24 Signed-off-by: srajum <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572776 (cherry picked from commit 073daafe8a11e86806be966711271be51d99c18e) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678681 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-03-10 16:01:18 -08:00
srajum	069fe05dca	gpu: nvgpu: remove whitelisting for wrongly reported violations by tool - Earlier we whitelisted wrongly reported static analysis violations by tool, raised coverity tool bugs for these cases. - These bugs are fixed with new version of tool, so no need fo whitelisting. JIRA NVGPU-7119 Change-Id: Ib2341db0d46fa7fac4c0cc9a6c1bdc8704377ef1 Signed-off-by: srajum <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2604365 (cherry picked from commit dc2d8ddaa409aefe0e04e0bacb3a8a977f6dbd64) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2677523 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-10 16:01:06 -08:00
Konsta Hölttä	2ab6184955	gpu: nvgpu: debug dump tsg domain name Include the scheduling domain name in the channel debug dump. The domain name of a channel is the domain name of its parent TSG, if any. Copy just the name into the dump info to avoid refcounting concerns. While at it, reword the deterministic flag for less ambiguity. Jira NVGPU-6791 Change-Id: I06041277f938e20f23de9aa419cfffbaa028035e Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2673101 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-03-01 00:09:12 -08:00
Konsta Hölttä	f10ee4ab0e	gpu: nvgpu: add domain name API Add nvgpu_nvs_domain_get_name() to minimize messing up with nvs internals and to help code organization when nvs is not built in yet. A stub to help compilation returns NULL because no domains can exist when the stub is built in, and thus it won't be used. Jira NVGPU-6788 Change-Id: If663f7c0e8434ef00dd3a3f40f6404a35b477f2b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2673120 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-01 00:09:01 -08:00
Konsta Hölttä	2a8914619d	gpu: nvgpu: bind sched domains as fds Replace id-based lookup with fd-based lookup when binding a TSG to a domain. The device node based domain interface naturally provides access control; this way userspace tools can limit which uid/gid can access each domain. Also, explicitly disallow binding channels to a TSG that has no runlist domain yet. Normally a TSG is in the default domain if nothing else has been specified, but the default domain can be deleted. Jira NVGPU-6788 Change-Id: I2af96dfc002367d894eaf0c175006332f790c27f Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2651165 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-01 00:08:55 -08:00
Shashank Singh	5ec241a1d8	gpu: nvgpu: remove non stall intr from top handler for safety On safety nonstall interrupt is not used and should be compiled out to rule out any chance of interference with safety code. Remove top handler support of nonstall interrupt for safety which is currently not applicable to linux. Jira NVGPU-7066 Jira NVGPU-4078 Change-Id: I278efc8da6ddd0f22c128af6630cfd1b20ba4784 Signed-off-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589006 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671586 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-02-21 02:31:38 -08:00
Shashank Singh	19a3b86f06	gpu: nvgpu: remove unused code from common.nvgpu on safety build - remove unused code from common.nvgpu unit on safety build. Also, remove the code which uses them in other places. - document use of compiler intrinsics as mandated in code inspection checklist. Jira NVGPU-6876 Change-Id: Ifd16dd197d297f56a517ca155da4ed145015204c Signed-off-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561584 (cherry picked from commit 900391071e9a7d0448cbc1bb6ed57677459712a4) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561583 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-02-17 04:58:32 -08:00
Konsta Hölttä	359e83b45a	gpu: nvgpu: tsg: release default nvs domain ref A reference to the default scheduling domain is taken when a TSG is opened. Although the explicit bind is designed to support only one bind, the TSG is bound to the default one implicitly at that point. Release the reference to avoid leaking it. The domain might be null at that point if the default domain has been removed; in that case there's just no domain to put back. Change-Id: I7db43f7bbb2a8c86c391280eb7fa32431c8982da Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2663420 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-02-06 10:09:34 -08:00
Konsta Hölttä	8736c0d467	gpu: nvgpu: add and use sw-only timers The nvgpu timeout API has an internal override for presilicon mode by default: in presi simulation environments the timeouts never trigger. This behaviour is intended in the original usecase of the timer unit with hardware polling loops. In pure software logic though, the timer must trigger after the specified timeout even in presi mode so add a new init function to produce a timer for software logic. Use this new kind of timer in channel and scheduling worker threads. The channel worker currently times out for just the purpose of the channel watchdog timer which has its own internal timer. Although that's just software, the general expectation is that the watchdog does not trigger in presilicon tests that run slower than usual. The internal watchdog timer thus keeps the non-sw mode. Bug 3521828 Change-Id: I48ae8522c7ce2346a930e766528d8b64195f81d8 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662541 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-02-04 22:02:33 -08:00
Sagar Kamble	29a0a146ac	gpu: nvgpu: fix coverity defects Fix following coverity defects: ioctl_prof.c resource leak ioctl_dbg.c logically dead code global_ctx.c identical code for branches therm_dev.c resource leak pmu_pstate.c unused value nvgpu_mem.c dead default in switch tsg.c Dereference before null check nvlink_gv100.c logically dead code nvlink.c Out-of-bounds write fifo_vgpu.c Dereference null return value pmu_pg.c Dereference before null check fw_ver_ops.c Identical code for different branches boardobjgrp.c Dereference after null check boardobjgrp.c Dereference before null check boardobjgrp.c Dereference after null check engines.c Dereference before null check nvgpu_init.c Unused value CID 10127875 CID 10127820 CID 10063535 CID 10059311 CID 10127863 CID 9875900 CID 9865875 CID 9858045 CID 9852644 CID 9852635 CID 9852232 CID 9847593 CID 9847051 CID 9846056 CID 9846055 CID 9846054 CID 9842821 Bug 3460991 Change-Id: I91c215a545d07eb0e5b236849d5a8440ed6fe18d Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2657444 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-01-28 04:50:12 -08:00
Richard Zhao	9ab1271269	gpu: nvgpu: common: fix compile error of new compile flags It's preparing to add bellow CFLAGS: -Werror -Wall -Wextra \ -Wmissing-braces -Wpointer-arith -Wundef \ -Wconversion -Wsign-conversion \ -Wformat-security \ -Wmissing-declarations -Wredundant-decls -Wimplicit-fallthrough Jira GVSCI-11640 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Ia8f508c65071aa4775d71b8ee5dbf88a33b5cbd5 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555056 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-01-13 12:36:14 -08:00
Konsta Hölttä	55afe1ff4c	gpu: nvgpu: improve nvs uapi - Make the domain scheduler timeslice type nanoseconds to future proof the interface - Return -ENOSYS from ioctls if the nvs code is not initialized - Return the number of domains also when user supplied array is present - Use domain id instead of name for TSG binding - Improve documentation in the uapi headers - Verify that reserved fields are zeroed - Extend some internal logging - Release the sched mutex on alloc error - Add file mode checks in the nvs ioctls. The create and remove ioctls require writable file permissions, while the query does not; this allows filesystem based access control on domain management on the single dev node. Jira NVGPU-6788 Change-Id: I668eb5972a0ed1073e84a4ae30e3069bf0b59e16 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2639017 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-12-15 06:05:25 -08:00

1 2 3 4 5 ...

541 Commits