linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
Debarshi Dutta	9328f057a7	gpu: nvgpu: fix use-after-free use case of CE APP. The following issue is reported when running sudo modprobe -r nvgpu [ 134.066392] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [ 134.066428] Mem abort info: [ 134.066431] ESR = 0x96000004 [ 134.066434] EC = 0x25: DABT (current EL), IL = 32 bit [ 134.066450] [0000000000000058] pgd=0000000000000000, p4d=0000000000000000 [ 134.066459] Internal error: Oops: 96000004 [#1] PREEMPT_RT SMP [ 134.066639] pc : nvgpu_cic_rm_wait_for_stall_interrupts+0x78/0xd0 [nvgpu] [ 134.066847] lr : nvgpu_cic_rm_wait_for_stall_interrupts+0x74/0xd0 [nvgpu] [ 134.067043] sp : ffff80001971ba80 [ 134.067046] x29: ffff80001971ba80 x28: ffff000093b0da00 [ 134.067054] x27: 0000000000000000 x26: ffff80001c28b990 [ 134.067061] x25: ffff00008cd01000 x24: 0000000000000bb8 [ 134.067067] x23: 0000000000000000 x22: ffff0000915b0000 [ 134.067073] x21: ffff000093b0da00 x20: ffff0000915b0000 [ 134.067079] x19: ffff0000915b0000 x18: 0000000000000036 [ 134.067085] x17: 0000000000000000 x16: 0000000000000000 [ 134.067091] x15: ffff8000126b5fd8 x14: 7373616c633d4d45 [ 134.067097] x13: ffff8000098abef0 x12: 0000000000000000 [ 134.067102] x11: ffff8000098ab5a0 x10: ffff8000098abef8 [ 134.067108] x9 : ffff80001010e844 x8 : ffff80001971ba48 [ 134.067115] x7 : 2222222222222222 x6 : ffff000093b0da00 [ 134.067122] x5 : ffff8000098b1fd8 x4 : 0000000000000000 [ 134.067127] x3 : 0000000000000000 x2 : 0000000000000000 [ 134.067133] x1 : 0000000000000000 x0 : 0000000000000000 [ 134.067138] Call trace: [ 134.067140] nvgpu_cic_rm_wait_for_stall_interrupts+0x78/0xd0 [nvgpu] [ 134.067328] nvgpu_cic_rm_wait_for_deferred_interrupts+0x20/0xb0 [nvgpu] [ 134.067517] nvgpu_channel_deferred_reset_engines+0x29c/0x920 [nvgpu] [ 134.067714] nvgpu_channel_close+0x18/0x20 [nvgpu] [ 134.067904] nvgpu_init_pramin+0x2ac/0x350 [nvgpu] [ 134.068092] nvgpu_ce_app_destroy+0x94/0xe0 [nvgpu] [ 134.068279] nvgpu_put+0x90/0x120 [nvgpu] [ 134.068465] nvgpu_pci_shutdown+0x29c/0x18a0 [nvgpu] [ 134.068655] pci_device_remove+0x44/0xe0 [ 134.068665] device_release_driver_internal+0x114/0x1f0 [ 134.068701] driver_detach+0x54/0xe0 [ 134.068709] bus_remove_driver+0x70/0x120 [ 134.068733] driver_unregister+0x34/0x60 The above issue occurs due to freeing of CIC resources earlier than dependent users of interrupts e.g. CDE, CE etc. As a solution, move CIC deinit sequence to end of nvgpu_put. This handles deinit properly for VGPU/IGPU/DGPU. Bug 200763510 Change-Id: I696e31d5e03a9468cccfe710048000dbf7cf0269 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592063 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-16 21:45:43 -07:00
Sahil Mukund Patki	794d1edbe4	gpu: nvgpu: Fix debugfs compilation errors The function "nvgpu_ce_debugfs_init" is declared in "debug_ce.h". This file is only compiled when CONFIG_DEBUG_FS is enabled. So any accesses to this function result in compilation errors when CONFIG_DEBUG_FS is disabled. This patch fixes the errors by guarding all accesses to the above mentioned function by CONFIG_DEBUG_FS. Bug 200755555 Change-Id: Ie566413913c4a72b10b87c3285d1263d1c811074 Signed-off-by: Sahil Mukund Patki <spatki@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591304 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-15 09:16:22 -07:00
prsethi	dd94573e55	gpu: nvgpu: Update KMDI mapping interface Finding gpu va mapping inside a given range is a two step process where in first step number of mapping are queried and at second step it queries for all the continues mapping range for that given gpu va range. Mapping interface should count and return number of mappings if input count is 0 in place of failing it. Patch make the change for this two step process and only returns count at first step and in second step returns the continues memory ranges. Patch also replaces nvgpu_zalloc with nvgpu_big_zalloc to handle bigger size allocation. Bug 200722275 Change-Id: I56428deafa560ac8471c78f102bb1f9dbe20cabc Signed-off-by: prsethi <prsethi@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591043 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-15 09:16:06 -07:00
Debarshi Dutta	79ab0ba6c4	gpu: nvgpu: remove sudo restrictions on gpu nodes. When SMC modes are enabled, devices are created with sudo-only access permissions. Those permissions are relaxed to allow non-sudo processes to allow job submission. Also, allow only root users to poweroff explicitely via the device power node. Bug 3374078 Change-Id: Ieb869399c3ada3588708cf2bc99a580414023cb7 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590584 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-15 09:15:49 -07:00
Antony Clince Alex	f3164a4672	gpu: nvgpu: fix tpc_fs_mask syfs output The tpc_fs_mask sysfs entry outputs the TPC masks in logical order, however, contradicts the gpc_fs_mask which is in physical order. So for consistency, update tpc_fs_mask to provided output in physical order. Bug 3364907 Change-Id: I2cc7b66dac2bea215024ef95944cde4b46d51c9a Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593803 Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-14 16:14:33 -07:00
Debarshi Dutta	791dc18666	gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields Add Setter and Getter methods for accessing tsg->sm_error_states. Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state. This renders it unnecessary to add BVEC for above fields for the struct in multiple locations. The current design ensures that only a constant pointer is obtained from the owner unit i.e. FIFO. The following new methods are added. Both unit tests and BVEC tests are added for them as well. nvgpu_tsg_store_sm_error_state nvgpu_tsg_get_sm_error_state Jira NVGPU-6947 Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-13 20:57:09 -07:00
ajeshkv	118f8c1280	gpu: nvgpu: add support for gsp stress test Add debugfs entries to support GSP stress test and other functionalities to enable the test. JIRA CORERM-3382 Change-Id: Iab20fcfe78807e76e91c64716502a2f036ed4d18 Signed-off-by: ajeshkv <akv@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589390 Reviewed-by: Amit Pabalkar <apabalkar@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-10 16:02:43 -07:00
Antony Clince Alex	ab4aa0afba	gpu: nvgpu: remove incorrect usage of CONFIG_NVGPU_NEXT Remove incorrect usage of CONFIG_NVGPU_NEXT introuduced in patch: https://git-master.nvidia.com/r/#/c/linux-nvgpu/+/2499571/ JIRA NVGPU-6574 Change-Id: I9bf0f0ee5d9762b79dd7913402678b0dd87f21ee Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2567353 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-08 06:50:49 -07:00
Sagar Kadamati	dd9b4364aa	gpu: nvgpu: add nvgpu-next infrastructure * As of now, working on multiple chip bringup in nvgpu-next repo has an issue because we end with losing control on source code (hard to find which part of the code belongs to which chip) and it's valuable history this affects chip migration on release. * To support multiple chip bringup simultaneously, we need new guidelines to avoid losing control on source code and make migration easier. This change adds links to nvgpu-next repo. * Updated return code to ENODEV for consistency * Updated ACR unittest to work with ENODEV return code NOTE: These are the initial set of infrastructure changes, guidelines will evolve, and source code will get updated accordingly. Based on future chip features, Which part of the source code falls under nvgpu-next repo is decided. JIRA NVGPU-6574 Change-Id: I81827e35d189c55554df00e255b527a4473e0338 Signed-off-by: Sagar Kadamati <skadamati@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556793 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-08 06:50:38 -07:00
dt	9355345610	gpu: nvgpu: Add IPA-PA cache to increase the performance When GPU need to programmed with PA(physical address), given IPA need to be converted to PA by querying Hypervisor. As this is an IPC between OSes, the call will reduce the performance badly. So this is adding a IPA-PA cache to improve the performance. This will be more helpful in passthr config. Bug 3277194 Change-Id: I6a3230d858977313a0ed0f33068055a3b516330a Signed-off-by: dt <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2571814 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-07 10:28:58 -07:00
Antony Clince Alex	2afd601a40	gpu: nvgpu: update FS mask sysfs entries to RDONLY Repurpose (gpc,fbp,tpc)_fs_mask sysfs nodes to only report active physical chiplets after floorsweeping. StaticPG'ing of chiplets will be handled by (gpc,fbp,tpc)_pg_mask sysfs nodes. The user will be able to the write valid PG masks for respective chiplets prior to poweron, which can then be verified using (gpc,fbp_tpc)_fs_mask nodes. Bug 3364907 Change-Id: Ia4132f9c1939b2cb4a8f55f9d99a2b0a5b02184c Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587926 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Chris Dragan <kdragan@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-07 05:35:09 -07:00
Debarshi Dutta	33740b41b6	gpu: nvgpu: free memory during module removal Following pointers(allocated via Kmalloc/DMA) aren't freed during module removal. struct nvgpu_gr_config -> gpc_tpc_mask_physical struct nvgpu_netlist_vars -> ctxsw_regs.etpc.l struct mm_gk20a -> sysmem_flush struct nvgpu_pmu_pg -> pg_buf SGTable corresponding to VPR secure buffer. Added appropriate free calls. Bug 3364181 Change-Id: I2105c1f3256b1910f0f514d98f0ee3ae2e34aff7 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586244 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-02 15:43:07 -07:00
Sagar Kamble	79fb97100d	gpu: nvgpu: implement GET_BUFFER_INFO ioctl Userspace applications will need to query buffer information such as size, comptags allocation status, user associated metadata etc. for enabling newer IPC mechanisms. Add support for this new ioctl. Bug 200586313 Change-Id: I87607eb306afa0cce1bec7a1fb2925ec3bc33e50 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480763 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-02 11:42:13 -07:00
Sagar Kamble	ed16377983	gpu: nvgpu: allocate comptags and store metadata in REGISTER_BUFFER ioctl To enable userspace query about comptags allocation status of a buffer, comptags are to be allocated only during buffer registration done by nvrm_gpu. Earlier, they were allocated during map. nvrm_gpu will be sending metadata blob to be associated with the buffer. This will have to be stored in the dmabuf privdata for all the buffers registered by nvrm_gpu. This patch moves the privdata allocation to buffer registration ioctl. Remove g->mm.priv_lock as it is not needed now. This lock was added to protect dmabuf private data setup. That private data is now handled through dmabuf->ops and setup of dmabuf->ops is done under dmabuf->lock. To support legacy userspace, this patch still allocates comptags on demand on map calls for unregistered buffers. Bug 200586313 Change-Id: I88b2ca04c733dd02a84bcbf05060bddc00147790 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480761 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-02 11:42:08 -07:00
Jon Hunter	8a4b72a4aa	gpu: nvgpu: Fix crash when reading CE_APP debugfs The CE_APP debugfs nodes are created when the NVGPU driver is probed, however, the 'ce_app' structure which contains the variables exposed via the debugfs, is not allocated until nvgpu_finalize_poweron() is called. Therefore, if the user attempts to access the CE_APP debugfs nodes before the NVGPU has been powered on, for example, right after Linux has booted, then this results in a NULL pointer dereference crash. Fix this by moving the creation of the CE_APP debugfs nodes to nvgpu_finalize_poweron_linux() which is called after nvgpu_finalize_poweron(). Bug 200747304 Change-Id: Icd28952112f86887a1d6b6f8beb382f5189461a9 Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572106 (cherry picked from commit 35a0c18d93e97265611c3bbfae41b39d9cd183e3) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587367 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-02 07:23:53 -07:00
Sagar Kamble	7410784b0b	gpu: nvgpu: fix clk_arb completion file private data access race clk_arb completion file descriptor can get closed immediately after poll finishes in the work item gp10b_clk_arb_run_arbiter_cb. In that case, the refcount for nvgpu_clk_dev can become zero in the work item and can lead to invalid access while removing nvgpu_clk_dev from the lists. Remove nvgpu_clk_dev from the list before dropping the reference to it. Also, delete the nvgpu_clk_dev in completion file release handler within the session and requests spinlocks to avoid race with gp10b_clk_arb_run_arbiter_cb using it. bug 200757277 Change-Id: I054eee547f2a6fa633d7ef55df216ec36647a826 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2569522 (cherry picked from commit `ce8548ec05`) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587070 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-01 09:50:11 -07:00
Debarshi Dutta	6fc27766ed	gpu: nvgpu: fix issues due to a previous patch `608decf` gpu: nvgpu: add support for powering off gpu The above commit accidentally removed nvgpu_quiesce from nvgpu_pci_remove path. Add that back. Bug 3365659 Change-Id: I287972c426738a950ace2907610e02b774ab1eff Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586240 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-01 01:37:17 -07:00
Ramesh Mylavarapu	88293ee42d	gpu: nvgpu: read temperature from therm_i2cs_sensor_00_r Currently reading temperature value depeads on therm pstate board objects. In absence of pstate reading temperature from therm get status will be failed which will cause GVS failure in NvRmGpuTest_Device_GetTemperature test. This change will add support to read temperature from therm sensor_00 register but this will have following limitation: - NV_THERM_I2CS_SENSOR_00 doesn't support fractional precision. - It doesn't support negative temperatures. BUG-200736830 Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Change-Id: I25e577dac9029fcd787a6f71957dbeefd6fe43dd Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584269 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-28 06:56:24 -07:00
Ramesh Mylavarapu	a96c04d097	gpu: nvgpu: disable pstate support for tu104 Disabling pstates on TU104 which is no more a POR. BUG-200736830 Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Change-Id: I36a0d5fac5d1294802e5150dcebd5dcb54ad5f2e Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584268 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-28 06:56:19 -07:00
Seshendra Gadagottu	135e056e9e	gpu: nvgpu: ga10b: set can_slcg/blcg/elcg to true Add capability to enable/disable clock gating power features by setting can_xxcg capabilities to true. The cg features are disabled on tot and will be enabled once verification is done. Jira NVGPU-7033 Bug 200766930 Change-Id: I2d2aa25b7c84f3c4de0b12fd6d845a8f792bfd2d Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584540 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-08-27 20:45:58 -07:00
Debarshi Dutta	608decf1e6	gpu: nvgpu: add support for powering off gpu Add support for powering off IGPU for switching between legacy to SMC mode/vice-versa or changing SMC configuration. The power off can be issued as follows echo 0 > /dev/nvgpu/igpu0/power The following steps are done during a poweroff. 1) Deterministic channel idle 2) Acquire write_lock on l->busy semaphore. 3) Wait till power_usage decrements to indicate 0 active jobs. 4) Invoke pm_runtime_put_sync_suspend() 5) Invoke nvgpu_gr_remove_support() to clear existing GR memory. 6) Release write_lock on l->busy 7) Deterministic channel unidle. Part of the sequence matches that of the gk20a_do_idle code. The common parts are extracted into new functions gk20a_block_new_jobs_and_idle() and gk20a_unblock_jobs() For joint-rail case, the current implementation, does a railgate and then sets pm_runtime_set_autosuspend_delay(-1) to disable regular runtime resume/suspend. Remove clearing of NVGPU_SUPPORT_MIG status during state change ias it leads to inconsistencies. Jira NVGPU-6920 Change-Id: I0b3eb3278176122ac061c1e8a94ebfb3c17c3925 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578501 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Antony Clince Alex <aalex@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-23 05:27:50 -07:00
Debarshi Dutta	2e3c3aada6	gpu: nvgpu: fix deinit of GR Existing implementation of GR de-init doesn't account for multiple instances of struct nvgpu_gr. As a fix, below changes are added. 1) nvgpu_gr_free is unified for VGPU as well as native. 2) All the GR instances are freed. 3) Appropriate NULL checks are added when freeing GR memories. 4) 2D, 3D, I2M and ZBC etc are explicitely disabled when MIG is set. 5) In ioctl_ctrl, checks are added to not return error when zbc is NULL for VGPU as requests are rerouted to RMserver. Jira NVGPU-6920 Change-Id: Icaa40f88f523c2cdbfe3a4fd6a55681ea7a83d12 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578500 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Antony Clince Alex <aalex@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-23 05:27:45 -07:00
Seshendra Gadagottu	a743596697	gpu: nvgpu: ga10b: handle floor-swept gpc clock gracefully If a GPC is floor-swept, then gpcclk enable for that GPC will return error. For gpu booting, ignore this error and continue with other clocks enable. More robust mechanism with floor-sweeping check before enabling clocks will be added in follow-up patches. Bug 3362403 Change-Id: I0b64c94918a1c00086a146408e6c4913788249ec Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579569 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-20 14:56:30 -07:00
Seshendra Gadagottu	342da8158a	gpu: nvgpu: ga10b: disable frequency scaling To disable frequency scaling for gpc clocks, set both devfreq_governor and qos_notify to NULL in platform data. Jira NVGPU-7059 Change-Id: I2142195d89758d21f2c6e070196645aca9bc0a24 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573476 Tested-by: Mark Mendez <mmendez@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-12 18:16:01 -07:00
Sagar Kamble	40064ef1ec	gpu: nvgpu: fix ecc counter free ECC counter structures are freed without removing the node from the stats_list. This can lead to invalid access due to dangling pointers. Update the ecc counter free logic to set them to NULL upon free, to remove them from stats_list and free them by validation. Also updated some of the ecc init paths where error was not propa- gated to callers and full ecc counters deallocation was not done. Now, calling unit ecc_free from any context (with counters alloc- ated or not) is harmless as requisite checks are in place. bug 3326612 bug 3345977 Change-Id: I05eb6ed226cff9197ad37776912da9dcb7e0716d Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2565264 Tested-by: Ashish Mhetre <amhetre@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-11 01:55:08 -07:00
Seshendra Gadagottu	00e67e0798	gpu: nvgpu: ga10b: disable elpg Engine Level Power Gating(ELPG) for ga10b is enabled on tot for silicon. elpg needs to be enabled only after verification on silicon and after stress testing the feature. To avoid issues during ga10b bring-up with unverified ELPG feature, disable it by setting both can_elpg_init and elpg_enable to false in ga10b platform data. Jira NVGPU-7033 Change-Id: I664d6e031339aa912b78769bd58a4e6d77dca1d0 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2564197 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: Krishna Reddy <vdumpa@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-28 12:03:48 -07:00
Seshendra Gadagottu	4b1a080cbf	gpu: nvgpu: ga10b: make sysclk rate same as gpc clocks As per HW guidance, keep gpc0, gpc1 and sysclk at same clock rate. Bug 3315239 Change-Id: I038d27c53e8c59a19f8150163ce1e1f216564e9a Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2562611 Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-21 17:44:32 -07:00
Tejal Kudav	b33079d47e	gpu: nvgpu: Move intr data members from MC to CIC Move interrupt specific data-members from common.mc to common.cic Some of these data members like sw_irq_stall_last_handled_cond need To be initialized much earlier during the OS specific init/probe stage. Also, some more members from struct nvgpu_interrupts(like stall_size, stall_lines[]), which will soon be moved to CIC will also need to be initialized early during the OS specific probe stage. However, the chip specific LUT can only be initialized after the hal_init stage where the HALs are all initialized. Split the CIC init to accommodate the above initialization requirements. JIRA NVGPU-6899 Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-19 18:06:28 -07:00
Antony Clince Alex	f80dccb543	gpu: nvgpu: report gpc_tpc_mask in physical order At present, there is an inconsistency in the order in which gpc_tpc masks are reported to the userspace. Both gpc and tpc masks are reported using physical-ids. However, the gpc_tpc_masks array is ordered by logical gpc-ids and not physical-ids. This creates a mismatch between the gpc reported as enabled in the gpc_mask and its corresponding gpc_tpc_mask. Introduce field "gpc_tpc_mask_physical" which stores the gpc_tpc_masks in physical order and update NVGPU_GPU_IOCTL_GET_TPC_MASKS to return this field. Bug 200665942 Change-Id: I63aa83414a59676b7e7d36b6deb527e2f3c04cff Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2531114 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-19 16:04:01 -07:00
Divya Singhatwaria	842bef7124	gpu: nvgpu: Support GPC and FBP Floorsweeping - Add gops_fbp_fs and gops_gpc_pg struct - Add HALs to write to NV_FUSE_CTRL_OPT_FBP and NV_FUSE_CTRL_OPT_GPC fuses needed for floorsweeping - Add set_fbp_mask and set_gpc_mask to probe FBP and GPC mask respectively during gpu probe - Add sysfs node: fbp_fs_mask and gpc_fs_mask to store FBP and GPC floorsweeping mask sent from userspace - Move the floorsweeping programming early in NVGPU’s GPU init function and then issue a PRI init. JIRA NVGPU-6433 Change-Id: I84764d625c69914c107e1e8c7f29c476c2f64f78 Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2499571 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-19 06:17:25 -07:00
Divya Singhatwaria	9f30609550	gpu: nvgpu: Rename TPC powergating mutex Rename tpc_pg_lock to static_pg_lock and have_tpc_pg_lock to have_static_pg_lock as it is used for tpc/gpc/fbp power gating. JIRA NVGPU-6433 Change-Id: I4c56b9710e303ad9e872bad4b5ed9a167acb9dd6 Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537489 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-18 02:46:25 -07:00
Ramesh Mylavarapu	d328bff79e	gpu: nvgpu: gsp NVRISCV load and bootstrap Changes: - This change will only init gsp software state, nvgpu_gsp_bootstrap need to be called. - CONFIG_NVGPU_GSP_SCHEDULER flag is created to compile out the gsp scheduler code when needed. - Created GSP engine reset which is needed when ACR completed execution and need to load gsp fw. NVGPU-6783 Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Change-Id: I2ce43e512b01df59443559eab621ed39868ad158 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554267 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-15 17:21:03 -07:00
Debarshi Dutta	493df6cb6e	gpu: nvgpu: resolve CE debugfs NULL access issues CE_APP is created only when CONFIG_NVGPU_DGPU is enabled. Consequently, create CE debugfs entries only when CONFIG_NVGPU_DGPU is enabled to avoid NULL access failures. Bug 200747304 Change-Id: Idf0829927b6578da4007f3c5c5ca5ae8f0ed11db Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2558712 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-07-15 10:13:48 -07:00
Vedashree Vidwans	43980bfe06	gpu: nvgpu: remove nvgpu_is_bpmp_running usage BPMP driver doesn't support any API to check whether bpmp is running. Remove use of nvgpu_is_bpmp_running. Bug 200720732 Change-Id: Id266e65d4af598dd056cbdbaa219d0d53b7b3fb3 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556448 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-15 10:06:42 -07:00
Deepak Nibade	4edf952e3e	gpu: nvgpu: fix rule 5.1 misra violations in common.gr Fix rule 5.1 misra violations in common.gr by renaming below functions : nvgpu_gr_config_get_gpc_tpc_mask_base -> nvgpu_gr_config_get_base_mask_gpc_tpc nvgpu_gr_config_get_gpc_tpc_count_base -> nvgpu_gr_config_get_base_count_gpc_tpc gm20b_ctxsw_prog_set_priv_access_map_config_mode -> gm20b_ctxsw_prog_set_config_mode_priv_access_map gm20b_ctxsw_prog_set_priv_access_map_addr -> gm20b_ctxsw_prog_set_addr_priv_access_map gm20b_gr_falcon_read_fecs_ctxsw_mailbox -> gm20b_gr_falcon_read_mailbox_fecs_ctxsw gm20b_gr_falcon_read_fecs_ctxsw_status0 -> gm20b_gr_falcon_read_status0_fecs_ctxsw gm20b_gr_falcon_read_fecs_ctxsw_status1 -> gm20b_gr_falcon_read_status1_fecs_ctxsw gv11b_gr_intr_get_sm_hww_warp_esr_pc -> gv11b_gr_intr_get_warp_esr_pc_sm_hww gv11b_gr_intr_get_sm_hww_warp_esr -> gv11b_gr_intr_get_warp_esr_sm_hww Jira NVGPU-6779 Change-Id: Icbe23a7b022373785968fc417ee247e2d80cfcc6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554521 (cherry picked from commit 1432650774506f2a7e45f70b084f498736d0d0c5) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555330 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-13 09:20:41 -07:00
Jon Hunter	e0ffd1a217	gpu: nvgpu: Fix debugfs_create_bool usage for Linux v5.14 Upstream Linux commit 393b06383fb7 ("debugfs: remove return value of debugfs_create_bool()") updated the function debugfs_create_bool() to remove the return value because it was not needed and user do not need to check the return value. This breaks building the NVGPU driver against the current upstream Linux kernel and the following error messages are seen ... nvgpu/drivers/gpu/nvgpu/os/linux/debug.c: In function ‘gk20a_debug_init’: nvgpu/drivers/gpu/nvgpu/os/linux/debug.c:469:25: error: void value not ignored as it ought to be l->debugfs_ltc_enabled = ^ nvgpu/drivers/gpu/nvgpu/os/linux/debug.c:507:32: error: void value not ignored as it ought to be l->debugfs_runlist_interleave = ^ Fix this by not saving the value returned from debugfs_create_bool() and remove the variables debugfs_ltc_enabled and debugfs_runlist_interleave from the nvgpu_os_linux structure. Note that these variables are not used anywhere in the driver and currently we don't check the return value from debugfs_create_bool() and so there is no impact from this change for older kernel versions. JIRA LS-114 Change-Id: I539388c8645f2026292d8b9f33f55921dfda648f Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555299 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-07-07 16:11:12 -07:00
Antony Clince Alex	f51a43b579	gpu: nvgpu: ga10b: fix fetching of FBP_L2 FS mask On all chips except ga10b, the number of ROP, L2 units per FBP were in sync, hence, their FS masks could be represented by a single fuse register NV_FUSE_STATUS_OPT_ROP_L2_FBP. However, on ga10b, the ROP unit was moved out from FBP to GPC and it no longer matches the number of L2 units, so the previous fuse register was broken into two - NV_FUSE_CTRL_OPT_LTC_FBP, NV_FUSE_CTRL_OPT_ROP_GPC. At present, the driver reads the NV_FUSE_CTRL_OPT_ROP_GPC register and reports incorrect L2 mask. Introduce HAL function ga10b_fuse_status_opt_l2_fbp to fix this. In addition, rename fields and functions to exclusively fetch L2 masks, this should help accommadate ga10b and future chips in which L2 and ROP units are not in same. As part of this, the following functions and fields have been renamed. - nvgpu_fbp_get_rop_l2_en_mask => nvgpu_fbp_get_l2_en_mask - fuse.fuse_status_opt_rop_l2_fbp => fuse.fuse_status_opt_l2_fbp - nvgpu_fbp.fbp_rop_l2_en_mask => nvgpu_fbp.fbp_l2_en_mask The HAL ga10b_fuse_status_opt_rop_gpc is removed as rop mask is not used anywhere in the driver nor exposed to userspace. Bug 200737717 Bug 200747149 Change-Id: If40fe7ecd1f47c23f7683369a60d8dd686590ca4 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551998 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-07 05:48:56 -07:00
scottl	cd3ad1ccc7	gpu: nvgpu: fix REMAP android build failure Rework nvgpu_vm_remap_os_buf structure initialization to avoid android/clang build issues with the use of a single pair of {} to initialize certain structures. The os-dependent nvgpu_vm_remap_os_buf_get() routine now does a memset of the structure prior to initializing its contents. Jira NVGPU-6804 Change-Id: I08682c6ab7b8324a605a56ed660dea5bea11d16b Signed-off-by: scottl <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2553193 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-07-03 02:05:25 -07:00
Lakshmanan M	e9872a0d91	gpu: nvgpu: Skip graphics unit access when MIG is enabled This CL covers the following modifications, 1) Added logic to skip the graphics unit specific sw context load register write during context creation when MIG is enabled. 2) Added logic to skip the graphics unit specific sw method register write when MIG is enabled. 3) Added logic to skip the graphics unit specific slcg and blcg gr register write when MIG is enabled. 4) Fixed some priv errors observed during MIG boot. 5) Added MIG Physical support for GPU count < 1. 6) Host clk register access is not allowed for GA100. So skipped to access host clk register. 7) Added utiliy api - nvgpu_gr_exec_with_ret_for_all_instances() 8) Added gr_pri_mme_shadow_ram_index_nvclass_v() reg field to identify the sw method class number. Bug 200649233 Change-Id: Ie434226f007ee5df75a506fedeeb10c3d6e227a3 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2549811 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-02 16:41:51 -07:00
tkudav	0526e7eaa9	gpu: nvgpu: Create CIC-mon and CIC-rm subunits common.cic unit is divided into common.cic.mon and common.cic.rm based on rm and mon process split. CIC-mon subunit includes the code which is utilized in critical interrupt handling path like initialization, error detection and error reporting path. CIC-rm subunit includes the code corresponding to rest of interrupt handling(like collecting error debug data from registers) and ISR status management (status of deferred interrupts). Split the CIC APIs and data-members into above two subunits. JIRA NVGPU-6899 Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-02 09:57:56 -07:00
Seeta Rama Raju	6948fa6f4a	gpu: nvgpu: remove Dynamic TPC-PG code - Dynamic TPC-PG feature is not fully implemented and these variables are not using anywhere, so removing this code. JIRA NVGPU-5849 Change-Id: I4949e991a62e06f4aff10c3fbe7516546e49f55e Signed-off-by: Seeta Rama Raju <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2544789 (cherry picked from commit bb28c7d8bfd873283b24e8a1812e23c554cc6c18) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551208 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-29 15:21:38 -07:00
scottl	3cd256b344	gpu: nvgpu: add linux REMAP support Add REMAP ioctl and accompanying support to the linux nvgpu driver. REMAP support provides per-page control over sparse VM areas using the concept of a virtual memory pool. The REMAP ioctl accepts a list of operations (each a map or unmap) that modify the VM area pages tracked by the virtual mmemory pool. Inclusion of REMAP support in the nvgpu build is controlled by the new CONFIG_NVGPU_REMAP flag. This flag is enabled by default for linux builds. A new NVGPU_GPU_FLAGS_SUPPORT_REMAP characteristics flag is added for use in detecting when REMAP support is available. When a VM allocation tagged with NVGPU_VM_AREA_ALLOC_SPARSE is made the base virtual memory pool resources are allocated. Per-page resources are later allocated when the NVGPU_AS_IOCTL_REMAP ioctl is issued. All REMAP resources are released when the corresponding VM area is freed. Jira NVGPU-6804 Change-Id: I1f2cdc0c06c1698a62640c1c6fbcb2f9db24a0bc Signed-off-by: scottl <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542178 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-28 22:39:06 -07:00
Richard Zhao	77f0ab6583	gpu: nvgpu: remove gpu_va update_hwpm_ctxsw_mode Since gpu server can noew allocate va itself, update_hwpm_ctxsw_mode does not need to fixed map pm ctx anymore. Jira GVSCI-10977 Change-Id: If592c8a2eb6dbfd7d922c79c87871162e9d8d8a4 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546192 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-28 18:10:18 -07:00
Richard Zhao	2845f2b66e	gpu: nvgpu: unify nvgpu_has_syncpoints - move nvgpu_has_syncpoints to common code and only checks flag NVGPU_HAS_SYNCPOINTS - the debugfs node disable_syncpoints also enable/disable the flag NVGPU_HAS_SYNCPOINTS Jira GVSCI-10881 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I8dc5dd17ad404238203a048abf49ff2b434fce11 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542738 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-06-28 18:09:14 -07:00
Antony Clince Alex	68e11c8bd3	gpu: nvgpu: remove nvgpu_next_gpuid.h Replace all usages of NVGPU_NEXT_GPUID and NVGPU_NEXT_DGPU_GPUID with NVGPU_GPUID_GA10B and NVGPU_GPUID_GA100. Remove nvgpu_next_gpuid.h and update yaml. Jira NVGPU-4771 Change-Id: I3baf0de4eb5266b79aabd5c6ddf8442bf8f73419 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547735 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-27 05:03:09 -07:00
Antony Clince Alex	d2919409e9	gpu: nvgpu: rename/collpase nvgpu_next functions and structs Replace all nvgpu_next functions/structs either by 1) collapsing them into nvgpu legacy functions/structs 2) renaming them as follows: - nvgpu_next_() => nvgpu_(ga10b/ga100)_() - nvgpu_next_() => (ga10b/ga100)_() - nvgpu_next_() => nvgpu_() [only if this doesn't cause collision] - nvgpu_next_() = > nvgpu__extra() Create hal.sim unit and move Ampere+ SIM code into it. Jira NVGPU-4771 Change-Id: I215594a0d0df4bd663bd875a0d0db47bcb9ff6a2 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548056 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-27 05:02:58 -07:00
Antony Clince Alex	f9cac0c64d	gpu: nvgpu: remove nvgpu_next files Remove all nvgpu_next files and move the code into corresponding nvgpu files. Merge nvgpu-next-*.yaml into nvgpu-.yaml files. Jira NVGPU-4771 Change-Id: I595311be3c7bbb4f6314811e68712ff01763801e Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547557 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-27 05:02:53 -07:00
Antony Clince Alex	c7d43f5292	gpu: nvgpu: remove usage of CONFIG_NVGPU_NEXT The CONFIG_NVGPU_NEXT config is no longer required now that ga10b and ga100 sources have been collapsed. However, the ga100, ga10b sources are not safety certified, so mark them as NON_FUSA by replacing CONFIG_NVGPU_NEXT with CONFIG_NVGPU_NON_FUSA. Move CONFIG_NVGPU_MIG to Makefile.linux.config and enable MIG support by default on standard build. Jira NVGPU-4771 Change-Id: Idc5861fe71d9d510766cf242c6858e2faf97d7d0 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547092 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-27 05:02:47 -07:00
Richard Zhao	ff75647d59	gpu: nvgpu: unify power state management code The management code of g->power_on_state on different OS are almost same, so moved the code to the common place. Jira GVSCI-10882 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I890015867b7bbdf3f749ab275ffd085ef76dfec2 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542846 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-06-23 09:26:49 -07:00
Konsta Hölttä	e44ece25ba	gpu: nvgpu: keep usermode region flags on railgate When the gpu is railgated, the usermode region mappings must be cleared. This is already done with zap_vma_ptes() but as an extra measure the vm flags are also zeroed. That is an oversight, so delete that code; in particular the VM_DONTCOPY flag is important so that the mapping does not follow fork, as the design does not allow that. Bug 200726443 Change-Id: I84ed4e38b7de1f0c8cbf4cca6276abfa2409ac3b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538481 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-22 19:30:00 -07:00

1 2 3 4 5 ...

918 Commits