linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 18:16:01 +03:00

Author	SHA1	Message	Date
Mayur Poojary	fe7368f8f4	gpu: nvgpu: ga10b: Support emulate mode Add sysfs node to enable gpu emulate_mode and pass the value to acr through acr descriptor struct. Bug 3279344 Change-Id: I936b1dda84d7f4f3688237308223c019798bdce3 Signed-off-by: Mayur Poojary <mpoojary@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591377 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-20 16:40:34 -07:00
Divya Singhatwaria	95c954cf9f	gpu: nvgpu: ga10b: enable can_elpg_init flag Set can_elpg_init flag to true. This will allow enabling ELPG via sysfs node. Bug 200766930 Change-Id: I7e16d3d233a212ec01728eae119d6a60fcf6390e Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590733 Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-17 04:03:20 -07:00
Debarshi Dutta	60aab0a1da	gpu: nvgpu: add null check before calling function pointer nvgpu_gsp_isr_support is called from the common code and results in a null pointer exception when calling g->ops.gsp.enable_irq when its not defined for some chips. Fix that. Bug 200763510 Change-Id: Ifef0d31ac4a8d06120bcebc17daf4a5b6559e3c3 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593355 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-16 21:45:49 -07:00
Debarshi Dutta	9328f057a7	gpu: nvgpu: fix use-after-free use case of CE APP. The following issue is reported when running sudo modprobe -r nvgpu [ 134.066392] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [ 134.066428] Mem abort info: [ 134.066431] ESR = 0x96000004 [ 134.066434] EC = 0x25: DABT (current EL), IL = 32 bit [ 134.066450] [0000000000000058] pgd=0000000000000000, p4d=0000000000000000 [ 134.066459] Internal error: Oops: 96000004 [#1] PREEMPT_RT SMP [ 134.066639] pc : nvgpu_cic_rm_wait_for_stall_interrupts+0x78/0xd0 [nvgpu] [ 134.066847] lr : nvgpu_cic_rm_wait_for_stall_interrupts+0x74/0xd0 [nvgpu] [ 134.067043] sp : ffff80001971ba80 [ 134.067046] x29: ffff80001971ba80 x28: ffff000093b0da00 [ 134.067054] x27: 0000000000000000 x26: ffff80001c28b990 [ 134.067061] x25: ffff00008cd01000 x24: 0000000000000bb8 [ 134.067067] x23: 0000000000000000 x22: ffff0000915b0000 [ 134.067073] x21: ffff000093b0da00 x20: ffff0000915b0000 [ 134.067079] x19: ffff0000915b0000 x18: 0000000000000036 [ 134.067085] x17: 0000000000000000 x16: 0000000000000000 [ 134.067091] x15: ffff8000126b5fd8 x14: 7373616c633d4d45 [ 134.067097] x13: ffff8000098abef0 x12: 0000000000000000 [ 134.067102] x11: ffff8000098ab5a0 x10: ffff8000098abef8 [ 134.067108] x9 : ffff80001010e844 x8 : ffff80001971ba48 [ 134.067115] x7 : 2222222222222222 x6 : ffff000093b0da00 [ 134.067122] x5 : ffff8000098b1fd8 x4 : 0000000000000000 [ 134.067127] x3 : 0000000000000000 x2 : 0000000000000000 [ 134.067133] x1 : 0000000000000000 x0 : 0000000000000000 [ 134.067138] Call trace: [ 134.067140] nvgpu_cic_rm_wait_for_stall_interrupts+0x78/0xd0 [nvgpu] [ 134.067328] nvgpu_cic_rm_wait_for_deferred_interrupts+0x20/0xb0 [nvgpu] [ 134.067517] nvgpu_channel_deferred_reset_engines+0x29c/0x920 [nvgpu] [ 134.067714] nvgpu_channel_close+0x18/0x20 [nvgpu] [ 134.067904] nvgpu_init_pramin+0x2ac/0x350 [nvgpu] [ 134.068092] nvgpu_ce_app_destroy+0x94/0xe0 [nvgpu] [ 134.068279] nvgpu_put+0x90/0x120 [nvgpu] [ 134.068465] nvgpu_pci_shutdown+0x29c/0x18a0 [nvgpu] [ 134.068655] pci_device_remove+0x44/0xe0 [ 134.068665] device_release_driver_internal+0x114/0x1f0 [ 134.068701] driver_detach+0x54/0xe0 [ 134.068709] bus_remove_driver+0x70/0x120 [ 134.068733] driver_unregister+0x34/0x60 The above issue occurs due to freeing of CIC resources earlier than dependent users of interrupts e.g. CDE, CE etc. As a solution, move CIC deinit sequence to end of nvgpu_put. This handles deinit properly for VGPU/IGPU/DGPU. Bug 200763510 Change-Id: I696e31d5e03a9468cccfe710048000dbf7cf0269 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592063 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-16 21:45:43 -07:00
ajesh	aa08389240	gpu: nvgpu: update doxygen for posix Update the documentation as per SWUD feedback for posix unit. JIRA NVGPU-6963 Change-Id: I29ed84ea21957b4593684ab62a798fc477fc279f Signed-off-by: ajesh <akv@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581414 (cherry picked from commit 89b560deebf6485356afbfddd508104e95136508) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587428 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-16 21:44:22 -07:00
Tejal Kudav	9b5274593c	gpu: nvgpu: Update common.ptimer documentation Enhance doxygen comments for below common.ptimer APIs: 1. nvgpu_scale_ptimer() 2. gops_ptimer.isr() Remove assert calls from nvgpu_scale_ptimer() as it now has a means to return error. Reorder the Ptimer ISR code for better logical flow. JIRA NVGPU-6989 Change-Id: I5adf4d665d3b90d3e9b11557a15fcb91e485f353 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2583667 (cherry picked from commit 502ab9ee2dc3f3b7b1da7ac59f13fddce4ead616) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592057 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-16 05:59:13 -07:00
Tejal Kudav	5a94007725	gpu: nvgpu: Remove redundant HAL from common.fbp common.fbp has two interfaces to initialize FBP: 1. Public API nvgpu_fbp_init_support 2. HAL fbp.fbp_init_support nvgpu_fbp_init_support() is only used to initialize HAL fbp.fbp_init_support. Remove the HAL and use the API directly. JIRA NVGPU-6644 Change-Id: I2c455e09dbcf5e4fb1dc370b284e4f0d5c678b40 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592047 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-16 05:59:00 -07:00
Sahil Mukund Patki	794d1edbe4	gpu: nvgpu: Fix debugfs compilation errors The function "nvgpu_ce_debugfs_init" is declared in "debug_ce.h". This file is only compiled when CONFIG_DEBUG_FS is enabled. So any accesses to this function result in compilation errors when CONFIG_DEBUG_FS is disabled. This patch fixes the errors by guarding all accesses to the above mentioned function by CONFIG_DEBUG_FS. Bug 200755555 Change-Id: Ie566413913c4a72b10b87c3285d1263d1c811074 Signed-off-by: Sahil Mukund Patki <spatki@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591304 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-15 09:16:22 -07:00
prsethi	dd94573e55	gpu: nvgpu: Update KMDI mapping interface Finding gpu va mapping inside a given range is a two step process where in first step number of mapping are queried and at second step it queries for all the continues mapping range for that given gpu va range. Mapping interface should count and return number of mappings if input count is 0 in place of failing it. Patch make the change for this two step process and only returns count at first step and in second step returns the continues memory ranges. Patch also replaces nvgpu_zalloc with nvgpu_big_zalloc to handle bigger size allocation. Bug 200722275 Change-Id: I56428deafa560ac8471c78f102bb1f9dbe20cabc Signed-off-by: prsethi <prsethi@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591043 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-15 09:16:06 -07:00
Debarshi Dutta	79ab0ba6c4	gpu: nvgpu: remove sudo restrictions on gpu nodes. When SMC modes are enabled, devices are created with sudo-only access permissions. Those permissions are relaxed to allow non-sudo processes to allow job submission. Also, allow only root users to poweroff explicitely via the device power node. Bug 3374078 Change-Id: Ieb869399c3ada3588708cf2bc99a580414023cb7 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590584 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-15 09:15:49 -07:00
Antony Clince Alex	f3164a4672	gpu: nvgpu: fix tpc_fs_mask syfs output The tpc_fs_mask sysfs entry outputs the TPC masks in logical order, however, contradicts the gpc_fs_mask which is in physical order. So for consistency, update tpc_fs_mask to provided output in physical order. Bug 3364907 Change-Id: I2cc7b66dac2bea215024ef95944cde4b46d51c9a Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593803 Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-14 16:14:33 -07:00
Seshendra Gadagottu	5f62534127	Revert "gpu: nvgpu: ga10b: add errata for disable CBU ECC" This reverts commit `78d7a7fdde`. Reason for revert: fix is available, so no errata required Bug 200759575 Change-Id: Id46dd3e8ecde1e56fd0e0bca2746dc9c35e07728 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584855 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-14 16:09:48 -07:00
Vedashree Vidwans	ecaafaf75e	gpu: nvgpu: ga10b: correct mmu fault static arrays Correct index and value of static descriptor arrays used to parse faulted hub and gpc clients. Bug 3373998 Change-Id: Ia0476b272aa110b6172c69b3d5ef2b76a683a856 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593631 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-14 01:59:59 -07:00
Debarshi Dutta	791dc18666	gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields Add Setter and Getter methods for accessing tsg->sm_error_states. Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state. This renders it unnecessary to add BVEC for above fields for the struct in multiple locations. The current design ensures that only a constant pointer is obtained from the owner unit i.e. FIFO. The following new methods are added. Both unit tests and BVEC tests are added for them as well. nvgpu_tsg_store_sm_error_state nvgpu_tsg_get_sm_error_state Jira NVGPU-6947 Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-13 20:57:09 -07:00
Debarshi Dutta	6361653633	gpu: nvgpu: update swud for priv_ring Update documentation for priv_ring unit based on updated swud guidelines. This patch is contains a combination of two commits. Documentation is added for the HAL methods enable_priv_ring and isr decode_error_code enum_ltc get_fbp_count get_gpc_count set_ppriv_timeout_settings Jira NVGPU-6986 Change-Id: Ifa401dab0f29330ab7db2dcc888edf46a402cc83 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587227 GVS: Gerrit_Virtual_Submit (cherry picked from commit 0bdcf425ca58e6d04dceaedbb48f3adef43a870a) (cherry picked from commit ca44c09df60791db2ea6a6a80bc807f6c7eba494) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590992 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-13 20:57:03 -07:00
ajeshkv	118f8c1280	gpu: nvgpu: add support for gsp stress test Add debugfs entries to support GSP stress test and other functionalities to enable the test. JIRA CORERM-3382 Change-Id: Iab20fcfe78807e76e91c64716502a2f036ed4d18 Signed-off-by: ajeshkv <akv@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589390 Reviewed-by: Amit Pabalkar <apabalkar@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-10 16:02:43 -07:00
Vedashree Vidwans	a3e2283cf2	gpu: nvgpu: ga10b: Use active ltcs count for cbc init This patch fixes a bug in the cbc initialization code for ga10b, where it was erroneously assumed that a fixed ltc count of only one should be used for historical reasons. For volta and later, the full ltc count should be used in cbc-related computation. Ensure - CBC base address is 64K aligned - CBC start address lies within CBC allocated memory Check CBC is marked safe only for silicon platform. Bug 3353418 Change-Id: I5edee2a78dc9e8c149e111a9f088a57e0154f5c2 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2585778 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-10 16:00:25 -07:00
deepak goyal	cc7b048641	gpu: nvgpu: non-zero blob size for rail-gating. Ucode blob size 0 is passed currently for rail-gating. Ucode blob size 0 is not supported by ACR yet. ACR will copy UCODE blob again to SYSMEM for GPU Rail-gating cycles. Bug 3361416 Change-Id: I1fdb3993cda7e5d62507d83f9c0a8645dc5f7fc7 Signed-off-by: deepak goyal <dgoyal@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2588207 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-09 09:16:37 -07:00
David Li	b27524916a	gpu: nvgpu: ga10b fix zcull sm_num_rcp_conservative -calculate sm_num_rcp_conservative correctly using TPC total from all GPCs -register manual says use SM count but it's actually TPC count bug 3370219 Change-Id: I4422fb09d3a59879394e0e1abc5513efc6355b5b Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586399 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Gangzheng Tong <gtong@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Gangzheng Tong <gtong@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-09 09:13:31 -07:00
Debarshi Dutta	a53ebf02d1	gpu: nvgpu: update error message to info. These errors are now actually expected from code that counts number of sys/gpc/fbp perfmons after first context creation. Nvgpu tries to count them by register offset lookup in context image and counts perfmons until invalid offset is found. nvgpu_gr_hwmp_map_find_priv_offset no longer prints an error message. The correct error condition is moved to gr_exec_reg_ops Bug 200755537 Change-Id: Ib5c6ccd39275b2b06e3f8bce4878a3234478a780 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586228 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-09 09:13:03 -07:00
Antony Clince Alex	ab4aa0afba	gpu: nvgpu: remove incorrect usage of CONFIG_NVGPU_NEXT Remove incorrect usage of CONFIG_NVGPU_NEXT introuduced in patch: https://git-master.nvidia.com/r/#/c/linux-nvgpu/+/2499571/ JIRA NVGPU-6574 Change-Id: I9bf0f0ee5d9762b79dd7913402678b0dd87f21ee Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2567353 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-08 06:50:49 -07:00
Sagar Kadamati	dd9b4364aa	gpu: nvgpu: add nvgpu-next infrastructure * As of now, working on multiple chip bringup in nvgpu-next repo has an issue because we end with losing control on source code (hard to find which part of the code belongs to which chip) and it's valuable history this affects chip migration on release. * To support multiple chip bringup simultaneously, we need new guidelines to avoid losing control on source code and make migration easier. This change adds links to nvgpu-next repo. * Updated return code to ENODEV for consistency * Updated ACR unittest to work with ENODEV return code NOTE: These are the initial set of infrastructure changes, guidelines will evolve, and source code will get updated accordingly. Based on future chip features, Which part of the source code falls under nvgpu-next repo is decided. JIRA NVGPU-6574 Change-Id: I81827e35d189c55554df00e255b527a4473e0338 Signed-off-by: Sagar Kadamati <skadamati@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556793 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-08 06:50:38 -07:00
Konsta Hölttä	9ffcb0fade	gpu: nvgpu: log submit error reasons For each common error that may happen in the submit path, log the failure reason at info level if not already logged. Various mistakes may cause -EINVAL, and getting to know what is wrong is helpful when writing tests. Change-Id: I8ac2a40441e0bf3d8afdb40526b607537eb5105c Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587360 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-07 16:00:50 -07:00
Divya Singhatwaria	b6ab227016	gpu: nvgpu: Enable pmu interrupt - For secure RISCV boot, enable pmu interrupt during pmu_rtos_init - As interrupts are enabled, PMU intr can be received before driver has changed the pmu firmware state. This can cause the RISCV boot to fail. - To resolve this, first change the pmu firmware state from off to PMU_FW_STATE_STARTING and then wait for pmu priv lockdown release. Change-Id: Ib2e8b033fec6320bf9ccff02696192a48172464b Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586325 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-07 16:00:05 -07:00
dt	152d7c9edd	gpu: nvgpu: Fix for pes_tpc_mask programming After CONFIG_UBSAN kernel compilation flag to know any shifting cause overflow or not enablement ,this is identified. The register "gr_fe_tpc_fs_r(gpc_index)" is read only after Volta. The gops where we are computing the index is not needed. Bug 200727116 Change-Id: Ib2306103389ba9df77fd59d012ec70e775104989 Signed-off-by: dt <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573296 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-07 15:59:48 -07:00
dt	4034de5756	gpu: nvgpu: Fix for smid programming As number of available tpc/gpc is more than 4 in new dgpu, this fix is needed for correct sm_id config programming. After CONFIG_UBSAN kernel compilation flag to know any shifting cause overflow or not enablement , this is identified where the shift is overflowing u32 when number of available TPCs is more than four. Bug 200727116 Change-Id: I9169a00614e4a648afe4a2d2f8e76c178e8c19eb Signed-off-by: dt <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2571823 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-07 10:29:03 -07:00
dt	9355345610	gpu: nvgpu: Add IPA-PA cache to increase the performance When GPU need to programmed with PA(physical address), given IPA need to be converted to PA by querying Hypervisor. As this is an IPC between OSes, the call will reduce the performance badly. So this is adding a IPA-PA cache to improve the performance. This will be more helpful in passthr config. Bug 3277194 Change-Id: I6a3230d858977313a0ed0f33068055a3b516330a Signed-off-by: dt <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2571814 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-07 10:28:58 -07:00
Ramesh Mylavarapu	ffd0d3962f	gpu: nvgpu: gsp: gsp isr and debug trace support - Created GSP NVRISCV interrupt handle and respective functions and register reads. - Created Debug trace support for GSP firmware. NVGPU-7084 Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Change-Id: I2728150c4db00403aa6e3c043bc19c51677dd9cf Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589430 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-07 05:37:51 -07:00
Antony Clince Alex	2afd601a40	gpu: nvgpu: update FS mask sysfs entries to RDONLY Repurpose (gpc,fbp,tpc)_fs_mask sysfs nodes to only report active physical chiplets after floorsweeping. StaticPG'ing of chiplets will be handled by (gpc,fbp,tpc)_pg_mask sysfs nodes. The user will be able to the write valid PG masks for respective chiplets prior to poweron, which can then be verified using (gpc,fbp_tpc)_fs_mask nodes. Bug 3364907 Change-Id: Ia4132f9c1939b2cb4a8f55f9d99a2b0a5b02184c Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587926 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Chris Dragan <kdragan@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-07 05:35:09 -07:00
Rajesh Devaraj	4d46e9e07a	gpu: nvgpu: update doxygen for SDL This patch updates doxygen for the following functions in SDL: - nvgpu_report_ctxsw_err() - nvgpu_report_ecc_err() - nvgpu_report_host_err() - nvgpu_report_pmu_err() - nvgpu_report_pri_err() - gr_intr_report_ctxsw_err() - nvgpu_report_mmu_err() - nvgpu_report_gr_err() JIRA NVGPU-7001 Change-Id: Ie21908cacaf4add1143d68d9f9a4d2d1315dfdd8 Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com> (cherry picked from commit c1dc3e7c35d585faed8ed3b9c61f6afe044f7263) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2588991 Reviewed-by: V M S Seeta Rama Raju Mudundi <srajum@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-03 14:40:32 -07:00
Debarshi Dutta	33740b41b6	gpu: nvgpu: free memory during module removal Following pointers(allocated via Kmalloc/DMA) aren't freed during module removal. struct nvgpu_gr_config -> gpc_tpc_mask_physical struct nvgpu_netlist_vars -> ctxsw_regs.etpc.l struct mm_gk20a -> sysmem_flush struct nvgpu_pmu_pg -> pg_buf SGTable corresponding to VPR secure buffer. Added appropriate free calls. Bug 3364181 Change-Id: I2105c1f3256b1910f0f514d98f0ee3ae2e34aff7 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586244 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-02 15:43:07 -07:00
Sagar Kamble	79fb97100d	gpu: nvgpu: implement GET_BUFFER_INFO ioctl Userspace applications will need to query buffer information such as size, comptags allocation status, user associated metadata etc. for enabling newer IPC mechanisms. Add support for this new ioctl. Bug 200586313 Change-Id: I87607eb306afa0cce1bec7a1fb2925ec3bc33e50 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480763 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-02 11:42:13 -07:00
Sagar Kamble	ed16377983	gpu: nvgpu: allocate comptags and store metadata in REGISTER_BUFFER ioctl To enable userspace query about comptags allocation status of a buffer, comptags are to be allocated only during buffer registration done by nvrm_gpu. Earlier, they were allocated during map. nvrm_gpu will be sending metadata blob to be associated with the buffer. This will have to be stored in the dmabuf privdata for all the buffers registered by nvrm_gpu. This patch moves the privdata allocation to buffer registration ioctl. Remove g->mm.priv_lock as it is not needed now. This lock was added to protect dmabuf private data setup. That private data is now handled through dmabuf->ops and setup of dmabuf->ops is done under dmabuf->lock. To support legacy userspace, this patch still allocates comptags on demand on map calls for unregistered buffers. Bug 200586313 Change-Id: I88b2ca04c733dd02a84bcbf05060bddc00147790 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480761 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-02 11:42:08 -07:00
Jon Hunter	8a4b72a4aa	gpu: nvgpu: Fix crash when reading CE_APP debugfs The CE_APP debugfs nodes are created when the NVGPU driver is probed, however, the 'ce_app' structure which contains the variables exposed via the debugfs, is not allocated until nvgpu_finalize_poweron() is called. Therefore, if the user attempts to access the CE_APP debugfs nodes before the NVGPU has been powered on, for example, right after Linux has booted, then this results in a NULL pointer dereference crash. Fix this by moving the creation of the CE_APP debugfs nodes to nvgpu_finalize_poweron_linux() which is called after nvgpu_finalize_poweron(). Bug 200747304 Change-Id: Icd28952112f86887a1d6b6f8beb382f5189461a9 Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572106 (cherry picked from commit 35a0c18d93e97265611c3bbfae41b39d9cd183e3) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587367 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-02 07:23:53 -07:00
Jon Hunter	d1b34e50e2	gpu: nvgpu: Fix build for Linux v5.14-rc2 Upstream Linux kernel commits b7eb335e26a9 ("Makefile: Enable -Wimplicit-fallthrough for Clang") and d936eb238744 ('Revert "Makefile: Enable -Wimplicit-fallthrough for Clang"') have the net effect of updating the compiler flag -Wimplicit-fallthrough from -Wimplicit-fallthrough= to -Wimplicit-fallthrough=5. This causes the following build error to be seen ... nvgpu/drivers/gpu/nvgpu/common/pmu/clk/clk_prog.c:1042:15: error: this statement may fall through [-Werror=implicit-fallthrough=] 1042 \| step_count = (freq_step_size_mhz == 0U) ? 0U : \| ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1043 \| (u8)(p1xmaster->super.freq_max_mhz - \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1044 \| *pfreqmaxlastmhz - 1U) / \| ~~~~~~~~~~~~~~~~~~~~~~~~ 1045 \| freq_step_size_mhz; \| ~~~~~~~~~~~~~~~~~~ nvgpu/drivers/gpu/nvgpu/common/pmu/clk/clk_prog.c:1048:3: note: here 1048 \| case CTRL_CLK_PROG_1X_SOURCE_ONE_SOURCE: \| ^~~~ cc1: all warnings being treated as errors scripts/Makefile.build:271: recipe for target 'nvgpu/drivers/gpu/nvgpu/common/pmu/clk/clk_prog.o' failed Per commit d936eb238744 ('Revert "Makefile: Enable -Wimplicit-fallthrough for Clang"'), by setting -Wimplicit-fallthrough=5 [0], the explicit 'fall-through' comments in the code are not recognised by the compiler and cause the above error to be seen. This could be fixed by simply replacing the 'fall-through' comment with the 'fallthrough;' statement. However, this requires newer versions of GCC that support it. The simplest way to fix this error is by ensuring that -Wimplicit-fallthrough=3 for NVGPU so that fallthrough comments are recognised by the compiler. Note that we still need to check that GCC supports this option because older versions do not. It should be noted that -Wimplicit-fallthrough=3 is the default set by -Wextra. See the GCC warnings options document [0] for more details. Bug 3340525 Link: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html [0] Change-Id: Ia56e4343143185460a37f8a7b0dd229f005acbb9 Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2567440 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582509 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Rohit Khanna <rokhanna@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Tested-by: Rohit Khanna <rokhanna@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-01 18:49:59 -07:00
Sagar Kamble	7410784b0b	gpu: nvgpu: fix clk_arb completion file private data access race clk_arb completion file descriptor can get closed immediately after poll finishes in the work item gp10b_clk_arb_run_arbiter_cb. In that case, the refcount for nvgpu_clk_dev can become zero in the work item and can lead to invalid access while removing nvgpu_clk_dev from the lists. Remove nvgpu_clk_dev from the list before dropping the reference to it. Also, delete the nvgpu_clk_dev in completion file release handler within the session and requests spinlocks to avoid race with gp10b_clk_arb_run_arbiter_cb using it. bug 200757277 Change-Id: I054eee547f2a6fa633d7ef55df216ec36647a826 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2569522 (cherry picked from commit `ce8548ec05`) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587070 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-01 09:50:11 -07:00
ajesh	7155ae865c	gpu: nvgpu: update queue unit tests Update queue unit tests for code coverage. JIRA NVGPU-6904 Change-Id: I49ed6980f2d610cf8359c375a1236e8866ea6795 Signed-off-by: ajesh <akv@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555333 (cherry picked from commit f2311f2710cab83b82ed7f5d51c54fa897051686) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560216 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-01 05:57:54 -07:00
ajesh	3c70d56ddb	gpu: nvgpu: update posix thread unit tests Update the unit tests for posix thread unit to increase coverage. JIRA NVGPU-6904 Change-Id: Ib103de1ee37fb4986aa36900772b78b990ccb02a Signed-off-by: ajesh <akv@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555772 (cherry picked from commit cd45d1cd2d095c77d738fdf7746fd258bc58353b) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560213 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-01 05:57:49 -07:00
Debarshi Dutta	6fc27766ed	gpu: nvgpu: fix issues due to a previous patch `608decf` gpu: nvgpu: add support for powering off gpu The above commit accidentally removed nvgpu_quiesce from nvgpu_pci_remove path. Add that back. Bug 3365659 Change-Id: I287972c426738a950ace2907610e02b774ab1eff Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586240 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-09-01 01:37:17 -07:00
deepak goyal	77d1e765f5	gpu: nvgpu: ga10b: Fix logic for BROM pass status Current code assumes riscv brom passed if it does not times out. This patch explicitly checks for brom pass/fail or timeout. Bug 3361416 Change-Id: I399a6cf9d32be92b24990532f81892642513ba54 Signed-off-by: deepak goyal <dgoyal@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2585786 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-08-31 08:54:35 -07:00
Seshendra Gadagottu	d255c64f50	gpu: nvgpu: ga10x: update pdiv_duration for thermal To keep pdiv_duration at 15usec between steps at 102MHz utilsclk, update stepping duration value from 0xBF4 to 0x5FA for ga10x. Bug 200757274 Change-Id: I333a5b0b35307402a734a7eafc4ab13d20316cd1 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584539 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-08-30 19:35:54 -07:00
Ramesh Mylavarapu	88293ee42d	gpu: nvgpu: read temperature from therm_i2cs_sensor_00_r Currently reading temperature value depeads on therm pstate board objects. In absence of pstate reading temperature from therm get status will be failed which will cause GVS failure in NvRmGpuTest_Device_GetTemperature test. This change will add support to read temperature from therm sensor_00 register but this will have following limitation: - NV_THERM_I2CS_SENSOR_00 doesn't support fractional precision. - It doesn't support negative temperatures. BUG-200736830 Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Change-Id: I25e577dac9029fcd787a6f71957dbeefd6fe43dd Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584269 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-28 06:56:24 -07:00
Ramesh Mylavarapu	a96c04d097	gpu: nvgpu: disable pstate support for tu104 Disabling pstates on TU104 which is no more a POR. BUG-200736830 Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Change-Id: I36a0d5fac5d1294802e5150dcebd5dcb54ad5f2e Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584268 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-28 06:56:19 -07:00
Seshendra Gadagottu	135e056e9e	gpu: nvgpu: ga10b: set can_slcg/blcg/elcg to true Add capability to enable/disable clock gating power features by setting can_xxcg capabilities to true. The cg features are disabled on tot and will be enabled once verification is done. Jira NVGPU-7033 Bug 200766930 Change-Id: I2d2aa25b7c84f3c4de0b12fd6d845a8f792bfd2d Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584540 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-08-27 20:45:58 -07:00
Antony Clince Alex	bb5bffe571	gpu: nvgpu: enhance CE error reporting documentation Update documentation for function nvgpu_report_ce_err to include fine granular implemenation details. In additiona, remove redundant descrptions from error reporting functions. Jira NVGPU-6948 Change-Id: Ie1675b0260809bfbc6fdeab6748c48347b5f3d7d Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554573 (cherry picked from commit a5f84edde5943358549534b8f736ee931a28c1ad) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555909 Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: Rajesh Devaraj <rdevaraj@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-08-27 04:17:15 -07:00
Deepak Nibade	3c97f3b932	gpu: nvgpu: disallow binding more channels than MAX channels supported per TSG There is HW specific limit on number of channel entries that can be added for each TSG entry in runlist. Right now there is no checking to enforce this from SW and hence if User binds more than supported channels to same TSG, invalid TSG formation error interrupts are generated. Fix this by adding appropriate checks in below steps : - Add new field ch_count to struct nvgpu_tsg to keep track of channels bound to TSG. - Define new hal gops.runlist.get_max_channels_per_tsg() to retrieve HW specific maximum channel count per TSG. - Implement the HAL for gk20a and gv11b chips, and assign new HALs for all chips appropriately. - Increment ch_count while binding the channel to TSG and decrement it while unbinding. - While binding channel to TSG, Check if current channel count is already equal to max channel count. If yes, print an error and bail out. Bug 200763991 Change-Id: Ic5f17a52e0fb171d1c020bf4f085f57cdb95f923 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582095 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-25 09:47:47 -07:00
Debarshi Dutta	608decf1e6	gpu: nvgpu: add support for powering off gpu Add support for powering off IGPU for switching between legacy to SMC mode/vice-versa or changing SMC configuration. The power off can be issued as follows echo 0 > /dev/nvgpu/igpu0/power The following steps are done during a poweroff. 1) Deterministic channel idle 2) Acquire write_lock on l->busy semaphore. 3) Wait till power_usage decrements to indicate 0 active jobs. 4) Invoke pm_runtime_put_sync_suspend() 5) Invoke nvgpu_gr_remove_support() to clear existing GR memory. 6) Release write_lock on l->busy 7) Deterministic channel unidle. Part of the sequence matches that of the gk20a_do_idle code. The common parts are extracted into new functions gk20a_block_new_jobs_and_idle() and gk20a_unblock_jobs() For joint-rail case, the current implementation, does a railgate and then sets pm_runtime_set_autosuspend_delay(-1) to disable regular runtime resume/suspend. Remove clearing of NVGPU_SUPPORT_MIG status during state change ias it leads to inconsistencies. Jira NVGPU-6920 Change-Id: I0b3eb3278176122ac061c1e8a94ebfb3c17c3925 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578501 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Antony Clince Alex <aalex@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-23 05:27:50 -07:00
Debarshi Dutta	2e3c3aada6	gpu: nvgpu: fix deinit of GR Existing implementation of GR de-init doesn't account for multiple instances of struct nvgpu_gr. As a fix, below changes are added. 1) nvgpu_gr_free is unified for VGPU as well as native. 2) All the GR instances are freed. 3) Appropriate NULL checks are added when freeing GR memories. 4) 2D, 3D, I2M and ZBC etc are explicitely disabled when MIG is set. 5) In ioctl_ctrl, checks are added to not return error when zbc is NULL for VGPU as requests are rerouted to RMserver. Jira NVGPU-6920 Change-Id: Icaa40f88f523c2cdbfe3a4fd6a55681ea7a83d12 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578500 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: Antony Clince Alex <aalex@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-23 05:27:45 -07:00
Mahantesh Kumbar	b9696ee643	gpu: nvgpu: ga10b: update NVRISCV LSPMU - Set NVRISCV LSPMU app version to 0. - Setting app version to 0 helps to load and boot multiple LSPMU ucode's without modifying the NVGPU driver. - Add support for PMU NVRISCV prod and dbg bin's. - This is corresponding change to LSPMU MPSK CL https://git-master.nvidia.com/r/c/tegra/kernel-firmware-t18x/+/2576049 JIRA NVGPU-7061 Change-Id: I800953ca97af3badde1983aa99e09b4fe7453203 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2575341 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-08-22 11:05:03 -07:00
Seshendra Gadagottu	a743596697	gpu: nvgpu: ga10b: handle floor-swept gpc clock gracefully If a GPC is floor-swept, then gpcclk enable for that GPC will return error. For gpu booting, ignore this error and continue with other clocks enable. More robust mechanism with floor-sweeping check before enabling clocks will be added in follow-up patches. Bug 3362403 Change-Id: I0b64c94918a1c00086a146408e6c4913788249ec Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579569 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-20 14:56:30 -07:00

1 2 3 4 5 ...

9138 Commits