linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 18:16:01 +03:00

Author	SHA1	Message	Date
Deepak Nibade	7466369a58	gpu: nvgpu: update hwpm/smpc ctxsw mode API to accept TSG Below APIs to update hwpm/smpc ctxsw mode take a channel pointer as a parameter. APIs then extract corresponding TSG from channel and perform various operations on context stored in TSG. g->ops.gr.update_smpc_ctxsw_mode() g->ops.gr.update_hwpm_ctxsw_mode() Update both above APIs to accept TSG pointer instead of a channel. This is a refactor work to support new profiler design where a profiler object is bound to TSG and keeps track of TSG only. Bug 2510974 Jira NVGPU-5360 Change-Id: Ia4cefda503d8420f2bd32d07c57534924f0f557a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366122 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	59230fe64a	gpu: nvgpu: disable PMU PSTATE support if LS PMU disabled disable PMU PSTATE support if LS PMU support is disabled. JIRA NVGPU-5474 Change-Id: Idc2d9d802473f0b3d898fddbdf4aead9d68285c9 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365591 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
mkumbar	4b206055ae	gpu: nvgpu: Move SEC2 RTOS ucode to last in the WPR blob -This change is required to have reduced access of WPR1 region for ACRLIB hosting falcon. -By doing the above we allow only L3 Read access for ACRLIB hosting falcon, enforcing better security. -Fixed freeing of ACR resource at exit upon failure. JIRA NVGPU-5459 Change-Id: I9c32a1fe723570cf3768f7e741a7a2e9d96cc1bf Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365589 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
mkumbar	7aa8447ef2	gpu: nvgpu: sec2 LS falcon bootstrap update Add LS Falcon instance and index mask param update to bootstrap selected instance using nv_sec2_acr_cmd_bootstrap_falcon interface. JIRA NVGPU-5468 Change-Id: Ief55755e69c82697a52fb1c50381c50313aa72e7 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365588 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Deepak Nibade	dd875bb8d1	gpu: nvgpu: add custom log prints for profiler Define new flag gpu_dbg_prof for profiler specific debug prints. Add debug prints to existing profiler specific functions. Bug 2510974 Change-Id: Ifee6af2b6efe7b29f1337b6d8c89fd2156e1e2ca Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365676 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	d869040d7a	gpu: nvgpu: rename profiler object structure Rename profiler object structure from struct dbg_profiler_object_data to struct nvgpu_profiler_object. Annotate the structure members appropriately. Bug 2510974 Change-Id: I9454388f8ad143b39daca6bbc2b12511ffa3fd95 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365675 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	e4e6be85ea	gpu: nvgpu: move profiler alloc/free APIs to separate file Move profiler object allocation/free APIs to separate profiler specific file common/profiler.c. Store struct gk20a pointer in struct dbg_profiler_object_data for convenience of accessing global struct pointer. Update profiler object to store TSG pointer instead of channel pointer. Since expectations is to have one profiler object per context/TSG. nvgpu_profiler_reserve_acquire() has a case to check if resource reservation is acquired by some other channel in TSG. But now since we keep track of TSG itself, this case becomes redundant and can be removed. All the support is compiled out of safety build with compile flag CONFIG_NVGPU_PROFILER. Linux will always compile the support. Bug 2510974 Change-Id: I197bbd67a9cdd1fbea42f1effd1b74b15a6068e5 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365674 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	1ff79b1d2c	gpu: nvgpu: remove support for quad reg_op quad type reg_ops were only needed on Kepler, and not for any other chip beginning Maxweel. HAL g->ops.gr.access_smpc_reg() was incorrectly set for Volta and Turing whereas it was only applicable to Kepler. Delete it. There is no register in the quad type whitelist since the type itself is not supported anymore. Remove the empty whitelists for all chips and also delete below HALs: g->ops.regops.get_qctl_whitelist() g->ops.regops.get_qctl_whitelist_count() hal/regops/regops_gv100.* files are not used anymore. Delete the files instead of just deleting quad HALs in these files. Bug 200628391 Change-Id: I4dcc04bef5c24eb4d63d913f492a8c00543163a2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366035 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	e7d6d36a16	gpu: nvgpu: update bios version for PG189 600QS Update BIOS version for PG189 600QS parts to 0x9004A200 as per new recommendation. Bug 2939979 Change-Id: I4600df89f5d824c20a95ed37c35578231abc3b3f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2362273 (cherry picked from commit 6de18442a2daade3dc14593cd4fefa404605b77e) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365475 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Seeta Rama Raju	6cb82a92b1	gpu: nvgpu: fix for certc violations JIRA NVGPU-5694 Change-Id: If1a2fd5c7f54878294ca0659dd37cf8c77f699d4 Signed-off-by: Seeta Rama Raju <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2363792 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Andrey Jivsov <ajivsov@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Richard Zhao	7b8a08af7a	gpu: nvgpu: check ch->wdt on wdt restart all channels ch->wdt is not always initialized. For example it's not initialized on gpu server, since the channel wdt is managed on client side. Bug 2833924 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Idb06f7de6a15e093bbb08be16454777b9d7582b9 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361978 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	98264f7505	gpu: nvgpu: call gops.tsg.unbind_channel on fail path When current context is busy, nvgpu_tsg_unbind_channel_common may fail because of preemption failed. In such case, the .unbind_channel hal still need to be called to notify vserver that the channel will be removed from tsg in teardown path. Bug 2833924 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I9996202485429b4d9cba0c2f985f8e55fcdd3f29 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361977 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Antony Clince Alex	077a07ff9f	gpu: nvgpu: add gr gops to enable/handle zrop/crop/rrh hww Add the following gr gops functions: - enable_gpc_crop_hww - enable_gpc_zrop_hww - handle_gpc_crop_hww - handle_gpc_zrop_hww - handle_gpc_rrh_hww These gr gops will be used in nvgpu-next. Add function: nvgpu_gr_rop_offset to compute rop pri offsets. Jira: NVGPU-5237 Change-Id: I9e2437c1d2893238b16ec7a134543e20c81b49f7 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335687 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	fbb6a5bc1c	gpu: nvgpu: Remove fifo->pbdma_map The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists. This array allows other software to query what PBDMA(s) serves a given runlist. The PBDMA map is read verbatim from an array of host registers. These registers are stored in a kmalloc()'ed array. This causes a problem for the device management code. The device management initialization executes well before the rest of the FIFO PBDMA initialization occurs. Thus, if the device management code queries the PBDMA mapping for a given device/runlist, the mapping has yet to be populated. In the next patches in this series the engine management code is subsumed into the device management code. In other words the device struct is reused by the engine management and all host SW does is pull pointers to the host managed devices from the device manager. This means that all engine initialization that used to be done on top of the device management needs to move to the device code. So, long story short, the PBDMA map needs to be read from the registers directly, instead of an array that gets allocated long after the device code has run. This patch removes the pbdma map array, deletes two HALs that managed that, and instead provides a new HAL to query this map directly from the registers so that the device code can use it. JIRA NVGPU-5421 Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	223d8522a1	gpu: nvgpu: clarify fence api assumptions Adjust documentation and validity checks in the fence functions for simplicity. Now that the cde code is using user fences cleanly, the do-nothing-on-null action can cause unintended behaviour in new code using nvgpu_fence_get and nvgpu_fence_put. It does not make sense to call these with a null fence, so delete the checks. Extend the documentation in nvgpu_fence_extract_user() for the os fence lifetime to give a reason for the dup call. Make nvgpu_fence_from_semaphore() and nvgpu_fence_from_syncpt() return void. These fill a previously allocated object; the only failure would have been a null object, but that never happens and is not acceptable behaviour for callers so delete these null checks and fix types. Jira NVGPU-5248 Change-Id: I9f82365d50ab5600374c8f7dd513691eac14a2f1 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359624 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	39d1af0f65	gpu: nvgpu: use user fences in cde buffer state The stored fence in struct gk20a_buffer_state is a post fence of a previous cde preparation job, if any. This stored fence is passed to userspace via NVGPU_GPU_IOCTL_PREPARE_COMPRESSIBLE_READ in case a preparation job was necessary to fulfill the request. As nothing else is needed from the fence, make it just a struct nvgpu_user_fence. Add nvgpu_user_fence_clone() for copying this user fence because it's stored internally and returned to userspace. The refcounted os fence needs special care. Now that the API is not so trivial anymore, add some documentation. Jira NVGPU-5248 Jira NVGPU-5493 Change-Id: I8bc4d52eaab7c7cbc5573b331e72e1d853f9f057 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359065 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	194fac7f3c	gpu: nvgpu: Remove clutter in engine code Remove the get_mask_on_id() HAL and replace it's usage with the global nvgpu_engine_get_mask_on_id() function. There's no need to have this function as a HAL. JIRA NVGPU-5420 Change-Id: I4fc843beff8e65806da26a0addc83fa218d390ac Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361315 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	8c5972ac7f	gpu: nvgpu: Move device de-init call Move the device de-init call to when the gk20a struct is being freed; the device list can live for as long as the gk20a struct does. This will be a problem later, since the current location causes the device structs to get freed and allocatoed over and over. That'll cause gross corruption in the FIFO code when the engine_info struct is replaced with pointers to the device structs. JIRA NVGPU-5421 Change-Id: If4e08ea88dbcae7acd599e3fad29f72ece63b8e0 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361269 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	ca1f93bdd7	gpu: nvgpu: add user fence type Decouple the fence information needed for providing submit postfences to userspace by adding a separate type for that and using it to pass fence data to ioctls. The data in struct nvgpu_fence_type is used in various places: - job tracking needs to know when a post fence is expired - job submitters within the driver (vidmem clears) need to be able to wait for these fences - userspace needs the fence as an id, value pair or as a file descriptor created from an os fence To keep object lifetimes strict, start decoupling the os fence data out of struct nvgpu_fence_type: delete nvgpu_fence_install_fd() and add nvgpu_fence_extract_user() to return a struct nvgpu_user_fence that contains only the necessary information. Storing the os fence in job tracking metadata is legacy code and not useful. Passing the os fence from where it's created through the whole submit path inside this combined fence type has been convenient, though. The internally stored cde job fence in dmabuf compression metadata is still nvgpu_fence_type to keep this patch simple. Jira NVGPU-5248 Change-Id: I75b7da676fb6aa083828f888c55571bbf7645ef3 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359064 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Peter Daifuku	9a93dc3c83	gpu: nvgpu: mv struct nvgpu_device_list to header Move the struct nvgpu_device_list from device.c to device.h, so that unit tests have access to the struct. Bug 2984984 Change-Id: Ie6ec6c8dd8f40c28b98964f832b92fdbae73169f Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2360308 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	f807ad932c	gpu: nvgpu: fix uninitialized variable error Enabling Kcov and KASAN causes below compilation failure : common/mm/vm_area.c:255:3: error: ‘vma’ may be used uninitialized in this function [-Werror=maybe-uninitialized] Fix this by correcting failure cases in function nvgpu_vm_area_alloc() Bug 2155608 Change-Id: Id4070157f2a8bd7043b0c49effb6f61cce5eecc2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359496 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2d24298af0	gpu: nvgpu: update nvgpu_pte_dbg_print function Currently, nvgpu_pte_dbg_print() overwrites ctag string "ctag=" and only prints ctag number. For example, nvgpu_pte_dbg_print:104 [DBG] vm=3 PTE: i=0 size=8 \| GPU 0x1efc000000 phys 0x115a50000 pgsz: 4kb perm=RW kind=0x8 APT=SYSTEM C--V- 1 [0x08000010, 0x115a5007] Update nvgpu_pte_dbg_print function to include ctag string. nvgpu_pte_dbg_print:104 [DBG] vm=3 PTE: i=0 size=8 \| GPU 0x1efc000000 phys 0x115a50000 pgsz: 4kb perm=RW kind=0x8 APT=SYSTEM C--V- ctag=1 [0x08000010, 0x115a5007] Jira NVGPU-5489 Change-Id: I2f84f89da685ad6a84534c0bb51e3ca1244b3497 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354182 Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: Seema Khowala <seemaj@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	160669a7bb	gpu: nvgpu: return device from nvgpu_device_get() Instead of copying the device contents into the passed pointer have nvgpu_device_get() return a device pointer. This will let the engines.c code move towards using the nvgpu_device type directly, instead of maintaining its own version of an essentially identical struct. JIRA NVGPU-5421 Change-Id: I6ed2ab75187a207c8962d4c0acd4003d1c20dea4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319758 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	229ea2dd59	gpu: nvgpu: add gmmu_attrs comptagline_mode flag Add cbc_comptagline_mode flag as a member of nvgpu_gmmu_attrs. This flag indicates if cbc follows comptagline policy. Add fb.is_comptagline_mode_enabled() to check if comptagline mode is enabled. JIRA NVGPU-4666 Change-Id: I77fb31cb54dd014c2fd35586a3751c757b2543e2 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2353348 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	71ab9800cd	gpu: nvgpu: Add raw data dump for profiler Add the ability to dump raw data from the profiler. The kernel driver can provide some simple analysis, but ultimately a userspace tool such as python, R, matlab/octave, or the like, is far better suited for data analysis and visualization. JIRA NVGPU-5606 Change-Id: I94a63eadba726b66a78cf51ea4674745038390a1 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2358381 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	70ce67df2d	gpu: nvgpu: Add a generic profiler Add a generic profiler based on the channel kickoff profiler. This aims to provide a mechanism to allow engineers to (more) easily profile arbitrary software paths within nvgpu. Usage of this profiler is still primarily through debugfs. Next up is a generic debugfs interface for this profiler in the Linux code. The end goal for this is to profile the recovery code and generate interesting statistics. JIRA NVGPU-5606 Signed-off-by: Alex Waterman <alexw@nvidia.com> Change-Id: I99783ec7e5143855845bde4e98760ff43350456d Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355319 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	319520ff57	gpu: nvgpu: Add a new device manager unit This adds a new device management unit in the common code responsible for facilitating the parsing of the GPU top device list and providing that info to other units in nvgpu. The basic idea is to read this list once from HW and store it in a set of lists corresponding to each device type (graphics, LCE, etc). Many of the HALs in top can be deleted and instead implemented using common code parsing the SW representation. Every time the driver queries the device list it does so using a device type and instance ID. This is common code. The HAL is responsible for populating the device list in such a way that the driver can query it in a chip agnostic manner. Also delete some of the unit tests for functions that no longer exist. This code will require new unit tests in time; those should be quite simple to write once unit testing is needed. JIRA NVGPU-5421 Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	4dcfbc19de	gpu: nvgpu: Trigger quiesce on spurious FBPA intr In Bug 200588835, the spurious FBPA interrupts are seen on couple of boards. These interrupts were found to be EDC (Error detection and Correction) interrupts which are triggered due to ECC errors. The EDC registers are not exposed to the driver, so the interrupt status register cannot be cleared; resulting in interrupt storm. Also, it was concluded that only bad HW can cause this failure scenario. So, in the ISR for FBPA interrupts, get the GPU into quiesce state as we don't expect the GPU to be in usable state post such unrecoverable errors. Adapt the quiesce code for Linux build too. 1. On Linux, we cannot exit the nvgpu process after quiesce like we do on QNX. So, add nvgpu_disable_irqs() call to quiesce implementation which is done as part of process exit handler on QNX. Masking interrupts which is already done as part of quiesce would be sufficient in most cases, but to be fail-safe disable_irqs too. 3. Also, the IOCTL code looks at g->sw_ready, hence add nvgpu_start_gpu_idle() to set g->sw_ready to false along with setting NVGPU_DRIVER_IS_DYING = true. We expect the nvgpu_sw_quiesce() call to finish before quiesce thread wakes up from 50ms sleep. Hence, critical step like nvgpu_start_gpu_idle() is added to nvgpu_sw_quiesce(), whereas the somewhat redundant disable IRQs call is added to quiesce thread. nvgpu_fifo_quiesce() was called twice by mistake; remove one of the them. Bug 2919899 Bug 200588835 Change-Id: I9beec688c2e1c0d8dfc1327ddf122684576f8684 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354537 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6778fc9eb6	gpu: nvgpu: remove fence validity checks The valid flag in struct nvgpu_fence_type is not very useful. It's set when a fence is created on an allocated object and read in these three scenarios: - nvgpu_fence_install_fd() after a submit, if the submit was successful. A successful submit implies that a post fence exists. - nvgpu_fence_wait() for a copyengine job when synchronizing the ce ringbuffer or when waiting for vidmem clears. In these cases the fence is also clearly always valid. - nvgpu_fence_is_expired() when testing whether a tracked job has completed. Such jobs cannot exist without post fences that are mandatory for tracking, so the fence must exist. Remove the valid flag. Remove also the other init checks from the above functions; they're equally unused and confusing implying that such calls would be acceptable, causing sloppy code at best. Jira NVGPU-5248 Jira NVGPU-5493 Change-Id: I52c5be1569b343024d2626bd9577f87b46064fba Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357828 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	2dfa74c831	gpu: nvgpu: ACR interface update FALCON_ID_END is used in ACR lsf_ucode_desc interface to allocate space for dependency map but now more number of FALCON’s supported which will cause wrong allocation for dependency map, so required to have its definition. JIRA NVGPU-5462 Change-Id: Idaaa24ea1d2767a0b4ef44b1376239f945e39912 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357747 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	4e241d5974	gpu: nvgpu: adapt to generic syncpt api Use the nvhost sync fence APIs that do not require knowledge about the sync fence version. Nvhost exports an opaque nvhost_fence type with a common interface for both legacy and stable sync fences. Delete the syncfd-specific nvhost wrappers. They exist only on Linux, so having them in the nvhost wrapper layer is just a hassle. The os fence interface is already one wrapper. Jira NVGPU-5386 Change-Id: I3849db3684c7be8f37cf53971347f26247a52d6c Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355650 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	3a11bd69e7	Revert "gpu: nvgpu: modify nvgpu_writel check and loop" This reverts commit c100ac23455d450a7046c62915014111a0aa2e70. Bug 3009270 Change-Id: I1db1acac63c841b5383d75ec674fdc2160a0c84d Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2356076 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com>	2020-12-15 14:13:28 -06:00
Dinesh	290911618a	gpu: nvgpu: Check for vidmem failure This is added to check the vidmem init failure during gpu initialization. JIRA NVGPU-5389 Change-Id: I0111f302058e171031407c88804ba30c2509fabc Signed-off-by: Dinesh <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352916 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6cbc174fc2	gpu: nvgpu: avoid channel wdt ifdefs Implement empty stubs of the channel watchdog functions for when watchdog is disabled from build. Add some forward declarations that were missing. Now most call sites don't need #idefs for the build flag. Add error checks for the wdt alloc failure. Jira NVGPU-5494 Jira NVGPU-5493 Change-Id: I2d42e8ab4c5e045cd280b2e1f254396127bd154b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352050 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2ad015f7a5	gpu: nvgpu: modify nvgpu_writel check and loop Currently, nvgpu_writel_loop() writes to a register and immediately checks if register value is updated. It might take some time for hardware registers to get updated with value written by software. Modify nvgpu_writel_loop() to accept number of retries to check if register value is updated and assert with nvgpu_assert(). Also, move nvgpu_writel_loop() to common code and use generic nvgpu_readl() and nvgpu_writel() APIs. JIRA NVGPU-5490 Change-Id: Iaaf24203a91eee3d05de7d0c7dea18113367de5f Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348628 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	c43e3e4aeb	gpu: nvgpu: acr: add fecs/gpccs sig files read for next dgpu add fecs/gpccs sig file read for next dgpu. JIRA NVGPU-5461 Change-Id: Ib135dab8961c53d62fb7a95e378eba4c81d729a2 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354622 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Sami Kiminki	36a488392f	gpu: nvgpu: add PDI reporting for vgpu Read the PDI from vgpu constants. Bug 2957580 Bug 2992739 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Change-Id: Ief2edeaaa26e284707792f13d218c511fef073af Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351214 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	fc5b45ea83	gpu: nvgpu: move init_ltc_support sequence Currently, ltc fs_state is initialized during ltc init support. However, ltc cbc_param and cbc_param2 registers do not seem to be providing correct data if ltc.init_fs_state is called before fb.init_fs_state. - Create fb.init_fb_support hal to initialize fb. - Trigger init_fb_support before init_ltc_support. Bug 2969956 Bug 2957808 JIRA NVGPU-4666 Change-Id: I54d697d27b9d9c6318c4ef459d215b6f82cd5571 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345673 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	2a3bb9107f	gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h> top.h is a description of "devices" available on the GPU. As such rename this header to device.h. device.h will ultimately be a unit of actual C code that will rely on the top HAL to fill a device list. JIRA NVGPU-5421 Change-Id: If6e4a537d2209e429a678761a34713723da7a00a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
tkudav	957b19092f	gpu: nvgpu: Enable Quiesce on all builds Make Recovery and quiesce co-exist to support quiesce state on unrecoverrable errors. Currently, the quiesce code is wrapped under ifndef CONFIG_NVGPU_RECOVERY. Isolate the quiesce code from recovery config, thereby enabling it on all builds. On Linux, the hung_task checker(check_hung_uninterruptible_tasks() in kernel/hung_task.c) complains that quiesce thread is stuck for more than 120 seconds. INFO: task sw-quiesce:1068 blocked for more than 120 seconds. The wait time of more than 120 seconds is expected as quiesce thread will wait until quiesce call is triggered on fatal unrecoverable errors. However, the INFO print upsets the kernel_warning_test(KWT) on Linux builds. To fix the failing KWT, change the quiesce task to interruptible instead of uninterruptible as checker only looks at uninterruptible tasks. Bug 2919899 JIRA NVGPU-5479 Change-Id: Ibd1023506859d8371998b785e881ace52cb5f030 Signed-off-by: tkudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342774 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Seshendra Gadagottu	8778aa531d	gpu: nvgpu: netlist: correct info for generic regions There is an issue with reading u32 data for generic regions. The u8 pointer dereference copying only u8 data instead of u32 data. Legacy code is not using this data, so the issue is not caught earlier. Now using nvgpu_memcpy to copy all bytes of u32 data. Bug 2986531 Change-Id: Ib23c76cd1ce77e3a2f882940b11703391a11f99d Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348593 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	16fb7654a5	gpu: nvgpu: isolate channel watchdog unit Move the definition of struct nvgpu_channel_wdt to watchdog.c. Adjust users of it to access it via an unified interface instead of poking directly at the channel internals. Jira NVGPU-5494 Change-Id: Ie11826e6732a8b98e72c4f81dd06bd7e49848121 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345935 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	21e02878f4	gpu: nvgpu: move wdt code out of channel.c Cut and paste the existing channel watchdog functions to another file for better isolation of units. Jira NVGPU-5494 Change-Id: Id437f0939e69a4a8b495eaee164c4d7a9f283fa9 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345934 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2d94863cae	gpu: nvgpu: move is_tpc_addr and get_tpc_num to common gr.is_tpc_addr() and gr.get_tpc_num() are chip agnostic hals. Move these hals to common code. Jira NVGPU-5504 Change-Id: I50fa7ac876c8667de42df1830bd412b412538508 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349272 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Sami Kiminki	23cda4f4a9	gpu: nvgpu: add PDI for TU104 (Linux) Add reporting for the per-device identifier (PDI) in the Linux GPU characteristics. Implement PDI read for TU104. Bug 2957580 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Change-Id: I6ac0e4f74378564d82955b431d4c1fd6c0daeb13 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346933 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	6b1302f23c	gpu: nvgpu: Reduce linux debug log spew Currently when nvgpu prints debug information for something like an MMU fault the result includes a lot of usless boiler plate logging spew. In some cases this can be helpful in identifying where the log message came from in the nvgpu code base. However, for debug spews from faults, the viewer of that info does not care which function printed the log (for example). Instead having a fast and readable debug dump is more valuable. So to that end, add a special debug dump printing function that does not use the normal log format. Instead, it prints only a breif prefix to use as a grep search query. The new print out is listed below. Since often the kernel logs are impressively long and obtuse, having a clear debug search string can be helpful. With this log format, one can simply do: $ grep __$CHIP__ kernel.log And find any debug logs for the desired chip. New log format - collected on a gv11b under L4T running `nvgpu_submit_mmu_fault': [ 32.005793] nvgpu: 17000000.gv11b gv11b_fb_mmu_fault_info_dump:311 [ERR] [MMU FAULT] mmu engine id: 32, ch id: 511, fault addr: 0x1000, fault addr aperture: 0, fault type: invalid pde, access type: virt read, [ 32.006137] nvgpu: 17000000.gv11b gv11b_fb_mmu_fault_info_dump:320 [ERR] [MMU FAULT] protected mode: 0, client type: hub, client id: host, gpc id if client type is gpc: 0, [ 32.006417] nvgpu: 17000000.gv11b nvgpu_rc_mmu_fault:296 [ERR] mmu fault id=0 id_type=1 act_eng_bitmask=00000000 [ 32.007125] __gv11b__ Channel Status - chip gv11b [ 32.007128] __gv11b__ --------------------------- [ 32.007241] __gv11b__ 511-gv11b, TSG: 0, pid 955, refs: 2, deterministic: [ 32.007364] __gv11b__ channel status: in use pending busy [ 32.007509] __gv11b__ RAMFC : TOP: 8000000000001000 PUT: 0000000000001030 GET: 0000000000001000 FETCH: 0000600000001000HEADER: 60400000 COUNT: 00000000SEMAPHORE: addr 0000000000000000payload 0000000000000000 execute 00000000 [ 32.007601] __gv11b__ [ 32.008696] __gv11b__ [ 32.008700] __gv11b__ PBDMA Status - chip gv11b [ 32.008894] __gv11b__ ------------------------- [ 32.013477] __gv11b__ pbdma 0: [ 32.017840] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.020992] __gv11b__ PBDMA_PUT 0000000000001030 PBDMA_GET 0000000000001000 [ 32.029037] __gv11b__ GP_PUT 00000001 GP_GET 00000001 FETCH 00000001 HEADER 60400000 [ 32.036386] __gv11b__ HDR 00000000 SHADOW0 00001000 SHADOW1 80003000 [ 32.044787] __gv11b__ pbdma 1: [ 32.051964] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.055099] __gv11b__ PBDMA_PUT 0000000042003200 PBDMA_GET 00000050728bc914 [ 32.062997] __gv11b__ GP_PUT 00000000 GP_GET 2080a000 FETCH 00000000 HEADER e1850010 [ 32.070424] __gv11b__ HDR 00110000 SHADOW0 02000000 SHADOW1 10000004 [ 32.078652] __gv11b__ pbdma 2: [ 32.085913] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.088973] __gv11b__ PBDMA_PUT 00000021040c0004 PBDMA_GET 0000000140020000 [ 32.096502] __gv11b__ GP_PUT 00000000 GP_GET 8080a440 FETCH 00000000 HEADER 61400040 [ 32.103679] __gv11b__ HDR 14000010 SHADOW0 00000000 SHADOW1 00000400 [ 32.112336] __gv11b__ [ 32.119860] __gv11b__ gv11b eng 0: [ 32.122119] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.125807] __gv11b__ [ 32.135954] __gv11b__ gv11b eng 1: [ 32.135958] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.139457] __gv11b__ [ 32.149945] __gv11b__ gv11b eng 2: [ 32.149950] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.153543] __gv11b__ [ 32.163598] __gv11b__ gv11b eng 3: [ 32.163601] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.167278] __gv11b__ [ 32.177076] __gv11b__ [ 32.186145] nvgpu: 17000000.gv11b nvgpu_tsg_set_ctx_mmu_error:492 [ERR] TSG 0 generated a mmu fault [ 32.189443] nvgpu: 17000000.gv11b nvgpu_set_err_notifier_locked:140 [ERR] error notifier set to 31 for ch 511 JIRA NVGPU-5541 Change-Id: Iad60adfab5198ee11dd2ec595f2422ea541b7a2a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349166 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	5d06a59bc5	gpu: nvgpu: Cleanup uart and debugfs debug prints The gk20a_debug_dump() function implicitly adds a newline since it uses nvgpu_err() under the hood (for uart destined prints). For the seq_file destined writes it does not so there is an annoying inconsistency. Remove the newline that many of the gk20a_debug_dump() calls add and add the newline to the (now) seq_printf() call. This reduces the length of debug dump logs and speeds them up - UART is _very_ slow after all. Also cleanup some formatting issues in the various debug prints I happened to notice. JIRA NVGPU-5541 Change-Id: Iabf853d5c50214794fc4cbb602dfffabeb877132 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347956 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Antony Clince Alex	50dcfe1637	gpu: nvgpu: update fb unit ecc init, handling The ecc init, handling for the fb unit is refactored to improve reusability for nvgpu-next. The following changes have been done: - fb.ecc: This is a new subunit within fb and contains the following functions: - init: Moved from fb.fb_ecc_init. - free: Moved from fb.fb_ecc_free. - l2tlb_error_mask: Fetch bit mask for corrected, uncorrected errors supported by the unit. - fb.intr: This unit has been updated to include the following ecc interrupt, error handlers: - handle_ecc: Top level interrupt handler for fb ecc errors. - handle_ecc_l2tlb: Handle errors within l2tlb memory. - handle_ecc_hubtlb: Handle errors within hubtlb memory. - handle_ecc_fillunit: Handle errors within fillunit memory Jira: NVGPU-5032 Change-Id: I1a26c1823eb992e0e0175250b969f1186dff6e62 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333271 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Abdul Salam	d339d9ed33	gpu: nvgpu: segregate clk_mon from clk unit. As a part of refactoring this CL removes clk_mon unit from clk unit. Clk_mon is used for monitoring of clk and it is an independent unit. This patch does the following. Move the clk_mon struct from clk.h to clk_mon_tu104.h create a new clk_mon gpu_ops and assign clk_mon specific ops there. Move all the function to clk_mon_tu104.c Update the yaml file NVGPU-4689 Change-Id: Ia72bf28a93ce9a7936c277076f365c4b6593b032 Signed-off-by: Abdul Salam <absalam@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2336230 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	91af7efd23	gpu: nvgpu: enable ACR support for NEXT dGPU -Enabled ACR support for NEXT dGPU -Blob creation & boot strap of LSPMU support skipped by ACR by checking flag "support_ls_pmu", lspmu support is not required until PSTATE support is enabled. JIRA NVGPU-5461 Change-Id: I5a4c688926ca1c55aeb4cbbb9668c55bb35f9119 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2344582 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Abdul Salam <absalam@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00

... 12 13 14 15 16 ...

3348 Commits