linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 18:16:01 +03:00

Author	SHA1	Message	Date
Alex Waterman	d0fa6a15c1	gpu: nvgpu: Fix logging for pre-4.14 kernels It seems that on Tegra kernels older than 4.14 the pre_err() function does not automatically add a '\n' if you don't supply it. For older kernels, with the new nvgpu_dbg_dump_impl() function, add this extra newline so that logs are not hopelessly scrambled. Change-Id: Ife8fe03ace248a1d8ece7850b609c343cc1d27ac Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359752 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	f807ad932c	gpu: nvgpu: fix uninitialized variable error Enabling Kcov and KASAN causes below compilation failure : common/mm/vm_area.c:255:3: error: ‘vma’ may be used uninitialized in this function [-Werror=maybe-uninitialized] Fix this by correcting failure cases in function nvgpu_vm_area_alloc() Bug 2155608 Change-Id: Id4070157f2a8bd7043b0c49effb6f61cce5eecc2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359496 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2d24298af0	gpu: nvgpu: update nvgpu_pte_dbg_print function Currently, nvgpu_pte_dbg_print() overwrites ctag string "ctag=" and only prints ctag number. For example, nvgpu_pte_dbg_print:104 [DBG] vm=3 PTE: i=0 size=8 \| GPU 0x1efc000000 phys 0x115a50000 pgsz: 4kb perm=RW kind=0x8 APT=SYSTEM C--V- 1 [0x08000010, 0x115a5007] Update nvgpu_pte_dbg_print function to include ctag string. nvgpu_pte_dbg_print:104 [DBG] vm=3 PTE: i=0 size=8 \| GPU 0x1efc000000 phys 0x115a50000 pgsz: 4kb perm=RW kind=0x8 APT=SYSTEM C--V- ctag=1 [0x08000010, 0x115a5007] Jira NVGPU-5489 Change-Id: I2f84f89da685ad6a84534c0bb51e3ca1244b3497 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354182 Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: Seema Khowala <seemaj@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	160669a7bb	gpu: nvgpu: return device from nvgpu_device_get() Instead of copying the device contents into the passed pointer have nvgpu_device_get() return a device pointer. This will let the engines.c code move towards using the nvgpu_device type directly, instead of maintaining its own version of an essentially identical struct. JIRA NVGPU-5421 Change-Id: I6ed2ab75187a207c8962d4c0acd4003d1c20dea4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319758 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	229ea2dd59	gpu: nvgpu: add gmmu_attrs comptagline_mode flag Add cbc_comptagline_mode flag as a member of nvgpu_gmmu_attrs. This flag indicates if cbc follows comptagline policy. Add fb.is_comptagline_mode_enabled() to check if comptagline mode is enabled. JIRA NVGPU-4666 Change-Id: I77fb31cb54dd014c2fd35586a3751c757b2543e2 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2353348 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	f1dc3fd2fb	gpu: nvgpu: Add debugfs wrapper for exposing profilers The current debugfs code is completely specific to FIFO's kickoff profiler. But exposing these debugfs nodes is really a perfectly generic operation to any given profiler. Therefore add a generic debugfs interface for exposing profilers. Any code that implements a profiler can now use a single function call to export a profiler to the GPU debugfs area. JIRA NVGPU-5606 Change-Id: I67a5bd9998fcfac94678e465442b9a38ab7e7612 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2358382 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	71ab9800cd	gpu: nvgpu: Add raw data dump for profiler Add the ability to dump raw data from the profiler. The kernel driver can provide some simple analysis, but ultimately a userspace tool such as python, R, matlab/octave, or the like, is far better suited for data analysis and visualization. JIRA NVGPU-5606 Change-Id: I94a63eadba726b66a78cf51ea4674745038390a1 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2358381 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	70ce67df2d	gpu: nvgpu: Add a generic profiler Add a generic profiler based on the channel kickoff profiler. This aims to provide a mechanism to allow engineers to (more) easily profile arbitrary software paths within nvgpu. Usage of this profiler is still primarily through debugfs. Next up is a generic debugfs interface for this profiler in the Linux code. The end goal for this is to profile the recovery code and generate interesting statistics. JIRA NVGPU-5606 Signed-off-by: Alex Waterman <alexw@nvidia.com> Change-Id: I99783ec7e5143855845bde4e98760ff43350456d Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355319 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	319520ff57	gpu: nvgpu: Add a new device manager unit This adds a new device management unit in the common code responsible for facilitating the parsing of the GPU top device list and providing that info to other units in nvgpu. The basic idea is to read this list once from HW and store it in a set of lists corresponding to each device type (graphics, LCE, etc). Many of the HALs in top can be deleted and instead implemented using common code parsing the SW representation. Every time the driver queries the device list it does so using a device type and instance ID. This is common code. The HAL is responsible for populating the device list in such a way that the driver can query it in a chip agnostic manner. Also delete some of the unit tests for functions that no longer exist. This code will require new unit tests in time; those should be quite simple to write once unit testing is needed. JIRA NVGPU-5421 Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Debarshi Dutta	f6298157bc	gpu: nvgpu: update HAL function name get_ecc_override_val is renamed to gp10b_gr_get_ecc_override_val to follow the naming convention of the HAL functions correctly. Change-Id: I80e495e8bec3483651e325b792708382e5327de0 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357925 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	4dcfbc19de	gpu: nvgpu: Trigger quiesce on spurious FBPA intr In Bug 200588835, the spurious FBPA interrupts are seen on couple of boards. These interrupts were found to be EDC (Error detection and Correction) interrupts which are triggered due to ECC errors. The EDC registers are not exposed to the driver, so the interrupt status register cannot be cleared; resulting in interrupt storm. Also, it was concluded that only bad HW can cause this failure scenario. So, in the ISR for FBPA interrupts, get the GPU into quiesce state as we don't expect the GPU to be in usable state post such unrecoverable errors. Adapt the quiesce code for Linux build too. 1. On Linux, we cannot exit the nvgpu process after quiesce like we do on QNX. So, add nvgpu_disable_irqs() call to quiesce implementation which is done as part of process exit handler on QNX. Masking interrupts which is already done as part of quiesce would be sufficient in most cases, but to be fail-safe disable_irqs too. 3. Also, the IOCTL code looks at g->sw_ready, hence add nvgpu_start_gpu_idle() to set g->sw_ready to false along with setting NVGPU_DRIVER_IS_DYING = true. We expect the nvgpu_sw_quiesce() call to finish before quiesce thread wakes up from 50ms sleep. Hence, critical step like nvgpu_start_gpu_idle() is added to nvgpu_sw_quiesce(), whereas the somewhat redundant disable IRQs call is added to quiesce thread. nvgpu_fifo_quiesce() was called twice by mistake; remove one of the them. Bug 2919899 Bug 200588835 Change-Id: I9beec688c2e1c0d8dfc1327ddf122684576f8684 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354537 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6778fc9eb6	gpu: nvgpu: remove fence validity checks The valid flag in struct nvgpu_fence_type is not very useful. It's set when a fence is created on an allocated object and read in these three scenarios: - nvgpu_fence_install_fd() after a submit, if the submit was successful. A successful submit implies that a post fence exists. - nvgpu_fence_wait() for a copyengine job when synchronizing the ce ringbuffer or when waiting for vidmem clears. In these cases the fence is also clearly always valid. - nvgpu_fence_is_expired() when testing whether a tracked job has completed. Such jobs cannot exist without post fences that are mandatory for tracking, so the fence must exist. Remove the valid flag. Remove also the other init checks from the above functions; they're equally unused and confusing implying that such calls would be acceptable, causing sloppy code at best. Jira NVGPU-5248 Jira NVGPU-5493 Change-Id: I52c5be1569b343024d2626bd9577f87b46064fba Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357828 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Rajesh Devaraj	f0a455eab4	gpu: nvgpu: update write to ltc ecc status reg This patch changes "nvgpu_writel_check" to "nvgpu_writel" for the write operation on ltc_ltc0_lts0_l2_cache_ecc_status_r() register with ltc_ltc0_lts0_l2_cache_ecc_status_reset_task_f(). This is required because the read operation will always return 0 for the register field ltc_ltc0_lts0_l2_cache_ecc_status_reset_task_f() according to the ref manual dev_ltc.ref. Bug 2975438 JIRA NVGPU-5635 Change-Id: I898bdcb1ca15af62279c43426aca8ec68ed5ba05 Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357792 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
mkumbar	2dfa74c831	gpu: nvgpu: ACR interface update FALCON_ID_END is used in ACR lsf_ucode_desc interface to allocate space for dependency map but now more number of FALCON’s supported which will cause wrong allocation for dependency map, so required to have its definition. JIRA NVGPU-5462 Change-Id: Idaaa24ea1d2767a0b4ef44b1376239f945e39912 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357747 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	98335c29b2	gpu: nvgpu: make os_fence_android_syncpt common The differences between sync_fence ("android sync") and dma_fence are abstracted away by nvhost in the nvhost_fence interface. There is no need to have separate android and dma os fences for syncpoints; unify the general implementation so that it's always used when requested for the build. Jira NVGPU-5386 Change-Id: Ia829e93e18d03064ff46ab1271547de2d1fb1cae Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2356158 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	4e241d5974	gpu: nvgpu: adapt to generic syncpt api Use the nvhost sync fence APIs that do not require knowledge about the sync fence version. Nvhost exports an opaque nvhost_fence type with a common interface for both legacy and stable sync fences. Delete the syncfd-specific nvhost wrappers. They exist only on Linux, so having them in the nvhost wrapper layer is just a hassle. The os fence interface is already one wrapper. Jira NVGPU-5386 Change-Id: I3849db3684c7be8f37cf53971347f26247a52d6c Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355650 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Sami Kiminki	6f30584a76	gpu: nvgpu: add PDI reporting for GV11B Enable reporting the per-device identifier on GV11B. Bug 2957580 Bug 2992739 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Change-Id: I3bee107cc08519942bdc3f2930820fa1cf91adcd Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346934 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	3a11bd69e7	Revert "gpu: nvgpu: modify nvgpu_writel check and loop" This reverts commit c100ac23455d450a7046c62915014111a0aa2e70. Bug 3009270 Change-Id: I1db1acac63c841b5383d75ec674fdc2160a0c84d Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2356076 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com>	2020-12-15 14:13:28 -06:00
Dinesh	3156f32c08	gpu: nvgpu: Add missing hal for pramin-DGPU-next This is adding missing hal-op for pramin. JIRA NVGPU-5615 Change-Id: I79df1da81354d4a5ad53360d33e79180ba82aba6 Signed-off-by: Dinesh <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355310 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Dinesh	290911618a	gpu: nvgpu: Check for vidmem failure This is added to check the vidmem init failure during gpu initialization. JIRA NVGPU-5389 Change-Id: I0111f302058e171031407c88804ba30c2509fabc Signed-off-by: Dinesh <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352916 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6cbc174fc2	gpu: nvgpu: avoid channel wdt ifdefs Implement empty stubs of the channel watchdog functions for when watchdog is disabled from build. Add some forward declarations that were missing. Now most call sites don't need #idefs for the build flag. Add error checks for the wdt alloc failure. Jira NVGPU-5494 Jira NVGPU-5493 Change-Id: I2d42e8ab4c5e045cd280b2e1f254396127bd154b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352050 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2ad015f7a5	gpu: nvgpu: modify nvgpu_writel check and loop Currently, nvgpu_writel_loop() writes to a register and immediately checks if register value is updated. It might take some time for hardware registers to get updated with value written by software. Modify nvgpu_writel_loop() to accept number of retries to check if register value is updated and assert with nvgpu_assert(). Also, move nvgpu_writel_loop() to common code and use generic nvgpu_readl() and nvgpu_writel() APIs. JIRA NVGPU-5490 Change-Id: Iaaf24203a91eee3d05de7d0c7dea18113367de5f Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348628 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Debarshi Dutta	86b31c4f7c	gpu: nvgpu: alternative implementation of dma_buf_get/set_data Historically, nvgpu has supported a struct gk20a_dmabuf_priv and associated it with a dmabuf instance. This was aided by Nvmap's dma_buf_set_drv_data() and dma_buf_get_drvdata() APIs. gk20a_dmabuf_priv is used to store Comptag IDs i.e. (1 per 64 kb) as well as can store the dmabuf attachments to avoid multiple attach/detach calls. dma_buf_set_drv_data() allows Nvgpu to associate an instance of struct gk20a_dmabuf_priv with the instance of the dmabuf and also provide a release callback to delete the instance when the last reference to the dmabuf is put. Nvmap accomplishes this by modifying the struct dma_buf_ops definition to include the set_drv_data and get_drv_data callbacks in the kernel code. The above approach won't work for upstream Kstable and Nvmap plans to remove these APIs for upcoming newer downstream kernels as well. In order to implement the same functionality without depending on Nvmap, Nvgpu will implement a release chaining mechanism. Dmabuf's 'ops' pointer points to a constant struct and hence a whole copy of the ops is made followed by altering the new copy's release pointer. struct gk20a_dmabuf_priv stores the new copy and the dmabuf's 'ops' is changed to point to this. This allows Nvgpu to retrieve the corresponding gk20a_dmabuf_priv instance using container_of. Nvgpu's custom release callback will invoke the original release callback of the dmabuf's producer as a last step, thus completing the full circle. In case, the driver is removed, Nvgpu restores the dmabuf's 'ops' back to the original state. In order to accomplish this, every instance of a struct nvgpu_os_linux maintains a linkedlist of the gk20a_dma_buf instances. During the driver removal, this linkedlist is traversed and the corresponding dmabuf's 'ops' pointer is put back to its original state followed by freeing of this instance. Nvgpu is a producer of dmabuf's for vidmem and needs a way to check whether the given dmabuf belongs to itself. Its no longer reliable to depend on a comparision of the 'ops' pointer. Instead dmabuf_export_info() allows a name to be set by the exporter and this can be used to compare with a memory location that belongs to Nvgpu. Similarly for sysmem dmabufs, Nvmap makes a similar change in the way it identifies whether a dmabuf belongs to itself. Removed NVGPU_DMABUF_HAS_DRVDATA and moved to a unified mechanism for both downstream as well as upstream kernel. Some of the other changes in this file include the following. 1) Deletion of dmabuf.c and moving its contents over to dmabuf_priv.c 2) Replacing gk20a_mm_pin_has_drvdata with nvgpu_mm_pin_privdata and vice-versa for unpin. Bug 2878569 Change-Id: Icf8e79b05a25ad5a85f478c3ee0fc1eb7747e22d Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2341001 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Puneet Saxena <puneets@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	c43e3e4aeb	gpu: nvgpu: acr: add fecs/gpccs sig files read for next dgpu add fecs/gpccs sig file read for next dgpu. JIRA NVGPU-5461 Change-Id: Ib135dab8961c53d62fb7a95e378eba4c81d729a2 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354622 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Sami Kiminki	36a488392f	gpu: nvgpu: add PDI reporting for vgpu Read the PDI from vgpu constants. Bug 2957580 Bug 2992739 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Change-Id: Ief2edeaaa26e284707792f13d218c511fef073af Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351214 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Sami Kiminki	d44960d424	gpu: nvgpu: add PDI reporting for GP10B (Linux) Read the T186 SoC PDI fuse registers to retrieve the per-device identifier for GP10B. Bug 2957580 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Change-Id: Ie5031a005ca251636614d27c2dc77bddfce0ea21 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350930 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Debarshi Dutta	872c043ad6	gpu: nvgpu: disable default enable of CDE Kstable kernel won't support adding additional CONFIGS for now. Instead as a WAR, this CONFIG is disabled by default and patches enabling it for K4.14 and K4.9 are added. Bug 2878569 Change-Id: Ib5f2adf477b554a1966de1d61db1e89264ae5d23 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350398 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	fc5b45ea83	gpu: nvgpu: move init_ltc_support sequence Currently, ltc fs_state is initialized during ltc init support. However, ltc cbc_param and cbc_param2 registers do not seem to be providing correct data if ltc.init_fs_state is called before fb.init_fs_state. - Create fb.init_fb_support hal to initialize fb. - Trigger init_fb_support before init_ltc_support. Bug 2969956 Bug 2957808 JIRA NVGPU-4666 Change-Id: I54d697d27b9d9c6318c4ef459d215b6f82cd5571 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345673 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	32bdf8cc2d	gpu: nvgpu: add NVGPU_SUPPORT_PLC flag Add NVGPU_SUPPORT_PLC to indicate if compression PLC is supported in nvgpu. Add corresponding GPU characteristics flag and IOCTL mapping to sync compression support status with nvrm_gpu. JIRA NVGPU-4666 Change-Id: I63307b99ceac7dc2e6af143ca13cdac63e253ed3 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340242 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	2a3bb9107f	gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h> top.h is a description of "devices" available on the GPU. As such rename this header to device.h. device.h will ultimately be a unit of actual C code that will rely on the top HAL to fill a device list. JIRA NVGPU-5421 Change-Id: If6e4a537d2209e429a678761a34713723da7a00a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
tkudav	957b19092f	gpu: nvgpu: Enable Quiesce on all builds Make Recovery and quiesce co-exist to support quiesce state on unrecoverrable errors. Currently, the quiesce code is wrapped under ifndef CONFIG_NVGPU_RECOVERY. Isolate the quiesce code from recovery config, thereby enabling it on all builds. On Linux, the hung_task checker(check_hung_uninterruptible_tasks() in kernel/hung_task.c) complains that quiesce thread is stuck for more than 120 seconds. INFO: task sw-quiesce:1068 blocked for more than 120 seconds. The wait time of more than 120 seconds is expected as quiesce thread will wait until quiesce call is triggered on fatal unrecoverable errors. However, the INFO print upsets the kernel_warning_test(KWT) on Linux builds. To fix the failing KWT, change the quiesce task to interruptible instead of uninterruptible as checker only looks at uninterruptible tasks. Bug 2919899 JIRA NVGPU-5479 Change-Id: Ibd1023506859d8371998b785e881ace52cb5f030 Signed-off-by: tkudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342774 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	1f28443889	gpu: nvgpu: Disable platform debug spew by default Disable the somewhat non-useful syncpoint debug spew in the nvgpu debug spew. The GPU has it's own snapshot view of syncpoints so visibility into other syncpoint data is often not very helpful. However, there are plausibly times where this would be necessary. For example debugging a sync issue between the GPU and some other SoC engine. Therefore, the syncpoint debug spew can be enabled again at runtime if necessary. JIRA NVGPU-5541 Change-Id: I7028e2d6027e41835b2fed4f2bbb366c16b99967 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349185 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Seshendra Gadagottu	8778aa531d	gpu: nvgpu: netlist: correct info for generic regions There is an issue with reading u32 data for generic regions. The u8 pointer dereference copying only u8 data instead of u32 data. Legacy code is not using this data, so the issue is not caught earlier. Now using nvgpu_memcpy to copy all bytes of u32 data. Bug 2986531 Change-Id: Ib23c76cd1ce77e3a2f882940b11703391a11f99d Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348593 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	f0de6fa54a	gpu: nvgpu: sec2: update sec2 interfaces update sec2 rtos interfaces to support next dgpu sec2 ucode. JIRA NVGPU-5468 Change-Id: I534a6eded8a9525dc09e5f57e46bef36f1a4e81b Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352103 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Seema Khowala	1c40ebe9b1	gpu: nvgpu: handle pbus and priv intr first Handle pbus and priv intr before handling other stall interrupts. These should be treated as high priority interrupts. JIRA NVGPU-25 Bug 200603566 Change-Id: I707119c8751a5621958777ffb64300db28426dfb Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350773 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
ajesh	ce4e7a6859	gpu: nvgpu: remove redundant error prints Remove OS API return error prints from posix files as the BUG function prints the function and line number which causes the error. Jira NVGPU-4987 Change-Id: Ie6d6f781241ac5e837f2732fbb1cc1ddc4d971d4 Signed-off-by: ajesh <akv@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2337390 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: Ankur Kishore <ankkishore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	16fb7654a5	gpu: nvgpu: isolate channel watchdog unit Move the definition of struct nvgpu_channel_wdt to watchdog.c. Adjust users of it to access it via an unified interface instead of poking directly at the channel internals. Jira NVGPU-5494 Change-Id: Ie11826e6732a8b98e72c4f81dd06bd7e49848121 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345935 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	21e02878f4	gpu: nvgpu: move wdt code out of channel.c Cut and paste the existing channel watchdog functions to another file for better isolation of units. Jira NVGPU-5494 Change-Id: Id437f0939e69a4a8b495eaee164c4d7a9f283fa9 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345934 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	22987182a3	gpu: nvgpu: make ltc ecc intr handle func global LTC interrupt handle functions are reused in nvgpu-next. So, make listed ltc intr functions global. - gv11b_ltc_intr_init_counters - gv11b_ltc_intr_handle_rstg_ecc_interrupts - gv11b_ltc_intr_handle_tstg_ecc_interrupts - gv11b_ltc_intr_handle_dstg_ecc_interrupts Jira NVGPU-5094 Change-Id: I33a21ef7585314e31398dd165e1c7399ed27a7c3 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2337896 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2d94863cae	gpu: nvgpu: move is_tpc_addr and get_tpc_num to common gr.is_tpc_addr() and gr.get_tpc_num() are chip agnostic hals. Move these hals to common code. Jira NVGPU-5504 Change-Id: I50fa7ac876c8667de42df1830bd412b412538508 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349272 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Sami Kiminki	23cda4f4a9	gpu: nvgpu: add PDI for TU104 (Linux) Add reporting for the per-device identifier (PDI) in the Linux GPU characteristics. Implement PDI read for TU104. Bug 2957580 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Change-Id: I6ac0e4f74378564d82955b431d4c1fd6c0daeb13 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346933 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	8d68e687f0	gpu: nvgpu: linux: check whether hal initialized for gr_default_attrib_cb_size On access debugfs node gr_default_attrib_cb_size, the hal might not have been initialized. Bug 2848790 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I0a70f1377d2001802092a8eccec5ec144a58c79b Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349299 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	6d922dd9b7	gpu: nvgpu: vgpu: remove debugfs node dump_ctxsw_stats_on_channel_close It could cause kernel debug since vgpu cannot dump gr_ctx content. Also set .dump_ctxsw_stats null in vgpu hal. Bug 2848790 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Ia9ec99d464be72e2be26df25c572e671e10c18a5 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349295 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	cef1780e05	gpu: nvgpu: vgpu: remove ce_app support Kernel oops on dump ce_app debugfs nodes. ce_app is only used by dGPU which vgpu does not support currently. This patch removes hal setup and debugfs setup for ce_app. Bug 2848790 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Ia60a06a27b2d2ceda96ca567cda9e9a01e023c4b Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349294 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Richard Zhao	246b5fcf4d	gpu: nvgpu: debugfs: only create railgate_residency if not is_virtual Dump railgate_residency causes kernel crash since vgpu does not control railgate_residency. So create railgate_residency only on native driver. Bug 2848790 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I08d65e1c1c5bf813f0c47d5bffad5a01ea62adf8 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349293 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	6b1302f23c	gpu: nvgpu: Reduce linux debug log spew Currently when nvgpu prints debug information for something like an MMU fault the result includes a lot of usless boiler plate logging spew. In some cases this can be helpful in identifying where the log message came from in the nvgpu code base. However, for debug spews from faults, the viewer of that info does not care which function printed the log (for example). Instead having a fast and readable debug dump is more valuable. So to that end, add a special debug dump printing function that does not use the normal log format. Instead, it prints only a breif prefix to use as a grep search query. The new print out is listed below. Since often the kernel logs are impressively long and obtuse, having a clear debug search string can be helpful. With this log format, one can simply do: $ grep __$CHIP__ kernel.log And find any debug logs for the desired chip. New log format - collected on a gv11b under L4T running `nvgpu_submit_mmu_fault': [ 32.005793] nvgpu: 17000000.gv11b gv11b_fb_mmu_fault_info_dump:311 [ERR] [MMU FAULT] mmu engine id: 32, ch id: 511, fault addr: 0x1000, fault addr aperture: 0, fault type: invalid pde, access type: virt read, [ 32.006137] nvgpu: 17000000.gv11b gv11b_fb_mmu_fault_info_dump:320 [ERR] [MMU FAULT] protected mode: 0, client type: hub, client id: host, gpc id if client type is gpc: 0, [ 32.006417] nvgpu: 17000000.gv11b nvgpu_rc_mmu_fault:296 [ERR] mmu fault id=0 id_type=1 act_eng_bitmask=00000000 [ 32.007125] __gv11b__ Channel Status - chip gv11b [ 32.007128] __gv11b__ --------------------------- [ 32.007241] __gv11b__ 511-gv11b, TSG: 0, pid 955, refs: 2, deterministic: [ 32.007364] __gv11b__ channel status: in use pending busy [ 32.007509] __gv11b__ RAMFC : TOP: 8000000000001000 PUT: 0000000000001030 GET: 0000000000001000 FETCH: 0000600000001000HEADER: 60400000 COUNT: 00000000SEMAPHORE: addr 0000000000000000payload 0000000000000000 execute 00000000 [ 32.007601] __gv11b__ [ 32.008696] __gv11b__ [ 32.008700] __gv11b__ PBDMA Status - chip gv11b [ 32.008894] __gv11b__ ------------------------- [ 32.013477] __gv11b__ pbdma 0: [ 32.017840] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.020992] __gv11b__ PBDMA_PUT 0000000000001030 PBDMA_GET 0000000000001000 [ 32.029037] __gv11b__ GP_PUT 00000001 GP_GET 00000001 FETCH 00000001 HEADER 60400000 [ 32.036386] __gv11b__ HDR 00000000 SHADOW0 00001000 SHADOW1 80003000 [ 32.044787] __gv11b__ pbdma 1: [ 32.051964] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.055099] __gv11b__ PBDMA_PUT 0000000042003200 PBDMA_GET 00000050728bc914 [ 32.062997] __gv11b__ GP_PUT 00000000 GP_GET 2080a000 FETCH 00000000 HEADER e1850010 [ 32.070424] __gv11b__ HDR 00110000 SHADOW0 02000000 SHADOW1 10000004 [ 32.078652] __gv11b__ pbdma 2: [ 32.085913] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.088973] __gv11b__ PBDMA_PUT 00000021040c0004 PBDMA_GET 0000000140020000 [ 32.096502] __gv11b__ GP_PUT 00000000 GP_GET 8080a440 FETCH 00000000 HEADER 61400040 [ 32.103679] __gv11b__ HDR 14000010 SHADOW0 00000000 SHADOW1 00000400 [ 32.112336] __gv11b__ [ 32.119860] __gv11b__ gv11b eng 0: [ 32.122119] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.125807] __gv11b__ [ 32.135954] __gv11b__ gv11b eng 1: [ 32.135958] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.139457] __gv11b__ [ 32.149945] __gv11b__ gv11b eng 2: [ 32.149950] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.153543] __gv11b__ [ 32.163598] __gv11b__ gv11b eng 3: [ 32.163601] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.167278] __gv11b__ [ 32.177076] __gv11b__ [ 32.186145] nvgpu: 17000000.gv11b nvgpu_tsg_set_ctx_mmu_error:492 [ERR] TSG 0 generated a mmu fault [ 32.189443] nvgpu: 17000000.gv11b nvgpu_set_err_notifier_locked:140 [ERR] error notifier set to 31 for ch 511 JIRA NVGPU-5541 Change-Id: Iad60adfab5198ee11dd2ec595f2422ea541b7a2a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349166 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	5d06a59bc5	gpu: nvgpu: Cleanup uart and debugfs debug prints The gk20a_debug_dump() function implicitly adds a newline since it uses nvgpu_err() under the hood (for uart destined prints). For the seq_file destined writes it does not so there is an annoying inconsistency. Remove the newline that many of the gk20a_debug_dump() calls add and add the newline to the (now) seq_printf() call. This reduces the length of debug dump logs and speeds them up - UART is _very_ slow after all. Also cleanup some formatting issues in the various debug prints I happened to notice. JIRA NVGPU-5541 Change-Id: Iabf853d5c50214794fc4cbb602dfffabeb877132 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347956 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Richard Zhao	f2d424d452	gpu: nvgpu: vgpu: init rwsem deterministic_busy Uninitialized rwsem raised warnings on enabling spinlock debug. Bug 2880934 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I74828b291c518f1fd987806682118041af41e080 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346408 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Aparna Das <aparnad@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	b3766f352c	gpu: nvgpu: call hal callback when set fecs_trace default filter vgpu depends on the hal callback to notify server the filter changes. Bug 200469911 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Ibc9221de853ebe813609f897b46584f5cf88cbce Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2343613 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Divya Singhatwaria	bc4cef7a43	gpu: nvgpu: offset for exterraddr and exterrstat reg Compute the offsets for falcon_falcon_exterraddr_r() and falcon_falcon_exterrstat_r() registers by applying the mask 0xFFF JIRA NVGPU-4834 Change-Id: I7cef6f82e7802bea9133f3c95c891de22ef10d07 Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347674 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00

... 9 10 11 12 13 ...

8354 Commits