Commit Graph

8354 Commits

Author SHA1 Message Date
Alex Waterman
d0fa6a15c1 gpu: nvgpu: Fix logging for pre-4.14 kernels
It seems that on Tegra kernels older than 4.14 the pre_err() function does
not automatically add a '\n' if you don't supply it.

For older kernels, with the new nvgpu_dbg_dump_impl() function, add this extra
newline so that logs are not hopelessly scrambled.

Change-Id: Ife8fe03ace248a1d8ece7850b609c343cc1d27ac
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359752
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
f807ad932c gpu: nvgpu: fix uninitialized variable error
Enabling Kcov and KASAN causes below compilation failure :
common/mm/vm_area.c:255:3: error: ‘vma’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]

Fix this by correcting failure cases in function nvgpu_vm_area_alloc()

Bug 2155608

Change-Id: Id4070157f2a8bd7043b0c49effb6f61cce5eecc2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359496
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
2d24298af0 gpu: nvgpu: update nvgpu_pte_dbg_print function
Currently, nvgpu_pte_dbg_print() overwrites ctag string "ctag=" and only
prints ctag number.
For example,
nvgpu_pte_dbg_print:104  [DBG]  vm=3 PTE: i=0    size=8  |
GPU 0x1efc000000  phys 0x115a50000  pgsz:   4kb perm=RW kind=0x8 APT=SYSTEM C--V- 1
[0x08000010, 0x115a5007]

Update nvgpu_pte_dbg_print function to include ctag string.
nvgpu_pte_dbg_print:104  [DBG]  vm=3 PTE: i=0    size=8  |
GPU 0x1efc000000  phys 0x115a50000  pgsz:   4kb perm=RW kind=0x8 APT=SYSTEM C--V- ctag=1
[0x08000010, 0x115a5007]

Jira NVGPU-5489

Change-Id: I2f84f89da685ad6a84534c0bb51e3ca1244b3497
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354182
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
160669a7bb gpu: nvgpu: return device from nvgpu_device_get()
Instead of copying the device contents into the passed pointer have
nvgpu_device_get() return a device pointer. This will let the engines.c
code move towards using the nvgpu_device type directly, instead of
maintaining its own version of an essentially identical struct.

JIRA NVGPU-5421

Change-Id: I6ed2ab75187a207c8962d4c0acd4003d1c20dea4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319758
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
229ea2dd59 gpu: nvgpu: add gmmu_attrs comptagline_mode flag
Add cbc_comptagline_mode flag as a member of nvgpu_gmmu_attrs. This flag
indicates if cbc follows comptagline policy.
Add fb.is_comptagline_mode_enabled() to check if comptagline mode is
enabled.

JIRA NVGPU-4666

Change-Id: I77fb31cb54dd014c2fd35586a3751c757b2543e2
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2353348
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
f1dc3fd2fb gpu: nvgpu: Add debugfs wrapper for exposing profilers
The current debugfs code is completely specific to FIFO's kickoff
profiler. But exposing these debugfs nodes is really a perfectly
generic operation to any given profiler.

Therefore add a generic debugfs interface for exposing profilers.
Any code that implements a profiler can now use a single function
call to export a profiler to the GPU debugfs area.

JIRA NVGPU-5606

Change-Id: I67a5bd9998fcfac94678e465442b9a38ab7e7612
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2358382
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
71ab9800cd gpu: nvgpu: Add raw data dump for profiler
Add the ability to dump raw data from the profiler. The kernel driver
can provide some simple analysis, but ultimately a userspace tool
such as python, R, matlab/octave, or the like, is far better suited
for data analysis and visualization.

JIRA NVGPU-5606

Change-Id: I94a63eadba726b66a78cf51ea4674745038390a1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2358381
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
70ce67df2d gpu: nvgpu: Add a generic profiler
Add a generic profiler based on the channel kickoff profiler. This
aims to provide a mechanism to allow engineers to (more) easily profile
arbitrary software paths within nvgpu.

Usage of this profiler is still primarily through debugfs. Next up is
a generic debugfs interface for this profiler in the Linux code.

The end goal for this is to profile the recovery code and generate
interesting statistics.

JIRA NVGPU-5606

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: I99783ec7e5143855845bde4e98760ff43350456d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355319
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
319520ff57 gpu: nvgpu: Add a new device manager unit
This adds a new device management unit in the common code responsible
for facilitating the parsing of the GPU top device list and providing
that info to other units in nvgpu.

The basic idea is to read this list once from HW and store it in a
set of lists corresponding to each device type (graphics, LCE, etc).
Many of the HALs in top can be deleted and instead implemented using
common code parsing the SW representation.

Every time the driver queries the device list it does so using a
device type and instance ID. This is common code. The HAL is responsible
for populating the device list in such a way that the driver can
query it in a chip agnostic manner.

Also delete some of the unit tests for functions that no longer
exist. This code will require new unit tests in time; those should be
quite simple to write once unit testing is needed.

JIRA NVGPU-5421

Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
f6298157bc gpu: nvgpu: update HAL function name
get_ecc_override_val is renamed to gp10b_gr_get_ecc_override_val to
follow the naming convention of the HAL functions correctly.

Change-Id: I80e495e8bec3483651e325b792708382e5327de0
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357925
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
4dcfbc19de gpu: nvgpu: Trigger quiesce on spurious FBPA intr
In Bug 200588835, the spurious FBPA interrupts are seen on couple
of boards. These interrupts were found to be EDC (Error detection
and Correction) interrupts which are triggered due to ECC errors.
The EDC registers are not exposed to the driver, so the interrupt
status register cannot be cleared; resulting in interrupt storm.
Also, it was concluded that only bad HW can cause this failure
scenario. So, in the ISR for FBPA interrupts, get the GPU into
quiesce state as we don't expect the GPU to be in usable state post
such unrecoverable errors.

Adapt the quiesce code for Linux build too.
1. On Linux, we cannot exit the nvgpu process after quiesce like we
   do on QNX. So, add nvgpu_disable_irqs() call to quiesce
   implementation which is done as part of process exit handler on
   QNX. Masking interrupts which is already done as part of quiesce
   would be sufficient in most cases, but to be fail-safe
   disable_irqs too.
3. Also, the IOCTL code looks at g->sw_ready, hence add
   nvgpu_start_gpu_idle() to set g->sw_ready to false along with
   setting NVGPU_DRIVER_IS_DYING = true.

We expect the nvgpu_sw_quiesce() call to finish before quiesce thread
wakes up from 50ms sleep. Hence, critical step like
nvgpu_start_gpu_idle() is added to nvgpu_sw_quiesce(), whereas the
somewhat redundant disable IRQs call is added to quiesce thread.

nvgpu_fifo_quiesce() was called twice by mistake; remove one of the
them.

Bug 2919899
Bug 200588835

Change-Id: I9beec688c2e1c0d8dfc1327ddf122684576f8684
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354537
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6778fc9eb6 gpu: nvgpu: remove fence validity checks
The valid flag in struct nvgpu_fence_type is not very useful. It's set
when a fence is created on an allocated object and read in these three
scenarios:

- nvgpu_fence_install_fd() after a submit, if the submit was successful.
  A successful submit implies that a post fence exists.
- nvgpu_fence_wait() for a copyengine job when synchronizing the ce
  ringbuffer or when waiting for vidmem clears. In these cases the fence
  is also clearly always valid.
- nvgpu_fence_is_expired() when testing whether a tracked job has
  completed. Such jobs cannot exist without post fences that are
  mandatory for tracking, so the fence must exist.

Remove the valid flag. Remove also the other init checks from the above
functions; they're equally unused and confusing implying that such calls
would be acceptable, causing sloppy code at best.

Jira NVGPU-5248
Jira NVGPU-5493

Change-Id: I52c5be1569b343024d2626bd9577f87b46064fba
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357828
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Rajesh Devaraj
f0a455eab4 gpu: nvgpu: update write to ltc ecc status reg
This patch changes "nvgpu_writel_check" to "nvgpu_writel" for the
write operation on ltc_ltc0_lts0_l2_cache_ecc_status_r() register
with ltc_ltc0_lts0_l2_cache_ecc_status_reset_task_f(). This is
required because the read operation will always return 0 for the
register field ltc_ltc0_lts0_l2_cache_ecc_status_reset_task_f()
according to the ref manual dev_ltc.ref.

Bug 2975438
JIRA NVGPU-5635

Change-Id: I898bdcb1ca15af62279c43426aca8ec68ed5ba05
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357792
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
mkumbar
2dfa74c831 gpu: nvgpu: ACR interface update
FALCON_ID_END is used in ACR lsf_ucode_desc interface to allocate
space for dependency map but now more number of  FALCON’s supported
which will cause wrong allocation for dependency map, so required to
have its definition.

JIRA NVGPU-5462

Change-Id: Idaaa24ea1d2767a0b4ef44b1376239f945e39912
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357747
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
98335c29b2 gpu: nvgpu: make os_fence_android_syncpt common
The differences between sync_fence ("android sync") and dma_fence are
abstracted away by nvhost in the nvhost_fence interface. There is no
need to have separate android and dma os fences for syncpoints; unify
the general implementation so that it's always used when requested for
the build.

Jira NVGPU-5386

Change-Id: Ia829e93e18d03064ff46ab1271547de2d1fb1cae
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2356158
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
4e241d5974 gpu: nvgpu: adapt to generic syncpt api
Use the nvhost sync fence APIs that do not require knowledge about the
sync fence version. Nvhost exports an opaque nvhost_fence type with a
common interface for both legacy and stable sync fences.

Delete the syncfd-specific nvhost wrappers. They exist only on Linux, so
having them in the nvhost wrapper layer is just a hassle. The os fence
interface is already one wrapper.

Jira NVGPU-5386

Change-Id: I3849db3684c7be8f37cf53971347f26247a52d6c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355650
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sami Kiminki
6f30584a76 gpu: nvgpu: add PDI reporting for GV11B
Enable reporting the per-device identifier on GV11B.

Bug 2957580
Bug 2992739

Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Change-Id: I3bee107cc08519942bdc3f2930820fa1cf91adcd
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346934
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
3a11bd69e7 Revert "gpu: nvgpu: modify nvgpu_writel check and loop"
This reverts commit c100ac23455d450a7046c62915014111a0aa2e70.

Bug 3009270

Change-Id: I1db1acac63c841b5383d75ec674fdc2160a0c84d
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2356076
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
2020-12-15 14:13:28 -06:00
Dinesh
3156f32c08 gpu: nvgpu: Add missing hal for pramin-DGPU-next
This is adding missing hal-op for pramin.

JIRA NVGPU-5615

Change-Id: I79df1da81354d4a5ad53360d33e79180ba82aba6
Signed-off-by: Dinesh <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355310
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Dinesh
290911618a gpu: nvgpu: Check for vidmem failure
This is added to check the vidmem init failure during gpu
initialization.

JIRA NVGPU-5389

Change-Id: I0111f302058e171031407c88804ba30c2509fabc
Signed-off-by: Dinesh <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352916
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6cbc174fc2 gpu: nvgpu: avoid channel wdt ifdefs
Implement empty stubs of the channel watchdog functions for when
watchdog is disabled from build. Add some forward declarations that were
missing. Now most call sites don't need #idefs for the build flag.

Add error checks for the wdt alloc failure.

Jira NVGPU-5494
Jira NVGPU-5493

Change-Id: I2d42e8ab4c5e045cd280b2e1f254396127bd154b
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352050
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
2ad015f7a5 gpu: nvgpu: modify nvgpu_writel check and loop
Currently, nvgpu_writel_loop() writes to a register and immediately
checks if register value is updated. It might take some time for
hardware registers to get updated with value written by software.
Modify nvgpu_writel_loop() to accept number of retries to check if
register value is updated and assert with nvgpu_assert().
Also, move nvgpu_writel_loop() to common code and use generic
nvgpu_readl() and nvgpu_writel() APIs.

JIRA NVGPU-5490

Change-Id: Iaaf24203a91eee3d05de7d0c7dea18113367de5f
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348628
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
86b31c4f7c gpu: nvgpu: alternative implementation of dma_buf_get/set_data
Historically, nvgpu has supported a struct gk20a_dmabuf_priv and
associated it with a dmabuf instance. This was aided by Nvmap's
dma_buf_set_drv_data() and dma_buf_get_drvdata() APIs. gk20a_dmabuf_priv
is used to store Comptag IDs i.e. (1 per 64 kb) as well as can store the
dmabuf attachments to avoid multiple attach/detach calls. dma_buf_set_drv_data()
allows Nvgpu to associate an instance of struct gk20a_dmabuf_priv with the instance
of the dmabuf and also provide a release callback to delete the
instance when the last reference to the dmabuf is put. Nvmap accomplishes
this by modifying the struct dma_buf_ops definition to include the
set_drv_data and get_drv_data callbacks in the kernel code.

The above approach won't work for upstream Kstable and Nvmap
plans to remove these APIs for upcoming newer downstream kernels as
well.

In order to implement the same functionality without depending on Nvmap,
Nvgpu will implement a release chaining mechanism. Dmabuf's 'ops' pointer
points to a constant struct and hence a whole copy of the ops is made
followed by altering the new copy's release pointer.

struct gk20a_dmabuf_priv stores the new copy and the dmabuf's 'ops' is
changed to point to this. This allows Nvgpu to retrieve
the corresponding gk20a_dmabuf_priv instance using container_of.

Nvgpu's custom release callback will invoke the original release
callback of the dmabuf's producer as a last step, thus completing the
full circle. In case, the driver is removed, Nvgpu restores the
dmabuf's 'ops' back to the original state. In order to accomplish this,
every instance of a struct nvgpu_os_linux maintains a linkedlist of the
gk20a_dma_buf instances. During the driver removal, this linkedlist is
traversed and the corresponding dmabuf's 'ops' pointer is put back to
its original state followed by freeing of this instance.

Nvgpu is a producer of dmabuf's for vidmem and needs
a way to check whether the given dmabuf belongs to itself.
Its no longer reliable to depend on a comparision of
the 'ops' pointer. Instead dmabuf_export_info() allows a name to be set by the
exporter and this can be used to compare with a memory location
that belongs to Nvgpu. Similarly for sysmem dmabufs, Nvmap makes a
similar change in the way it identifies whether a dmabuf belongs to
itself.

Removed NVGPU_DMABUF_HAS_DRVDATA and moved to a unified mechanism for
both downstream as well as upstream kernel.

Some of the other changes in this file include the following.
1) Deletion of dmabuf.c and moving its contents over to dmabuf_priv.c
2) Replacing gk20a_mm_pin_has_drvdata with nvgpu_mm_pin_privdata and
vice-versa for unpin.

Bug 2878569

Change-Id: Icf8e79b05a25ad5a85f478c3ee0fc1eb7747e22d
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2341001
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Puneet Saxena <puneets@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
mkumbar
c43e3e4aeb gpu: nvgpu: acr: add fecs/gpccs sig files read for next dgpu
add fecs/gpccs sig file read for next dgpu.

JIRA NVGPU-5461

Change-Id: Ib135dab8961c53d62fb7a95e378eba4c81d729a2
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354622
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sami Kiminki
36a488392f gpu: nvgpu: add PDI reporting for vgpu
Read the PDI from vgpu constants.

Bug 2957580
Bug 2992739

Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Change-Id: Ief2edeaaa26e284707792f13d218c511fef073af
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351214
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sami Kiminki
d44960d424 gpu: nvgpu: add PDI reporting for GP10B (Linux)
Read the T186 SoC PDI fuse registers to retrieve the per-device
identifier for GP10B.

Bug 2957580

Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Change-Id: Ie5031a005ca251636614d27c2dc77bddfce0ea21
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350930
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
872c043ad6 gpu: nvgpu: disable default enable of CDE
Kstable kernel won't support adding additional CONFIGS for now.
Instead as a WAR, this CONFIG is disabled by default and patches
enabling it for K4.14 and K4.9 are added.

Bug 2878569

Change-Id: Ib5f2adf477b554a1966de1d61db1e89264ae5d23
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350398
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
fc5b45ea83 gpu: nvgpu: move init_ltc_support sequence
Currently, ltc fs_state is initialized during ltc init support. However,
ltc cbc_param and cbc_param2 registers do not seem to be providing
correct data if ltc.init_fs_state is called before fb.init_fs_state.
- Create fb.init_fb_support hal to initialize fb.
- Trigger init_fb_support before init_ltc_support.

Bug 2969956
Bug 2957808
JIRA NVGPU-4666

Change-Id: I54d697d27b9d9c6318c4ef459d215b6f82cd5571
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345673
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
32bdf8cc2d gpu: nvgpu: add NVGPU_SUPPORT_PLC flag
Add NVGPU_SUPPORT_PLC to indicate if compression PLC is supported in
nvgpu.
Add corresponding GPU characteristics flag and IOCTL mapping to sync
compression support status with nvrm_gpu.

JIRA NVGPU-4666

Change-Id: I63307b99ceac7dc2e6af143ca13cdac63e253ed3
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340242
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
2a3bb9107f gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h>
top.h is a description of "devices" available on the GPU. As such
rename this header to device.h.

device.h will ultimately be a unit of actual C code that will rely
on the top HAL to fill a device list.

JIRA NVGPU-5421

Change-Id: If6e4a537d2209e429a678761a34713723da7a00a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
tkudav
957b19092f gpu: nvgpu: Enable Quiesce on all builds
Make Recovery and quiesce co-exist to support quiesce state
on unrecoverrable errors. Currently, the quiesce code is wrapped
under ifndef CONFIG_NVGPU_RECOVERY. Isolate the quiesce code from
recovery config, thereby enabling it on all builds.

On Linux, the hung_task checker(check_hung_uninterruptible_tasks()
in kernel/hung_task.c) complains that quiesce thread is stuck for
more than 120 seconds.

INFO: task sw-quiesce:1068 blocked for more than 120 seconds.

The wait time of more than 120 seconds is expected as quiesce
thread will wait until quiesce call is triggered on fatal
unrecoverable errors. However, the INFO print upsets the
kernel_warning_test(KWT) on Linux builds. To fix the failing
KWT, change the quiesce task to interruptible instead of
uninterruptible as checker only looks at uninterruptible tasks.

Bug 2919899
JIRA NVGPU-5479

Change-Id: Ibd1023506859d8371998b785e881ace52cb5f030
Signed-off-by: tkudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342774
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
1f28443889 gpu: nvgpu: Disable platform debug spew by default
Disable the somewhat non-useful syncpoint debug spew in the nvgpu
debug spew. The GPU has it's own snapshot view of syncpoints so
visibility into other syncpoint data is often not very helpful.

However, there are plausibly times where this would be necessary.
For example debugging a sync issue between the GPU and some other
SoC engine. Therefore, the syncpoint debug spew can be enabled
again at runtime if necessary.

JIRA NVGPU-5541

Change-Id: I7028e2d6027e41835b2fed4f2bbb366c16b99967
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349185
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
8778aa531d gpu: nvgpu: netlist: correct info for generic regions
There is an issue with reading u32 data for generic regions.
The u8 pointer dereference copying only u8 data instead of
u32 data. Legacy code is not using this data, so the issue
is not caught earlier. Now using nvgpu_memcpy to copy all
bytes of u32 data.

Bug 2986531

Change-Id: Ib23c76cd1ce77e3a2f882940b11703391a11f99d
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348593
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
mkumbar
f0de6fa54a gpu: nvgpu: sec2: update sec2 interfaces
update sec2 rtos interfaces to support next dgpu sec2 ucode.

JIRA NVGPU-5468

Change-Id: I534a6eded8a9525dc09e5f57e46bef36f1a4e81b
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352103
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Seema Khowala
1c40ebe9b1 gpu: nvgpu: handle pbus and priv intr first
Handle pbus and priv intr before handling other stall
interrupts. These should be treated as high priority
interrupts.

JIRA NVGPU-25
Bug 200603566

Change-Id: I707119c8751a5621958777ffb64300db28426dfb
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350773
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
ajesh
ce4e7a6859 gpu: nvgpu: remove redundant error prints
Remove OS API return error prints from posix files as the BUG
function prints the function and line number which causes the
error.

Jira NVGPU-4987

Change-Id: Ie6d6f781241ac5e837f2732fbb1cc1ddc4d971d4
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2337390
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
16fb7654a5 gpu: nvgpu: isolate channel watchdog unit
Move the definition of struct nvgpu_channel_wdt to watchdog.c. Adjust
users of it to access it via an unified interface instead of poking
directly at the channel internals.

Jira NVGPU-5494

Change-Id: Ie11826e6732a8b98e72c4f81dd06bd7e49848121
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345935
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
21e02878f4 gpu: nvgpu: move wdt code out of channel.c
Cut and paste the existing channel watchdog functions to another file
for better isolation of units.

Jira NVGPU-5494

Change-Id: Id437f0939e69a4a8b495eaee164c4d7a9f283fa9
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345934
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
22987182a3 gpu: nvgpu: make ltc ecc intr handle func global
LTC interrupt handle functions are reused in nvgpu-next. So, make listed
ltc intr functions global.
- gv11b_ltc_intr_init_counters
- gv11b_ltc_intr_handle_rstg_ecc_interrupts
- gv11b_ltc_intr_handle_tstg_ecc_interrupts
- gv11b_ltc_intr_handle_dstg_ecc_interrupts

Jira NVGPU-5094

Change-Id: I33a21ef7585314e31398dd165e1c7399ed27a7c3
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2337896
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
2d94863cae gpu: nvgpu: move is_tpc_addr and get_tpc_num to common
gr.is_tpc_addr() and gr.get_tpc_num() are chip agnostic hals. Move these
hals to common code.

Jira NVGPU-5504

Change-Id: I50fa7ac876c8667de42df1830bd412b412538508
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349272
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sami Kiminki
23cda4f4a9 gpu: nvgpu: add PDI for TU104 (Linux)
Add reporting for the per-device identifier (PDI) in the Linux GPU
characteristics. Implement PDI read for TU104.

Bug 2957580

Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Change-Id: I6ac0e4f74378564d82955b431d4c1fd6c0daeb13
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346933
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
8d68e687f0 gpu: nvgpu: linux: check whether hal initialized for gr_default_attrib_cb_size
On access debugfs node gr_default_attrib_cb_size, the hal might not have
been initialized.

Bug 2848790

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I0a70f1377d2001802092a8eccec5ec144a58c79b
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349299
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
6d922dd9b7 gpu: nvgpu: vgpu: remove debugfs node dump_ctxsw_stats_on_channel_close
It could cause kernel debug since vgpu cannot dump gr_ctx content.
Also set .dump_ctxsw_stats null in vgpu hal.

Bug 2848790

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia9ec99d464be72e2be26df25c572e671e10c18a5
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349295
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
cef1780e05 gpu: nvgpu: vgpu: remove ce_app support
Kernel oops on dump ce_app debugfs nodes. ce_app is only used by dGPU
which vgpu does not support currently. This patch removes hal setup and
debugfs setup for ce_app.

Bug 2848790

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia60a06a27b2d2ceda96ca567cda9e9a01e023c4b
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349294
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Richard Zhao
246b5fcf4d gpu: nvgpu: debugfs: only create railgate_residency if not is_virtual
Dump railgate_residency causes kernel crash since vgpu does not control
railgate_residency. So create railgate_residency only on native driver.

Bug 2848790

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I08d65e1c1c5bf813f0c47d5bffad5a01ea62adf8
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349293
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
6b1302f23c gpu: nvgpu: Reduce linux debug log spew
Currently when nvgpu prints debug information for something like an MMU
fault the result includes a lot of usless boiler plate logging spew. In
some cases this can be helpful in identifying where the log message came
from in the nvgpu code base. However, for debug spews from faults, the
viewer of that info does not care which function printed the log (for
example).

Instead having a fast and readable debug dump is more valuable. So to
that end, add a special debug dump printing function that does not use
the normal log format. Instead, it prints only a breif prefix to use as
a grep search query. The new print out is listed below.

Since often the kernel logs are impressively long and obtuse, having a
clear debug search string can be helpful. With this log format, one can
simply do:

  $ grep __$CHIP__ kernel.log

And find any debug logs for the desired chip.

New log format - collected on a gv11b under L4T running
`nvgpu_submit_mmu_fault':

[   32.005793] nvgpu: 17000000.gv11b      gv11b_fb_mmu_fault_info_dump:311  [ERR]  [MMU FAULT] mmu engine id:  32, ch id:  511, fault addr: 0x1000, fault addr aperture: 0, fault type: invalid pde, access type: virt read,
[   32.006137] nvgpu: 17000000.gv11b      gv11b_fb_mmu_fault_info_dump:320  [ERR]  [MMU FAULT] protected mode: 0, client type: hub, client id:  host, gpc id if client type is gpc: 0,
[   32.006417] nvgpu: 17000000.gv11b                nvgpu_rc_mmu_fault:296  [ERR]  mmu fault id=0 id_type=1 act_eng_bitmask=00000000
[   32.007125] __gv11b__ Channel Status - chip gv11b
[   32.007128] __gv11b__ ---------------------------
[   32.007241] __gv11b__ 511-gv11b, TSG: 0, pid 955, refs: 2, deterministic:
[   32.007364] __gv11b__ channel status:  in use pending busy
[   32.007509] __gv11b__ RAMFC : TOP: 8000000000001000 PUT: 0000000000001030 GET: 0000000000001000 FETCH: 0000600000001000HEADER: 60400000 COUNT: 00000000SEMAPHORE: addr 0000000000000000payload 0000000000000000 execute 00000000
[   32.007601] __gv11b__
[   32.008696] __gv11b__
[   32.008700] __gv11b__ PBDMA Status - chip gv11b
[   32.008894] __gv11b__ -------------------------
[   32.013477] __gv11b__ pbdma 0:
[   32.017840] __gv11b__   id: -1 - [channel] next_id: - -1 [channel] | status: invalid
[   32.020992] __gv11b__   PBDMA_PUT 0000000000001030 PBDMA_GET 0000000000001000
[   32.029037] __gv11b__   GP_PUT    00000001  GP_GET  00000001  FETCH   00000001 HEADER 60400000
[   32.036386] __gv11b__   HDR       00000000  SHADOW0 00001000  SHADOW1 80003000
[   32.044787] __gv11b__ pbdma 1:
[   32.051964] __gv11b__   id: -1 - [channel] next_id: - -1 [channel] | status: invalid
[   32.055099] __gv11b__   PBDMA_PUT 0000000042003200 PBDMA_GET 00000050728bc914
[   32.062997] __gv11b__   GP_PUT    00000000  GP_GET  2080a000  FETCH   00000000 HEADER e1850010
[   32.070424] __gv11b__   HDR       00110000  SHADOW0 02000000  SHADOW1 10000004
[   32.078652] __gv11b__ pbdma 2:
[   32.085913] __gv11b__   id: -1 - [channel] next_id: - -1 [channel] | status: invalid
[   32.088973] __gv11b__   PBDMA_PUT 00000021040c0004 PBDMA_GET 0000000140020000
[   32.096502] __gv11b__   GP_PUT    00000000  GP_GET  8080a440  FETCH   00000000 HEADER 61400040
[   32.103679] __gv11b__   HDR       14000010  SHADOW0 00000000  SHADOW1 00000400
[   32.112336] __gv11b__
[   32.119860] __gv11b__ gv11b eng 0:
[   32.122119] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.125807] __gv11b__
[   32.135954] __gv11b__ gv11b eng 1:
[   32.135958] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.139457] __gv11b__
[   32.149945] __gv11b__ gv11b eng 2:
[   32.149950] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.153543] __gv11b__
[   32.163598] __gv11b__ gv11b eng 3:
[   32.163601] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.167278] __gv11b__
[   32.177076] __gv11b__
[   32.186145] nvgpu: 17000000.gv11b       nvgpu_tsg_set_ctx_mmu_error:492  [ERR]  TSG 0 generated a mmu fault
[   32.189443] nvgpu: 17000000.gv11b     nvgpu_set_err_notifier_locked:140  [ERR]  error notifier set to 31 for ch 511

JIRA NVGPU-5541

Change-Id: Iad60adfab5198ee11dd2ec595f2422ea541b7a2a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349166
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
5d06a59bc5 gpu: nvgpu: Cleanup uart and debugfs debug prints
The gk20a_debug_dump() function implicitly adds a newline since it
uses nvgpu_err() under the hood (for uart destined prints). For the
seq_file destined writes it does not so there is an annoying inconsistency.

Remove the newline that many of the gk20a_debug_dump() calls add and add
the newline to the (now) seq_printf() call. This reduces the length of
debug dump logs and speeds them up - UART is _very_ slow after all.

Also cleanup some formatting issues in the various debug prints I
happened to notice.

JIRA NVGPU-5541

Change-Id: Iabf853d5c50214794fc4cbb602dfffabeb877132
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347956
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Richard Zhao
f2d424d452 gpu: nvgpu: vgpu: init rwsem deterministic_busy
Uninitialized rwsem raised warnings on enabling spinlock debug.

Bug 2880934

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I74828b291c518f1fd987806682118041af41e080
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346408
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
b3766f352c gpu: nvgpu: call hal callback when set fecs_trace default filter
vgpu depends on the hal callback to notify server the filter changes.

Bug 200469911

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ibc9221de853ebe813609f897b46584f5cf88cbce
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2343613
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Divya Singhatwaria
bc4cef7a43 gpu: nvgpu: offset for exterraddr and exterrstat reg
Compute the offsets for falcon_falcon_exterraddr_r()
and falcon_falcon_exterrstat_r() registers by applying
the mask 0xFFF

JIRA NVGPU-4834

Change-Id: I7cef6f82e7802bea9133f3c95c891de22ef10d07
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347674
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00