Commit Graph

7757 Commits

Author SHA1 Message Date
Divya Singhatwaria
c060e754fc gpu: nvgpu: ELPG dump stats at shutdown
ELPG_DISALLOW command fails during gk20a shutdown.
It was due to nvgpu_can_busy() which was returning
0 before without acknowledging the ELPG_DISALLOW
command.

Since the system is shutting down so fix this issue
by setting the ACK for disallow command without
waiting for actual ACK from PMU.
In doing so the state machine is also maintained
properly and the driver does not dump fail stats.

BUG 200588696

Change-Id: I943d8e6108fa0f9c418ccb1a7f061307823f1ec6
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2308557
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
9bee2fe660 gpu: nvgpu: prealloc priv cmdbuf metadata
Move preallocation of priv cmdbuf metadata structs to the priv cmdbuf
level and do it always, not only on deterministic channels. This makes
job tracking simpler and loosens dependencies from jobs to cmdbuf
internals. The underlying dma memory for the cmdbuf data has always been
preallocated.

Rename the priv cmdbuf functions to have a consistent prefix.

Refactor the channel sync wait and incr ops to free any priv cmdbufs
they allocate. They have been depending on the caller to free their
resources even on error conditions, requiring the caller to know how
they work.

The error paths that could occur after a priv cmdbuf has been allocated
have likely been wrong for a long time. Usually the cmdbuf queue allows
allocating only from one end and freeing from only the other end, as
that's natural with the hardware job queue. However, in error conditions
the just recently allocated entries need to be put back. Improve the
interface for this.

[not part of the cherry-pick:] Delete the error prints about not enough
priv cmd buffer space. That is not an error. When obeying the
user-provided job sizes more strictly, momentarily running out of job
tracking resources is possible when the job cleanup thread does not
catch up quickly enough. In such a case the number of inflight jobs on
the hardware could be less than the maximum, but the inflight job count
that nvgpu sees via the consumed resources could reach the maximum.
Also remove the wrong translation to -EINVAL from err from one call to
nvgpu_priv_cmdbuf_alloc() - the -EAGAIN from the failed allocation is
important.

[not part of the cherry-pick: a bunch of MISRA mitigations.]

Jira NVGPU-4548

Change-Id: I09d02bd44d50a5451500d09605f906d74009a8a4
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329657
(cherry picked from commit 25412412f31436688c6b45684886f7552075da83)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332506
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
bc4f74d854 gpu: nvgpu: add pg209 sku device id
Jira NVGPU-5375

Change-Id: I745832b3bd1865abaca24b4b96fd174097542427
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333424
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Thomas Fleury
85b9c98eba gpu: nvgpu: init hal for nvgpu-next dgpu
Add hooks for nvgpu-next dgpu init hal.

Jira NVGPU-5382

Change-Id: I5395a32ceda21b43b186756ba6dd5937251c3548
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332956
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Seema Khowala
68caee196a gpu: nvgpu: add mm.mmu_fault.parse_mmu_fault_info gops
Add mm.mmu_fault.parse_mmu_fault_info gops. This is required
for nvgpu-next.
Also add mmu_engine_id type in mmu_fault structure. This variable
will be set in parse_mmu_fault_info hal so that
gv11b_mm_mmu_fault_handle_other_fault_notify does not depend
upon any chip specific h/w header. This is needed because
BAR2 mmu engine id has changed in nvgpu-next.

JIRA NVGPU-5032

Change-Id: I0c5e9ef607aff5b105f59582013cbfb31396290a
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2330693
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Lakshmanan M <lm@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Thomas Fleury
91401cc849 gpu: nvgpu: build flag for dGPU in safety
CONFIG_NVGPU_DGPU already exists to enable dGPU support.
Replace NVGPU_FORCE_DGPU_SAFETY_PROFILE with CONFIG_NVGPU_DGPU.

Jira NVGPU-5277

Change-Id: Ia1617a42269b18c1a443d91f9ca2ba38afd4a6f9
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322899
(cherry picked from commit 9882b44709ef472b9113e3cd43974fe177eeeb24)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328532
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6fc1e41150 gpu: nvgpu: split submit on deterministic
Avoid repetitive branching on the c->deterministic flag and on build
time flags by breaking the submit function on the runtime flag into two
functions of which one gets called.

In deterministic mode the job tracking conditions are simpler, there are
a few extra prechecks to guarantee deterministic latency and the
railgate corner case, and deferred cleanup is never done.

In nondeterministic mode job tracking has more conditions, a power
reference is taken for the job lifetime, and deferred cleanup is
assumed.

These two paths still share some common code. Split it to two more
functions to act as easy building blocks so that the main logic is
apparent.

Jira NVGPU-4548

Change-Id: I64f91dcf09acb16f409dc04a12ad1e144d0cce56
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333728
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
b077c6787d gpu: nvgpu: split sync and gpfifo work in submit
Make the big submit function somewhat shorter by splitting out the work
to do job allocation, sync command buffer creation and gpfifo writing
out to another function. To emphasize the difference between tracked and
fast submits, add two separate functions for those two cases.

Jira NVGPU-4548

Change-Id: I97432a3d70dd408dc5d7c520f2eb5aa9c76d5e41
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333727
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Antony Clince Alex
96bea78d55 gpu: nvgpu: add init_hw, intr_enable hals to ce gops
Add following two HALs to ce gops:
- init_hw:
  Build a list of non-stall interrupt vectors and register them
  with struct nvgpu_mc.
- intr_enable:
  Enable ce engine stall, non-stall interrupts.

Jira: NVGPU-5034

Change-Id: Ibdc768c2bce778237233803ebbbd5190362b4578
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329166
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
ajesh
1500ce829a gpu: nvgpu: assert for OS API errors in non fusa
Handle the OS API errors with assert in non FUSA builds also.

Jira NVGPU-4987

Change-Id: I90428e845ae9f934b0d4bce08ab93f13f3fde2f8
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332544
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Abdul Salam
af3311ddea gpu: nvgpu: Refactor clock_domain unit
As a part of refactoring move nvgpu_clk_domain struct from public
to private.
This will help to have arch consistency across all units.
Use public functions to fetch the data across other units.
The following functions are added to access data in clk_domain unit.
*nvgpu_pmu_clk_domain_get_f_points()--> To get freq points
*nvgpu_pmu_clk_domain_update_clk_info() --> To update change seq script
with clock domain data

NVGPU-4689

Change-Id: Idc85e3cf5bbe1b80766ce6c9f07b3305ef04cbdc
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332185
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
aff5497907 gpu: nvgpu: add intr_unit_bitmask i/p param for fb.intr.isr
tu104 onwards, fb interrupt status/enable/disable moved from
fb_niso_intr_* reg to fb_*vector* registers.
At the top level, fb interrupt status/enable/disable is done
using hub_intr bit in mc_intr registers.

Starting nvgpu-next, this has changed.

JIRA NVGPU-5032

Change-Id: Ib54170b055b83e2696312c811c2e3ba678749359
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2330867
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
091e4b9396 gpu: nvgpu: detect enabled ecc units in hal init
ECC scrubbing can start before GPU characteristics
are initialized. Detect enabled ECC units in HAL
init functions so that scrubbing is started properly.

Bug 2919887

Change-Id: Ic20b4223504a947eed78418779531e26c2116d41
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2330101
(cherry picked from commit e8d380d7bba91b895033ebb5ab0d281be6d3db30)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331612
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seeta Rama Raju
e0cb334f4e gpu: nvgpu: Add fault injection variable for devctl_channel unit
JIRA NVGPU-5232

Change-Id: Ia0dde3390d0dc8c34f02e7a99419c0907213a2ed
Signed-off-by: Seeta Rama Raju <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331323
(cherry-picked from commit bdd370e80375527566cd6148812456d4369725b2)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325923
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
25461c7621 gpu: nvgpu: Move nvlink HAL code to /hal
Remove the nvlink register read/write code from /common.
Move the register handling code to /hal and add
HALs to to expose this functionality to common code.

JIRA NVGPU-2964

Change-Id: Iafba9f4e29cc0f1130dbf5dd14fbbf8b6b5bb8ec
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329195
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
3378fbbb49 gpu: nvgpu: remove old ALLOC_GPFIFO
NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO has not been used in years. Delete it.
The SUBMIT_BIND (and ALLOC_GPFIFO_EX before it) ioctl shall be used
instead.

Jira NVGPU-4548

Change-Id: If707c1b131386d3662815518cd3689b596db5330
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325788
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kamble
9f8d5acfbb gpu: nvgpu: fix the return value from gk20a_mm_pin
The return value in case of failure of dma_buf_attach and
dma_buf_map_attachment was ignored and NULL was returned.
This would lead to following null pointer access. Fix it.

[  293.622880] Unable to handle kernel NULL pointer dereference
               at virtual address 0000000000000000
...
[  293.711860] Hardware name: quill (DT)
[  293.720393] pc : nvgpu_linux_sgt_create+0x14/0xa8 [nvgpu]
[  293.725871] lr : nvgpu_vm_map_linux+0x104/0x1c8 [nvgpu]
...
[  293.813934] Call trace:
[  293.816455]  nvgpu_linux_sgt_create+0x14/0xa8 [nvgpu]
[  293.821573]  nvgpu_vm_map_linux+0x104/0x1c8 [nvgpu]
[  293.826515]  nvgpu_vm_map_buffer+0x120/0x290 [nvgpu]
[  293.831542]  gk20a_as_dev_ioctl+0x364/0xfb8 [nvgpu]
[  293.836416]  ksys_ioctl+0x17c/0xba8
[  293.839899]  __arm64_sys_ioctl+0x18/0x28
[  293.843817]  do_el0_svc+0xf8/0x1b8
[  293.847214]  el0_sync_handler+0x11c/0x28c
[  293.851217]  el0_sync+0x140/0x180

Bug 2834141

Change-Id: I0d9e863d0326946c8091bfb1b907b62b055f7272
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332204
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Debarshi Dutta
8e9d837fd5 gpu: nvgpu: remove unused include file
#include "../../../arch/arm/mach-tegra/iomap.h" should be removed.

Bug 2887230

Change-Id: I3402dbae5a61845475cff4a0a9a36c60f41b45cd
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332091
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
bb93223a21 gpu: nvgpu: add check_warp_esr_error hal
Set check_warp_esr_error hal pointer to
gv11b_gr_check_warp_esr_error hal function.

Jira NVGPU-4867

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: Ib014c5ff2456836af2fe89f849f37991fe52844e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331804
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
d013e42f60 gpu: nvgpu: rename INTR_* defines
Rename INTR_* to MC_INTR_* defines.

JIRA NVGPU-5032

Change-Id: Iee291e2003171e3cf02b6452f1567747093e5773
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331742
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
ajesh
b6c7b9976e Revert "gpu: nvgpu: modify the prints for return values"
This reverts commit 48946d43fd9b6028900b594d329cd7f51f196ba4.

Jira NVGPU-4987

Change-Id: I2fbaa3ff0b5eec7fbf051353a6bd80576e818f2e
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329597
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
8f3a0f4486 gpu: nvgpu: add sm rams ecc enabled flag
Add sm rams ecc enabled flag.

Move ecc scrubbing timeout defines to
gr_init_gv11b.h

Jira NVGPU-4871

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: Ie43f5947c53be697d0b2fd064d308612856d823a
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328871
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Rajesh Devaraj
bfa1736351 gpu: nvgpu: update bug() to print function name
This patch updates BUG() to print the name of the function that
triggered it. In addition, it also prints the line number in which
BUG() is present in the function that triggered SW quiecese. This
will aid in finding the function due to which SW quiesce has been
triggered.

Bug 2919887

Change-Id: Ie63d9e5f1ba128da54ddc18bd259659d634b60cb
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329796
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
dd2fb50a1a gpu: nvgpu: require deferred cleanup for aggressive sync destroy
Aggressive sync destroy is used on some platforms where the amount of
syncpoints is limited. It can cause sync objects to get allocated and
freed in the submit path and when jobs are cleaned up, so require
deferred cleanup. Allocations do not belong to job tracking in a
deterministic submit path.

Although this has been technically allowed before, deterministic
channels have likely not been a priority on those old platforms with
aggressive sync destroy set.

Update virtualized gp10b platform data to match on a gp10b-vgpu compat
string instead of gk20a-vgpu. gk20a (Tegra T124) hasn't been supported
for a long time. Delete the aggressive sync destroy field from this
platform. It's got enough syncpoints to not dynamically allocate them;
having this property set for gp10b-vgpu has likely been a mistake.

This is not a completely pure cherry-pick: also extend the gpu
characteristics to not advertise full deterministic submit support when
aggressive sync destroy is off. This platform flag cannot be adjusted by
the user unlike many other flags.

Jira NVGPU-4548

Change-Id: I283f546d48b79ac94b943d88e5dce55710858330
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322042
(cherry picked from commit b1ba2b997b2174e365bcb0782ef3e67260ff9e57)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328411
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
2001b8ec97 gpu: nvgpu: remove aggressive sync init from platform
Remove the boolean aggressive_sync_destroy flag from struct
gk20a_platform; only the threshold to set the channel limit is useful in
the platform data. The boolean flag is a runtime condition and it always
starts as false.

Jira NVGPU-4548

Change-Id: I1a4b9903978ab239581857ff791a7983f59fdc13
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331357
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
0b70fff5db gpu: nvgpu: fix job count calculation for non-pow2
The CIRC_SPACE and CIRC_CNT macros work as expected when the buffer size
is a power of two. The userspace-supplied number of inflight jobs is not
necessarily so. Compare the get and put pointers manually.

Jira NVGPU-4548

Change-Id: Ifa7bd6d78f82ec8efcac21fcca391053a2f6f311
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328572
(cherry picked from commit 33dffa1cfb142eea0f28474566c31b632eee04f5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331340
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
47c3d4582c gpu: nvgpu: hide priv cmdbuf gva and size
Add an accessor function in the priv cmdbuf object for gva and size to
be written in a gpfifo entry once the cmdbuf build is finished. This
helps in eventually hiding the struct priv_cmd_entry as an
implementation detail.

Add a sanity check to verify that the buffer has been filled exactly to
the requested size. The cmdbufs are used to hold wait and increment
commands for syncpoints or gpu semaphores. A prefence buffer can hold a
number of wait commands of equal size, and the postfence buffer holds
exactly one increment.

Jira NVGPU-4548

Change-Id: I83132bf6de52794ecc419e033e9f4599e488fd68
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325102
(cherry picked from commit d1831463a487666017c4c80fab0292a0b85c7d83)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331339
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Dinesh
1c1da3d6b4 gpu: nvgpu: Syncpoint invalid value to ~0.
As qnx syncpoint's invalid value is ~0, change the code
to handle this.

Bug 200603716

Change-Id: I5ec79688cd9e60066725781f1effe57692ec0c27
Signed-off-by: Dinesh <dt@nvidia.com>
(cherry picked from commit 705260565a75bc90683841c4c08e4c857bda39f0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331208
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
c1521a7bba gpu: nvgpu: change system suspend's implementation
Currently, for platforms with canRailgate device characteristics disabled,
suspend can block as deterministic channels hold busy references. This
patch makes the change to first hold off any new jobs for deterministic
channels and then reverts back the busy references taken by those
channels. Following this, suspend also waits for the device to get idle
by waiting (with timeout) for the nvgpu's internal usage counter to be
come zero. This ensures there are no further jobs in progress and
allows the system to go into a suspend state.

Bug 200598228
Bug 2930266

Change-Id: Id02b4d41a9c2dd64303b2e2449dbed48c12aea4c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328489
(cherry picked from commit 9d1e07ca18)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2330159
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
e9747d5477 gpu: nvgpu: remove wait_fence_fd from incr_user
The wait_fence_fd parameter in nvgpu_channel_sync_incr_user() has not
been used since commit 1a4647272f ("gpu: nvgpu: remove fence
dependency tracking") where it was used to save a dependency fd to
sema-based post fences. The commit probably should have removed this
param; it has no purpose in the current design.

Jira NVGPU-4548

Change-Id: Id7e68b24f8e9ba0e43ff01b7af946434580b166e
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2326604
(cherry picked from commit f8031142270fb87ac41597ae70a80505078ae6d5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328423
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
aa1322f975 gpu: nvgpu: move syncpt priv cmd allocation
channel_sync_syncpt_gen_wait_cmd() is rather simple now and is called
from two places where one has the buf preallocated and the other
doesn't. Remove the preallocated flag from the function, moving the
allocation to the single place where it is needed.

Jira NVGPU-4548

Change-Id: I48083f4f6f1093d64b67c63582291392a3481932
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325101
(cherry picked from commit afb566721e2b4c15349ff79d51f5eddc49b66014)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331338
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
4acf78dff3 gpu: nvgpu: guard sync cmd hals properly
Make the syncpt and sema wait and incr command HAL ops consistent. Add
CONFIG_NVGPU_SW_SEMAPHORE guards for the semaphore ops. The syncpoint
ops already have CONFIG_TEGRA_GK20A_NVHOST around them.

Delete the dummy syncpt ops. They are not used; the ops are only needed
when the real versions exist.

Jira NVGPU-4548

Change-Id: I30315a67169b31b1d63a0a1a0a4492688db4a2bc
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325100
(cherry picked from commit ed13b286c5fbdbc008ec59172d98ac79e9f2e733)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331337
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
39844fb27c gpu: nvgpu: hide priv cmdbuf mem writes
Add an API to append data to a priv cmdbuf entry. Hold the write pointer
offset internally in the entry instead of having the user keep track of
where those words are written to.

This helps in eventually hiding struct priv_cmd_entry from users and
provides a more consistent interface in general. The wait and incr
commands are now slightly easier to read as well when they're just
arrays of data.

A syncfd-backed prefence may be composed of several individual fences.
Some of those (or even a fence backed by just one) may be already
expired, and currently the syncfd export design releases and nulls
semaphores when expired (see gk20a_sync_pt_has_signaled()) so for those
the wait cmdbuf is appended with zeros; the specific function is for
this purpose.

Jira NVGPU-4548

Change-Id: I1057f98c1b5b407460aa6e1dcba917da9c9aa9c9
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325099
(cherry picked from commit 6a00a65a86d8249cfeb06a05682abb4771949f19)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331336
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
0c9f589f3f gpu: nvgpu: Remove TLC error regs from dev_reginit
The TLC error registers will be programmed as part of
interrupt and error initialization code. This will help move
all common.nvlink_turing_intr unit related code together.

JIRA NVGPU-4350

Change-Id: I1c291f346eee890ee973889473b44227306d0400
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2327621
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Petlozu Pravareshwar <petlozup@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
tkudav
3856381b43 gpu: nvgpu: Clear nvlink error persistent state
Error logging bits within the nvlink blocks like TLC and MIF are
persistent through reset, to enable them to be polled following
a reset event.  That means that they are in an unknown state at
cold reset, and may contain error state after a warm reset event.
Software is expected to reset them, either by writing ones to the
status bits or by writing to the DEBUG_RESET register at the IOCTRL
top level, to clear the state out before enabling error reporting.

JIRA NVGPU-4352

Change-Id: Iab4e96388fd827c0d694eada61b20f24bbddd1ff
Signed-off-by: tkudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2317683
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
5af8cedf05 gpu: nvgpu: Nvlink interrupt handling
Enable logging and error reporting for MIF, DLPL, and TLC blocks.
Configure the NVLIPT and IOCTRL interrupt registers to rollup
the MIF and TLC errors on the link-specific fatal line and the
DLPL interrupts on link-specific intr_a(fatal) line. Both
link_err_fatal and link_intr_a are rolled up to stall interrupt line.
In the handling ISR, clear the interrupt status registers and print
an error.
Move the interrupt handling HAL code to /common/hal.

JIRA NVGPU-4350
JIRA NVGPU-4351
JIRA NVGPU-5231
JIRA NVGPU-4354
JIRA NVGPU-4355
JIRA NVGPU-4356

Change-Id: I14812499caf506592f3ae84d6681d857730d31ff
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2313221
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
d58d6ff321 gpu: nvgpu: use job count for priv cmdbuf size
Reduce the priv cmdbuf allocation size to match the actual space needed
in the worst case when num_in_flight is not specified. Although
synchronization may indeed take up to 2/3 of the gpfifo entries, the
number of jobs is what matters and it will be the remaining 1/3.

Each job uses up at most one wait and incr command from the pre and post
fences, so half of the 2/3 will be only wait commands and the other half
will be only incr commands.

Jira NVGPU-4548

Change-Id: Ib3566a76b97d8f65538d961efb97408ef23ec281
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325233
(cherry picked from commit 515deae4f58fedc7d004988f0f85470a7a894ddf)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328413
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
116c385089 gpu: nvgpu: alloc priv cmdbuf based on chip
The semaphore wait and incr sizes are not 8 and 10 for gv11b onwards.
Use the specific HAL API to retrieve their sizes and compute the priv
cmdbuf queue based on them instead of the up-to-gp10b values.

We haven't run out of space likely for several reasons:

1) userspace may not need both pre and post fences for absolutely each
   submitted job
2) submitted jobs may consist of more than one gpfifo entry, reducing
   the relative required sync capacity
3) the queue size is rounded up to the next power of two which leaves
   some margin for error in this calculation
4) the gpfifo size based num-in-flight guess has been twice as big as it
   needs to be (fixed in a next patch)

Jira NVGPU-4548

Change-Id: I172b5c0d8bb7d2231cc45cbed5e1e8b60ce7c707
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323148
(cherry picked from commit 03fb194d105242c3eb20a9857a22743f5f64b9b9)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328412
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
00203b42f2 gpu: nvgpu: split add_sema_cmd to wait and incr
The internal add_sema_cmd() used when making cmd buf entries has so many
branches it makes sense to split it at the bool acquire flag into two
functions. The wait part doesn't even need the wfi flag, and the incr
part doesn't need offset.

Jira NVGPU-4548

Change-Id: Iab26b9bc14564e2958935ab7ffda03aa873dd9b1
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323320
(cherry picked from commit 9fe2830aa9ee2b0b165edc959defa74dfb49c6ba)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328410
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6202ead057 gpu: nvgpu: split sema sync hal to wait and incr
Instead of one HAL op with a boolean flag to decide whether to do one
thing or another entirely different thing, use two separate HAL ops for
filling priv cmd bufs with semaphore wait and semaphore increment
commands. It's already two ops for syncpoints, and explicit commands are
more readable than boolean flags.

Change offset into cmdbuf in sem wait HAL to be relative to the cmdbuf,
so the HAL adds the cmdbuf internal offset to it.

While at it, modify the syncpoint cmdbuf HAL ops' prototypes to be
consistent.

Jira NVGPU-4548

Change-Id: Ibac1fc5fe2ef113e4e16b56358ecfa8904464c82
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323319
(cherry picked from commit 08c1fa38c0fe4effe6ff7a992af55f46e03e77d0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328409
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
6a7bf6cdc0 gpu: nvgpu: update sm ecc_status_error handling
Use gv11b_gr_intr_handle_tpc_sm_ecc_exception
function for future chip to avoid code replication.

Add sm_ecc_status_errors hal to read
the ecc_status_errors

Jira NVGPU-5033

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: I4a25837d9b833a48307b9353b82ff6597f985e41
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325537
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kamble
72d01afd0c gpu: nvgpu: replace dma_buf_kmap with dma_buf_vmap
dma_buf_kmap was introduced a decade ago to map a dma_buf partially
by the input number of pages, when 32-bit was fairly common. It was
added to not exhaust vmalloc space. Starting from kernel 5.6, it is
deprecated as vmap calls should succeed with larger available
vmalloc space.

Use dma_buf_vmap/vunmap instead of dma_buf_kmap/kunmap for handling
mapping of notifier memory in gk20a_channel_wait_semaphore.

Also update the debug prints and add speculation barrier to the
start of gk20a_channel_wait.

Bug 2925664

Change-Id: I49078fa81f050a57a5b66a793e62006dd66e3ba3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2326513
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Abdul Salam
b029f3b2b0 gpu: nvgpu: Reactor clk_fll unit
As a part of refactor move struct nvgpu_avfsfllobjs from public header
to private header.
This will help to have arch consistency across all units.
Use public functions to fetch the data across other units.

NVGPU-4690

Change-Id: I73a750695c2ae7d3e46d1d692d10e40f13ec3cb3
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/#/c/linux-nvgpu/+/2326675/
(cherry picked from commit 41e374461da5dc9e2b4ac67a0855fd8dd20e1450)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328538
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Sagar Kamble
81b14ef5b1 gpu: nvgpu: fix dbg log and comment in nvgpu_vm_find_mapping
Following commit updated the debug message in the function
nvgpu_vm_find_mapping w.r.t reuse of mapping.

  commit 2f00d9adfc4fc91a6b84b14cc513f9b855d39cad
  Author: Sagar Kamble <skamble@nvidia.com>
  gpu: nvgpu: fix null pointer access in nvgpu_vm_find_mapping

That reuse log is about the mapping and not SGT. Fix the log
and add details about different handling of SGT for dmabuf
drvdata cases in the comment.

Bug 2834141

Change-Id: I3630de1c45a2bf55ff18bdb426f0597efe83f72c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328427
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
1dcd4957f0 gpu: nvgpu: extract job from channel.c
Start moving job and job list related functionality out of the big
channel.c file. The lowest level job list stuff is moved, as is resource
preallocation which is tied to the job list. Adding and cleaning jobs
still stays in channel.c for now.

The joblist is still owned by the channel as a direct struct field.

Jira NVGPU-4548

Change-Id: I2733484d8ce6bd7b1fe0c32a867139c682616dfd
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323149
(cherry picked from commit cbd20803ee10058da9d258e9e8cb91b34d2278d5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328408
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
72151c579f gpu: nvgpu: hide priv cmd queue type
Move struct priv_cmd_queue to priv_cmdbuf.c so that its definition does
not need to be visible to all users of channel.h. This also forces it to
be separately allocated (during channel init time).

While at it, rename the functions to allocate and free priv cmdbuf
queues now that they're not in channel.c anymore. A private command
buffer queue is a piece of dma memory from which entries for incr and
wait command lists are suballocated. As the name implies, it's a queue;
allocations and frees of the bufs must happen in certain order.

Jira NVGPU-4548

Change-Id: I1b47029f3a478e1942f24292918b7b59a5d91528
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323147
(cherry picked from commit 1fcf9b04275f44638059c0147dc16c1dc6956510)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328407
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
b3d16b23d5 gpu: nvgpu: extract priv cmdbuf from channel.c
Move private command buffer related functionality to priv_cmdbuf.c. This
is used only for kernel mode submits, so it makes sense to group it out,
and the priv cmdbuf stuff is used also by things that don't care about
channels.

Jira NVGPU-4548

Change-Id: Idbb42e3ed3984e16c654bb9aa2b7564b780048a4
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323146
(cherry picked from commit bb67bfc7ab8e87236f31bc4f6c80dab042609f21)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328406
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
ajesh
7fc3c3822d gpu: nvgpu: reduce the ccm for thread unit
Reduce the code complexity of function nvgpu_thread_create_priority
in Thread unit.

Jira NVGPU-4987

Change-Id: I85da527c3d8dbbe37c5428e5bded9ed19b299613
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2327865
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
52835c39ae gpu: nvgpu: do not skip completed syncpt prefences
A corner case has existed since ancient times for syncpoint-backed
prefences to not cause a gpu wait if the fence is found to be completed
in the submit path. This adds some unnecessary complexity, so don't
check for completion in software. Let the gpu "wait" for these
known-to-be-trivial waits too. Necessary priv cmdbuf space has been
allocated anyway.

Originally nvhost had 16-bit fences which would wrap around relatively
quickly, so waiting for an old fence could have looked like waiting for
a fence that will expire long in the future. With 32-bit thresholds,
this hasn't been the case for several Tegra generations anymore, and
nvhost doesn't ignore waits like this either.

The wait priv cmdbuf in submit path can still be missing even with a
prefence supplied because the Android sync framework supports sync fds
that contain zero fences inside; this can happen at least when merging
fences that have all been expired. In such conditions the wait cmdbuf
wouldn't even get allocated.

[this is squashed with commit 8b3b0cb12d118 (gpu: nvgpu: allow no wait
cmd with valid input fence) from
https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325677]

Jira NVGPU-4548

Change-Id: Ie81fd8735c2614d0fedb7242dc9869d0961610eb
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321762
(cherry picked from commit 8f3dac44934eb727b1bf4fb853f019cf4c15a5cd)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324254
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Nitin Kumbhar
85949d39e2 gpu: nvgpu: disable GC-OFF feature for all dGPUs
Set the can_pci_gc_off platform flag of all dGPUs to false
to disable powering on/off dGPU using GC-OFF feature.

Bug 2917054

Change-Id: Iffacd134cf52a137bb9c121d69bd0fd0a096c6ff
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2327968
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00