Commit Graph

3269 Commits

Author SHA1 Message Date
Vedashree Vidwans
b24f577a5c gpu: nvgpu: reduce traffic on dbg_fn or dbg_info
Reduce debug logs printed when gpu_dbg_info or gpu_dbg_fn is set.
- Add gpu_dbg_verbose flag for more verbose debug prints. Update prints
in to ga10b_gr_init_wait_idle(), gm20b_gr_init_wait_fe_idle(),
gv11b_gr_init_write_bundle_veid_state() and
gv11b_gr_init_load_sw_veid_bundle().
- Add gpu_dbg_hwpm flag for hwpm specific debug prints. Update print in
nvgpu_gr_hwpm_map_create().
- Add gpu_dbg_mm for MM specific debug prints. Update prints in
gm20b_fb_tlb_invalidate(), gk20a_mm_fb_flush(),
gk20a_mm_l2_invalidate_locked(), gk20a_mm_l2_flush() and
gv11b_mm_l2_flush().
- Remove gpu_dbg_fn mask print in gr_ga10b_create_priv_addr_table(),
gr_gk20a_get_pm_ctx_buffer_offsets(), gr_gv11b_decode_priv_addr() and
gr_gv11b_create_priv_addr_table().

Jira NVGPU-7183

Change-Id: I9842d567047cb95a42e23b5907ae324214eed606
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2602797
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-09 15:05:21 -07:00
Seshendra Gadagottu
4333bc7faf gpu: nvgpu: ga10b: patch ctx with rops_crop_debug1_crd_cond_read_disable
For ga10b emulate_mode, patch context with rops_crop_debug1_crd_cond_read_disable
for required perf setting.

Bug 200768322
JIRA NVGPU-6433

Change-Id: Ib1f977ed28e3b18184bce7ac695a0b6a2bae979d
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2602268
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-06 18:15:40 -07:00
dt
e628e23d59 gpu: nvgpu: nvgpu-next: Fixup for false ltc tag tracking
This is clearing the write-through behavior of CE and ROP writes.

Bug 200601972

Change-Id: I269d2b994be13f5e15090c520c129d36489df3c1
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561967
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-10-06 18:11:34 -07:00
Deepak Nibade
d1f3f81553 gpu: nvgpu: remove SW methods from safety build
Improved SDL heartbeat mechanism detects the interrupts triggered by
SW method and treats them as errors. Hence remove the SW method support
completely from safety build. Registers set by SW methods are now set
by default for all the contexts.

Implement new HAL gops.gr.init.set_default_compute_regs() to set the
registers in patch context. Call this HAL while creating each context.

Update gv11b_gr_intr_handle_sw_method() to treat all compute SW methods
as invalid.

Update unit test test_gr_intr_sw_exceptions() so that it now expects
failure for any method/data.

Bug 200748548

Change-Id: I614f6411bbe7000c22f1891bbaf06982e8bd7f0b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527249
(cherry picked from commit bb6e0f9aa1404f79bcfbdd308b8c174a4fc83250)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2602638
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-04 18:03:55 -07:00
smadhavan
19fa7004aa gpu: nvgpu: Fix memory leaks in common.acr
The SEC2 ucode allocation code does not free the struct nvgpu_firmware
data structures used while requesting firmwares - sec2_fw, sec2_desc
and sec2_sig.
The lsfm_free_nonpmu_ucode_img_res() API only frees the 'data' field
of struct nvgpu_firmware, but not the entire struct.
Fix these memory leaks by calling nvgpu_release_firmware() API
after the intended use of allocated struct is achieved.

Bug 200690283

Change-Id: I1ed2e1603455bce65af897a40aa31ccc82fda4b0
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2488219
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-04 13:18:27 -07:00
Konsta Hölttä
1b1d183b9c gpu: nvgpu: simplify gmmu map calls
Introduce nvgpu_gmmu_map_partial() to map a specific size of a buffer
represented by nvgpu_mem, or what nvgpu_gmmu_map() used to do. Delete
the size parameter from nvgpu_gmmu_map() such that it now maps the
entire buffer. The separate size parameter is a historical artifact from
when nvgpu_mem did not exist yet; the typical use is to map the entire
buffer.

Mapping at a certain address with nvgpu_gmmu_map_fixed() still takes the
size parameter.

The returned address still has to be stored somewhere, typically to
mem.gpu_va by the caller so that the matching unmap variant finds the
right address.

Change-Id: I7d67a0b15d741c6bcee1aecff1678e3216cc28d2
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2601788
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 21:38:43 -07:00
Ramesh Mylavarapu
d2d59d6206 gpu: nvgpu: add gsp ops to support cmd/msg
Added all dependent gsp dependent ops. This include
read/write from/into EMEM, get Queue head/tail, engine
dependent ops and aperture settings.

NVGPU-6784

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: Ic780bfdcd2de593bf2e8f292756e3d1700610ad2
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590940
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 09:29:55 -07:00
Ramesh Mylavarapu
35796f70c6 gpu: nvgpu: add msg handling support
Add message handling support to read the response from
GSP nvrisc.

NVGPU-6784

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I0d301dfc34560f7b18e075cf11f7afbe7d1b6e06
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590769
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 09:29:42 -07:00
Ramesh Mylavarapu
3c980954c4 gpu: nvgpu: add cmd post support
Add command post support to send commands to GSP nvriscv.

NVGPU-6784

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: Ib7fde3712c24a5b4f0f58d7788e67d29a1e351a2
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590763
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 09:29:31 -07:00
Ramesh Mylavarapu
085f94bf89 gpu: nvgpu: add queue support for gsp cmd/msg
implemented queue support which is needed for cmd/msg for managing
CMDQ/MSGQ. In ga10b GSP, totally 4 CMDQ and 4 MSGQ supported.
in current implementation we use only one CMDQ and one MSGQ.

NVGPU-6784

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: Ib40ff9df6580e15824131dd6f54bfb85dce8e594
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590678
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 09:29:20 -07:00
Ramesh Mylavarapu
8c455dff18 gpu: nvgpu: add sequence support for gsp cmd/msg
implemented sequence support which is needed for cmd/msg for sequencing
all the commands sent from NVGPU to gsp and also to handle cmd responses
with respect to correspondind assigned sequences.

NVGPU-6784

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I7d0bb015227c11512ec3c7a5ef7117e149704206
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590607
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 09:29:09 -07:00
Mahantesh Kumbar
00c36c9b48 gpu: nvgpu: t234: Falcon debug update
-Add new log bit for falcon debug under gpu_dbg_*
-BIT(40) assigned to gpu_dbg_falcon
-Replaced nvgpu_info with nvgpu_falcon_dbg() in Falcon unit

Bug 200780546

Change-Id: Icd88bb940014d501142952b399ce76f4d8d5ff92
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2603212
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-10-01 00:51:49 -07:00
Konsta Hölttä
44422db851 gpu: nvgpu: simplify gmmu unmap calls
Introduce nvgpu_gmmu_unmap_addr() to unmap a nvgpu_mem that was mapped
at some other address than mem.gpu_va, which can be the case for buffers
that are shared across different address spaces. Delete the address
parameter from nvgpu_gmmu_unmap(), as the common case is to store the
address to mem.gpu_va when mapping the buffer.

Modify some instances of consecutive unmap + free calls to call just
nvgpu_dma_unmap_free().

Change-Id: Iecd7c9aa41d04e9f48e055f6bc0c9227cd759c69
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2601787
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-30 16:29:41 -07:00
deepak goyal
9bdb8f1a10 gpu: nvgpu: Correct LS ucode data alignment.
Currently LS UCODE data is aligned to PAGE_SIZE
which is dependent on kernel config.

This causes "data_size" variable to change due to
padding difference which causes LS sig authentication
to fail.

This patch corrects alignment and align it to
LSF_UCODE_DATA_ddALIGNMENT instead of PAGE_SIZE.

Bug 200773365

Change-Id: I5f2fe1152053ed6135c01ae3eb94e8cf6eecde5f
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2602083
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-30 01:39:13 -07:00
Deepak Nibade
af989f6212 gpu: nvgpu: fix misra rule 13.2 violations in common.gr unit
Fix MISRA rule 13.2 violations of below type from common.gr unit:

nvgpu/drivers/gpu/nvgpu/common/gr/gr_intr.c:108
  Type: MISRA C-2012 Side Effects (MISRA C-2012 Rule 13.2, Required)

nvgpu/drivers/gpu/nvgpu/common/gr/gr_intr.c:108:
  1. misra_c_2012_rule_13_2_violation:
  In "nvgpu_safe_add_u32(nvgpu_gr_gpc_offset(g, gpc), nvgpu_gr_tpc_offset(g, tpc))",
  there are 2 function calls in the arguments for which the order of
  evaluation is undefined.

Jira NVGPU-7127

Change-Id: Ie867fb62098eed3a45ec01b941eda93b94220b4b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2598696
(cherry picked from commit 15483df6ca1017e5b9d6f2dff35f7e57094a2b4d)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2601976
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: V M S Seeta Rama Raju Mudundi <srajum@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-29 15:14:34 -07:00
Divya
ae2d561c48 gpu: nvgpu: add platform support for Static PG
- Add support for taking static PG config values
  from DT nodes
- Check those values against valid set of values
  for GPC, TPC and FBP
- Store valid values in g->gpc_pg_mask, g->fbp_pg_mask
  and g->tpc_pg_mask[] array.

Bug 200768322
JIRA NVGPU-6433

Change-Id: Ifc87e7d369034b1daa13866bc16a970602514bf6
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2594802
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-25 15:47:25 -07:00
Mahantesh Kumbar
82526439dc gpu:nvgpu: Support to bootstrap ctxsw in MIG mode
-Update PMU_RPC_STRUCT_ACR_BOOTSTRAP_FALCON to
 accpet the FECS/GPCCS instance bootstrap request.
-Update the ACR ucode interface to take MIG mode
 param to config FECS/GPCCS SCTL PLM for LSPMU access.

JIRA NVGPU-6562

Change-Id: I460ef4e965009b3a77aeb4350f2191235f52c6f7
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587033
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-23 20:21:43 -07:00
Divya
9266da636b gpu: nvgpu: update static pg support for pre-si
- On pre-silicon platform, static pg will be
  done by nvgpu driver. For this, retain structs
  and HALs of static pg.
- Add the static pg support under pre-silicon code.
- On silicon, the static pg will be done by BPMP.
- Rename variables used in static pg for better
  readability and consistency

Bug 200768322
JIRA NVGPU-6433

Change-Id: Ib31c0f83b751c2b1563a36bd51af78a0bd12a117
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2594801
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 13:40:11 -07:00
Sagar Kamble
72c3bce602 gpu: nvgpu: compile out non-safe ctxsw_prog hals
Following two hals are non-safe. Compile them under
CONFIG_NVGPU_HAL_NON_FUSA:
1. init_ctxsw_hdr_data
2. disable_verif_features

JIRA NVGPU-5358

Change-Id: I751c4655dc628f7ab66ed3a779268a6a88f9a1e3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581361
(cherry picked from commit abf16c6a01109d174879609c10354f06739fb6dc)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581842
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 03:17:12 -07:00
Sagar Kamble
62b04331de gpu: nvgpu: compile out priv_access_map config/addr hals
These hals are non-safe. Compile them out with
CONFIG_NVGPU_SET_FALCON_ACCESS_MAP.

JIRA NVGPU-5358

Change-Id: I75b46e201fa132e09fee15679a402d24bbf9b2ab
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581360
(cherry picked from commit d048333ef391019b2618abf7d09c8fe2042f8ee0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581841
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 03:17:00 -07:00
Mayur Poojary
fe7368f8f4 gpu: nvgpu: ga10b: Support emulate mode
Add sysfs node to enable gpu emulate_mode and
pass the value to acr through acr descriptor struct.

Bug 3279344

Change-Id: I936b1dda84d7f4f3688237308223c019798bdce3
Signed-off-by: Mayur Poojary <mpoojary@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591377
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-20 16:40:34 -07:00
Debarshi Dutta
60aab0a1da gpu: nvgpu: add null check before calling function pointer
nvgpu_gsp_isr_support is called from the common code and results in
a null pointer exception when calling g->ops.gsp.enable_irq when its
not defined for some chips. Fix that.

Bug 200763510

Change-Id: Ifef0d31ac4a8d06120bcebc17daf4a5b6559e3c3
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593355
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-16 21:45:49 -07:00
Debarshi Dutta
9328f057a7 gpu: nvgpu: fix use-after-free use case of CE APP.
The following issue is reported when running sudo modprobe -r nvgpu

[  134.066392] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
[  134.066428] Mem abort info:
[  134.066431]   ESR = 0x96000004
[  134.066434]   EC = 0x25: DABT (current EL), IL = 32 bit
[  134.066450] [0000000000000058] pgd=0000000000000000, p4d=0000000000000000
[  134.066459] Internal error: Oops: 96000004 [#1] PREEMPT_RT SMP

[  134.066639] pc : nvgpu_cic_rm_wait_for_stall_interrupts+0x78/0xd0 [nvgpu]
[  134.066847] lr : nvgpu_cic_rm_wait_for_stall_interrupts+0x74/0xd0 [nvgpu]
[  134.067043] sp : ffff80001971ba80
[  134.067046] x29: ffff80001971ba80 x28: ffff000093b0da00
[  134.067054] x27: 0000000000000000 x26: ffff80001c28b990
[  134.067061] x25: ffff00008cd01000 x24: 0000000000000bb8
[  134.067067] x23: 0000000000000000 x22: ffff0000915b0000
[  134.067073] x21: ffff000093b0da00 x20: ffff0000915b0000
[  134.067079] x19: ffff0000915b0000 x18: 0000000000000036
[  134.067085] x17: 0000000000000000 x16: 0000000000000000
[  134.067091] x15: ffff8000126b5fd8 x14: 7373616c633d4d45
[  134.067097] x13: ffff8000098abef0 x12: 0000000000000000
[  134.067102] x11: ffff8000098ab5a0 x10: ffff8000098abef8
[  134.067108] x9 : ffff80001010e844 x8 : ffff80001971ba48
[  134.067115] x7 : 2222222222222222 x6 : ffff000093b0da00
[  134.067122] x5 : ffff8000098b1fd8 x4 : 0000000000000000
[  134.067127] x3 : 0000000000000000 x2 : 0000000000000000
[  134.067133] x1 : 0000000000000000 x0 : 0000000000000000
[  134.067138] Call trace:
[  134.067140]  nvgpu_cic_rm_wait_for_stall_interrupts+0x78/0xd0 [nvgpu]
[  134.067328]  nvgpu_cic_rm_wait_for_deferred_interrupts+0x20/0xb0 [nvgpu]
[  134.067517]  nvgpu_channel_deferred_reset_engines+0x29c/0x920 [nvgpu]
[  134.067714]  nvgpu_channel_close+0x18/0x20 [nvgpu]
[  134.067904]  nvgpu_init_pramin+0x2ac/0x350 [nvgpu]
[  134.068092]  nvgpu_ce_app_destroy+0x94/0xe0 [nvgpu]
[  134.068279]  nvgpu_put+0x90/0x120 [nvgpu]
[  134.068465]  nvgpu_pci_shutdown+0x29c/0x18a0 [nvgpu]
[  134.068655]  pci_device_remove+0x44/0xe0
[  134.068665]  device_release_driver_internal+0x114/0x1f0
[  134.068701]  driver_detach+0x54/0xe0
[  134.068709]  bus_remove_driver+0x70/0x120
[  134.068733]  driver_unregister+0x34/0x60

The above issue occurs due to freeing of CIC resources earlier than
dependent users of interrupts e.g. CDE, CE etc.

As a solution, move CIC deinit sequence to end of nvgpu_put.
This handles deinit properly for VGPU/IGPU/DGPU.

Bug 200763510

Change-Id: I696e31d5e03a9468cccfe710048000dbf7cf0269
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592063
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 21:45:43 -07:00
Tejal Kudav
9b5274593c gpu: nvgpu: Update common.ptimer documentation
Enhance doxygen comments for below common.ptimer APIs:
1. nvgpu_scale_ptimer()
2. gops_ptimer.isr()

Remove assert calls from nvgpu_scale_ptimer() as it now
has a means to return error.
Reorder the Ptimer ISR code for better logical flow.

JIRA NVGPU-6989

Change-Id: I5adf4d665d3b90d3e9b11557a15fcb91e485f353
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2583667
(cherry picked from commit 502ab9ee2dc3f3b7b1da7ac59f13fddce4ead616)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592057
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 05:59:13 -07:00
Tejal Kudav
5a94007725 gpu: nvgpu: Remove redundant HAL from common.fbp
common.fbp has two interfaces to initialize FBP:
1. Public API nvgpu_fbp_init_support
2. HAL fbp.fbp_init_support

nvgpu_fbp_init_support() is only used to initialize HAL
fbp.fbp_init_support. Remove the HAL and use the API directly.

JIRA NVGPU-6644

Change-Id: I2c455e09dbcf5e4fb1dc370b284e4f0d5c678b40
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592047
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 05:59:00 -07:00
Debarshi Dutta
791dc18666 gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields
Add Setter and Getter methods for accessing tsg->sm_error_states.
Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state.
This renders it unnecessary to add BVEC for above fields for the struct
in multiple locations. The current design ensures that only a constant
pointer is obtained from the owner unit i.e. FIFO.

The following new methods are added. Both unit tests and BVEC tests
are added for them as well.

nvgpu_tsg_store_sm_error_state
nvgpu_tsg_get_sm_error_state

Jira NVGPU-6947

Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-13 20:57:09 -07:00
ajeshkv
118f8c1280 gpu: nvgpu: add support for gsp stress test
Add debugfs entries to support GSP stress test and other
functionalities to enable the test.

JIRA CORERM-3382

Change-Id: Iab20fcfe78807e76e91c64716502a2f036ed4d18
Signed-off-by: ajeshkv <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589390
Reviewed-by: Amit Pabalkar <apabalkar@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-10 16:02:43 -07:00
deepak goyal
cc7b048641 gpu: nvgpu: non-zero blob size for rail-gating.
Ucode blob size 0 is passed currently for rail-gating.
Ucode blob size 0 is not supported by ACR yet.
ACR will copy UCODE blob again
to SYSMEM for GPU Rail-gating cycles.

Bug 3361416

Change-Id: I1fdb3993cda7e5d62507d83f9c0a8645dc5f7fc7
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2588207
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-09 09:16:37 -07:00
Debarshi Dutta
a53ebf02d1 gpu: nvgpu: update error message to info.
These errors are now actually expected from code that counts number of
sys/gpc/fbp perfmons after first context creation. Nvgpu tries to count
them by register offset lookup in context image and counts perfmons until
invalid offset is found.

nvgpu_gr_hwmp_map_find_priv_offset no longer prints an error message.
The correct error condition is moved to gr_exec_reg_ops

Bug 200755537

Change-Id: Ib5c6ccd39275b2b06e3f8bce4878a3234478a780
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586228
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-09 09:13:03 -07:00
Sagar Kadamati
dd9b4364aa gpu: nvgpu: add nvgpu-next infrastructure
* As of now, working on multiple chip bringup in nvgpu-next repo has
   an issue because we end with losing control on source code (hard to
   find which part of the code belongs to which chip) and it's valuable
   history this affects chip migration on release.

 * To support multiple chip bringup simultaneously, we need new
   guidelines to avoid losing control on source code and make migration
   easier. This change adds links to nvgpu-next repo.

 * Updated return code to ENODEV for consistency
 * Updated ACR unittest to work with ENODEV return code

NOTE:
     These are the initial set of infrastructure changes, guidelines
     will evolve, and source code will get updated accordingly.

     Based on future chip features, Which part of the source code falls
     under nvgpu-next repo is decided.

JIRA NVGPU-6574

Change-Id: I81827e35d189c55554df00e255b527a4473e0338
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556793
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-08 06:50:38 -07:00
Konsta Hölttä
9ffcb0fade gpu: nvgpu: log submit error reasons
For each common error that may happen in the submit path, log the
failure reason at info level if not already logged. Various mistakes may
cause -EINVAL, and getting to know what is wrong is helpful when writing
tests.

Change-Id: I8ac2a40441e0bf3d8afdb40526b607537eb5105c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587360
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 16:00:50 -07:00
Divya Singhatwaria
b6ab227016 gpu: nvgpu: Enable pmu interrupt
- For secure RISCV boot, enable pmu interrupt
  during pmu_rtos_init
- As interrupts are enabled, PMU intr can be received
  before driver has changed the pmu firmware state. This
  can cause the RISCV boot to fail.
- To resolve this, first change the pmu firmware state
  from off to PMU_FW_STATE_STARTING and then wait
  for pmu priv lockdown release.

Change-Id: Ib2e8b033fec6320bf9ccff02696192a48172464b
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586325
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-07 16:00:05 -07:00
dt
152d7c9edd gpu: nvgpu: Fix for pes_tpc_mask programming
After CONFIG_UBSAN kernel compilation flag to know any shifting
cause overflow or not enablement ,this is identified.
The register "gr_fe_tpc_fs_r(gpc_index)" is read only after
Volta. The gops where we are computing the index is not needed.

Bug 200727116

Change-Id: Ib2306103389ba9df77fd59d012ec70e775104989
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573296
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 15:59:48 -07:00
dt
9355345610 gpu: nvgpu: Add IPA-PA cache to increase the performance
When GPU need to programmed with PA(physical address),
given IPA need to be converted to PA by querying Hypervisor.
As this is an IPC between OSes, the call will reduce the
performance badly. So this is adding a IPA-PA cache to improve
the performance. This will be more helpful in passthr config.

Bug 3277194

Change-Id: I6a3230d858977313a0ed0f33068055a3b516330a
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2571814
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 10:28:58 -07:00
Ramesh Mylavarapu
ffd0d3962f gpu: nvgpu: gsp: gsp isr and debug trace support
- Created GSP NVRISCV interrupt handle and
  respective functions and register reads.
- Created Debug trace support for GSP firmware.

NVGPU-7084

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I2728150c4db00403aa6e3c043bc19c51677dd9cf
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589430
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 05:37:51 -07:00
Debarshi Dutta
33740b41b6 gpu: nvgpu: free memory during module removal
Following pointers(allocated via Kmalloc/DMA) aren't freed during
module removal.

struct nvgpu_gr_config -> gpc_tpc_mask_physical
struct nvgpu_netlist_vars -> ctxsw_regs.etpc.l
struct mm_gk20a -> sysmem_flush
struct nvgpu_pmu_pg -> pg_buf
SGTable corresponding to VPR secure buffer.

Added appropriate free calls.

Bug 3364181

Change-Id: I2105c1f3256b1910f0f514d98f0ee3ae2e34aff7
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586244
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-02 15:43:07 -07:00
Sagar Kamble
ed16377983 gpu: nvgpu: allocate comptags and store metadata in REGISTER_BUFFER ioctl
To enable userspace query about comptags allocation status of a buffer,
comptags are to be allocated only during buffer registration done by
nvrm_gpu. Earlier, they were allocated during map.

nvrm_gpu will be sending metadata blob to be associated with the buffer.
This will have to be stored in the dmabuf privdata for all the buffers
registered by nvrm_gpu.

This patch moves the privdata allocation to buffer registration ioctl.

Remove g->mm.priv_lock as it is not needed now. This lock was added
to protect dmabuf private data setup. That private data is now
handled through dmabuf->ops and setup of dmabuf->ops is done
under dmabuf->lock.

To support legacy userspace, this patch still allocates comptags on
demand on map calls for unregistered buffers.

Bug 200586313

Change-Id: I88b2ca04c733dd02a84bcbf05060bddc00147790
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480761
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-02 11:42:08 -07:00
Sagar Kamble
7410784b0b gpu: nvgpu: fix clk_arb completion file private data access race
clk_arb completion file descriptor can get closed immediately after
poll finishes in the work item gp10b_clk_arb_run_arbiter_cb. In
that case, the refcount for nvgpu_clk_dev can become zero in
the work item and can lead to invalid access while removing
nvgpu_clk_dev from the lists.

Remove nvgpu_clk_dev from the list before dropping the reference to
it.

Also, delete the nvgpu_clk_dev in completion file release handler
within the session and requests spinlocks to avoid race with
gp10b_clk_arb_run_arbiter_cb using it.

bug 200757277

Change-Id: I054eee547f2a6fa633d7ef55df216ec36647a826
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2569522
(cherry picked from commit ce8548ec05)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587070
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 09:50:11 -07:00
deepak goyal
77d1e765f5 gpu: nvgpu: ga10b: Fix logic for BROM pass status
Current code assumes riscv brom passed if it does not times out.
This patch explicitly checks for brom pass/fail or timeout.

Bug 3361416

Change-Id: I399a6cf9d32be92b24990532f81892642513ba54
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2585786
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-31 08:54:35 -07:00
Deepak Nibade
3c97f3b932 gpu: nvgpu: disallow binding more channels than MAX channels supported per TSG
There is HW specific limit on number of channel entries that can be
added for each TSG entry in runlist. Right now there is no checking
to enforce this from SW and hence if User binds more than supported
channels to same TSG, invalid TSG formation error interrupts are
generated.

Fix this by adding appropriate checks in below steps :

- Add new field ch_count to struct nvgpu_tsg to keep track of
  channels bound to TSG.
- Define new hal gops.runlist.get_max_channels_per_tsg() to retrieve
  HW specific maximum channel count per TSG.
- Implement the HAL for gk20a and gv11b chips, and assign new HALs for
  all chips appropriately.
- Increment ch_count while binding the channel to TSG and decrement it
  while unbinding.
- While binding channel to TSG, Check if current channel count is
  already equal to max channel count. If yes, print an error and bail
  out.

Bug 200763991

Change-Id: Ic5f17a52e0fb171d1c020bf4f085f57cdb95f923
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582095
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-25 09:47:47 -07:00
Debarshi Dutta
608decf1e6 gpu: nvgpu: add support for powering off gpu
Add support for powering off IGPU for switching between
legacy to SMC mode/vice-versa or changing SMC configuration.
The power off can be issued as follows

echo 0 > /dev/nvgpu/igpu0/power

The following steps are done during a poweroff.
1) Deterministic channel idle
2) Acquire write_lock on l->busy semaphore.
3) Wait till power_usage decrements to indicate 0 active jobs.
4) Invoke pm_runtime_put_sync_suspend()
5) Invoke nvgpu_gr_remove_support() to clear existing GR memory.
6) Release write_lock on l->busy
7) Deterministic channel unidle.

Part of the sequence matches that of the gk20a_do_idle code.
The common parts are extracted into new functions
gk20a_block_new_jobs_and_idle() and gk20a_unblock_jobs()

For joint-rail case, the current implementation, does a railgate
and then sets pm_runtime_set_autosuspend_delay(-1) to disable
regular runtime resume/suspend.

Remove clearing of NVGPU_SUPPORT_MIG status during state change
ias it leads to inconsistencies.

Jira NVGPU-6920

Change-Id: I0b3eb3278176122ac061c1e8a94ebfb3c17c3925
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578501
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:50 -07:00
Debarshi Dutta
2e3c3aada6 gpu: nvgpu: fix deinit of GR
Existing implementation of GR de-init doesn't account for multiple
instances of struct nvgpu_gr. As a fix, below changes are added.

1) nvgpu_gr_free is unified for VGPU as well as native.
2) All the GR instances are freed.
3) Appropriate NULL checks are added when freeing GR memories.
4) 2D, 3D, I2M and ZBC etc are explicitely disabled when MIG is set.
5) In ioctl_ctrl, checks are added to not return error when zbc is NULL
   for VGPU as requests are rerouted to RMserver.

Jira NVGPU-6920

Change-Id: Icaa40f88f523c2cdbfe3a4fd6a55681ea7a83d12
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578500
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:45 -07:00
Mahantesh Kumbar
b9696ee643 gpu: nvgpu: ga10b: update NVRISCV LSPMU
- Set NVRISCV LSPMU app version to 0.
- Setting app version to 0 helps to load and boot
  multiple LSPMU ucode's without modifying the
  NVGPU driver.
- Add support for PMU NVRISCV prod and dbg bin's.
- This is corresponding change to LSPMU MPSK CL
  https://git-master.nvidia.com/r/c/tegra/kernel-firmware-t18x/+/2576049

JIRA NVGPU-7061

Change-Id: I800953ca97af3badde1983aa99e09b4fe7453203
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2575341
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-22 11:05:03 -07:00
Richard Zhao
d8e847c90d gpu: nvgpu: vgpu: fix force preemption from debugfs
check whether there's any force_preemption_gfxp or force_preemption_cilp
set in debugfs when alloc obj_ctx.

Jira GVSCI-4658

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I87fc7e195c9b0f7ed29ec6c37c8f46b456625fea
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579218
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-19 14:06:44 -07:00
Vedashree Vidwans
e13ab1f9ea gpu: nvgpu: pmu: remove hw access from remove_pmu_support
GPU HW registers are locked before remove_pmu_support.
Remove functions accessing HW registers.

Bug 3357477

Change-Id: I34a1923bfdb3afacd462f2646e2821569573a81a
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2577627
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-17 09:45:42 -07:00
Sagar Kamble
f4571194b0 gpu: nvgpu: stop ELPG init thread during unload
ELPG initialization thread creation can fail when the process is killed.
That leads to driver resume failure.

That thread was stopped on suspend and re-created on resume. To avoid
the issue above, don't stop the ELPG thread in suspend and let the
first created thread handle the ELPG state transitions always.
And stop the ELPG thread during unload.

Also fix couple of instances of config flag as:

s/CONFIG_PMU_POWER_PG/CONFIG_NVGPU_POWER_PG

bug 3345977

Change-Id: I8952edf8d1664ed258f238e265002e716d1bf5c2
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573763
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Ashish Mhetre <amhetre@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-11 01:55:46 -07:00
Sagar Kamble
40064ef1ec gpu: nvgpu: fix ecc counter free
ECC counter structures are freed without removing the node from the
stats_list. This can lead to invalid access due to dangling pointers.

Update the ecc counter free logic to set them to NULL upon free, to
remove them from stats_list and free them by validation.

Also updated some of the ecc init paths where error was not propa-
gated to callers and full ecc counters deallocation was not done.

Now, calling unit ecc_free from any context (with counters alloc-
ated or not) is harmless as requisite checks are in place.

bug 3326612
bug 3345977

Change-Id: I05eb6ed226cff9197ad37776912da9dcb7e0716d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2565264
Tested-by: Ashish Mhetre <amhetre@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-11 01:55:08 -07:00
mkumbar
de267c034c gpu: nvgpu: ga10b: Enable PKC support
-Enable PKC support in ACR and LS-PMU
-Update the PMU f/w version.
-Enable PMU support by default.

Change-Id: I42bbe1b64ddc6ead9641c97d1ed27a9f4020510a
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2568609
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Deepak Goyal <dgoyal@nvidia.com>
Tested-by: Krishna Reddy <vdumpa@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-08 14:23:36 -07:00
Sagar Kamble
0f59efb2cd gpu: nvgpu: return tpc exceptions error properly
It is observed that recovery on receiving the ESR MMU NACK exception
does not get triggered as the error returned by tpc level handler
is masked.

NACK is marked handled but recovery is not done and subsequent fb
intr handler does not trigger recovery since NACK is handled.

This leaves the HW engines in bad state.

Fix the tpc error return logic to trigger recovery during ESR MMU
NACK exception.

Bug 3318939

Change-Id: I475826f734e4366e853607e1e0338290ee28249b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2564764
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-07-26 05:13:41 -07:00
Tejal Kudav
b33079d47e gpu: nvgpu: Move intr data members from MC to CIC
Move interrupt specific data-members from common.mc to common.cic
Some of these data members like sw_irq_stall_last_handled_cond need
To be initialized much earlier during the OS specific init/probe stage.
Also, some more members from struct nvgpu_interrupts(like stall_size,
stall_lines[]), which will soon be moved to CIC will also need to be
initialized early during the OS specific probe stage.
However, the chip specific LUT can only be initialized after the
hal_init stage where the HALs are all initialized.
Split the CIC init to accommodate the above initialization requirements.

JIRA NVGPU-6899

Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 18:06:28 -07:00