Commit Graph

1025 Commits

Author SHA1 Message Date
Jinesh Parakh
02b108d26d gpu: nvgpu: Fix Unchecked Return Value bugs
Propagate errors from previously unchecked function calls.
This fixes the following Coverity Defects:

nvlink.c : Unchecked return value
sysfs.c : Unchecked return value
nvlink_probe.c : Unchecked return value
ioctl_nvs.c : Unchecked return value

CID 9847567
CID 9848580
CID 10127940
CID 10129447

Bug 3460991
Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I930bf34a451d6d941359ad76c84cf1fef2df1351
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689111
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-14 10:23:36 -07:00
Sagar Kamble
120a653dd1 gpu: nvgpu: fix untrusted loop bound in clk_set_info ioctl
In gk20a_ctrl_dev_ioctl clk_set_info: An unscrutinized value num_entries
is used as a loop bound. An attacker could control the number of times
the loop iterates.

Loop iterator is signed int which can lead to unpredictable results,
Hence change it to u32. And sanitize the num_entries parameter.

CID 1993996
Bug 3460991

Change-Id: Ib644cf19f016ab80a3f2d66f156ca863f8e138e1
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693942
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-13 14:06:50 -07:00
Jon Hunter
86c0a696ed gpu: nvgpu: Fix build for Linux v5.18
Upstream commit 7938f4218168 ("dma-buf-map: Rename to iosys-map")
renames 'struct dma_buf_map' to 'struct iosys_map' and breaks building
the NVGPU driver with Linux v5.18-rc1. In the NVGPU driver there are
many places where 'dma_buf_map' is used and so to clean-up the code and
minimise the impact of this change, add a gk20a_dmabuf_vmap() and a
gk20a_dmabuf_vunmap() helper function. These new functions support all
kernel versions and eliminate a lot the KERNEL_VERSION ifdefs.

Bug 3598986

Change-Id: Id0f904ec0662f20f3d699b74efd9542d12344228
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693970
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-12 16:34:10 -07:00
Antony Clince Alex
19a8adeae1 gpu: nvgpu: prof: add new resource type
Add new profiler resource type NVGPU_PROFILER_PM_RESOURCE_TYPE_PC_SAMPLER.
Introduce regops HAL get_hwpm_pc_sampler_register_ranges to get
allowlist for PC_SAMPLER resources. Re-generate allowlist files to include
register ranges for PC_SAMPLER resources.

Update uapi header to advertise new resource type
NVGPU_PROFILER_PM_RESOURCE_ARG_PC_SAMPLER.

Bug 3408536

Change-Id: I7009ef822665771eed727da48ef1e89dcc6b9c4b
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689057
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-12 16:30:52 -07:00
Vedashree Vidwans
a112c5d9dd gpu: nvgpu: modify channel wdt for non-si
Increase channel watchdog value for non-si platforms.

Bug 3553564

Change-Id: I42277255599afb09b11f8321ca9b2f124f502933
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2420872
Tested-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-07 23:56:53 -07:00
Divya
0946df9865 gpu: nvgpu: add aelpg flag check in sysfs node
- When read from aelpg_param_read sysfs node and
  write to aelpg_param_store sysfs node is done it
  leads to system crash.
- This issue is seen on safety build as power features
  are not enabled.
- To avoid this crash, add aelpg platform flag check in
  aelpg_param_read and aelpg_param_store sysfs nodes.
- Also, as AELPG depends on ELPG add can_elpg check before
  enabling/disabling aelpg through sysfs node.

Bug 3582946

Change-Id: Iaf709db2b5dc0340390767f4b06a0ac06962ed77
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2690548
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-07 12:39:36 -07:00
Sagar Kamble
ad85b60bb0 gpu: nvgpu: use nvmem API to read fuses
Replace the usage of tegra_fuse_readl with nvmem_cell_read_u32 for the
below fuse registers added as nvmem cells on v5.10+ kernels.

Older nvidia kernels do not have these tegra nvmem cell support.

1. FUSE_GCPLEX_CONFIG_FUSE_0
2. FUSE_RESERVED_CALIB0_0
3. FUSE_PDI0
4. FUSE_PDI1

bug 200633045

Change-Id: I187400720929233fcbc1970c9bbed34347b0a9a7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2670828
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-07 12:35:22 -07:00
Jinesh Parakh
bbaf01590c gpu: nvgpu: Fix Logically dead code Coverity bugs
Fixed following Coverity Defects:

ioctl_clk_arb.c : Logically dead code
gr_gp10b.c : Logically dead code
vfe_var.c : Logically dead code
grmgr_ga10b.c : Logically dead code
vm_remap.c : Logically dead code
falcon_debug.c : Logically dead code

CID 1994001
CID 3008644
CID 9870823
CID 10062537
CID 10127915
CID 10128008

Bug 3460991
Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I711d2ccb480328d8f0a4ba49e877612669f3d41f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2686362
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-28 07:36:44 -07:00
Jinesh Parakh
d4cb2eb3c0 gpu: nvgpu: Fix Dereference Coverity issues
Fixed following Coverity Defects:

fw.c : Dereference after null check
channel.c : Dereference before null check
log.c : Dereference before null check

CID 10064128
CID 10056456
CID 10127934

Bug 3460991

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I9c075f5c38c2254d5c656af58bb002714bd53396
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2685320
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-28 07:36:10 -07:00
Konsta Hölttä
e9d453806c gpu: nvgpu: move duplicate timer api to common
The high level API for the timer unit is the same across all OSs, so
get rid of the slight code duplication by moving the timer init
functions under a new file in common code:

- nvgpu_timeout_init_cpu_timer
- nvgpu_timeout_init_cpu_timer_sw
- nvgpu_timeout_init_retry

Much of the timer logic is also duplicated, but it is mixed between OS
specific current time retrieval. With some refactoring and addition of
an OS independent time keeping layer, that logic could also be made
shared.

Change-Id: I75d02ceb0d32022b0ba7f3bcd9fdb13d47039dbc
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2669510
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-25 21:33:21 -07:00
Divya Singhatwaria
7ff977063b gpu: nvgpu: add elpg protection for tpc_enabled_exceptions
- DeviceGetTpcExceptionEnMask test calls ioctl
  NVGPU_GPU_IOCTL_GET_TPC_EXCEPTION_EN_STATUS which reads
  register gr_gpc0_tpc0_tpccs_tpc_exception_en_r(). This causes
  IDLE_SNAP and further disengages ELPG.
- Add elpg protected call for the tpc_enabled_exceptions HAL.

Bug 3522086

Change-Id: I137ac2c643c693b596b6ce3e879da9c786ee3a85
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2674509
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-23 20:59:58 -07:00
Rajesh Devaraj
a06b5ff2d2 gpu: nvgpu: update error reporting in av+l
MISC_EC is not supported in platforms like L4T. In such case, -ENODEV
error code will be returned by Safety_Services. This patch updates
error reporting to consider this scenario.

JIRA NVGPU-8094
Bug 3366818
Bug 3491596

Change-Id: I0316571fe44418e738ae784da0584cf1040cb6cb
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2684283
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-23 14:02:00 -07:00
Rajesh Devaraj
4652f96a6f gpu: nvgpu: add polling for back-to-back error reporting in av+l
When an error is reported to Safety_Services, it will be cleared
at FSI and reported to SEH (System Error Handler). Since MISC_EC
interface provides only one register for error reporting, there
is a need to poll the status of previously reported error before
reporting the next error. For this purpose, this patch adds logic
to perform polling using epl_get_misc_ec_err_status(), in AV+L.

JIRA NVGPU-8094
Bug 200729736

Change-Id: Ia01a2fc42a7ce586b7965a82c90027a9a2dd252b
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2684141
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-21 10:51:39 -07:00
Rajesh Devaraj
86be5112b2 gpu: nvgpu: plug-in misc-ec interface for av+l
This patch adds misc-ec interface into NvGPU driver to report GPU HW
errors to Safety_Services, in AV+L. For this purpose, it introduces
a new flag "CONFIG_NVGPU_ENABLE_MISC_EC".

JIRA NVGPU-8094

Change-Id: Id8fff69487cad9ed4eb082a7d8615a1e15867ffa
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678394
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
2022-03-18 07:54:53 -07:00
Antony Clince Alex
40231858a5 gpu: nvgpu: add enable flag for gpu emulate mode
Introduce enable flag NVGPU_SUPPORT_EMULATE_MODE, and bring emulate mode
feature under this flag. At present, gpu emulate mode is only support on
ga10b.

Jira NVGPU-8120

Change-Id: I85269992926c3cf8f2d1dd70882979e1c4656984
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2681613
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-17 19:48:03 -07:00
prsethi
3651d1150d gpu: nvgpu: update kmdi interfaces
Patch udpates/fixes following issues.
- Updates nvgpu_dbg_gpu_get_mappings_entry.size to u64 to address >4G
limitations.
- Removes offset from original cpuva and unmaps only original mapped
address.
- Call nvgpu_vm_find_mapped_buf_range() in place of
nvgpu_vm_find_mapped_buf() to find the addresses which are not
page aligned.
- Update logic to parse the gpuva while trying to find gpu mappings so
that gpuva which are more than the mapped buffer base address can also
be considered.

Bug 200722275

Change-Id: If33d85db37a9f03a662984c212544a8b2ade471c
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2612129
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-17 10:15:04 -07:00
Seshendra Gadagottu
ad884ffa53 gpu: nvgpu: ga10b: enable rail gating
Bug 3514055

Change-Id: Ie7c268e7555bab6f7f0872c2774be39893d6459e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590816
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Antony Clince Alex <aalex@nvidia.com>
2022-03-16 14:48:10 -07:00
Sagar Kamble
373167883e gpu: nvgpu: add write barrier after setting notifier info32
While checking the GPU error status, userspace polls on the error
notifier 'status' and then verifies 'status' and 'info32'. nvgpu
sets 'info32' before 'status', so put a write barrier between
those two writes for the consistency between userspace and
kernel view of the error notifier state.

JIRA NVGPU-7538
Bug 200717195
Bug 3250920

Change-Id: I92ac0589283fee823f3366ac594d03b8f27f3590
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2680320
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2022-03-16 08:19:50 -07:00
Konsta Hölttä
4b4ee3802a gpu: nvgpu: coverity-clean nvs sprintf
Cast the return code of sprintf away when creating a scheduling domain
device name. While at it, use snprintf just in case.

CID 497476
CID 497479

Bug 3512545

Change-Id: I4e26c29b889de4b709d582ec3fdde28c50fca5b9
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2681274
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-15 04:28:09 -07:00
Martin Radev
30c434efc1 gpu: nvgpu: check for extra mapping flags
User space may erroneously provide extra flags for buffer
mappings which always get silently ignored without any
error. It would be better to return error to catch cases
of ABI mismatches.

This patch checks if any extra flags were provided and
returns error if this is the case.

Jira NVGPU-6640
Bug 3489827

Change-Id: Ib226f049f81bef48bf00656259ed97ba0a3eb47c
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2676684
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-09 21:07:59 -08:00
Martin Radev
8cba20ff42 gpu: nvgpu: sanitize TEGRA_RAW attribute
TEGRA_RAW mappings are only supposed to be used in special
circumstances by user space software, and the request cannot
be treated as a hint and be silently ignored if the feature
is not supported.

This patch updates the logic to return error if the feature is
not supported.

Jira NVGPU-6640
Bug 3489827

Change-Id: Ia2ce71df0202ab0c8676b815cf887cc7300aa07f
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2676168
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-09 21:07:52 -08:00
Debarshi Dutta
cb70e86ac1 gpu: nvgpu: Allow SC7 suspend/resume
Allow SC7 suspend/resume for platforms even if runtime pm
is disabled.

Currently, nvgpu can disable runtime pm by setting railgate_init
field to false for platform_{gk20a/gv11b/ga10b) files. This is done by
taking extra reference count in the PM Framework.

However, device suspend would still fail. Fix this by checking
for NVGPU_CAN_RAILGATE and removing the additional reference count
taken as mentioned above. Take the extra refcount back at
the end of the resume path.

Bug 3458643

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I413e09e2f9f380d78c0ce30196591e9c5b7544f3
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2668567
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-09 21:04:59 -08:00
Antony Clince Alex
c0f4723339 gpu: nvgpu: perbuf: update PMA buffer mapping
The PMA unit can only access GPU VAs within a 4GB window, hence both
the user allocated PMA buffer and the kernel allocated bytes available
buffer should lie in the same 4GB window. This is accomplished by
carving out and reserving a 4GB VA space in perbuf.vm and using fixed
GPU VAs to ensure that both buffers are bound within the same 4GB window.

In addition, update ALLOC_PMA_STREAM to use pma_buffer_offset,
pma_buffer_map_size fields correctly.

Bug 3503708

Change-Id: Ic5297a22c2db42b18ff5e676d565d3be3c1cd780
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671637
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-07 15:17:35 -08:00
srajum
8e56c73eab gpu: nvgpu: fixing MISRA Rule 21.2 violation
- "va_start", "time" a reserved identifiers or macro names described
  in Section 7, "Library", of the C standard, shall not be declared.

JIRA NVGPU-6536

Change-Id: I868362819dd7178eb7b165f243fb6d36322d8372
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582291
(cherry picked from commit 29c2c55b184cf16aee51614da895747750217885)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2674867
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 06:08:00 -08:00
Debarshi Dutta
5c0dc7e39d gpu: nvgpu: add support for disabling l3 via DT
On volta the GPU determines whether to do L3 allocation for a mapping by
checking bit 36 of the physical address. So if a mapping should allocate
lines in the L3 this bit must be set.

However, when the physical addresses for 64GB of RAM uses the 36th bit
resulting in a conflict. Thus, add support for disabling l3 support
for SKUs having 64GB of physical memory.

Bug 3486025
Bug 3469094

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Ic540e754274cf1d9e6625493962699d21509e540
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2661548
(cherry picked from commit 46b43d2b24)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2661542
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Tested-by: Brad Griffis <bgriffis@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 06:05:58 -08:00
Konsta Hölttä
2a8914619d gpu: nvgpu: bind sched domains as fds
Replace id-based lookup with fd-based lookup when binding a TSG to a
domain. The device node based domain interface naturally provides access
control; this way userspace tools can limit which uid/gid can access
each domain.

Also, explicitly disallow binding channels to a TSG that has no runlist
domain yet. Normally a TSG is in the default domain if nothing else has
been specified, but the default domain can be deleted.

Jira NVGPU-6788

Change-Id: I2af96dfc002367d894eaf0c175006332f790c27f
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2651165
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 00:08:55 -08:00
Konsta Hölttä
3a64fdefc4 gpu: nvgpu: domains as files for access control
Create device nodes for user-created scheduling domains. This helps
leverage filesystem based access control: domains can be chosen to be
available for a limited set of users on a system.

The device nodes are dynamic: they can be removed while the driver is
running normally. This is a bit different from the nodes that exist
until the driver is unloaded, so the devno/domain mapping is stored in a
separate list. The usual container_of pattern would suffer from an
unavoidable race condition if a domain file was opened while the same
domain would get removed.

As usual, domain refcounting prevents a domain from being removed. Now
the open device files hold refs and thus any open domain files prevent a
domain from getting removed, in addition to the userspace-invisible ref
that is taken when a TSG is bound to a domain.

While at it, make the query ioctl guarded by the sched domain mutex, as
domains might technically get added or removed during the querying code.

Jira NVGPU-6788

Change-Id: Ief2a09a442c4e70f1f2be8a32359341071d74659
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2651164
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 00:08:49 -08:00
Konsta Hölttä
beed6d3c2b gpu: nvgpu: add nvgpu_get_v2_user_class()
Add a function to find the nvgpu_class of the v2 user device nodes. This
is the last entry in the class list, as the devices are created in that
order.

The v2 user class is not defined when MIG is enabled because there are
multiple logical devices; bigger changes would be needed for this.

Jira NVGPU-6788

Change-Id: I2177c1e5b4d0bbec77a4e258391859242b4f20d6
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2674427
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 00:08:43 -08:00
Konsta Hölttä
f11ca4c300 gpu: nvgpu: expose device creation
Allow gk20a_create_device() to happen outside the main ioctl logic and
rename it to have the modern nvgpu_ prefix. Add a separate function to
do cdev allocation and refactor the existing two callers slightly to
avoid repetition on the cdev struct initialization.

As a side effect, this modification fixes the error path that used to
not return an error if adding a device fails and also leaked the
allocated cdev memory.

Jira NVGPU-6788

Change-Id: Ia1f018b88d78fafdfcf4e95f6aa66e2368e58974
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2674426
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 00:08:37 -08:00
Konsta Hölttä
82df5b0219 gpu: nvgpu: track cdev minor numbers
The existing Linux character device nodes are statically configured
once. For other dynamically created devices, track the next minor number
in nvgpu_os_linux as a rudimentary allocator.

Only a small number of increments are expected at this time; in the
future, a bitmap might be more appropriate for tracking out-of-order
deallocations too.

Jira NVGPU-6788

Change-Id: I016ee8471313086620f9ab371583d6763848b0e2
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2651163
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-01 00:08:31 -08:00
Konsta Hölttä
086909ddd0 gpu: nvgpu: use correct err from device_create
When device_create fails, take PTR_ERR from the subdev that was
returned. Commit e8bac374c0 ("gpu: nvgpu: Use device instead of
platform_device") refactored this code but forgot to rename the error
retrieval.

Change-Id: Id01adac431da77a71c8e71e1b01a065826f5ebcf
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2673712
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-28 10:53:30 -08:00
Dinesh T
ef2a2be44f gpu: nvgpu: Add compression support with added contig memory pool
This is adding compression support for Ampere gpus by
the given contig memory pool.

Bug 3426194

Change-Id: I1c2400094296eb5448fe18f76d021a10c33ef861
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2673581
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-27 18:10:41 -08:00
Divya
05a1f927f8 gpu: nvgpu: add golden image check for tpc_pg_mask
- Setting different tpc_pg_mask value leads to GPU crash.
- It is observed that with GPU railgating disabled, if
  tpc_pg_mask is set, "the gpu is powered on" error is
  reported and it won't allow to set the tpc_pg_mask, which
  is expected.
- With GPU railgating enabled, the different tpc_pg_mask
  value is set and the GPU is crashed.
- So, add check for golden image initialized before setting the
  TPC, GPC and FBP PG mask.
- This check won't allow to update TPC, GPC and FBP mask after
  golden image initialization and thus no GPU crash happens.

Bug 3544499

Change-Id: Ia003beaaec9dead22da74ea5862a81986780966b
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2672202
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Ninad Malwade <nmalwade@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Tested-by: Ninad Malwade <nmalwade@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-22 05:57:37 -08:00
Shashank Singh
19a3b86f06 gpu: nvgpu: remove unused code from common.nvgpu on safety build
- remove unused code from common.nvgpu unit on safety build. Also,
remove the code which uses them in other places.
- document use of compiler intrinsics as mandated in code inspection
  checklist.

Jira NVGPU-6876

Change-Id: Ifd16dd197d297f56a517ca155da4ed145015204c
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561584
(cherry picked from commit 900391071e9a7d0448cbc1bb6ed57677459712a4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561583
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-17 04:58:32 -08:00
Rajesh Devaraj
0699220b85 gpu: nvgpu: compile-out unused apis from safety build
This patch does the following changes:
- Compiles-out unused error reporting APIs and the related
  data structures from safety build. For this purpose, it
  introduces the new flag: CONFIG_NVGPU_INTR_DEBUG
- Updates nvgpu_report_err_to_sdl() API with one more argument,
  hw_unit_id. This aids in finding whether an error to be reported
  is corrected or uncorrected from LUT.
- Triggers SW quiesce, if an uncorrected error is reported to
  Safety_Services, in safety build.
- Renames files in cic folder by replacing gv11b with ga10b,
  since error reporting for gv11b is not supported in dev-main.

JIRA NVGPU-8002

Change-Id: Ic01e73b0208252abba1f615a2c98d770cdf41ca4
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2668466
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-14 22:00:33 -08:00
Konsta Hölttä
81c220b95b gpu: nvgpu: use %pS for function pointers
%pF is obsolete. Use %pS when debug printing function symbols. (One
print in kmem was already using this.)

Bug 3532466

Change-Id: Id3994abbcb0dc2495e69f3c872149c6ea5e3b5cb
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2667999
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-11 18:27:39 -08:00
Debarshi Dutta
3d01b89e68 gpu: nvgpu: expose physical masks for GPCS/FBPs for MIG
Following changes are added
1) nvgpu_gr_config->gpc_tpc_mask_physical is now indexed by physical
gpc id instead of logical id.
2) Removed the conversion of logical fbp ids and replace them with
physical ids.
3) nvgpu_gpu_instance->fbp_en_mask now contains the mask of physical fbp ids.
4) gk20a_ctrl_ioctl_gpu_characteristics returns gpu.gpc_mask returns mask
of physical ids.

Bug 200712091

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I0e066df76e07203ff4a5be5bfff2cef8566b425d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2648831
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-11 13:28:50 -08:00
srajum
852717ccc1 gpu: nvgpu: add GPLv2 license to OS-specific code for linux
Bug 3384871

Change-Id: Ibc7be6d0a8985a87f70b352f2d9e5c233015c2a2
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2632438
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-09 20:50:21 -08:00
Rajesh Devaraj
7dc013d242 gpu: nvgpu: merge error reporting apis
In DRIVE 6.0, NvGPU is allowed to report only 32-bit metadata to
Safety_Services. So, there is no need to have distinct APIs for
reporting errors from units like GR, MM, FIFO to SDL unit. All
these error reporting APIs will be replaced with a single API. To
meet this objective, this patch does the following changes:
- Replaces nvgpu_report_*_err with nvgpu_report_err_to_sdl.
- Removes the reporting of error messages.
- Replaces nvgpu_log() with nvgpu_err(), for error reporting.
- Removes error reporting to Safety_Services from nvgpu_report_*_err.

However, nvgpu_report_*_err APIs and their related files are not
removed. During the creation of nvgpu-mon, they will be moved under
nvgpu-rm, in debug builds.

Note:
- There will be a follow-up patch to fix error IDs.
- As discussed in https://nvbugs/3491596 (comment #12), the high
level expectation is to report only errors.

JIRA NVGPU-7450

Change-Id: I428f2a9043086462754ac36a15edf6094985316f
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662590
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-09 00:41:18 -08:00
Ramesh Mylavarapu
9302b2efee gpu: nvgpu: gsp units separation
Separated gsp unit into three unit:
- GSP unit which holds the core functionality of GSP RISCV core,
  bootstrap, interrupt, etc.
- GSP Scheduler to hold the cmd/msg management, IPC, etc.
- GSP Test to hold stress test ucode specific support.

NVGPU-7492

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I12340dc776d610502f28c8574843afc7481c0871
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2660619
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-09 00:38:21 -08:00
Chris Johnson
14ed75e857 gpu: nvgpu: fix REMAP to support small/big pages
Initially, REMAP only worked with big pages but in some cases
only small pages are supported where REMAP functionality is
also needed.

This cleans up some page size assumptions. In particular, on a
remap request, the nvgpu_vm_area is found from the passed in VA,
but can only be done from virt_offset_in_pages if we're also
told the page size.

This now occurs from _PAGESIZE_ flags which are required by
both map and unmap operations.

Jira NVGPU-6804

Change-Id: I311980a1b5e0e5e1840bdc1123479350a5c9d469
Signed-off-by: Chris Johnson <cwj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2566087
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-09 00:37:33 -08:00
Konsta Hölttä
8736c0d467 gpu: nvgpu: add and use sw-only timers
The nvgpu timeout API has an internal override for presilicon mode by
default: in presi simulation environments the timeouts never trigger.
This behaviour is intended in the original usecase of the timer unit
with hardware polling loops. In pure software logic though, the timer
must trigger after the specified timeout even in presi mode so add a new
init function to produce a timer for software logic. Use this new kind
of timer in channel and scheduling worker threads.

The channel worker currently times out for just the purpose of the
channel watchdog timer which has its own internal timer. Although that's
just software, the general expectation is that the watchdog does not
trigger in presilicon tests that run slower than usual. The internal
watchdog timer thus keeps the non-sw mode.

Bug 3521828

Change-Id: I48ae8522c7ce2346a930e766528d8b64195f81d8
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662541
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-04 22:02:33 -08:00
Antony Clince Alex
e96746cfcd gpu: nvgpu: profiler: update PMA stream free policy
Update PMA stream free policy to implicitly unbind any resources already
bound to the profiler object.

Bug 3480919

Change-Id: I71ed4b73be295a86046a1384800e7ed0f2430f64
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662361
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-02 21:47:33 -08:00
Dinesh T
e33bdceb8b gpu: nvgpu: Unify ivm mempool
CBC contig allocation requires mempool node in DT and the
node can be used for contig allocations. The code duplication
can be avoided by unifying the code from vgpu.

Change-Id: I6eaa1d0c9db47b158602bf0ba68ce4e09cf487a7
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2650459
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-01 09:50:45 -08:00
Sagar Kamble
29a0a146ac gpu: nvgpu: fix coverity defects
Fix following coverity defects:
  ioctl_prof.c resource leak
  ioctl_dbg.c logically dead code
  global_ctx.c identical code for branches
  therm_dev.c resource leak
  pmu_pstate.c unused value
  nvgpu_mem.c dead default in switch
  tsg.c Dereference before null check
  nvlink_gv100.c logically dead code
  nvlink.c Out-of-bounds write
  fifo_vgpu.c Dereference null return value
  pmu_pg.c Dereference before null check
  fw_ver_ops.c Identical code for different branches
  boardobjgrp.c Dereference after null check
  boardobjgrp.c Dereference before null check
  boardobjgrp.c Dereference after null check
  engines.c Dereference before null check
  nvgpu_init.c Unused value

CID 10127875
CID 10127820
CID 10063535
CID 10059311
CID 10127863
CID 9875900
CID 9865875
CID 9858045
CID 9852644
CID 9852635
CID 9852232
CID 9847593
CID 9847051
CID 9846056
CID 9846055
CID 9846054
CID 9842821

Bug 3460991

Change-Id: I91c215a545d07eb0e5b236849d5a8440ed6fe18d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2657444
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-28 04:50:12 -08:00
Tejal Kudav
4f41ce7696 gpu: nvgpu: Disable frequency scaling for AV+L
NVGPU does not support frequency scaling on hypervisor
based embedded environments.Disable frequency scaling on AV+L
using the nvgpu_is_hypervisor_mode().

JIRA NVGPU-7283

Change-Id: If8fbcc0c5e2f11b9e8895825bb3b3022e7bd3005
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2654969
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Kasinadha Dendukuri <kdendukuri@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-17 05:37:23 -08:00
Sagar Kadamati
a3ed73a57c gpu: nvgpu: add tegra_raw support
* This change adds NVGPU_AS_MAP_BUFFER_FLAGS_TEGRA_RAW flag
   to control buffer format
 * Add NVGPU_SUPPORT_TEGRA_RAW enabled flag to indicate if feature
   is enabled for a given chip.
 * Update gv11b_gpu_phys_addr function to set TEGRA_RAW bit

Jira NVGPU-6640
Bug 3489827

Change-Id: I959c22bef906bb9c6dcdc8d5f5e9951ad9937a60
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2545128
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Seema Khowala <seemaj@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-13 12:35:36 -08:00
Debarshi Dutta
1c053a75af gpu: nvgpu: remove unnecessary warning.
Here, the freq_counter is set to track the count of number of
frequencies enumerated and capped by GP10B_MAX_SUPPORTED_FREQS.
There is an early terminating condition when new_rate equals max_rate.

The line following this is set to
WARN_ON(freq_counter == GP10B_MAX_SUPPORTED_FREQS);

This line is probably incorrect and contradicts the above loop as in
there is definite probability of freq_counter equaling
GP10B_MAX_SUPPORTED_FREQS. Probably the original intention might have
been to catch an off-by-1 error where freq_counter equals
GP10B_MAX_SUPPORTED_FREQS + 1.

Even then instead of printing a warning message, a better idea is to
handle the possible bug in the code itself.

Bug 3407276

Change-Id: I7f2a9d5c41be62227d08045e959e16c4228fbff4
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623380
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-12 07:40:37 -08:00
Deepak Nibade
7f839d6098 gpu: nvgpu: take power refcount for pma stream update get/put IOCTL
Add gk20a_busy()/idle() protection for pma stream update get/put IOCTL
NVGPU_PROFILER_IOCTL_PMA_STREAM_UPDATE_GET_PUT

Bug 2510974

Change-Id: Iade198f68e72f6fbc49be8ee55e4b44a4c332451
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2650588
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-07 06:30:24 -08:00
Sagar Kamble
535a27411a gpu: nvgpu: fix allocator debugfs deinit
Allocator (bitmap, buddy, page) debugfs files are not cleaned up when
the allocators are destroyed. This leads to warning logs from nvgpu
like below:

[21073.493000] debugfs: File 'gk20a_as_17' in directory 'allocators' already present!
[21073.493026] debugfs: File 'gk20a_as_17-sys' in directory 'allocators' already present!

Remove the per-allocator debugfs node when destroying an allocator in
runtime.

While at this, add missing nvgpu_allocator locking to the function
nvgpu_bitmap_alloc_destroy. And create nop functions for the
functions nvgpu_init_alloc_debug and nvgpu_fini_alloc_debug
when CONFIG_DEBUG_FS is not defined to avoid adding the
CONFIG checks at multiple places.

Move gk20a_debug_deinit to the end of gk20a_free_cb called in nvgpu_put
as that tears down all debugfs entries. Allocator destroy happens as
part of nvgpu_put call and it can lead to invalid debugfs dentry
access if gk20a_debug_deinit is called before it.

Bug 3481097

Change-Id: I8a66bcf6ade7e5707f9207c78a54d12d7bd94c02
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2648012
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-07 06:28:53 -08:00