Commit Graph

33 Commits

Author SHA1 Message Date
Ramalingam C
4a95d2b7ea gpu: nvgpu: allow powergating during bind_context ioctl
Do not check for powergate disablement for bind and unbind of the context
ioctls. As bind and unbind of context is just a sw statemachine related
ops we can ignore even if the power gating is still on. No need to print
the error message.

Bug 4397568

Change-Id: Idc71e12a4188cbc1c94c38032a2f1ed435d278ce
Signed-off-by: Ramalingam C <ramalingamc@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3034385
(cherry picked from commit 5f1dfe20366637f59b04b8968a527dbc3fac0d9f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3035549
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Tested-by: Tushar Kashalikar <tkashalikar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-12-26 22:49:40 -08:00
Antony Clince Alex
0b0b68be72 gpu: nvgpu: prof: print error if profiling with PG enabled
Prior to starting any profiling session, GPU power features should be
disabled, not doing this can result in unexpected behavior, print a
error message under such situations.

Jira NVGPU-7326
Bug 3579926

Change-Id: Ia9716d73ed945a588792a7d843aed64bf3b7f83b
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2701501
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-25 09:02:29 -07:00
Sagar Kamble
b58363d36f gpu: nvgpu: fix resource leak in nvgpu_prof_ioctl_vab_flush
Allocated user_data is not freed on error from gk20a_busy. Free it.

CID 10129907
Bug 3460991

Change-Id: Ic30b3cee1028a4829360fb591b90f29f74e6bdce
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693941
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-14 17:03:09 -07:00
Jon Hunter
86c0a696ed gpu: nvgpu: Fix build for Linux v5.18
Upstream commit 7938f4218168 ("dma-buf-map: Rename to iosys-map")
renames 'struct dma_buf_map' to 'struct iosys_map' and breaks building
the NVGPU driver with Linux v5.18-rc1. In the NVGPU driver there are
many places where 'dma_buf_map' is used and so to clean-up the code and
minimise the impact of this change, add a gk20a_dmabuf_vmap() and a
gk20a_dmabuf_vunmap() helper function. These new functions support all
kernel versions and eliminate a lot the KERNEL_VERSION ifdefs.

Bug 3598986

Change-Id: Id0f904ec0662f20f3d699b74efd9542d12344228
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693970
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-12 16:34:10 -07:00
Antony Clince Alex
19a8adeae1 gpu: nvgpu: prof: add new resource type
Add new profiler resource type NVGPU_PROFILER_PM_RESOURCE_TYPE_PC_SAMPLER.
Introduce regops HAL get_hwpm_pc_sampler_register_ranges to get
allowlist for PC_SAMPLER resources. Re-generate allowlist files to include
register ranges for PC_SAMPLER resources.

Update uapi header to advertise new resource type
NVGPU_PROFILER_PM_RESOURCE_ARG_PC_SAMPLER.

Bug 3408536

Change-Id: I7009ef822665771eed727da48ef1e89dcc6b9c4b
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689057
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-12 16:30:52 -07:00
Antony Clince Alex
c0f4723339 gpu: nvgpu: perbuf: update PMA buffer mapping
The PMA unit can only access GPU VAs within a 4GB window, hence both
the user allocated PMA buffer and the kernel allocated bytes available
buffer should lie in the same 4GB window. This is accomplished by
carving out and reserving a 4GB VA space in perbuf.vm and using fixed
GPU VAs to ensure that both buffers are bound within the same 4GB window.

In addition, update ALLOC_PMA_STREAM to use pma_buffer_offset,
pma_buffer_map_size fields correctly.

Bug 3503708

Change-Id: Ic5297a22c2db42b18ff5e676d565d3be3c1cd780
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671637
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-07 15:17:35 -08:00
Antony Clince Alex
e96746cfcd gpu: nvgpu: profiler: update PMA stream free policy
Update PMA stream free policy to implicitly unbind any resources already
bound to the profiler object.

Bug 3480919

Change-Id: I71ed4b73be295a86046a1384800e7ed0f2430f64
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662361
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-02 21:47:33 -08:00
Sagar Kamble
29a0a146ac gpu: nvgpu: fix coverity defects
Fix following coverity defects:
  ioctl_prof.c resource leak
  ioctl_dbg.c logically dead code
  global_ctx.c identical code for branches
  therm_dev.c resource leak
  pmu_pstate.c unused value
  nvgpu_mem.c dead default in switch
  tsg.c Dereference before null check
  nvlink_gv100.c logically dead code
  nvlink.c Out-of-bounds write
  fifo_vgpu.c Dereference null return value
  pmu_pg.c Dereference before null check
  fw_ver_ops.c Identical code for different branches
  boardobjgrp.c Dereference after null check
  boardobjgrp.c Dereference before null check
  boardobjgrp.c Dereference after null check
  engines.c Dereference before null check
  nvgpu_init.c Unused value

CID 10127875
CID 10127820
CID 10063535
CID 10059311
CID 10127863
CID 9875900
CID 9865875
CID 9858045
CID 9852644
CID 9852635
CID 9852232
CID 9847593
CID 9847051
CID 9846056
CID 9846055
CID 9846054
CID 9842821

Bug 3460991

Change-Id: I91c215a545d07eb0e5b236849d5a8440ed6fe18d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2657444
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-28 04:50:12 -08:00
Deepak Nibade
7f839d6098 gpu: nvgpu: take power refcount for pma stream update get/put IOCTL
Add gk20a_busy()/idle() protection for pma stream update get/put IOCTL
NVGPU_PROFILER_IOCTL_PMA_STREAM_UPDATE_GET_PUT

Bug 2510974

Change-Id: Iade198f68e72f6fbc49be8ee55e4b44a4c332451
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2650588
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-07 06:30:24 -08:00
Martin Radev
b67a3cd053 gpu: nvgpu: ga10b: Correct VAB implementation
This patch performs the following improvements for VAB:
1) It avoids an infinite loop when collecting VAB information.
   Previously, nvgpu incorrectly assumed that the valid bit would
   be eventually set for the checker when polling. It may not be set
   if a VAB-related fault has occurred.
2) It handles the VAB_ERROR mmu fault which may be caused for various
   reasons: invalid vab buffer address, tracking in protected mode,
   etc. The recovery sequence is to set the vab buffer size to 0 and
   then to the original size. This clears the VAB_ERROR bit. After
   reseting, the old register values are again set in the recovery
   code sequence.
3) Use correct number of VAB buffers. There's only one VAB buffer on
   ga10b, not two.
4) Simplify logic.

Bug 3374805
Bug 3465734
Bug 3473147

Change-Id: I716f460ef37cb848ddc56a64c6f83024c4bb9811
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621290
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-22 08:22:13 -08:00
Seshendra Gadagottu
820d2b4a2d gpu: nvgpu: keep gpu busy during vab reserve/flush ioctls
VAB IOCTL for reserve and flush can take more than railgate_delay(500msec).
To avoid gpu entering into railgate, set gpu busy hint during entry and set to
idle at the end of ioctl.

Bug 3468562

Change-Id: I9219e65004ad42028062ce09a315d9fde029a86c
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2643418
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-21 12:20:48 -08:00
Antony Clince Alex
f9cac0c64d gpu: nvgpu: remove nvgpu_next files
Remove all nvgpu_next files and move the code into corresponding
nvgpu files.

Merge nvgpu-next-*.yaml into nvgpu-.yaml files.

Jira NVGPU-4771

Change-Id: I595311be3c7bbb4f6314811e68712ff01763801e
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547557
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:53 -07:00
Antony Clince Alex
c7d43f5292 gpu: nvgpu: remove usage of CONFIG_NVGPU_NEXT
The CONFIG_NVGPU_NEXT config is no longer required now that ga10b and
ga100 sources have been collapsed. However, the ga100, ga10b sources
are not safety certified, so mark them as NON_FUSA by replacing
CONFIG_NVGPU_NEXT with CONFIG_NVGPU_NON_FUSA.

Move CONFIG_NVGPU_MIG to Makefile.linux.config and enable MIG support
by default on standard build.

Jira NVGPU-4771

Change-Id: Idc5861fe71d9d510766cf242c6858e2faf97d7d0
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547092
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:47 -07:00
Richard Zhao
9b66fca165 gpu: nvgpu: move .exec_regops to only execute regops
HAL .exec_regops used to first validate regops then execute it, now
moving it to only execute the regops.

- It helps B0CC on HV. On server side it does not track profiler object,
but regops validation uses the profiler, so moving validation to client
side.
- The change also remove ctx_buffer_offset checking in
validate_reg_op_offset. The offset already checked again whitelists
which have be verified when update whitelist. Also vgpu does not have
information of ctx and golden image.
- Added function nvgpu_regops_exec to cover both regops validation and
execution.

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I434e027290e263a8a64a25a55500f7294038c9c4
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534252
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:29:40 -07:00
Lakshmanan M
df87591b7d gpu: nvgpu: Add multi gr handling for debugger and profiler
1) Added multi gr handling for dbg_ioctl apis.
2) Added nvgpu_assert() in gr_instances.h (for legacy mode).
3) Added multi gr handling for prof_ioctl apis.
4) Added multi gr handling for profiler.
5) Added multi gr handling for ctxsw enable/disable apis.
6) Updated update_hwpm_ctxsw_mode() HAL for multi gr handling.

JIRA NVGPU-5656

Change-Id: I3024d5e6d39bba7a1ae54c5e88c061ce9133e710
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538761
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-04 18:07:47 -07:00
Martin Radev
8834275906 gpu: nvgpu: Validate PMA buffer size
The original code would only truncate the size to 32
bits and later write the value to a hw register. Let's
check that the user-provided buffer is large enough.

Bug 2510974

Change-Id: I8b14a07a46d30c0b8c7ea63e5bdef53fbd19ec6f
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527148
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-25 14:30:35 -07:00
Sami Kiminki
3aceed2db1 gpu: nvgpu: add changes for nvgpu-next
- Add new UAPI IOCTLs.
- Add nvgpu-next gops in fb and gr.
- Initialize and teardown vab during mm_support

Bug 2999621

Change-Id: Icc241f1a234bfee3fd20dc69b42c92e0af6d445c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2447064
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-22 07:35:34 -07:00
Jon Hunter
ddf8f12197 gpu: nvgpu: Add support for Linux v5.11
For Linux v5.11, commit 6619ccf1bb1d ("dma-buf: Use struct dma_buf_map
in dma_buf_vmap() interfaces") changes to the dma_buf_vmap() and
dma_buf_vunmap() APIs to pass a new parameter of type
'struct dma_buf_map'. Update the NVGPU to support these updated APIs
for Linux v5.11+.

Finally, the legacy dma_buf_vmap() API returns NULL on error and not an
error code and so correct the test of the return value in the function
gk20a_cde_convert().

Bug 200687525

Change-Id: Ie20f101e965fa0f2c650d9b30ff4558ce1256c12
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2469555
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-01-13 22:36:14 -08:00
Deepak Nibade
a0fb91846d gpu: nvgpu: set regop type based on per-resource ctxsw flag
New profiler APIs set regop type based on whether context is bound or
not in nvgpu_prof_get_regops_staging_data(). But it is possible that
ctxsw is not enabled for some particular HWPM resource even if context
is bound to profiler object.

Fix this by extracting regop type based on per-resource ctxsw flag
instead of bound context.

Add reg_op_type[] array in profiler object to track regop type for each
HWPM resource. Initialize the array based on resource ctxsw flag in
nvgpu_profiler_pm_resource_reserve().

Update profiler_obj_validate_reg_op_offset() to get regop type from
nvgpu_profiler_validate_regops_allowlist() and use this type and
prof->reg_op_type[] to get actual type that should be used for that
regop.

Update validate_reg_ops() to validate the offset first since regop
type is now determined in offset validation. Set ops[i].status to 0
for each validation iteration, and if op is valid set it to
REGOP(STATUS_SUCCESS) at the end of iteration.

Bug 2510974
Jira NVGPU-5360

Change-Id: Ib1f75d840d04d288789473adabda02cdc807eea0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2460003
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-01-05 12:38:17 -08:00
Deepak Nibade
869735cda4 gpu: nvgpu: add dynamic allowlist support
Add gv11b and tu104 HALs to get allowed  HWPM resource register ranges,
offsets, and stride meta data.

Add new enum nvgpu_pm_resource_hwpm_register_type for HWPM register
type. Add new struct nvgpu_pm_resource_register_range_map to store all
the register ranges for HWPM resources. Add pointer of map in struct
nvgpu_profiler_object along with map entry count.

Add new API nvgpu_profiler_build_regops_allowlist() to build the regops
allowlist dynamically while binding the resources. Map entry count is
received with get_pm_resource_register_range_map_entry_count() and only
those resource ranges are added for which resource is reserved by
profiler object.

Add nvgpu_profiler_destroy_regops_allowlist() to destroy the allowlist
while unbinding the resources.

Add static functions allowlist_range_search() to search a register
offset in HWPM resource ranges. Add another static function
allowlist_offset_search() to search the offset in per-resource offset
list.

Add nvgpu_profiler_validate_regops_allowlist() that accepts an offset
value, checks if it is in allowed ranges using allowlist_range_search()
and then checks if offset is in allowlist using allowlist_offset_search().

Update gops.regops.exec_regops() to receive profiler object pointer as
a parameter.

Invoke nvgpu_profiler_validate_regops_allowlist() from
validate_reg_ops() if prof pointer is not-null. This will be true only
for new profiler stack and not legacy profilers.

In gr_exec_ctx_ops(), skip regops execution if offset is invalid.

Bug 2510974
Jira NVGPU-5360

Change-Id: I40acb91cc37508629c83106ea15b062250bba473
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2460001
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-01-05 12:38:06 -08:00
Deepak Nibade
be9271d721 gpu: nvgpu: add API to extract gk20a pointer from cdev
Add new API nvgpu_get_gk20a_from_cdev() that extracts gk20a pointer
from cdev pointer. This helps in keeping cdev related implementation
details in ioctl.c and away from other device ioctl files.

Also move struct nvgpu_cdev, nvgpu_class, and nvgpu_cdev_class_priv_data
from os_linux.h to ioctl.h since all of these structures are more IOCTL
related and better to keep them in ioctl specific header.

Jira NVGPU-5648

Change-Id: Ifad8454fd727ae2389ccf3d1ba492551ef1613ac
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2435466
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
a3e39c685d gpu: nvgpu: track dev nodes using dynamic linked list
Remove static dev node meta data from struct nvgpu_os_linux and replace
it by a dynamic list. Struct nvgpu_os_linux will only keep track of list
head and number of entries.

Add new structure nvgpu_cdev to store meta data of each dev node and
create/setup it dynamically in gk20a_user_init(). Once done, add the new
node under list head maintained in nvgpu_os_linux.

Add a static list dev_node_list[] that contains list of dev node names
and file operations. This static list is used to create nvgpu_cdev data
structures and to register new device nodes.

Update all dev node open file operations (e.g. gk20a_as_dev_open()) to
extract struct gk20a pointer from device pointer of dev node.
gk20a device is the parent of dev node device.

Jira NVGPU-5648

Change-Id: If070c3428afd6215e45b4919335d9f43e04c36f9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2428500
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
c6aae8c049 gpu: nvgpu: use fixed address mapping for pma byte buffer
Use fixed address mapping for pma byte buffer so that the address of
this buffer always fits in 32 bits.

This also requires to move unmap sequence to OS specific function since
different unmap API is now needed for linux and QNX.

Also call nvgpu_prof_free_pma_stream_priv_data() before
nvgpu_profiler_free_pma_stream() since former uses mm->perfbuf which
is released in later.

Bug 2510974
Jira NVGPU-5360

Change-Id: I398b0ca4f96527d6e09c9aacacb4b43c90f5bfc9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2424691
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
9e94e118fe gpu: nvgpu: ensure pma byte buffer address fits in 32 bits
Right now PMA byte buffer address is allocated in the range of
0x1ffc010000. The register that stores this address is only 32-bit and
there is no corresponding _hi() register, so the address must fit in
32 bits.

Update nvgpu_vm_init() parameters in nvgpu_perfbuf_init_vm() so that a
low_hole of only 4K is used. This allows the address to be allocated
in the range of 0x4000000.

Also map byte buffer before PMA stream buffer so that byte buffer always
gets lower address.

There is only one PMA stream buffer allowed to be mapped right now so
this works for now. But in future multiple buffers can be mapped and this
solution needs to be reworked.

Bug 2510974
Jira NVGPU-5360

Change-Id: Ief1a9ee54d554e3bc13c7a9567934dcbeaefbcc6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2418520
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
f9f82561cf gpu: nvgpu: check PMA stream reservation in get put API
Check if PMA stream resource is reserved in
nvgpu_prof_ioctl_pma_stream_update_get_put() before accessing PMA stream
data.

Bug 2510974
Jira NVGPU-5360

Change-Id: Id57cc74e1cd37a2eb12a36671011a18693af1219
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2418521
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
221475f753 gpu: nvgpu: add profiler apis to manage PMA stream
Support new IOCTL to manage PMA stream meta data by adding below API
nvgpu_prof_ioctl_pma_stream_update_get_put()

Add nvgpu_perfbuf_update_get_put() to handle all the updates coming
from userspace and to pass all required information.

Add gops.perf.update_get_put() to handle all HW accesses required in
perf HW unit.

Add gops.perf.bind_mem_bytes_buffer_addr() to bind the available bytes
buffer while binding HWPM streamout.

Bug 2510974
Jira NVGPU-5360

Change-Id: Ibacc2299b845e47776babc081759dfc4afde34fe
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406484
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
5844151a93 gpu: nvgpu: add profiler apis to alloc/free pma stream
Add two new IOCTL APIs to allocate/free pma stream. Add two new
functions to handle this :
nvgpu_prof_ioctl_alloc_pma_stream()
nvgpu_prof_ioctl_free_pma_stream()

Allocation of pma stream includes below steps :
- Initializing perfbuf VM
- Mapping PMA buffer into perfbuf VM
- Mapping PMA byte buffer into perfbuf VM
- Mapping PMA byte buffer to CPU virtual address space

Store all of above data in struct nvgpu_profiler_object for
reference. OS specific data is stored in struct
nvgpu_profiler_object_priv

Update HWPM streamout bind/unbind sequence to enable/disable perfbuf
respectively.

Also take care of releasing the pma stream resources in profiler object
close path if they are not explicitly released by user space by IOCTL
call.

Bug 2510974
Jira NVGPU-5360

Change-Id: I126633746cabc4e293c7ad7c49806866a897949d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406483
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
69fe763b04 gpu: nvgpu: poweron GPU for regops execution
Call gk20a_busy() for regops execution in nvgpu_prof_ioctl_exec_reg_ops
since for resident contexts it will directly access the HW.

Bug 2510974
Jira NVGPU-5360

Change-Id: I272cf997f0c8a2edd71f88ab6d48471114a32a87
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406796
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
2012a6b558 gpu: nvgpu: add profiler api to execute regops
Implement new API nvgpu_prof_ioctl_exec_reg_ops() to support regops on
new profiler objects.

Add two new staging buffers to hold regops copied from userspace, and
to convert and execute regops in common code.
Buffers are allocated and released along with the profiler object.

New API will implements this :
-  copy regops data in chunks of 4K from userspace
- store them in staging buffer
- convert the new regop struct into common regop struct and also
  copy the content into second staging buffer
- trigger gops.regops.exec_regops() with second staging buffer as
  operation pointer
- convert common regop struct back into new regop struct and copy
  back to userspace

Export bunch of helper functions from ioctl_dbg.h. e.g.
nvgpu_get_regops_op_values_common()

Update regop execution code to skip regop execution if regop status
is not valid. This is only possible when userspace requests for
CONTINUE_ON_ERROR mode.

Add more documentation to some of the fields in UAPI header.

Note that maximum atomic operations reported by new API are same
as legacy API and are incorrect. This will be fixed up in upcoming
patches.

Bug 2510974
Jira NVGPU-5360

Change-Id: I9f82052b22143aec33f6e778c0784386744b699e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394208
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
5311132781 gpu: nvgpu: add profiler apis to bind/unbind PM resources
Add new APIs to bind/unbind PM resources to/from profiler objects:
nvgpu_profiler_bind_pm_resources()
nvgpu_profiler_unbind_pm_resources()

Implement support to bind/unbind SMPC/HWPM/HWPM_STREAMOUT in various
functions in common/profiler/profiler.c.

Unbind all the PM resources explicitly in
nvgpu_profiler_unbind_context() while closing the profiler object.

If resources are bound during a resource reservation request,
unbind the resources explicitly before reserving new resource.
It is responsibility of application to bind the PM resources again.

Bug 2510974
Jira NVGPU-5360

Change-Id: Ib2a0e017eaa23d0d376438771e8bf4e340865f03
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389655
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
330cc7d0e5 gpu: nvgpu: add profiler apis for resource reservation
Add two new functions to reserve/release PM resources :
nvgpu_prof_ioctl_reserve_pm_resource()
nvgpu_prof_ioctl_release_pm_resource()

Add ctxsw field to struct nvgpu_profiler_object to store per-resource
context switch enable flag.

Force resource reservation release while unbinding the context from
profiler object or while closing the profiler object. Add this code
in nvgpu_profiler_unbind_context() since both above paths will call
this function.

Bug 2510974
Jira NVGPU-5360

Change-Id: If334148e8df86360fba4162d1611187f3f04d01b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389654
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
ccba2e850b gpu: nvgpu: add mutex to serialize profiler ioctl calls
Add new mutex prof->ioctl_lock to serialize all IOCTL calls on profiler
object. Running concurrent IOCTL calls could lead to races and
corrupted state.

Bug 2510974
Jira NVGPU-5360

Change-Id: I66a8d9078c35475a13442ccd34b61aca5b9c1d2b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389652
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
969b901999 gpu: nvgpu: create device/context profiler dev nodes
Create new dev nodes for device and context profilers. Example of dev
nodes on iGPU
/dev/nvhost-prof-dev-gpu - device scope profiler
/dev/nvhost-prof-ctx-gpu - context scope profiler

Add below APIs to open/close above dev nodes :
nvgpu_prof_dev_fops_open()
nvgpu_prof_ctx_fops_open()
nvgpu_prof_fops_release()

Add common API nvgpu_prof_fops_ioctl() to handle IOCTL call on these
dev nodes. Add IOCTL NVGPU_PROFILER_IOCTL_BIND_CONTEXT to bind the TSG
to profiler objects.

Add nvgpu_tsg_get_from_file() to retrieve TSG struct pointer from
file descriptor. Also store profiler object pointer into TSG struct.

Enable NVGPU_SUPPORT_PROFILER_V2_DEVICE capability on gv11b and tu104.
Note that this is not yet enabled for vGPU.
Keep NVGPU_SUPPORT_PROFILER_V2_CONTEXT capabiity disabled since this
will take longer to support.

Add new IOCTL NVGPU_PROFILER_IOCTL_UNBIND_CONTEXT so that userspace can
explicitly unbind the context and release the resources before closing
the profiler descriptor.

Add context_init flag to profiler object for book keeping.

Bug 2510974
Jira NVGPU-5360

Change-Id: Ie07e0cfd5a9da9d80008f79c955c7ef93b4bc60f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2384354
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00