Commit Graph

1115 Commits

Author SHA1 Message Date
Lakshmanan M
ee2aaef308 gpu: nvgpu: Report non zero num_sub_partition_per_fbpa value only for dGPU
All Tegra iGPUs don't have real FBPA/FBSP units at all.
So num_sub_partition_per_fbpa should be 0 for iGPUs.

JIRA NVGPU-5656

Change-Id: I30050caf8f9f6b5185404a64dbbbe02f67046093
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2545978
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-16 15:06:30 -07:00
dt
12a0e3fe61 gpu: nvgpu: Add support to print mig config lists
This is adding support to show available mig configs when MIG
is disabled for nvgpu-next.

JIRA NVGPU-6721

Change-Id: I8ba742b7850902c1eea4728655c75d795e0bb3a2
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543472
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 13:25:46 -07:00
Sagar Kamble
e0e337fb83 gpu: nvgpu: set nvgpu power state to POWERED_OFF on poweron fail
When force closing the app, poweron needed in channel close path will
fail as pg_task kthread creation fails with -EINTR (process is
SIGKILL'd so threads don't get created).

Upon poweron failure, device nodes are removed and the nvgpu power
state is not reset to NVGPU_STATE_POWERED_OFF. Hence on further
gk20a_busy attempts, poweron is not attempted and gpu remains
unusable from thereon.

Change the state to POWERED_OFF from POWERING_ON on poweron fail.

Bug 3308828

Change-Id: I2360f11a4937dfe93eb7933b30c13748fb570898
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543797
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 04:58:28 -07:00
Debarshi Dutta
8f9ac1dea9 gpu: nvgpu: split away power node removal
Presently, gk20a_user_deinit is used to remove all device nodes
including "power" node as well.

Split removal of power node into a separate function
gk20a_power_node_deinit to enable other device removal during the
normal runtime_suspend path to facilitate the fast path for MIG
reconfiguration. Powernode can be removed only during a call to
Rmmod. This also enables separately powering off the device nodes
in the unlikely case of a poweron failure.

Bug 3308828
Jira NVGPU-6920

Change-Id: Ib045a09a992a63c468492a837b273cca41e20f15
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543014
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 04:57:57 -07:00
Lakshmanan M
4a3a9d46e3 gpu: nvgpu: Use gr_instance specific api to query the num of sm
Replaced get_no_of_sm() with gr_instance specific api
nvgpu_gr_config_get_no_of_sm()

JIRA NVGPU-5656

Change-Id: I01b786402dde857e7cc30d5370429d02ebe3f428
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543245
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-11 18:05:07 -07:00
Lakshmanan M
7d473f4dcc gpu: nvgpu: Expose logical mask for MIG
1) Expose logical mask instead of physical mask when MIG is enabled.
   For legacy, NvGpu expose physical mask.
2) Added fb related info in struct nvgpu_gpu_instance().
4) Added utility api to get the logical id for a given local id
   nvgpu_grmgr_get_gr_gpc_logical_id()
5) Added grmgr api to get max_gpc_count
   nvgpu_grmgr_get_max_gpc_count().
5) Added grmgr's fbp api to get num_fbps and its enable masks.
   nvgpu_grmgr_get_num_fbps()
   nvgpu_grmgr_get_fbp_en_mask()
   nvgpu_grmgr_get_fbp_rop_l2_en_mask()
6) Used grmgr's fbp apis in ioctl_ctrl.c
7) Moved fbp_init_support() in nvgpu_early_init()
8) Added nvgpu_assert handling in grmgr.c
9) Added vgpu hal for get_max_gpc_count().

JIRA NVGPU-5656

Change-Id: I90ac2ad99be608001e7d5d754f6242ad26c70cdb
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538508
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-10 03:05:21 -07:00
Richard Zhao
e2d8bdc38d gpu: nvgpu: unify nvgpu_get_gpfifo_entry_size
moved nvgpu_get_gpfifo_entry_size implementation to common code.

Jira GVSCI-10880

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia6ccee5e26836662f7c2196ff41658ff41e3a570
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541575
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-09 19:27:25 -07:00
Seshendra Gadagottu
5ec1e0cc21 gpu: nvgpu: make gp10b_tegra_acquire_platform_clocks public
Made gp10b_tegra_acquire_platform_clocks as public function
so that each gpu architecture can supply different number of
clock list.

Jira NVGPU-6707

Change-Id: Iad2156a63e00913374ce5fa4274c95e7488fdb31
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2511795
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sivaram Nair <sivaramn@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-08 21:54:23 -07:00
Richard Zhao
9b66fca165 gpu: nvgpu: move .exec_regops to only execute regops
HAL .exec_regops used to first validate regops then execute it, now
moving it to only execute the regops.

- It helps B0CC on HV. On server side it does not track profiler object,
but regops validation uses the profiler, so moving validation to client
side.
- The change also remove ctx_buffer_offset checking in
validate_reg_op_offset. The offset already checked again whitelists
which have be verified when update whitelist. Also vgpu does not have
information of ctx and golden image.
- Added function nvgpu_regops_exec to cover both regops validation and
execution.

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I434e027290e263a8a64a25a55500f7294038c9c4
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534252
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:29:40 -07:00
Lakshmanan M
08cd42093d gpu: nvgpu: Add multi gr l2_evict support
1) Added l2_evict support for multi gr
2) Added multi gr handling for the following apis,
   nvgpu_gr_get_cilp_preempt_pending_chid
   nvgpu_gr_clear_cilp_preempt_pending_chid

JIRA NVGPU-5656

Change-Id: Iee6142a49b9a569f2b440077762164af8aee9fb3
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2539734
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-07 13:46:40 -07:00
Lakshmanan M
df87591b7d gpu: nvgpu: Add multi gr handling for debugger and profiler
1) Added multi gr handling for dbg_ioctl apis.
2) Added nvgpu_assert() in gr_instances.h (for legacy mode).
3) Added multi gr handling for prof_ioctl apis.
4) Added multi gr handling for profiler.
5) Added multi gr handling for ctxsw enable/disable apis.
6) Updated update_hwpm_ctxsw_mode() HAL for multi gr handling.

JIRA NVGPU-5656

Change-Id: I3024d5e6d39bba7a1ae54c5e88c061ce9133e710
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538761
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-04 18:07:47 -07:00
Sagar Kamble
1dd3e0761c gpu: nvgpu: fix the usermode mappings deadlock during railgate and munmap
Following locking sequence leads to deadlock:

1. gk20a_pm_prepare_poweroff (alter_usermode_mappings):
   ctrl_privs_lock -> mmap_lock
2. __do_munmap (usermode_vma_close):
   mmap_lock -> ctrl_privs_lock

This lock contention can be resolved by retrying the usermode mapping
alteration after a while releasing the ctrl_priv_lock for munmap to
proceed.

Below is the kernel panic log with deadlock.

[] INFO: task kworker/1:1:116 blocked for more than 120 seconds.
[]       Tainted: G        W         5.10.17-tegra #1
[] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[] task:kworker/1:1     state:D stack:    0 pid:  116 ppid:     2 flags:0x00000028
[] Workqueue: pm pm_runtime_work
[] Call trace:
[]  __switch_to+0x104/0x160
[]  __schedule+0x3d4/0x900
[]  schedule+0x74/0x100
[]  rwsem_down_write_slowpath+0x250/0x4b0
[]  down_write+0x6c/0x80
[]  alter_usermode_mappings+0xb4/0x160 [nvgpu]
[]  nvgpu_hide_usermode_for_poweroff+0x24/0x30 [nvgpu]
[]  gk20a_pm_prepare_poweroff+0xe8/0x140 [nvgpu]
[]  gk20a_pm_runtime_suspend+0x78/0xf0 [nvgpu]
[]  pm_generic_runtime_suspend+0x3c/0x60
[]  genpd_runtime_suspend+0xb0/0x2c0
[]  __rpm_callback+0x90/0x150
[]  rpm_callback+0x34/0xa0
[]  rpm_suspend+0xe0/0x5e0
[]  pm_runtime_work+0xbc/0xc0
[]  process_one_work+0x1c0/0x4a0
[]  worker_thread+0x11c/0x430
[]  kthread+0x148/0x170
[]  ret_from_fork+0x10/0x18

[] INFO: task nvrm_gpu_tests:1273 blocked for more than 121 seconds.
[]       Tainted: G        W         5.10.17-tegra #1
[] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[] task:nvrm_gpu_tests  state:D stack:    0 pid: 1273 ppid:  1245 flags:0x00000000
[] Call trace:
[]  __switch_to+0x104/0x160
[]  __schedule+0x3d4/0x900
[]  schedule+0x74/0x100
[]  schedule_preempt_disabled+0x28/0x40
[]  __mutex_lock.isra.0+0x184/0x5c0
[]  __mutex_lock_slowpath+0x24/0x30
[]  mutex_lock+0x5c/0x70
[]  usermode_vma_close+0x30/0x50 [nvgpu]
[]  remove_vma+0x34/0x60
[]  __do_munmap+0x1f4/0x4a0
[]  __vm_munmap+0x74/0xd0
[]  __arm64_sys_munmap+0x3c/0x50
[]  el0_svc_common.constprop.0+0x7c/0x1a0
[]  do_el0_svc+0x34/0xa0
[]  el0_svc+0x1c/0x30
[]  el0_sync_handler+0xa8/0xb0
[]  el0_sync+0x160/0x180
[] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]---

Bug 200703921

Change-Id: Ie7f017c92f20061d3bf891079f7fc7fe390f7cf7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2533853
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-04 18:06:11 -07:00
dt
5e82717c96 gpu: nvgpu: Add powernode support to vgpu
As the normal gpu is powered on by writing one to
power-node, the patch is adding power node for vgpu.

Change-Id: I08fbbe8694e02c826a0d5692f5a4c0f4efd396ff
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537053
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-02 19:40:39 -07:00
Tejal Kudav
9f43914933 gpu: nvgpu: Move Intr handling common code to CIC
CIC (Central Interrupt controller) will be responsible for the
interrupt handling. common.cic unit is the placeholder for all
interrupt related code. Move interrupt related defines and
Public APIs present in common.mc to common.cic.
Note: The common.mc interrupts related struct definitions are
not moved as part of this patch.

Adapt the code to use interrupt handling related defines and public
APIs migrated from common.mc to common.cic

JIRA NVGPU-6899

Change-Id: I747e2b556c0dd66d58d74ee5bb36768b9370d276
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535618
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-31 19:37:31 -07:00
dt
c1b302652e gpu: nvgpu: Add fix for dev_node leak
This is adding fix for dev_node leak when user_deinit
called.
The dev_nodes in linux are created in two phases. In first
phase the power dev_nodes(one for legacy and other for v2)
are created. The second phase other dev_nodes are created.
While creating the dev_nodes the power cdev_region overwritten
by cdev_region. This is fixed by introducing new cdev_region and
updating respective nodes.

JIRA NVGPU-6721

Change-Id: Iec78db8e5fe40cc0b14fb3fecc35b8881dff716f
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535265
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-28 11:39:58 -07:00
Sami Kiminki
5f6ff29aea gpu: nvgpu: report number of syncpoints in nvgpu_as_get_sync_ro_map_arg
Add reporting for the number of syncpoints when mapping the RO
shim. This allows the userspace to perform boundary condition checks
when computing the GPU VA for a syncpoint.

JIRA GCSS-1579

Change-Id: Ia6c9eee917d2c1e08f9905701e03f2b09e01ba60
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2533981
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-27 21:19:38 -07:00
Martin Radev
8834275906 gpu: nvgpu: Validate PMA buffer size
The original code would only truncate the size to 32
bits and later write the value to a hw register. Let's
check that the user-provided buffer is large enough.

Bug 2510974

Change-Id: I8b14a07a46d30c0b8c7ea63e5bdef53fbd19ec6f
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527148
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-25 14:30:35 -07:00
Martin Radev
04ce9faf04 gpu: nvgpu: Minor fixes in ioctl handling
Fixes:
1) gk20a_sched_dev_ioctl allocates a buffer with size
*CTXSW_IOCTL_MAX_ARG_SIZE* but then sanitizes IOC_SIZE
against *SCHED_IOCTL_MAX_ARG_SIZE*. No big deal here
since both are of size 0x20 but may lead to issues in
the future.
2) nvgpu_clk_arb_ioctl_event_dev would BUG_ON if IOC_SIZE
is larger than expected. Let's instead sanitize and return
error.

Jira VFND-1586
Jira VQRM-3741

Change-Id: I9e00796a2b2f4a83c3a04194c34eb4c006b937d3
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2525753
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-25 14:30:30 -07:00
Tejal Kudav
e0a1fcf5f5 gpu: nvgpu: Add Central Intr Controller unit
Add a new Central Interrupt Controller(CIC) unit in common code.
The interrupt handling is done in a distributed manner currently.
The error handling policy for different errors resides in each unit's
ISR code. The goal is to converge this data under one central place -
the CIC unit.

This patch creates framework for CIC unit and moves the gv11b QNX
safety LUT to CIC unit. All the error reporting APIs from different
units are also moved to CIC.

New APIs are exposed by CIC unit to access its internal data like:
  1. Struct err_desc - the static err handling /injection data per
                       error id
  2. Num_hw_modules  - the number of error reporting HW units
                       supported by CIC

Init and deinit of CIC unit:
  1. CIC unit should be initialized earlyon during boot so that it
     is available for any interrupt handling.
  2. Initialize CIC just before the interrupts are enabled during
     boot.
  3. Similarly, CIC is disabled late during deinit cycle; right
     after the interrupts are masked.

LUT:
  1. LUT is currently used only for reporting error to safety
     services in gv11b QNX safety build.
  2. This error handling policy LUT currently has only two levels
     of handing - correctable and quiecse.
  3. Once, the error handling policy decision is moved from leaf
     unit nodes to CIC, LUT will be updated to have additional levels
     like fast recovery and full recovery.
  4. Also, then a separate LUT will be added for each platform/build.
  5. In current framework, the LUT is set to NULL for all
     configurations except gv11b.

report_err() ops is added to report error to safety services.
This ops is only effective for gv11b qnx build; and set to NULL for
other configurations.

NVGPU-6521
NVGPU-6523
NVGPU-6750
NVGPU-6758
NVGPU-6760
NVGPU-6754

Change-Id: I24be7836a96d787741e37b732e19863ed8014635
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683
Reviewed-by: Ajesh K V <akv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-25 14:28:04 -07:00
Martin Radev
d1983f5cfa gpu: nvgpu: Decrement CSS dmabuf ref cnt before ret
The function gk20a_channel_cycle_stats does not decrement the
dmabuf refcnt if vmapping it fails. This patch fixes it by
decrementing the ref cnt before returning.

NVGPU-397
NVGPU-415

Change-Id: Iae01ada710adb04fd4e4ba0371eccec5f8765254
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527190
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-18 18:18:25 -07:00
Ramesh Mylavarapu
7d0bd72fde gpu: nvgpu: add clk arbiter check
Check for NVGPU_CLK_ARB_ENABLED flag before
initiating clk crbiter session which shouldn't
be initiated in absence of clk arbiter.

Bug 3236519

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I945203164063cec35fbab2256b3c7cb983e520ea
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2528551
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-13 06:32:01 -07:00
dt
a741347ead gpu: nvgpu: Compute the proper gr_config before read any information
This is added to compute proper gr_config to get the
correct information like number of sm etc.

This is added to fix the failure when running
"NvRmGpuTest_TSG_ReadSmErrorState_Exists" on MIG instance.

JIRA NVGPU-6833

Change-Id: I274720e31cde3636b3282fec586b161f884bc73d
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2526911
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-11 08:26:16 -07:00
srajum
573f02e68d gpu: nvgpu: Fixing MISRA 21.1 violation.
- "misra_c_2012_rule_21_1_violation"
  Defining or undefining a reserved name "__NVGPU_SAVE_KALLOC_STACK_TRACES",
  which is an identifier or macro name beginning with an underscore.

Change-Id: If89ce68fb6dc76e5ffcdd2dc436dddcbe9ba96ee
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2525631
(cherry picked from commit a84c9e0d6987b22e24d777c5ac632c4072cbbb58)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2526776
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-10 10:08:13 -07:00
srajum
74deaae0bf gpu: nvgpu: use GPLV2 license for files in os/linux
JIRA NVGPU-6452

Change-Id: Iac22c3bf52c541a9fd3ba7e59cf4e78ce92ecd71
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2526346
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-10 02:53:39 -07:00
dt
be507aea50 gpu: nvgpu: MIG mode selection at runtime
This is adding code to select MIG mode and boot
the GPU with selected mig config.

For testing MIG, after system boots

1. write  mig_mode_config by
     echo  x > /sys/devices/gpu.0/mig_mode_config for igpu
     echo x > /sys/devices/./platform/14100000.pcie/pci0001:00/0001:00:00.0/0001:01:00.0/ for dgpu

2. Then run any nvgpu* tests or nvrm_gpu_info.
If the mig_mode need to be changed , note down the supported
configs by "cat mig_mode_config_list" and reboot the system

3. Follow steps 1 and 2.

example output:

"cat mig_mode_config" 2

"cat mig_mode_config_list"

+++++++++ Config list Start ++++++++++

 CONFIG_ID : 0 for CONFIG NAME : 2 GPU instances each with 4 GPCs

 CONFIG_ID : 1 for CONFIG NAME : 4 GPU instances each with 2 GPCs

 CONFIG_ID : 2 for CONFIG NAME : 7 GPU instances - 1 GPU instance with 2
GPCs + 6 GPU instances each with 1 GPC

 CONFIG_ID : 3 for CONFIG NAME : 5 GPU instances - 1 GPU instance with 4
GPCs + 4 GPU instances each with 1 GPC

 CONFIG_ID : 4 for CONFIG NAME : 4 GPU instances - 1 GPU instance with 2
GPCs + 2 GPU instances each with 1 GPC + 1 GPU instance with 4 GPCs

 CONFIG_ID : 5 for CONFIG NAME : 6 GPU instances - 2 GPU instances each
with 2 GPCs + 4 GPU instances each with 1 GPC

 CONFIG_ID : 6 for CONFIG NAME : 5 GPU instances -  1 GPU instance with
2 GPCs + 2 GPU instances each with 1 GPC + 2 GPU instances with 2 GPCs

 CONFIG_ID : 7 for CONFIG NAME : 5 GPU instances - 2 GPU instances each
with 2 GPCs + 1 GPC instance with 2 GPCs + 2 GPU instances with 1 GPC

 CONFIG_ID : 8 for CONFIG NAME : 5 GPU instances - 1 GPC instance with 2
GPCs + 2 GPU instances each with 1 GPC + 2 GPU instances each with 2
GPCs

 CONFIG_ID : 9 for CONFIG NAME : 1 GPU instance with 8 GPCs

++++++++++ Config list End +++++++++++

JIRA NVGPU-6633

Change-Id: I3e56f8c836e1ced8753a60f328da63916faa7696
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2522821
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-06 06:09:21 -07:00
dt
a6a3bde1b5 gpu: nvgpu: Fix for MIG boot issue
- The power device node is created at bootime and the power node
is used to power-on the GPU. Power node is commmon for MIG and
non-MIG platforms. As the same API is used for power and
other MIG/non-MIG nodes, we need to distinguish between them.
Otherwise the same nodes creation will give boot issue.
- As we are supporting mig_mode setting for non-mig platforms
like GV11B, the condition need to be added to create MIG-modes
or not. If any mig-mode is set on gv11b/tu104 then graphics pipeline
will be disabled.

JIRA NVGPU-6633

Signed-off-by: dt <dt@nvidia.com>
Change-Id: I3c641e50c39180543efff04a9cf8b721dbf7f648
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521732
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-04 01:24:11 -07:00
Deepak Nibade
c78efae5e7 gpu: nvgpu: set file private data before installing fd
Make sure file->private_data is set before installing file into file
descriptor with fd_install().

Bug 200724607

Change-Id: I03e79a3f8981f959ab5f75f442911253d166aa87
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2520465
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-29 10:54:07 -07:00
Richard Zhao
ab6d4fa543 gpu: nvgpu: create common sim reg accessors
sim reg accessors is common after it moved to use os abstract layer reg
accessors.

Bug 2999617

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I1c0ff7ca1724cde09dd845c077763709ea2ef915
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2517383
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-28 19:15:31 -07:00
Vedashree Vidwans
86cb03d2f1 gpu: nvgpu: Replace WAR keyword with "fix"
Replace/remove "WAR" keyword in the comments in nvgpu driver with "fix".
Rename below functions and corresponding gops to replace "war" word with
"errata" word:
- g.pdb_cache_war_mem
- ramin.init_pdb_cache_war
- ramin.deinit_pdb_cache_war
- tu104_ramin_init_pdb_cache_war
- tu104_ramin_deinit_pdb_cache_war
- fb.apply_pdb_cache_war
- tu104_fb_apply_pdb_cache_war
- nvgpu_init_mm_pdb_cache_war
- nvlink.set_sw_war
- gv100_nvlink_set_sw_war

Jira NVGPU-6680

Change-Id: Ieaad2441fac87e4544eddbca3624b82076b2ee73
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2515700
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-04-28 19:14:49 -07:00
Vedashree Vidwans
aba26fa082 gpu: nvgpu: handle chip specific erratas
Currently, there are few chip specific erratas present in nvgpu code.
For better traceability of the erratas and corresponding fixes,
introduce flags to indicate existing erratas on a chip. These flags
decide if a corresponding solution is applied to the chip(s).

This patch introduces below functions to handle errata flags:
- nvgpu_init_errata_flags
- nvgpu_set_errata
- nvgpu_is_errata_present
- nvgpu_print_errata_flags
- nvgpu_free_errata_flags

nvgpu_print_errata_flags: print below details of erratas present in chip
1. errata flag name
2. chip where the errata was first discovered
3. short description of the errata

Flags corresponding to erratas present in a chip are set during chip hal
init sequence.

JIRA NVGPU-6510

Change-Id: Id5a8fb627222ac0a585aba071af052950f4de965
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2498095
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-04-28 19:14:44 -07:00
Debarshi Dutta
6222ebeaea gpu: nvgpu: address NULL access during boot.
The function nvgpu_pci_probe invokes nvgpu_kzalloc(g) with a pointer
to struct gk20a before setting the device pointer in struct
nvgpu_os_linux. This may result in NULL Pointer access in the function
nvgpu_log_name when some logs are enabled during boot.

As a solution, the implementation of nvgpu_log_name is updated to first
check for a valid pointer to a struct device before calling dev_name

Jira NVGPU-6770

Change-Id: I98a9746550e43f3b7a143f5b7c7141ff6c67f758
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2520355
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-28 11:19:19 -07:00
Lakshmanan M
3f8c562004 gpu: nvgpu: Add nvgpu_early_poweron() support
1) NvGpu dev node needs to be created in gpu power on
early stage to avoid latency introduced by udevd.
For creating dev node, device and grmgr init
needs to move to early stage of GPU power on.
After grmgr init, NvGpu can identify the number of MIG
instance required for each physical GPU.
For that, added a new API nvgpu_early_poweron() to handle
early init which is required for before dev node creation.

2) Removed fifo dependency in nvgpu_init_gr_manager()

3) Used get_max_subctx_count() directly to query
the veid/subctx count.

JIRA NVGPU-6633

Change-Id: Ib9d7c3e184c71237b0da9305515ccd8ceda1d5ad
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2517173
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-22 15:00:54 -07:00
Sami Kiminki
3aceed2db1 gpu: nvgpu: add changes for nvgpu-next
- Add new UAPI IOCTLs.
- Add nvgpu-next gops in fb and gr.
- Initialize and teardown vab during mm_support

Bug 2999621

Change-Id: Icc241f1a234bfee3fd20dc69b42c92e0af6d445c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2447064
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-22 07:35:34 -07:00
dt
f2b69c8704 gpu: nvgpu: mig: Add sysfs nodes for mig mode selection
This is adding two sysfs nodes
1. mig_mode_config: to select the mig_mode
2. mig_mode_config_list: to list the available mig configs.

Added logic to skip gpu dev node creation only for
real MIG physical device.
Added logic to skip the gpu characteristics flags only for
real MIG physical device.

JIRA NVGPU-6633

Change-Id: I4a450b6d658f76e79d89f863c00dffad4558c70f
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2499284
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-21 14:49:57 -07:00
dt
639ca4edfb gpu: nvgpu: mig: Defer dev_nodes creation and create new power node to support MIG
- This is deferring the dev_nodes creation after power_on to
select the MIG config and to create the dev_nodes as per the
selected MIG config.

- The patch is adding a device node to issue power on. The
nodes are:
 for igpu :/dev/nvgpu/igpu0/power
 for dgpu:/dev/nvgpu/dgpu-0001:01:00.0/power

To issue power on :
echo "1" > /dev/nvgpu/igpu0/power
echo "1" > /dev/nvgpu/dgpu-0001:01:00.0/power

JIRA NVGPU-6633

Change-Id: Ic4f1f3e42724cc788dcfaf0e881d188fd3bd1ce1
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2512647
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-21 10:15:20 -07:00
Richard Zhao
cfc1281223 gpu: nvgpu: vgpu: remove gp10b support
gp10b vgpu won't be supported on future releases.

- removed gp10b vgpu hal code
- removed vgpu bar1 related code
- removed gp10b vgpu linux platform code

Jira GVSCI-10202

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ic1bfeb12c854df3808a0c7e67f5c52bc1e80ab2d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2517273
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-04-21 06:06:22 -07:00
Richard Zhao
643eb158a3 gpu: nvgpu: move mapped regs to gk20a
- moved reg fields to gk20a
- added os abstract register accessor in nvgpu/io.h
- defined linux register access abstract implementation
- hook up with posix. posix implementation of the register accessor uses
  the high 4 bit of address to identify register apertures then call the
  according callbacks.

It helps to unify code across OSes.

Bug 2999617

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ifcb737e4b4d5b1d8bae310ae50b1ce0aa04f750c
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2497937
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-19 19:45:24 -07:00
Antony Clince Alex
95bfa039f5 gpu: nvgpu: tu104: implement l2 sector promotion
Introduce new HAL gops_ltc.set_l2_sector_promotion to configure L2
sector promotion policy. The follow three promotion settings are support:
- NVGPU_GPU_IOCTL_TSG_L2_SECTOR_PROMOTE_FLAG_NONE
- NVGPU_GPU_IOCTL_TSG_L2_SECTOR_PROMOTE_FLAG_64B
- NVGPU_GPU_IOCTL_TSG_L2_SECTOR_PROMOTE_FLAG_128B

Add ioctl "NVGPU_TSG_IOCTL_SET_L2_SECTOR_PROMOTION" to the gpu tsg node
to support l2 sector promotion. On chips which do not support sector
promotion, the ioctl returns 0.

Bug 200656177

Change-Id: Iad835a5c954d3b10da436cfafb388aaaa04f44c7
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2460553
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-04-16 03:35:57 -07:00
Prateek sethi
d6d1b03496 gpu: nvgpu: implement ioctls to access GPU VA ranges
Patch adds below two ioctls to access GPU VA.
- NVGPU_DBG_GPU_IOCTL_GET_MAPPINGS
- NVGPU_DBG_GPU_IOCTL_ACCESS_GPU_VA

Bug 2108651
Bug 2543387

Change-Id: Iebcfa777c1a623eda070a866aed069ca9b3ec49d
Signed-off-by: Prateek sethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2383317
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-04-10 13:43:40 -07:00
Mayur Poojary
6277d57936 gpu: nvgpu: Add new api for setting longer timeslice on dbg node
Add new ioctl api for setting longer timeslice and get timeslice
inside 'dbg' dev node.
Update ioctl gpu_get_characteristic to pass the max timeslice value
Add debugfs to access and change the max timeslice value

Bug 1842244

Change-Id: I7e80f59162cf5d90496f9752fc128f5fa8dcc7d2
Signed-off-by: Mayur Poojary <mpoojary@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2471569
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-06 04:37:38 -07:00
Martin Radev
38e6c9ae98 gpu: nvgpu: Check for int overflow in MAPPING_MODIFY path
The check `buffer_offset + buffer_size > mapped_buffer->size` can
be bypassed with a large `buffer_size`, and that may lead to some
corruption. This patch combines the bounds checks into a more
robust one.

Jira NVGPU-6374

Change-Id: I55c8664134e763c66715bf3492867bc73686b694
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2504890
Reviewed-by: Scott Long <scottl@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-26 14:20:18 -07:00
Richard Zhao
a56d93aa2f gpu: nvgpu: linux: remove definition of ecc_sysfs_stats_htable
No one uses it anymore.

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I0ea7f62e4e4e53d8da66bc00dcbe08a1f94e19a8
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2497936
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-25 14:08:18 -07:00
Vedashree Vidwans
e445b57b04 gpu: nvgpu: Move interrupt ISR code to common
This is one of the steps in restructuring of interrupt code.
- Move ISR logic to common code. This will allow us to add mixed ASIL
error handling levels.
- Modify nonstall ISR to use threaded interrupts. Bottom half of
nonstall ISR will run nonstall operations instead of adding work to
workqueues.
- Remove nonstall workqueue implementation.

JIRA NVGPU-6351

Change-Id: I5f891b0de4b0c34f6ac05522a5da08dc36221aa6
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2467713
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-25 02:34:57 -07:00
Sagar Kamble
ff8fbf1004 Revert "gpu: nvgpu: disable DGPU_THERMAL_ALERT for k5.9 temporarily"
Bug 200669739

This reverts commit d3f5905a0c.

Change-Id: I76a4ca4ec5be316c24410860a722a13383a3cea3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2501427
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-19 14:37:00 -07:00
shashank singh
46cdc4d5ca gpu: nvgpu: add field in characteristics struct for max gpfifo entries
Expose max gpfifo entries supported by nvgpu-rm. This limit will then be
propagated to application by nvrm_gpu.

Jira NVGPU-5846

Change-Id: Ibbbed9e1929c3bcc4eaaec9636d76e9e115e0c0c
Signed-off-by: shashank singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2482936
(cherry picked from commit b099a700aa055a5864ddb65cb546c9294c02b2b9)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2497486
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-19 10:09:44 -07:00
Divya Singhatwaria
cc34df76f9 gpu: nvgpu: Add support for ELPG_MS feature
- To enable ELPG_MS feature, add identifier for
  MS_LTC engine.
- The identifier is then passed
  as pg_engine_id to enable the MS_LTC engine.
- Add enable flag NVGPU_ELPG_MS_ENABLED for
  enabling/disabling ELPG_MS feature at init.

JIRA NVGPU-6430

Change-Id: Ie1f477918332d85ec98b3bd4d05b8e773d74eab8
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480750
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-18 15:29:06 -07:00
scottl
75d98f55d7 gpu: nvgpu: add SUPPORT_MAPPING_MODIFY flags
Add new NVGPU_SUPPORT_MAPPING_MODIFY enable flag that is used to
control the value of the exported NVGPU_GPU_FLAGS_SUPPORT_MAPPING_MODIFY
flag.

These flags are currently only enabled on linux in non-virtualized
environments.

Jira NVGPU-6374

Change-Id: Ia85c353b767b4f7d0aebc04838f44996bc38c61f
Signed-off-by: scottl <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2490986
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-16 14:07:21 -07:00
Antony Clince Alex
f41e5975d8 gpu: nvgpu: add ioctl to configure l2 max_ways_evict_last
Add ioctl support to configure and read the max number of lines/ways
in a L2 cache set that can be marked as EVICT_LAST. This is accomplished
through two new ltc hals: set_l2_max_ways_evict_last,
get_l2_max_ways_evict_last. These hals will only be set for nvgpu-next
chips. Incase of legacy chips, the IOCTLs will return error -ENOSYS.

Generate following litter constants to get the number of sets in a l2
slice and the number of ways in each set:
- GPU_LIT_NUM_LTC_LTS_SETS
- GPU_LIT_NUM_LTC_LTS_WAYS

Add gpu characteritics flag: NVGPU_L2_MAX_WAYS_EVICT_LAST_ENABLED to
allow userspace driver to determine if L2_MAX_WAYS_EVICT_LAST ioctl is
supported.

Bug 200605474

Change-Id: Id3180f891399f5e128500f3835d762aee59953e0
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2445884
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-12 04:36:22 -08:00
Thomas Steinle
1b5a9b28ea gpu: nvgpu: Add gr.ops NULL-ptr check
This fix add NULL-ptr checks for some of the user-accessible
ioctl.

Bug 3240771
Bug 200696704

Change-Id: Ibe7f75b31b2521a530883253a93ba832f010dc80
Signed-off-by: Thomas Steinle <tsteinle@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2483635
(cherry picked from commit cc717e3145)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2490126
Tested-by: Dinesh T <dt@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-03-04 00:36:14 -08:00
Deepak Nibade
4ab8c87974 gpu: nvgpu: expose V2 device node hierarchy
Nvgpu will move to new V2 device node hierarchy by default to be
consistent with MIG mode hierarchy. As part of this, expose V2 device
node hierarchy with this patch.

Both legacy and V2 hierarchies will co-exist for some duration until
V2 hierarchy is well tested out, and then legacy hierarchy will be
removed.

With V2 hierarchy, below is the directory structure for dev nodes :
igpu: /dev/nvgpu/igpu0/<dev_node>
dgpu: /dev/nvgpu/dgpu-0001\:01\:00.0/<dev_node>

Jira NVGPU-5648

Change-Id: I1c7a1c41e6d59fb248bc98a145c5ecebd06e8bad
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2443712
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-02 19:30:59 -08:00