Commit Graph

3031 Commits

Author SHA1 Message Date
deepak goyal
77d1e765f5 gpu: nvgpu: ga10b: Fix logic for BROM pass status
Current code assumes riscv brom passed if it does not times out.
This patch explicitly checks for brom pass/fail or timeout.

Bug 3361416

Change-Id: I399a6cf9d32be92b24990532f81892642513ba54
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2585786
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-31 08:54:35 -07:00
Deepak Nibade
3c97f3b932 gpu: nvgpu: disallow binding more channels than MAX channels supported per TSG
There is HW specific limit on number of channel entries that can be
added for each TSG entry in runlist. Right now there is no checking
to enforce this from SW and hence if User binds more than supported
channels to same TSG, invalid TSG formation error interrupts are
generated.

Fix this by adding appropriate checks in below steps :

- Add new field ch_count to struct nvgpu_tsg to keep track of
  channels bound to TSG.
- Define new hal gops.runlist.get_max_channels_per_tsg() to retrieve
  HW specific maximum channel count per TSG.
- Implement the HAL for gk20a and gv11b chips, and assign new HALs for
  all chips appropriately.
- Increment ch_count while binding the channel to TSG and decrement it
  while unbinding.
- While binding channel to TSG, Check if current channel count is
  already equal to max channel count. If yes, print an error and bail
  out.

Bug 200763991

Change-Id: Ic5f17a52e0fb171d1c020bf4f085f57cdb95f923
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582095
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-25 09:47:47 -07:00
Debarshi Dutta
608decf1e6 gpu: nvgpu: add support for powering off gpu
Add support for powering off IGPU for switching between
legacy to SMC mode/vice-versa or changing SMC configuration.
The power off can be issued as follows

echo 0 > /dev/nvgpu/igpu0/power

The following steps are done during a poweroff.
1) Deterministic channel idle
2) Acquire write_lock on l->busy semaphore.
3) Wait till power_usage decrements to indicate 0 active jobs.
4) Invoke pm_runtime_put_sync_suspend()
5) Invoke nvgpu_gr_remove_support() to clear existing GR memory.
6) Release write_lock on l->busy
7) Deterministic channel unidle.

Part of the sequence matches that of the gk20a_do_idle code.
The common parts are extracted into new functions
gk20a_block_new_jobs_and_idle() and gk20a_unblock_jobs()

For joint-rail case, the current implementation, does a railgate
and then sets pm_runtime_set_autosuspend_delay(-1) to disable
regular runtime resume/suspend.

Remove clearing of NVGPU_SUPPORT_MIG status during state change
ias it leads to inconsistencies.

Jira NVGPU-6920

Change-Id: I0b3eb3278176122ac061c1e8a94ebfb3c17c3925
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578501
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:50 -07:00
Debarshi Dutta
2e3c3aada6 gpu: nvgpu: fix deinit of GR
Existing implementation of GR de-init doesn't account for multiple
instances of struct nvgpu_gr. As a fix, below changes are added.

1) nvgpu_gr_free is unified for VGPU as well as native.
2) All the GR instances are freed.
3) Appropriate NULL checks are added when freeing GR memories.
4) 2D, 3D, I2M and ZBC etc are explicitely disabled when MIG is set.
5) In ioctl_ctrl, checks are added to not return error when zbc is NULL
   for VGPU as requests are rerouted to RMserver.

Jira NVGPU-6920

Change-Id: Icaa40f88f523c2cdbfe3a4fd6a55681ea7a83d12
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578500
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:45 -07:00
Mahantesh Kumbar
b9696ee643 gpu: nvgpu: ga10b: update NVRISCV LSPMU
- Set NVRISCV LSPMU app version to 0.
- Setting app version to 0 helps to load and boot
  multiple LSPMU ucode's without modifying the
  NVGPU driver.
- Add support for PMU NVRISCV prod and dbg bin's.
- This is corresponding change to LSPMU MPSK CL
  https://git-master.nvidia.com/r/c/tegra/kernel-firmware-t18x/+/2576049

JIRA NVGPU-7061

Change-Id: I800953ca97af3badde1983aa99e09b4fe7453203
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2575341
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-22 11:05:03 -07:00
Richard Zhao
d8e847c90d gpu: nvgpu: vgpu: fix force preemption from debugfs
check whether there's any force_preemption_gfxp or force_preemption_cilp
set in debugfs when alloc obj_ctx.

Jira GVSCI-4658

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I87fc7e195c9b0f7ed29ec6c37c8f46b456625fea
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579218
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-19 14:06:44 -07:00
Vedashree Vidwans
e13ab1f9ea gpu: nvgpu: pmu: remove hw access from remove_pmu_support
GPU HW registers are locked before remove_pmu_support.
Remove functions accessing HW registers.

Bug 3357477

Change-Id: I34a1923bfdb3afacd462f2646e2821569573a81a
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2577627
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-17 09:45:42 -07:00
Sagar Kamble
f4571194b0 gpu: nvgpu: stop ELPG init thread during unload
ELPG initialization thread creation can fail when the process is killed.
That leads to driver resume failure.

That thread was stopped on suspend and re-created on resume. To avoid
the issue above, don't stop the ELPG thread in suspend and let the
first created thread handle the ELPG state transitions always.
And stop the ELPG thread during unload.

Also fix couple of instances of config flag as:

s/CONFIG_PMU_POWER_PG/CONFIG_NVGPU_POWER_PG

bug 3345977

Change-Id: I8952edf8d1664ed258f238e265002e716d1bf5c2
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573763
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Ashish Mhetre <amhetre@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-11 01:55:46 -07:00
Sagar Kamble
40064ef1ec gpu: nvgpu: fix ecc counter free
ECC counter structures are freed without removing the node from the
stats_list. This can lead to invalid access due to dangling pointers.

Update the ecc counter free logic to set them to NULL upon free, to
remove them from stats_list and free them by validation.

Also updated some of the ecc init paths where error was not propa-
gated to callers and full ecc counters deallocation was not done.

Now, calling unit ecc_free from any context (with counters alloc-
ated or not) is harmless as requisite checks are in place.

bug 3326612
bug 3345977

Change-Id: I05eb6ed226cff9197ad37776912da9dcb7e0716d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2565264
Tested-by: Ashish Mhetre <amhetre@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-11 01:55:08 -07:00
mkumbar
de267c034c gpu: nvgpu: ga10b: Enable PKC support
-Enable PKC support in ACR and LS-PMU
-Update the PMU f/w version.
-Enable PMU support by default.

Change-Id: I42bbe1b64ddc6ead9641c97d1ed27a9f4020510a
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2568609
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Deepak Goyal <dgoyal@nvidia.com>
Tested-by: Krishna Reddy <vdumpa@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-08 14:23:36 -07:00
Sagar Kamble
0f59efb2cd gpu: nvgpu: return tpc exceptions error properly
It is observed that recovery on receiving the ESR MMU NACK exception
does not get triggered as the error returned by tpc level handler
is masked.

NACK is marked handled but recovery is not done and subsequent fb
intr handler does not trigger recovery since NACK is handled.

This leaves the HW engines in bad state.

Fix the tpc error return logic to trigger recovery during ESR MMU
NACK exception.

Bug 3318939

Change-Id: I475826f734e4366e853607e1e0338290ee28249b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2564764
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-07-26 05:13:41 -07:00
Tejal Kudav
b33079d47e gpu: nvgpu: Move intr data members from MC to CIC
Move interrupt specific data-members from common.mc to common.cic
Some of these data members like sw_irq_stall_last_handled_cond need
To be initialized much earlier during the OS specific init/probe stage.
Also, some more members from struct nvgpu_interrupts(like stall_size,
stall_lines[]), which will soon be moved to CIC will also need to be
initialized early during the OS specific probe stage.
However, the chip specific LUT can only be initialized after the
hal_init stage where the HALs are all initialized.
Split the CIC init to accommodate the above initialization requirements.

JIRA NVGPU-6899

Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 18:06:28 -07:00
Richard Zhao
a884bd3537 gpu: nvgpu: vgpu: add L2 sector promotion support
- added new IVC command for setting L2 sector promotion policy.
- init according HAL for ga10b VGPU.

Jira GVSCI-10901

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ibd206d26cbe72dd37f541eb0a8fb177c195567ab
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560575
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 16:13:34 -07:00
smadhavan
d9add2db52 gpu: nvgpu: pkc signature verification support
This change adds lsf_ucode_desc_wrapper to hold the pkc signature
header and corresponding lsf_lsb_header_v2. During blob preparation
based on the flag is_sig_pkc, the new header defines will be
packed into ls blob and passed to acr.
The flag NVGPU_PKC_LS_SIG_ENABLED is also added, which will be set
based on the acr core selection.

JIRA NVGPU-6365

Change-Id: I74e25d7c0f69d4007893e46006f97f2a607fd11f
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2506136
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 16:04:25 -07:00
Antony Clince Alex
f80dccb543 gpu: nvgpu: report gpc_tpc_mask in physical order
At present, there is an inconsistency in the order in which
gpc_tpc masks are reported to the userspace. Both gpc and
tpc masks are reported using physical-ids. However, the
gpc_tpc_masks array is ordered by logical gpc-ids and
not physical-ids. This creates a mismatch between the gpc
reported as enabled in the gpc_mask and its corresponding
gpc_tpc_mask.

Introduce field "gpc_tpc_mask_physical" which stores the
gpc_tpc_masks in physical order and update
NVGPU_GPU_IOCTL_GET_TPC_MASKS to return this field.

Bug 200665942

Change-Id: I63aa83414a59676b7e7d36b6deb527e2f3c04cff
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2531114
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 16:04:01 -07:00
Divya Singhatwaria
842bef7124 gpu: nvgpu: Support GPC and FBP Floorsweeping
- Add gops_fbp_fs and gops_gpc_pg struct
- Add HALs to write to NV_FUSE_CTRL_OPT_FBP and
  NV_FUSE_CTRL_OPT_GPC fuses needed for floorsweeping
- Add set_fbp_mask and set_gpc_mask to probe FBP and GPC mask
  respectively during gpu probe
- Add sysfs node: fbp_fs_mask and gpc_fs_mask to store
  FBP and GPC floorsweeping mask sent from userspace
- Move the floorsweeping programming early in NVGPU’s GPU init
  function and then issue a PRI init.

JIRA NVGPU-6433

Change-Id: I84764d625c69914c107e1e8c7f29c476c2f64f78
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2499571
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 06:17:25 -07:00
Divya Singhatwaria
9f30609550 gpu: nvgpu: Rename TPC powergating mutex
Rename tpc_pg_lock to static_pg_lock and
have_tpc_pg_lock to have_static_pg_lock as it
is used for tpc/gpc/fbp power gating.

JIRA NVGPU-6433

Change-Id: I4c56b9710e303ad9e872bad4b5ed9a167acb9dd6
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537489
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-18 02:46:25 -07:00
mkumbar
860027dc8c gpu: nvgpu: ga10b nvriscv pmu ucode update
-PMU ucode from gpmu/ga10b branch.
-Perfmon, PG and ACR features are enabled with bin.
-update APP_VERSION_NVGPU_NEXT_CORE PMU app version
 to 30147895
-Add dummy bytes to pmu boot params.
-P4 CL #30187066

JIRA NVGPU-6955

Change-Id: I17c51edaa2d4f8cd34e0e43044d62aae52b8ef2a
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2559075
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-18 00:05:04 -07:00
mkumbar
87984ea344 gpu: nvgpu: support nvriscv debug feature
Enable nvriscv debug buffer feature in NVGPU.
Debug buffer is a feature to print the debug log from ucode onto console
in real time.
Debug buffer feature uses the DMEM, queue and SWGEN1 interrupt to share
ucode debug data with NVGPU.
Ucode writes debug message to DMEM and updates offset in queue to trigger
interrupt to NVGPU.
NVGPU copies the debug message from DMEM to local buffer to process and
print onto console.

Debug buffer feature is added under falcon unit and required engine
can utilize the feature by providing required param through public
functions.

Currently GA10B NVRISCV NS/LS PMU ucode has support for this feature
and enabled support on NVGPU side by adding required changes, with this
feature enabled, it is now possible to see prints in real time.

JIRA NVGPU-6959

Change-Id: I9d46020470285b490b6bc876204f62698055b1ec
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548951
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-17 12:45:00 -07:00
Richard Zhao
7ce01d3d1d gpu: nvgpu: vgpu: add size and pgsz_idx when unmap buffer
Since the server won't manage mapped_buffer anymore, the client needs to
pass size and pgsz_idx to unmap buffers.

Jira GVSCI-10901

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Iff076e2cd86d0be71565b43d3993704e51978abe
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2557063
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-07-17 06:26:11 -07:00
mkumbar
fcf31d7063 gpu: nvgpu: ga10b: fix GSP/PMU priv error
- Fix GSP/PMU registers priv errors which are seen as part of boot sequence.
- Couple of GSP/PMU Falcon/NVRISCV registers are allowed to access
  upon NVRISCV bootrom completion but these registers were needed
  to configure on legacy chips to bootstrap/configure Falcon.
- Add is_falcon2_enabled or NVGPU_PMU_NEXT_CORE_ENABLED check
  to skip these registers.

JIRA NVGPU-7025

Change-Id: I087a477ade6736398dea113f89894a0ff73ae647
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2553127
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-16 16:44:08 -07:00
Sagar Kadamati
aabc161151 gpu: nvgpu: vgpu: added VAB support for HV
Added below IVC commands to support VAB on HV.

 * TEGRA_VGPU_CMD_FB_VAB_RESERVE - Enable & Configure VAB tracking
 * TEGRA_VGPU_CMD_FB_VAB_FLUSH_STATE - Dump VAB to user buffer
 * TEGRA_VGPU_CMD_FB_VAB_RELEASE - Disable VAB tracking

Also set HAL and enable VAB for ga10b vgpu.

Jira GVSCI-4619

Change-Id: Id7564611c24740ab8613e4baa420ee58fb52759a
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2507268
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-16 16:40:47 -07:00
Ramesh Mylavarapu
d328bff79e gpu: nvgpu: gsp NVRISCV load and bootstrap
Changes:
- This change will only init gsp software
  state, nvgpu_gsp_bootstrap need to be called.
- CONFIG_NVGPU_GSP_SCHEDULER flag is created to
  compile out the gsp scheduler code when needed.
- Created GSP engine reset which is needed when
  ACR completed execution and need to load gsp fw.

NVGPU-6783

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I2ce43e512b01df59443559eab621ed39868ad158
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554267
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-15 17:21:03 -07:00
Vedashree Vidwans
43980bfe06 gpu: nvgpu: remove nvgpu_is_bpmp_running usage
BPMP driver doesn't support any API to check whether bpmp is running.
Remove use of nvgpu_is_bpmp_running.

Bug 200720732

Change-Id: Id266e65d4af598dd056cbdbaa219d0d53b7b3fb3
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556448
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-15 10:06:42 -07:00
Divya Singhatwaria
77e3a8c5e4 gpu: nvgpu: ga10b: Add request_idle ce ops
Issue observed:
- In GA10B, it was observed that after recovery happens
  ELPG does not engage.
- It was because, after CE reset, when nvgpu_submit_twod test
  was run to engage ELPG, IDLE_FLIPPED_PWR_OFF signal was asserted.
- This means that when ELPG was engaged (engine is in PWR_OFF),
  some idle signal flips (becomes non-idle) and this causes
  IDLE_SNAP. After IDLE_SNAP is hit, ELPG will not engage further.
- After debugging from WAVES, it was observed that:
  LCE0/LCE1 are not done with the reset sequence.
- The state of these LCE is RESET0. A pri request (pri read
  to NV_CE_PCE_MAP register in CE) is seen that kicks it out of
  RESET0. After this state, it goes through few states to update
  some internal states (states RESET1/RESET2/PCE_MAP etc) and then
  eventually settles down to IDLE state.

Solution:
- Read ce_pce_map_r register in recovery sequence (after ce reset).
- It is observed that when this read is added recovery is complete
  and post that when nvgpu_submit_two test is executed, ELPG is engaging.
- This means that a pri read is needed after CE reset so that it settles
  to idle state properly and post that ELPG can engage properly.

Bug 200734258

Change-Id: I5bb84921ca62a740fde81ffe6c29ccde4ebb341b
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554493
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-15 10:05:02 -07:00
Deepak Nibade
2237221a57 gpu: nvgpu: fix CERT EXP34-C errors in common.gr
nvgpu_gr_config_get_sm_info() returns NULL if invalid SM id is provided
to the API. Since it is possible return NULL, a NULL check is required
at all callers.

Also, nvgpu_gr_config_get_sm_info() is always called in a loop from 0
to (sm_count - 1) and hence adding an nvgpu_assert() should be
sufficient.

Change-Id: I0fd92ac354447796c4c7d7237e7bd3b6e5c2682c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552409
(cherry picked from commit 4f3789d6563bbfe1be3e25c522ca1eac0d5d2d13)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2558271
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: V M S Seeta Rama Raju Mudundi <srajum@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-13 13:52:24 -07:00
Deepak Nibade
4edf952e3e gpu: nvgpu: fix rule 5.1 misra violations in common.gr
Fix rule 5.1 misra violations in common.gr by renaming below functions :

nvgpu_gr_config_get_gpc_tpc_mask_base ->
  nvgpu_gr_config_get_base_mask_gpc_tpc

nvgpu_gr_config_get_gpc_tpc_count_base ->
  nvgpu_gr_config_get_base_count_gpc_tpc

gm20b_ctxsw_prog_set_priv_access_map_config_mode ->
  gm20b_ctxsw_prog_set_config_mode_priv_access_map

gm20b_ctxsw_prog_set_priv_access_map_addr ->
  gm20b_ctxsw_prog_set_addr_priv_access_map

gm20b_gr_falcon_read_fecs_ctxsw_mailbox ->
  gm20b_gr_falcon_read_mailbox_fecs_ctxsw

gm20b_gr_falcon_read_fecs_ctxsw_status0 ->
  gm20b_gr_falcon_read_status0_fecs_ctxsw

gm20b_gr_falcon_read_fecs_ctxsw_status1 ->
  gm20b_gr_falcon_read_status1_fecs_ctxsw

gv11b_gr_intr_get_sm_hww_warp_esr_pc ->
  gv11b_gr_intr_get_warp_esr_pc_sm_hww

gv11b_gr_intr_get_sm_hww_warp_esr ->
  gv11b_gr_intr_get_warp_esr_sm_hww

Jira NVGPU-6779

Change-Id: Icbe23a7b022373785968fc417ee247e2d80cfcc6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554521
(cherry picked from commit 1432650774506f2a7e45f70b084f498736d0d0c5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555330
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-13 09:20:41 -07:00
Debarshi Dutta
200777b854 gpu: nvgpu: bvec for channel and tsg
Below changes are added.

1) Added checks in
    nvgpu_channel_from_id__func, nvgpu_tsg_check_and_get_from_id
2) Added BVEC tests for
    nvgpu_channel_open_new, nvgpu_channel_from_id,
    nvgpu_tsg_check_and_get_from_id, nvgpu_tsg_set_error_notifier
3) Added common function get_random_u32.

Jira NVGPU-6905

Change-Id: I374d6f5503dc05e3224213d772a1752d82cbdc91
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548304
(cherry picked from commit 39b2529b3e96cfd3cbd3bb020f32ee2cca0ea363)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554021
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-07-07 12:25:50 -07:00
Lakshmanan M
46457ea536 gpu: nvgpu: Fix priv error when MIG+Profiling is alive
1) Currently only one profiler object should be allowed.
   Enable/Disable/Reset CAU is using whole GR space for both
   MIG and legacy mode. Need to convert broadcast address to
   GR specific unicast programming when NvGpu supports
   more than one profiler object at a time.

2) Used nvgpu_gr_exec_with_err_for_instance() for
   update_smpc_global_mode().

JIRA NVGPU-5656

Change-Id: If9c2af1459458c031c7cc269e1a89f527b972d7c
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554590
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-07-07 08:47:08 -07:00
Antony Clince Alex
f51a43b579 gpu: nvgpu: ga10b: fix fetching of FBP_L2 FS mask
On all chips except ga10b, the number of ROP, L2 units per FBP
were in sync, hence, their FS masks could be represented by a single
fuse register NV_FUSE_STATUS_OPT_ROP_L2_FBP. However, on ga10b, the ROP
unit was moved out from FBP to GPC and it no longer matches the number
of L2 units, so the previous fuse register was broken into two -
NV_FUSE_CTRL_OPT_LTC_FBP, NV_FUSE_CTRL_OPT_ROP_GPC.

At present, the driver reads the NV_FUSE_CTRL_OPT_ROP_GPC register
and reports incorrect L2 mask. Introduce HAL function
ga10b_fuse_status_opt_l2_fbp to fix this.

In addition, rename fields and functions to exclusively fetch L2 masks,
this should help accommadate ga10b and future chips in which L2 and ROP units
are not in same. As part of this, the following functions and
fields have been renamed.
- nvgpu_fbp_get_rop_l2_en_mask => nvgpu_fbp_get_l2_en_mask
- fuse.fuse_status_opt_rop_l2_fbp => fuse.fuse_status_opt_l2_fbp
- nvgpu_fbp.fbp_rop_l2_en_mask => nvgpu_fbp.fbp_l2_en_mask

The HAL ga10b_fuse_status_opt_rop_gpc is removed as rop mask is not
used anywhere in the driver nor exposed to userspace.

Bug 200737717
Bug 200747149

Change-Id: If40fe7ecd1f47c23f7683369a60d8dd686590ca4
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551998
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-07 05:48:56 -07:00
Pekka Jylhä-Ollila
8a72068508 Revert "gpu: nvgpu: gsp NVRISCV load and bootstrap"
This reverts commit aef4b80acb.

Change-Id: I47e02bf97e6a3aaa9acdd7f5eec41518b31ee5dc
Signed-off-by: Pekka Jylhä-Ollila <pjylhaollila@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554105
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
2021-07-05 06:01:52 -07:00
Ramesh Mylavarapu
aef4b80acb gpu: nvgpu: gsp NVRISCV load and bootstrap
Changes:
- This change will only init gsp software
  state, nvgpu_gsp_bootstrap need to be called.
- CONFIG_NVGPU_GSP_SCHEDULER flag is created to
  compile out the gsp scheduler code when needed.
- Created GSP engine reset which is needed when
  ACR completed execution and need to load gsp fw.

NVGPU-6783

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I26263ee5bae07de056f676ed0fddc1193b5af82d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2530438
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-04 13:34:51 -07:00
scottl
cd3ad1ccc7 gpu: nvgpu: fix REMAP android build failure
Rework nvgpu_vm_remap_os_buf structure initialization to
avoid android/clang build issues with the use of a single pair
of {} to initialize certain structures.

The os-dependent nvgpu_vm_remap_os_buf_get() routine now does
a memset of the structure prior to initializing its contents.

Jira NVGPU-6804

Change-Id: I08682c6ab7b8324a605a56ed660dea5bea11d16b
Signed-off-by: scottl <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2553193
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-07-03 02:05:25 -07:00
Lakshmanan M
e9872a0d91 gpu: nvgpu: Skip graphics unit access when MIG is enabled
This CL covers the following modifications,
1) Added logic to skip the graphics unit specific sw context load
   register write during context creation when MIG is enabled.
2) Added logic to skip the graphics unit specific sw method
   register write when MIG is enabled.
3) Added logic to skip the graphics unit specific slcg and blcg gr
   register write when MIG is enabled.
4) Fixed some priv errors observed during MIG boot.
5) Added MIG Physical support for GPU count < 1.
6) Host clk register access is not allowed for GA100.
   So skipped to access host clk register.
7) Added utiliy api - nvgpu_gr_exec_with_ret_for_all_instances()
8) Added gr_pri_mme_shadow_ram_index_nvclass_v() reg field
   to identify the sw method class number.

Bug 200649233

Change-Id: Ie434226f007ee5df75a506fedeeb10c3d6e227a3
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2549811
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-02 16:41:51 -07:00
tkudav
0526e7eaa9 gpu: nvgpu: Create CIC-mon and CIC-rm subunits
common.cic unit is divided into common.cic.mon and common.cic.rm
based on rm and mon process split.

CIC-mon subunit includes the code which is utilized in critical
interrupt handling path like initialization, error detection and
error reporting path. CIC-rm subunit includes the code corresponding
to rest of interrupt handling(like collecting error debug data from
registers) and ISR status management (status of deferred interrupts).

Split the CIC APIs and data-members into above two subunits.

JIRA NVGPU-6899

Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-02 09:57:56 -07:00
Deepak Nibade
8ccf9820ba gpu: nvgpu: check for valid sm_id in nvgpu_gr_config_get_sm_info
Check if requested sm_id is valid in nvgpu_gr_config_get_sm_info()
function. Also update doxygen documentation for same.

Also, ensure SM count is set using nvgpu_gr_config_set_sm_info() before
usig nvgpu_gr_config_get_sm_info() to retrieve it.

Update unit test test_gr_config_set_get to set valid SM count instead of
random number. With random number it is possible that SM count is set
higher than size of SM info struct. This could result into test process
crash.

Change-Id: I4292977b7e880752c65001cbd594e0617fe135f5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2549882
(cherry picked from commit ee9767cac1a27ffbc99f707c1aa158b8216d757f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551983
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-01 06:51:05 -07:00
Deepak Nibade
02943a63b4 gpu: nvgpu: rework ptimer scale APIs
common.ptimer unit right now exposes two APIs -
scale_ptimer() to scale the timer
ptimer_scalingfactor10x() to get the scaling factor

receiving scaling factor is not really necessary for user of
common.ptimer since it can be internally calculated in scale_ptimer()
function itself.

Hence make ptimer_scalingfactor10x() static and rename public API
scale_ptimer() to nvgpu_ptimer_scale()

nvgpu_ptimer_scale() will not accept timeout value as parameter
and return scaled timeout value in another pointer parameter.
Error code is returned if timeout value is invalid.

Jira NVGPU-6394

Change-Id: Ib882d99f6096c3af5f96eef298d713fb5e36dd87
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546970
(cherry picked from commit 2da7c918efe91046818c83481664312e194ead8e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551334
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-01 06:48:20 -07:00
Ramesh Mylavarapu
b38b8a794d gpu: nvgpu-next: Update pg pre and post init structs
SECURITY_HARDENING feature in pmu ucode leads to failure
in pg pre and post init rpcs due to mismatch is interface
struct size. This change will update pg pre and post init
nvgpu-pmu interface structs as per pmu ucode.

NVGPU-6421

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: Ied9179b3a7ee1923dba56e792979115f3a19f7e5
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551026
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-30 10:22:45 -07:00
ajesh
83d1ae9c0a gpu: nvgpu: add bvec tests for utils
Add boundary value tests for common utils unit.

JIRA NVGPU-6395

Change-Id: I4442f339c0238e7ee8a44277ca5f53db9c71f367
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542636
(cherry picked from commit 125d73582d57b673b155ada6ce7430401d56dbc3)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548579
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-29 06:58:24 -07:00
scottl
3cd256b344 gpu: nvgpu: add linux REMAP support
Add REMAP ioctl and accompanying support to the linux nvgpu driver.

REMAP support provides per-page control over sparse VM areas using the
concept of a virtual memory pool.

The REMAP ioctl accepts a list of operations (each a map or unmap) that
modify the VM area pages tracked by the virtual mmemory pool.

Inclusion of REMAP support in the nvgpu build is controlled by the new
CONFIG_NVGPU_REMAP flag.  This flag is enabled by default for linux builds.
A new NVGPU_GPU_FLAGS_SUPPORT_REMAP characteristics flag is added for use
in detecting when REMAP support is available.

When a VM allocation tagged with NVGPU_VM_AREA_ALLOC_SPARSE is made the
base virtual memory pool resources are allocated.  Per-page resources are
later allocated when the NVGPU_AS_IOCTL_REMAP ioctl is issued.  All REMAP
resources are released when the corresponding VM area is freed.

Jira NVGPU-6804

Change-Id: I1f2cdc0c06c1698a62640c1c6fbcb2f9db24a0bc
Signed-off-by: scottl <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542178
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-28 22:39:06 -07:00
Richard Zhao
cecb0666f6 gpu: nvgpu: pd_cache: always zero partial on free
After a partial cache is freed, it is possible to be re-used next time.
So always zero the partial on free.

Jira GVSCI-10977

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I7779169793e32d396db187d6e6e072d6f194e91e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548471
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-28 18:10:28 -07:00
Richard Zhao
61173ed198 gpu: nvgpu: vgpu: add new cmd for preemption mode support
- added new cmd for set preemption mode, all buffers will be allocated
and mapped on server side
- removed the old cmd bind ctxsw buffers.

gr_ctx and its associated buffers have all moved to server side,
including memory allocation, va allocation, gpu mapping and commit.

Jira GVSCI-10977

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I28f0e20bf414f51a842a33d0c12bfe9ff5e34a4d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546856
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-28 18:10:23 -07:00
Richard Zhao
77f0ab6583 gpu: nvgpu: remove gpu_va update_hwpm_ctxsw_mode
Since gpu server can noew allocate va itself, update_hwpm_ctxsw_mode
does not need to fixed map pm ctx anymore.

Jira GVSCI-10977

Change-Id: If592c8a2eb6dbfd7d922c79c87871162e9d8d8a4
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546192
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-28 18:10:18 -07:00
Richard Zhao
f9ae5c6424 gpu: nvgpu: vgpu: merge ivc commands for .alloc_obj_ctx
- added new ivc cmd for .alloc_obj_ctx
- removed functions which were used to implement .alloc_obj_ctx

Jira GVSCI-10977

Change-Id: Iec868d601d2844957aa1ccd40626787d388546d0
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546191
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-28 18:10:06 -07:00
Richard Zhao
4e08649b7f gpu: nvgpu: move mem checking of gr_ctx to .alloc_obj_ctx
Preparing for adding vgpu cmd .add_obj_ctx and memory will be allocated
on server side. Outside of implementation of .alloc_obj_ctx, code should
not check whether gr_ctx is valid by check gr_ctx mem.

Jirs GVSCI-10977

Change-Id: I6b3d826e930fdfaaae517d204186642e49f5c2d7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546190
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-28 18:10:01 -07:00
Richard Zhao
ec1175123e gpu: nvgpu: vgpu: add client va_start and va_limit to gmmu map cmd
The server cannot construct same VA range with only client VA size. So
pass va_start and va_limit to server. The server will take the client
VA range as user region.

Jira GVSCI-10900

Change-Id: Ib5ab65f17a1b410d65155d39defc088e02efa3f2
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548470
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-28 18:09:50 -07:00
Richard Zhao
671dbbb145 gpu: nvgpu: remove vm->guest_managed
gpu server now moved to use kernel vma range too, so guest_managed is
not used anymore.

Jira GVSCI-10900

Change-Id: I838cad24194faf72fe5ef53053e5dacc9f6588c1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546189
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-28 18:09:44 -07:00
Antony Clince Alex
68e11c8bd3 gpu: nvgpu: remove nvgpu_next_gpuid.h
Replace all usages of NVGPU_NEXT_GPUID and NVGPU_NEXT_DGPU_GPUID
with NVGPU_GPUID_GA10B and NVGPU_GPUID_GA100.

Remove nvgpu_next_gpuid.h and update yaml.

Jira NVGPU-4771

Change-Id: I3baf0de4eb5266b79aabd5c6ddf8442bf8f73419
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547735
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:03:09 -07:00
Antony Clince Alex
d2919409e9 gpu: nvgpu: rename/collpase nvgpu_next functions and structs
Replace all nvgpu_next functions/structs either by 1) collapsing them
into nvgpu legacy functions/structs 2) renaming them as follows:
- nvgpu_next_*() => nvgpu_(ga10b/ga100)_*()
- nvgpu_next_*() => (ga10b/ga100)_*()
- nvgpu_next_*() => nvgpu_*() [only if this doesn't cause collision]
- nvgpu_next_*() = > nvgpu_*_extra()

Create hal.sim unit and move Ampere+ SIM code into it.

Jira NVGPU-4771

Change-Id: I215594a0d0df4bd663bd875a0d0db47bcb9ff6a2
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548056
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:58 -07:00
Antony Clince Alex
f9cac0c64d gpu: nvgpu: remove nvgpu_next files
Remove all nvgpu_next files and move the code into corresponding
nvgpu files.

Merge nvgpu-next-*.yaml into nvgpu-.yaml files.

Jira NVGPU-4771

Change-Id: I595311be3c7bbb4f6314811e68712ff01763801e
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547557
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:53 -07:00