Commit Graph

9734 Commits

Author SHA1 Message Date
ajesh
aa08389240 gpu: nvgpu: update doxygen for posix
Update the documentation as per SWUD feedback for posix unit.

JIRA NVGPU-6963

Change-Id: I29ed84ea21957b4593684ab62a798fc477fc279f
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581414
(cherry picked from commit 89b560deebf6485356afbfddd508104e95136508)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587428
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 21:44:22 -07:00
Tejal Kudav
9b5274593c gpu: nvgpu: Update common.ptimer documentation
Enhance doxygen comments for below common.ptimer APIs:
1. nvgpu_scale_ptimer()
2. gops_ptimer.isr()

Remove assert calls from nvgpu_scale_ptimer() as it now
has a means to return error.
Reorder the Ptimer ISR code for better logical flow.

JIRA NVGPU-6989

Change-Id: I5adf4d665d3b90d3e9b11557a15fcb91e485f353
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2583667
(cherry picked from commit 502ab9ee2dc3f3b7b1da7ac59f13fddce4ead616)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592057
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 05:59:13 -07:00
Tejal Kudav
5a94007725 gpu: nvgpu: Remove redundant HAL from common.fbp
common.fbp has two interfaces to initialize FBP:
1. Public API nvgpu_fbp_init_support
2. HAL fbp.fbp_init_support

nvgpu_fbp_init_support() is only used to initialize HAL
fbp.fbp_init_support. Remove the HAL and use the API directly.

JIRA NVGPU-6644

Change-Id: I2c455e09dbcf5e4fb1dc370b284e4f0d5c678b40
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592047
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 05:59:00 -07:00
Sahil Mukund Patki
794d1edbe4 gpu: nvgpu: Fix debugfs compilation errors
The function "nvgpu_ce_debugfs_init" is declared in "debug_ce.h".
This file is only compiled when CONFIG_DEBUG_FS is enabled. So
any accesses to this function result in compilation errors when
CONFIG_DEBUG_FS is disabled.

This patch fixes the errors by guarding all accesses to the above
mentioned function by CONFIG_DEBUG_FS.

Bug 200755555

Change-Id: Ie566413913c4a72b10b87c3285d1263d1c811074
Signed-off-by: Sahil Mukund Patki <spatki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591304
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-15 09:16:22 -07:00
prsethi
dd94573e55 gpu: nvgpu: Update KMDI mapping interface
Finding gpu va mapping inside a given range is a two step process where
in first step number of mapping are queried and at second step it
queries for all the continues mapping range for that given gpu va
range. Mapping interface should count and return number of mappings if
input count is 0 in place of failing it.
Patch make the change for this two step process and only returns count
at first step and in second step returns the continues memory ranges.

Patch also replaces nvgpu_zalloc with nvgpu_big_zalloc to handle bigger
size allocation.

Bug 200722275

Change-Id: I56428deafa560ac8471c78f102bb1f9dbe20cabc
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591043
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-15 09:16:06 -07:00
Debarshi Dutta
79ab0ba6c4 gpu: nvgpu: remove sudo restrictions on gpu nodes.
When SMC modes are enabled, devices are created with sudo-only
access permissions. Those permissions are relaxed to allow non-sudo
processes to allow job submission.

Also, allow only root users to poweroff explicitely via the device
power node.

Bug 3374078

Change-Id: Ieb869399c3ada3588708cf2bc99a580414023cb7
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590584
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-15 09:15:49 -07:00
Antony Clince Alex
f3164a4672 gpu: nvgpu: fix tpc_fs_mask syfs output
The tpc_fs_mask sysfs entry outputs the TPC masks in logical order,
however, contradicts the gpc_fs_mask which is in physical order.
So for consistency, update tpc_fs_mask to provided output in physical
order.

Bug 3364907

Change-Id: I2cc7b66dac2bea215024ef95944cde4b46d51c9a
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593803
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-14 16:14:33 -07:00
Seshendra Gadagottu
5f62534127 Revert "gpu: nvgpu: ga10b: add errata for disable CBU ECC"
This reverts commit 78d7a7fdde.

Reason for revert: fix is available, so no errata required

Bug 200759575

Change-Id: Id46dd3e8ecde1e56fd0e0bca2746dc9c35e07728
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584855
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-14 16:09:48 -07:00
Vedashree Vidwans
ecaafaf75e gpu: nvgpu: ga10b: correct mmu fault static arrays
Correct index and value of static descriptor arrays used to parse
faulted hub and gpc clients.

Bug 3373998

Change-Id: Ia0476b272aa110b6172c69b3d5ef2b76a683a856
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2593631
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-14 01:59:59 -07:00
Debarshi Dutta
791dc18666 gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields
Add Setter and Getter methods for accessing tsg->sm_error_states.
Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state.
This renders it unnecessary to add BVEC for above fields for the struct
in multiple locations. The current design ensures that only a constant
pointer is obtained from the owner unit i.e. FIFO.

The following new methods are added. Both unit tests and BVEC tests
are added for them as well.

nvgpu_tsg_store_sm_error_state
nvgpu_tsg_get_sm_error_state

Jira NVGPU-6947

Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-13 20:57:09 -07:00
Debarshi Dutta
6361653633 gpu: nvgpu: update swud for priv_ring
Update documentation for priv_ring unit based on updated swud
guidelines. This patch is contains a combination of two commits.

Documentation is added for the HAL methods

enable_priv_ring and isr
decode_error_code
enum_ltc
get_fbp_count
get_gpc_count
set_ppriv_timeout_settings

Jira NVGPU-6986

Change-Id: Ifa401dab0f29330ab7db2dcc888edf46a402cc83
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587227
GVS: Gerrit_Virtual_Submit
(cherry picked from commit 0bdcf425ca58e6d04dceaedbb48f3adef43a870a)
(cherry picked from commit ca44c09df60791db2ea6a6a80bc807f6c7eba494)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2590992
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-13 20:57:03 -07:00
ajeshkv
118f8c1280 gpu: nvgpu: add support for gsp stress test
Add debugfs entries to support GSP stress test and other
functionalities to enable the test.

JIRA CORERM-3382

Change-Id: Iab20fcfe78807e76e91c64716502a2f036ed4d18
Signed-off-by: ajeshkv <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589390
Reviewed-by: Amit Pabalkar <apabalkar@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-10 16:02:43 -07:00
Vedashree Vidwans
a3e2283cf2 gpu: nvgpu: ga10b: Use active ltcs count for cbc init
This patch fixes a bug in the cbc initialization code for ga10b,
where it was erroneously assumed that a fixed ltc count of only one
should be used for historical reasons. For volta and later, the full
ltc count should be used in cbc-related computation.
Ensure
- CBC base address is 64K aligned
- CBC start address lies within CBC allocated memory

Check CBC is marked safe only for silicon platform.

Bug 3353418

Change-Id: I5edee2a78dc9e8c149e111a9f088a57e0154f5c2
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2585778
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-10 16:00:25 -07:00
deepak goyal
cc7b048641 gpu: nvgpu: non-zero blob size for rail-gating.
Ucode blob size 0 is passed currently for rail-gating.
Ucode blob size 0 is not supported by ACR yet.
ACR will copy UCODE blob again
to SYSMEM for GPU Rail-gating cycles.

Bug 3361416

Change-Id: I1fdb3993cda7e5d62507d83f9c0a8645dc5f7fc7
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2588207
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-09 09:16:37 -07:00
David Li
b27524916a gpu: nvgpu: ga10b fix zcull sm_num_rcp_conservative
-calculate sm_num_rcp_conservative correctly using TPC total from all GPCs
-register manual says use SM count but it's actually TPC count

bug 3370219

Change-Id: I4422fb09d3a59879394e0e1abc5513efc6355b5b
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586399
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Gangzheng Tong <gtong@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Gangzheng Tong <gtong@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-09 09:13:31 -07:00
Debarshi Dutta
a53ebf02d1 gpu: nvgpu: update error message to info.
These errors are now actually expected from code that counts number of
sys/gpc/fbp perfmons after first context creation. Nvgpu tries to count
them by register offset lookup in context image and counts perfmons until
invalid offset is found.

nvgpu_gr_hwmp_map_find_priv_offset no longer prints an error message.
The correct error condition is moved to gr_exec_reg_ops

Bug 200755537

Change-Id: Ib5c6ccd39275b2b06e3f8bce4878a3234478a780
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586228
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-09 09:13:03 -07:00
Antony Clince Alex
ab4aa0afba gpu: nvgpu: remove incorrect usage of CONFIG_NVGPU_NEXT
Remove incorrect usage of CONFIG_NVGPU_NEXT introuduced in patch:
https://git-master.nvidia.com/r/#/c/linux-nvgpu/+/2499571/

JIRA NVGPU-6574

Change-Id: I9bf0f0ee5d9762b79dd7913402678b0dd87f21ee
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2567353
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-08 06:50:49 -07:00
Sagar Kadamati
dd9b4364aa gpu: nvgpu: add nvgpu-next infrastructure
* As of now, working on multiple chip bringup in nvgpu-next repo has
   an issue because we end with losing control on source code (hard to
   find which part of the code belongs to which chip) and it's valuable
   history this affects chip migration on release.

 * To support multiple chip bringup simultaneously, we need new
   guidelines to avoid losing control on source code and make migration
   easier. This change adds links to nvgpu-next repo.

 * Updated return code to ENODEV for consistency
 * Updated ACR unittest to work with ENODEV return code

NOTE:
     These are the initial set of infrastructure changes, guidelines
     will evolve, and source code will get updated accordingly.

     Based on future chip features, Which part of the source code falls
     under nvgpu-next repo is decided.

JIRA NVGPU-6574

Change-Id: I81827e35d189c55554df00e255b527a4473e0338
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556793
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-08 06:50:38 -07:00
Konsta Hölttä
9ffcb0fade gpu: nvgpu: log submit error reasons
For each common error that may happen in the submit path, log the
failure reason at info level if not already logged. Various mistakes may
cause -EINVAL, and getting to know what is wrong is helpful when writing
tests.

Change-Id: I8ac2a40441e0bf3d8afdb40526b607537eb5105c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587360
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 16:00:50 -07:00
Divya Singhatwaria
b6ab227016 gpu: nvgpu: Enable pmu interrupt
- For secure RISCV boot, enable pmu interrupt
  during pmu_rtos_init
- As interrupts are enabled, PMU intr can be received
  before driver has changed the pmu firmware state. This
  can cause the RISCV boot to fail.
- To resolve this, first change the pmu firmware state
  from off to PMU_FW_STATE_STARTING and then wait
  for pmu priv lockdown release.

Change-Id: Ib2e8b033fec6320bf9ccff02696192a48172464b
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586325
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-07 16:00:05 -07:00
dt
152d7c9edd gpu: nvgpu: Fix for pes_tpc_mask programming
After CONFIG_UBSAN kernel compilation flag to know any shifting
cause overflow or not enablement ,this is identified.
The register "gr_fe_tpc_fs_r(gpc_index)" is read only after
Volta. The gops where we are computing the index is not needed.

Bug 200727116

Change-Id: Ib2306103389ba9df77fd59d012ec70e775104989
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573296
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 15:59:48 -07:00
dt
4034de5756 gpu: nvgpu: Fix for smid programming
As number of available tpc/gpc is more than 4 in new dgpu,
this fix is needed for correct sm_id config programming.

After CONFIG_UBSAN kernel compilation flag to know any shifting
cause overflow or not enablement , this is identified where the
shift is overflowing u32 when number of available TPCs is more
than four.

Bug 200727116

Change-Id: I9169a00614e4a648afe4a2d2f8e76c178e8c19eb
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2571823
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-07 10:29:03 -07:00
dt
9355345610 gpu: nvgpu: Add IPA-PA cache to increase the performance
When GPU need to programmed with PA(physical address),
given IPA need to be converted to PA by querying Hypervisor.
As this is an IPC between OSes, the call will reduce the
performance badly. So this is adding a IPA-PA cache to improve
the performance. This will be more helpful in passthr config.

Bug 3277194

Change-Id: I6a3230d858977313a0ed0f33068055a3b516330a
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2571814
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 10:28:58 -07:00
Ramesh Mylavarapu
ffd0d3962f gpu: nvgpu: gsp: gsp isr and debug trace support
- Created GSP NVRISCV interrupt handle and
  respective functions and register reads.
- Created Debug trace support for GSP firmware.

NVGPU-7084

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I2728150c4db00403aa6e3c043bc19c51677dd9cf
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589430
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 05:37:51 -07:00
Antony Clince Alex
2afd601a40 gpu: nvgpu: update FS mask sysfs entries to RDONLY
Repurpose (gpc,fbp,tpc)_fs_mask sysfs nodes to only report active
physical chiplets after floorsweeping.

StaticPG'ing of chiplets will be handled by (gpc,fbp,tpc)_pg_mask sysfs
nodes. The user will be able to the write valid PG masks for respective
chiplets prior to poweron, which can then be verified using
(gpc,fbp_tpc)_fs_mask nodes.

Bug 3364907

Change-Id: Ia4132f9c1939b2cb4a8f55f9d99a2b0a5b02184c
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587926
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Chris Dragan <kdragan@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-07 05:35:09 -07:00
Rajesh Devaraj
4d46e9e07a gpu: nvgpu: update doxygen for SDL
This patch updates doxygen for the following functions in SDL:
- nvgpu_report_ctxsw_err()
- nvgpu_report_ecc_err()
- nvgpu_report_host_err()
- nvgpu_report_pmu_err()
- nvgpu_report_pri_err()
- gr_intr_report_ctxsw_err()
- nvgpu_report_mmu_err()
- nvgpu_report_gr_err()

JIRA NVGPU-7001

Change-Id: Ie21908cacaf4add1143d68d9f9a4d2d1315dfdd8
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
(cherry picked from commit c1dc3e7c35d585faed8ed3b9c61f6afe044f7263)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2588991
Reviewed-by: V M S Seeta Rama Raju Mudundi <srajum@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-03 14:40:32 -07:00
Debarshi Dutta
33740b41b6 gpu: nvgpu: free memory during module removal
Following pointers(allocated via Kmalloc/DMA) aren't freed during
module removal.

struct nvgpu_gr_config -> gpc_tpc_mask_physical
struct nvgpu_netlist_vars -> ctxsw_regs.etpc.l
struct mm_gk20a -> sysmem_flush
struct nvgpu_pmu_pg -> pg_buf
SGTable corresponding to VPR secure buffer.

Added appropriate free calls.

Bug 3364181

Change-Id: I2105c1f3256b1910f0f514d98f0ee3ae2e34aff7
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586244
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-02 15:43:07 -07:00
Sagar Kamble
79fb97100d gpu: nvgpu: implement GET_BUFFER_INFO ioctl
Userspace applications will need to query buffer information such as
size, comptags allocation status, user associated metadata etc. for
enabling newer IPC mechanisms. Add support for this new ioctl.

Bug 200586313

Change-Id: I87607eb306afa0cce1bec7a1fb2925ec3bc33e50
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480763
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-02 11:42:13 -07:00
Sagar Kamble
ed16377983 gpu: nvgpu: allocate comptags and store metadata in REGISTER_BUFFER ioctl
To enable userspace query about comptags allocation status of a buffer,
comptags are to be allocated only during buffer registration done by
nvrm_gpu. Earlier, they were allocated during map.

nvrm_gpu will be sending metadata blob to be associated with the buffer.
This will have to be stored in the dmabuf privdata for all the buffers
registered by nvrm_gpu.

This patch moves the privdata allocation to buffer registration ioctl.

Remove g->mm.priv_lock as it is not needed now. This lock was added
to protect dmabuf private data setup. That private data is now
handled through dmabuf->ops and setup of dmabuf->ops is done
under dmabuf->lock.

To support legacy userspace, this patch still allocates comptags on
demand on map calls for unregistered buffers.

Bug 200586313

Change-Id: I88b2ca04c733dd02a84bcbf05060bddc00147790
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480761
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-02 11:42:08 -07:00
Jon Hunter
8a4b72a4aa gpu: nvgpu: Fix crash when reading CE_APP debugfs
The CE_APP debugfs nodes are created when the NVGPU driver is probed,
however, the 'ce_app' structure which contains the variables exposed
via the debugfs, is not allocated until nvgpu_finalize_poweron() is
called. Therefore, if the user attempts to access the CE_APP debugfs
nodes before the NVGPU has been powered on, for example, right after
Linux has booted, then this results in a NULL pointer dereference crash.
Fix this by moving the creation of the CE_APP debugfs nodes to
nvgpu_finalize_poweron_linux() which is called after
nvgpu_finalize_poweron().

Bug 200747304

Change-Id: Icd28952112f86887a1d6b6f8beb382f5189461a9
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572106
(cherry picked from commit 35a0c18d93e97265611c3bbfae41b39d9cd183e3)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587367
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-02 07:23:53 -07:00
Jon Hunter
d1b34e50e2 gpu: nvgpu: Fix build for Linux v5.14-rc2
Upstream Linux kernel commits b7eb335e26a9 ("Makefile: Enable
-Wimplicit-fallthrough for Clang") and d936eb238744 ('Revert "Makefile:
Enable -Wimplicit-fallthrough for Clang"') have the net effect of
updating the compiler flag -Wimplicit-fallthrough from
-Wimplicit-fallthrough= to -Wimplicit-fallthrough=5. This causes the
following build error to be seen ...

 nvgpu/drivers/gpu/nvgpu/common/pmu/clk/clk_prog.c:1042:15: error:
 	this statement may fall through [-Werror=implicit-fallthrough=]
 1042 |    step_count = (freq_step_size_mhz == 0U) ? 0U :
      |    ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1043 |      (u8)(p1xmaster->super.freq_max_mhz -
      |      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1044 |       *pfreqmaxlastmhz - 1U) /
      |       ~~~~~~~~~~~~~~~~~~~~~~~~
 1045 |      freq_step_size_mhz;
      |      ~~~~~~~~~~~~~~~~~~
 nvgpu/drivers/gpu/nvgpu/common/pmu/clk/clk_prog.c:1048:3: note: here
 1048 |   case CTRL_CLK_PROG_1X_SOURCE_ONE_SOURCE:
      |   ^~~~
 cc1: all warnings being treated as errors
 scripts/Makefile.build:271: recipe for target
 	'nvgpu/drivers/gpu/nvgpu/common/pmu/clk/clk_prog.o' failed

Per commit d936eb238744 ('Revert "Makefile: Enable -Wimplicit-fallthrough
for Clang"'), by setting -Wimplicit-fallthrough=5 [0], the explicit
'fall-through' comments in the code are not recognised by the compiler
and cause the above error to be seen. This could be fixed by simply
replacing the 'fall-through' comment with the 'fallthrough;' statement.
However, this requires newer versions of GCC that support it. The
simplest way to fix this error is by ensuring that
-Wimplicit-fallthrough=3 for NVGPU so that fallthrough comments are
recognised by the compiler. Note that we still need to check that GCC
supports this option because older versions do not. It should be noted
that -Wimplicit-fallthrough=3 is the default set by -Wextra. See the
GCC warnings options document [0] for more details.

Bug 3340525

Link: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html [0]
Change-Id: Ia56e4343143185460a37f8a7b0dd229f005acbb9
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2567440
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582509
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rohit Khanna <rokhanna@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Tested-by: Rohit Khanna <rokhanna@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-01 18:49:59 -07:00
Sagar Kamble
7410784b0b gpu: nvgpu: fix clk_arb completion file private data access race
clk_arb completion file descriptor can get closed immediately after
poll finishes in the work item gp10b_clk_arb_run_arbiter_cb. In
that case, the refcount for nvgpu_clk_dev can become zero in
the work item and can lead to invalid access while removing
nvgpu_clk_dev from the lists.

Remove nvgpu_clk_dev from the list before dropping the reference to
it.

Also, delete the nvgpu_clk_dev in completion file release handler
within the session and requests spinlocks to avoid race with
gp10b_clk_arb_run_arbiter_cb using it.

bug 200757277

Change-Id: I054eee547f2a6fa633d7ef55df216ec36647a826
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2569522
(cherry picked from commit ce8548ec05)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2587070
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 09:50:11 -07:00
ajesh
7155ae865c gpu: nvgpu: update queue unit tests
Update queue unit tests for code coverage.

JIRA NVGPU-6904

Change-Id: I49ed6980f2d610cf8359c375a1236e8866ea6795
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555333
(cherry picked from commit f2311f2710cab83b82ed7f5d51c54fa897051686)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560216
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 05:57:54 -07:00
ajesh
3c70d56ddb gpu: nvgpu: update posix thread unit tests
Update the unit tests for posix thread unit to increase
coverage.

JIRA NVGPU-6904

Change-Id: Ib103de1ee37fb4986aa36900772b78b990ccb02a
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555772
(cherry picked from commit cd45d1cd2d095c77d738fdf7746fd258bc58353b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560213
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 05:57:49 -07:00
Debarshi Dutta
6fc27766ed gpu: nvgpu: fix issues due to a previous patch
608decf gpu: nvgpu: add support for powering off gpu
The above commit accidentally removed nvgpu_quiesce from
nvgpu_pci_remove path. Add that back.

Bug 3365659

Change-Id: I287972c426738a950ace2907610e02b774ab1eff
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586240
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 01:37:17 -07:00
deepak goyal
77d1e765f5 gpu: nvgpu: ga10b: Fix logic for BROM pass status
Current code assumes riscv brom passed if it does not times out.
This patch explicitly checks for brom pass/fail or timeout.

Bug 3361416

Change-Id: I399a6cf9d32be92b24990532f81892642513ba54
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2585786
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-31 08:54:35 -07:00
Seshendra Gadagottu
d255c64f50 gpu: nvgpu: ga10x: update pdiv_duration for thermal
To keep pdiv_duration at 15usec between steps at 102MHz
utilsclk, update stepping duration value from 0xBF4 to
0x5FA for ga10x.

Bug 200757274

Change-Id: I333a5b0b35307402a734a7eafc4ab13d20316cd1
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584539
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-30 19:35:54 -07:00
Ramesh Mylavarapu
88293ee42d gpu: nvgpu: read temperature from therm_i2cs_sensor_00_r
Currently reading temperature value depeads on therm pstate
board objects. In absence of pstate reading temperature
from therm get status will be failed which will cause GVS
failure in NvRmGpuTest_Device_GetTemperature test.
This change will add support to read temperature from
therm sensor_00 register but this will have following
limitation:
 - NV_THERM_I2CS_SENSOR_00 doesn't support fractional
   precision.
 - It doesn't support negative temperatures.

BUG-200736830

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I25e577dac9029fcd787a6f71957dbeefd6fe43dd
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584269
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-28 06:56:24 -07:00
Ramesh Mylavarapu
a96c04d097 gpu: nvgpu: disable pstate support for tu104
Disabling pstates on TU104 which is no more a POR.

BUG-200736830

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I36a0d5fac5d1294802e5150dcebd5dcb54ad5f2e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584268
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-28 06:56:19 -07:00
Seshendra Gadagottu
135e056e9e gpu: nvgpu: ga10b: set can_slcg/blcg/elcg to true
Add capability to enable/disable clock gating power
features by setting can_xxcg capabilities to
true. The cg features are disabled on tot and will be
enabled once verification is done.

Jira NVGPU-7033
Bug 200766930

Change-Id: I2d2aa25b7c84f3c4de0b12fd6d845a8f792bfd2d
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584540
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-27 20:45:58 -07:00
Antony Clince Alex
bb5bffe571 gpu: nvgpu: enhance CE error reporting documentation
Update documentation for function nvgpu_report_ce_err to include
fine granular implemenation details. In additiona, remove redundant
descrptions from error reporting functions.

Jira NVGPU-6948

Change-Id: Ie1675b0260809bfbc6fdeab6748c48347b5f3d7d
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554573
(cherry picked from commit a5f84edde5943358549534b8f736ee931a28c1ad)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555909
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-27 04:17:15 -07:00
Deepak Nibade
3c97f3b932 gpu: nvgpu: disallow binding more channels than MAX channels supported per TSG
There is HW specific limit on number of channel entries that can be
added for each TSG entry in runlist. Right now there is no checking
to enforce this from SW and hence if User binds more than supported
channels to same TSG, invalid TSG formation error interrupts are
generated.

Fix this by adding appropriate checks in below steps :

- Add new field ch_count to struct nvgpu_tsg to keep track of
  channels bound to TSG.
- Define new hal gops.runlist.get_max_channels_per_tsg() to retrieve
  HW specific maximum channel count per TSG.
- Implement the HAL for gk20a and gv11b chips, and assign new HALs for
  all chips appropriately.
- Increment ch_count while binding the channel to TSG and decrement it
  while unbinding.
- While binding channel to TSG, Check if current channel count is
  already equal to max channel count. If yes, print an error and bail
  out.

Bug 200763991

Change-Id: Ic5f17a52e0fb171d1c020bf4f085f57cdb95f923
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582095
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-25 09:47:47 -07:00
Debarshi Dutta
608decf1e6 gpu: nvgpu: add support for powering off gpu
Add support for powering off IGPU for switching between
legacy to SMC mode/vice-versa or changing SMC configuration.
The power off can be issued as follows

echo 0 > /dev/nvgpu/igpu0/power

The following steps are done during a poweroff.
1) Deterministic channel idle
2) Acquire write_lock on l->busy semaphore.
3) Wait till power_usage decrements to indicate 0 active jobs.
4) Invoke pm_runtime_put_sync_suspend()
5) Invoke nvgpu_gr_remove_support() to clear existing GR memory.
6) Release write_lock on l->busy
7) Deterministic channel unidle.

Part of the sequence matches that of the gk20a_do_idle code.
The common parts are extracted into new functions
gk20a_block_new_jobs_and_idle() and gk20a_unblock_jobs()

For joint-rail case, the current implementation, does a railgate
and then sets pm_runtime_set_autosuspend_delay(-1) to disable
regular runtime resume/suspend.

Remove clearing of NVGPU_SUPPORT_MIG status during state change
ias it leads to inconsistencies.

Jira NVGPU-6920

Change-Id: I0b3eb3278176122ac061c1e8a94ebfb3c17c3925
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578501
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:50 -07:00
Debarshi Dutta
2e3c3aada6 gpu: nvgpu: fix deinit of GR
Existing implementation of GR de-init doesn't account for multiple
instances of struct nvgpu_gr. As a fix, below changes are added.

1) nvgpu_gr_free is unified for VGPU as well as native.
2) All the GR instances are freed.
3) Appropriate NULL checks are added when freeing GR memories.
4) 2D, 3D, I2M and ZBC etc are explicitely disabled when MIG is set.
5) In ioctl_ctrl, checks are added to not return error when zbc is NULL
   for VGPU as requests are rerouted to RMserver.

Jira NVGPU-6920

Change-Id: Icaa40f88f523c2cdbfe3a4fd6a55681ea7a83d12
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578500
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:45 -07:00
Mahantesh Kumbar
b9696ee643 gpu: nvgpu: ga10b: update NVRISCV LSPMU
- Set NVRISCV LSPMU app version to 0.
- Setting app version to 0 helps to load and boot
  multiple LSPMU ucode's without modifying the
  NVGPU driver.
- Add support for PMU NVRISCV prod and dbg bin's.
- This is corresponding change to LSPMU MPSK CL
  https://git-master.nvidia.com/r/c/tegra/kernel-firmware-t18x/+/2576049

JIRA NVGPU-7061

Change-Id: I800953ca97af3badde1983aa99e09b4fe7453203
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2575341
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-22 11:05:03 -07:00
Seshendra Gadagottu
a743596697 gpu: nvgpu: ga10b: handle floor-swept gpc clock gracefully
If a GPC is floor-swept, then gpcclk enable for that GPC will
return error. For gpu booting, ignore this error and continue
with other clocks enable. More robust mechanism with floor-sweeping
check before enabling clocks will be added in follow-up patches.

Bug 3362403

Change-Id: I0b64c94918a1c00086a146408e6c4913788249ec
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579569
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-20 14:56:30 -07:00
Seshendra Gadagottu
8ed1487860 gpu: nvgpu: ga10b: Enable clock arb support
Enable clock arbitration support for silicon.

Bug 200764879

Change-Id: I40d47f7f15197a8dd55ca0866e177fd42b8c4e9d
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579556
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-19 14:07:47 -07:00
Richard Zhao
d8e847c90d gpu: nvgpu: vgpu: fix force preemption from debugfs
check whether there's any force_preemption_gfxp or force_preemption_cilp
set in debugfs when alloc obj_ctx.

Jira GVSCI-4658

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I87fc7e195c9b0f7ed29ec6c37c8f46b456625fea
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2579218
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-19 14:06:44 -07:00
Richard Zhao
eaa508b2e6 gpu: nvgpu: vgpu: set .set_long_timeslice for ga10b
Jira GVSCI-4658

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I99c2f43504b68b8616b4327edcd1389b29912900
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578287
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-18 14:34:45 -07:00
Vedashree Vidwans
e13ab1f9ea gpu: nvgpu: pmu: remove hw access from remove_pmu_support
GPU HW registers are locked before remove_pmu_support.
Remove functions accessing HW registers.

Bug 3357477

Change-Id: I34a1923bfdb3afacd462f2646e2821569573a81a
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2577627
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-08-17 09:45:42 -07:00