Commit Graph

8549 Commits

Author SHA1 Message Date
Lakshmanan M
ee2aaef308 gpu: nvgpu: Report non zero num_sub_partition_per_fbpa value only for dGPU
All Tegra iGPUs don't have real FBPA/FBSP units at all.
So num_sub_partition_per_fbpa should be 0 for iGPUs.

JIRA NVGPU-5656

Change-Id: I30050caf8f9f6b5185404a64dbbbe02f67046093
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2545978
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-16 15:06:30 -07:00
Shashank Singh
9bd91499e3 gpu: nvgpu: fix findings in common.nvgpu from DVR
Fix nvgpu_get_litter_value() doxygen output.

Jira NVGPU-6597

Change-Id: I67ad29d9b9e880695a450fd030ba110bd739cd9b
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2544113
(cherry picked from commit c6b2826700b7435671e31b96d998921680cc9d9c)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2545314
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 13:27:07 -07:00
dt
12a0e3fe61 gpu: nvgpu: Add support to print mig config lists
This is adding support to show available mig configs when MIG
is disabled for nvgpu-next.

JIRA NVGPU-6721

Change-Id: I8ba742b7850902c1eea4728655c75d795e0bb3a2
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543472
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 13:25:46 -07:00
Divya Singhatwaria
4874bdfbac gpu: nvgpu: Address DVR issues for common.power_features
Fix the common.power_features DVR issues found as
part of 5.2 SWUD Lite units design verification.
1.Add note about various *CG features.
2. nvgpu_cg_init_gr_load_gating_prod description fixed.

JIRA NVGPU-6610

Change-Id: Id28eaa9d15a5481d28a5fd2cc407c82734a6c165
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541739
(cherry picked from commit d19e95407748689a26ae5b5920e6fb50f4399d1f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542078
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 09:08:21 -07:00
Sagar Kamble
e0e337fb83 gpu: nvgpu: set nvgpu power state to POWERED_OFF on poweron fail
When force closing the app, poweron needed in channel close path will
fail as pg_task kthread creation fails with -EINTR (process is
SIGKILL'd so threads don't get created).

Upon poweron failure, device nodes are removed and the nvgpu power
state is not reset to NVGPU_STATE_POWERED_OFF. Hence on further
gk20a_busy attempts, poweron is not attempted and gpu remains
unusable from thereon.

Change the state to POWERED_OFF from POWERING_ON on poweron fail.

Bug 3308828

Change-Id: I2360f11a4937dfe93eb7933b30c13748fb570898
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543797
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 04:58:28 -07:00
Debarshi Dutta
8f9ac1dea9 gpu: nvgpu: split away power node removal
Presently, gk20a_user_deinit is used to remove all device nodes
including "power" node as well.

Split removal of power node into a separate function
gk20a_power_node_deinit to enable other device removal during the
normal runtime_suspend path to facilitate the fast path for MIG
reconfiguration. Powernode can be removed only during a call to
Rmmod. This also enables separately powering off the device nodes
in the unlikely case of a poweron failure.

Bug 3308828
Jira NVGPU-6920

Change-Id: Ib045a09a992a63c468492a837b273cca41e20f15
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543014
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-15 04:57:57 -07:00
Divya Singhatwaria
a1d0957a9b gpu: nvgpu: Update GP10B FW version
Updated PMU ucode taken from P4 CL#30066529 for t18x igpu.
The ucode resolves the ELPG_DISALLOW_ACK timeout failure

P4 CL link for this PMU ucode changes:
https://p4sw-swarm.nvidia.com/changes/30066529

Bug 200588696

Change-Id: Ic45c37c75924c581d6ef91ffd754da287d63f4c6
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2544140
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-14 12:58:06 -07:00
Debarshi Dutta
45a1489409 gpu: nvgpu: enable compiling out DGPU specific flag in Hal.Bus unit
read_sw_scratch, and write_sw_scratch belonging to gops_bus struct is
moved under CONFIG_NVGPU_DGPU compiler flag as these are currently
called by DGPU bios specific routines.

Jira NVGPU-6402

Change-Id: I5ff22e6d9ad323b0c209f2b4458b8ee3a4a62226
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542959
(cherry picked from commit 71da44a5dbe3d969d6551dc366813208faf4ed05
in rel-33)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2544003
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-14 05:34:24 -07:00
Seshendra Gadagottu
ae243fa1eb gpu: nvgpu: set l3_alloc hint based on L3 errata
If errata for L3 SCF cache not supported is set, then
force l3_alloc hint to false, so that L3 memory traffic
will not be generated from nvgpu driver.

Bug 3186312
Bug 3288192

Change-Id: Icf776673c2975fdc04cc02bfae28ef26c8deba4d
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2539599
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-12 07:24:09 -07:00
Lakshmanan M
4a3a9d46e3 gpu: nvgpu: Use gr_instance specific api to query the num of sm
Replaced get_no_of_sm() with gr_instance specific api
nvgpu_gr_config_get_no_of_sm()

JIRA NVGPU-5656

Change-Id: I01b786402dde857e7cc30d5370429d02ebe3f428
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543245
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-11 18:05:07 -07:00
Deepak Nibade
c4dee40c49 gpu: nvgpu: fix MISRA violations in common.gr
1. misra_c_2012_rule_8_6_violation: "gm20b_gr_init_fe_go_idle_timeout"
   is declared but never defined.

Fix by adding config CONFIG_NVGPU_HAL_NON_FUSA for header declaration
of "gm20b_gr_init_fe_go_idle_timeout"

2. misra_c_2012_rule_5_7_violation: Identifier "ops" is already used
   to represent a type.

Fix by renaming local variable ops to nonstall_ops in
gm20b_gr_intr_nonstall_isr()

3. missing_default: No default case found for the switch statement
   "switch (offset << 2)"

Fix by adding break and default statements to switch case in
gv11b_gr_intr_handle_sw_method()

Jira NVGPU-6779

Change-Id: I8df097ec66479edcd2e81bf46bab5b5db52ac8c8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541246
(cherry picked from commit c4d9fe0449f8c6ee209051abfe58c6f3a745808d)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543012
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-11 18:04:33 -07:00
Lakshmanan M
5394175d5b gpu: nvgpu: Move get_num_hwpm_perfmon() after golden context creation
Query the num_perfmon requires golden context to be ready. Accessing
golden context might require gr_instance_id, specific to a GR engine.
On TOT, get_num_hwpm_perfmon() called from perfmon HAL which might
require to call nvgpu_gr_exec_with_err_for_instance().
It internally calls nvgpu_grmgr_config_gr_remap_window() to change
gr_window_remap register points to a current gr_instance_id for MIG.
This approach indirectly mandates to call
nvgpu_gr_exec_with_err_for_instance() which can be
completely avoided. get_num_hwpm_perfmon() is just a query call
which can be moved after the golden context creation.
Using this logic, we can avoid unnecessary invocation of
nvgpu_gr_exec_with_err_for_instance() during perform specific
HAL accesses.

1) Moved get_num_hwpm_perfmon() after golden context creation.
2) Added nvgpu_assert() if (g->num_sys_perfmon == 0U).

JIRA NVGPU-5656

Change-Id: I59a6ab4df93763adbc0765fa5e4d1712b2477521
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542438
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-10 19:53:17 -07:00
Konsta Hölttä
4b3591aafb gpu: nvgpu: avoid faulty elpg protection
Don't store the return value of elpg re-enable if disable fails; this
could make the local status value zero again, causing the elpg-protected
call to be executed with elpg still enabled and elpg re-enabled twice.

Commit c905858565 ("gpu: nvgpu: add cg and pg function") introduced
this bug; failure of re-enabling after a failed disable might be another
problem (and it's not clear why this is done in the first place) which
isn't propagated to the caller, but that would belong to another patch.

Bug 200565050

Change-Id: I7cf7a0887ae59e85bf0c56c38aaaadfefd16cc1c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541859
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-10 19:51:13 -07:00
Lakshmanan M
7d473f4dcc gpu: nvgpu: Expose logical mask for MIG
1) Expose logical mask instead of physical mask when MIG is enabled.
   For legacy, NvGpu expose physical mask.
2) Added fb related info in struct nvgpu_gpu_instance().
4) Added utility api to get the logical id for a given local id
   nvgpu_grmgr_get_gr_gpc_logical_id()
5) Added grmgr api to get max_gpc_count
   nvgpu_grmgr_get_max_gpc_count().
5) Added grmgr's fbp api to get num_fbps and its enable masks.
   nvgpu_grmgr_get_num_fbps()
   nvgpu_grmgr_get_fbp_en_mask()
   nvgpu_grmgr_get_fbp_rop_l2_en_mask()
6) Used grmgr's fbp apis in ioctl_ctrl.c
7) Moved fbp_init_support() in nvgpu_early_init()
8) Added nvgpu_assert handling in grmgr.c
9) Added vgpu hal for get_max_gpc_count().

JIRA NVGPU-5656

Change-Id: I90ac2ad99be608001e7d5d754f6242ad26c70cdb
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538508
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-10 03:05:21 -07:00
Richard Zhao
e2d8bdc38d gpu: nvgpu: unify nvgpu_get_gpfifo_entry_size
moved nvgpu_get_gpfifo_entry_size implementation to common code.

Jira GVSCI-10880

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia6ccee5e26836662f7c2196ff41658ff41e3a570
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541575
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-09 19:27:25 -07:00
Deepak Nibade
67399a1892 gpu: nvgpu: unit: BVEC test for common.class unit
class_validate_setup is already testing for valid/invalid boundary
values for common.class APIs. Append the valid/invalid list with BVEC
test values.

Fix obsolete gops_class doxygen documentation.

Jira NVGPU-6403

Change-Id: Id713db614919842324f6d655b36dd57043958919
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2539797
(cherry picked from commit 6aed159f9f3eeea553a442af37e3bcc840152154)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2539795
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-09 14:06:07 -07:00
Richard Zhao
9ac7550f35 gpu: nvgpu: unify NV_READ_ONCE and NV_WRITE_ONCE
Implemented NV_READ_ONCE and NV_WRITE_ONCE in common code.

Jira GVSCI-10879

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I5465b4bd1cd44fc7bc1592da01d6be455b1fcdcc
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541559
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-09 03:15:39 -07:00
Seshendra Gadagottu
5ec1e0cc21 gpu: nvgpu: make gp10b_tegra_acquire_platform_clocks public
Made gp10b_tegra_acquire_platform_clocks as public function
so that each gpu architecture can supply different number of
clock list.

Jira NVGPU-6707

Change-Id: Iad2156a63e00913374ce5fa4274c95e7488fdb31
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2511795
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sivaram Nair <sivaramn@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-08 21:54:23 -07:00
Deepak Nibade
6fb2b892ce gpu: nvgpu: set/check mmu nack flags only for GPC exceptions
gv11b_mm_mmu_fault_handle_mmu_fault_refch() right now checks/sets
mmu_nack_handled flag for MMU faults from all clients (i.e. GPC/HUB).

Handling of MMU nack in MMU fault handling path is only needed if MMU
nack exception is triggered by SM in GPC. Hence set and check this flag
only if source client is GPC.

In certain cases it is possible that CE engine triggers back to back
MMU faults on same channel. When this happens, and because of above
incorrect mmu_nack_handled flag handling, mmu_nack_handled flag is set
while handling second MMU fault from CE.

And because of this gv11b_mm_mmu_fault_handle_mmu_fault_refch() function
could end up dropping extra channel refcounts and trigger access after
free scenarios on that channel.

Bug 3315942

Change-Id: I28d8311edf34a041364dddedb5fc3a5b83132f85
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2540497
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-08 06:48:56 -07:00
Richard Zhao
1685a2404f gpu: nvgpu: vgpu: add b0cc profiler support
- added new commands to bind/unbind hwpm/hwpm_streamout/smpc
- added new command to updat get/put for PMA buffer
- tune function nvgpu_perfbuf_update_get_put so it could be reused on
server side.
- enable profiler v2 device for gv11b

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I4226c89ec3040e53dee5381ac8a30c9fd598e5ef
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537683
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:30:03 -07:00
Richard Zhao
a3c4236574 gpu: nvgpu: profiler: create bind/unbind hals
- created gops_profiler
- added HALs for bind/unbind hwpm/hwpm_streamout/smpc
- it helps enable b0cc on vgpu

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I9fd30b134d54a92d1ce8108172aa77237c702bc0
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537682
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:29:57 -07:00
Richard Zhao
4ea92a530b gpu: nvgpu: profiler: remove profiler obj from hwpm bind/unbind
It helps the hwpm bind/unbind functions to be reused on server side.
Server side does not track profiler object.

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ib692c686e940b8123c934b5bb6ba843e09a27246
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537681
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:29:52 -07:00
Richard Zhao
7664bee12f gpu: nvgpu: profiler: remove profiler obj from smpc bind/unbind
It helps the smpc bind/unbind functions to be reused on server side.
Server side does not track profiler object.

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I5e62901cabb56cb2f2d40d51a249b1404b292f5a
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537680
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:29:46 -07:00
Richard Zhao
9b66fca165 gpu: nvgpu: move .exec_regops to only execute regops
HAL .exec_regops used to first validate regops then execute it, now
moving it to only execute the regops.

- It helps B0CC on HV. On server side it does not track profiler object,
but regops validation uses the profiler, so moving validation to client
side.
- The change also remove ctx_buffer_offset checking in
validate_reg_op_offset. The offset already checked again whitelists
which have be verified when update whitelist. Also vgpu does not have
information of ctx and golden image.
- Added function nvgpu_regops_exec to cover both regops validation and
execution.

Jira GVSCI-10351

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I434e027290e263a8a64a25a55500f7294038c9c4
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534252
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-08 01:29:40 -07:00
Lakshmanan M
08cd42093d gpu: nvgpu: Add multi gr l2_evict support
1) Added l2_evict support for multi gr
2) Added multi gr handling for the following apis,
   nvgpu_gr_get_cilp_preempt_pending_chid
   nvgpu_gr_clear_cilp_preempt_pending_chid

JIRA NVGPU-5656

Change-Id: Iee6142a49b9a569f2b440077762164af8aee9fb3
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2539734
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-07 13:46:40 -07:00
Lakshmanan M
df87591b7d gpu: nvgpu: Add multi gr handling for debugger and profiler
1) Added multi gr handling for dbg_ioctl apis.
2) Added nvgpu_assert() in gr_instances.h (for legacy mode).
3) Added multi gr handling for prof_ioctl apis.
4) Added multi gr handling for profiler.
5) Added multi gr handling for ctxsw enable/disable apis.
6) Updated update_hwpm_ctxsw_mode() HAL for multi gr handling.

JIRA NVGPU-5656

Change-Id: I3024d5e6d39bba7a1ae54c5e88c061ce9133e710
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538761
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-04 18:07:47 -07:00
Sagar Kamble
1dd3e0761c gpu: nvgpu: fix the usermode mappings deadlock during railgate and munmap
Following locking sequence leads to deadlock:

1. gk20a_pm_prepare_poweroff (alter_usermode_mappings):
   ctrl_privs_lock -> mmap_lock
2. __do_munmap (usermode_vma_close):
   mmap_lock -> ctrl_privs_lock

This lock contention can be resolved by retrying the usermode mapping
alteration after a while releasing the ctrl_priv_lock for munmap to
proceed.

Below is the kernel panic log with deadlock.

[] INFO: task kworker/1:1:116 blocked for more than 120 seconds.
[]       Tainted: G        W         5.10.17-tegra #1
[] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[] task:kworker/1:1     state:D stack:    0 pid:  116 ppid:     2 flags:0x00000028
[] Workqueue: pm pm_runtime_work
[] Call trace:
[]  __switch_to+0x104/0x160
[]  __schedule+0x3d4/0x900
[]  schedule+0x74/0x100
[]  rwsem_down_write_slowpath+0x250/0x4b0
[]  down_write+0x6c/0x80
[]  alter_usermode_mappings+0xb4/0x160 [nvgpu]
[]  nvgpu_hide_usermode_for_poweroff+0x24/0x30 [nvgpu]
[]  gk20a_pm_prepare_poweroff+0xe8/0x140 [nvgpu]
[]  gk20a_pm_runtime_suspend+0x78/0xf0 [nvgpu]
[]  pm_generic_runtime_suspend+0x3c/0x60
[]  genpd_runtime_suspend+0xb0/0x2c0
[]  __rpm_callback+0x90/0x150
[]  rpm_callback+0x34/0xa0
[]  rpm_suspend+0xe0/0x5e0
[]  pm_runtime_work+0xbc/0xc0
[]  process_one_work+0x1c0/0x4a0
[]  worker_thread+0x11c/0x430
[]  kthread+0x148/0x170
[]  ret_from_fork+0x10/0x18

[] INFO: task nvrm_gpu_tests:1273 blocked for more than 121 seconds.
[]       Tainted: G        W         5.10.17-tegra #1
[] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[] task:nvrm_gpu_tests  state:D stack:    0 pid: 1273 ppid:  1245 flags:0x00000000
[] Call trace:
[]  __switch_to+0x104/0x160
[]  __schedule+0x3d4/0x900
[]  schedule+0x74/0x100
[]  schedule_preempt_disabled+0x28/0x40
[]  __mutex_lock.isra.0+0x184/0x5c0
[]  __mutex_lock_slowpath+0x24/0x30
[]  mutex_lock+0x5c/0x70
[]  usermode_vma_close+0x30/0x50 [nvgpu]
[]  remove_vma+0x34/0x60
[]  __do_munmap+0x1f4/0x4a0
[]  __vm_munmap+0x74/0xd0
[]  __arm64_sys_munmap+0x3c/0x50
[]  el0_svc_common.constprop.0+0x7c/0x1a0
[]  do_el0_svc+0x34/0xa0
[]  el0_svc+0x1c/0x30
[]  el0_sync_handler+0xa8/0xb0
[]  el0_sync+0x160/0x180
[] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]---

Bug 200703921

Change-Id: Ie7f017c92f20061d3bf891079f7fc7fe390f7cf7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2533853
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-04 18:06:11 -07:00
Deepak Nibade
419a65965b gpu: nvgpu: add mutex for gr_ctx initialization
If user calls IOCTL to allocate object context for two channels in same
TSG in parallel, nvgpu_gr_setup_alloc_obj_ctx() could end up racing and
trying to allocate object context for both channels at the same time.
This could result in corrupting object context.

Fix this by introducing per-TSG mutex ctx_init_lock to serialize context
initialization for all channels within TSG.

In ideal scenario nvrm_gpu is the only caller of all the IOCTLs, and
nvrm_gpu makes sure to initialize object context for each channel in
serial order. Because of this new lock does not cause any contention.

Jira NVGPU-6431

Change-Id: Ibb1cbb4878748929bb7f23e8666c283c39ecbf5a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538333
(cherry picked from commit 8be447838dc1ecbd5637eb6bd13b8f338eaf33cd)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538773
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-03 15:59:43 -07:00
dt
5e82717c96 gpu: nvgpu: Add powernode support to vgpu
As the normal gpu is powered on by writing one to
power-node, the patch is adding power node for vgpu.

Change-Id: I08fbbe8694e02c826a0d5692f5a4c0f4efd396ff
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537053
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-02 19:40:39 -07:00
Sagar Kamble
ac30c4cb65 gpu: nvgpu: change acr bootstrap completion info message
Following information message was printed unconditionally. Often, it
is not useful.

nvgpu_acr_wait_for_completion:100  [INFO]  flcn-0: sctl reg 7021 cpuctl reg 50

It is okay to move this to nvgpu_acr_dbg.

bug 200734207

Change-Id: Ie66caf20d0e2eb692532e26bf89417342a054cf8
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2536471
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-02 19:40:26 -07:00
Lakshmanan M
edbcf5cfc6 gpu: nvgpu: add multi gr handling for debugger
Added multi gr handling for debugger apis.
Replaced g->gr with nvgpu_gr_get_cur_instance_ptr(g).

JIRA NVGPU-5656

Change-Id: I010eff39b1ebec231b4dbdd53caffc25e1cd54c4
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537784
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-01 12:50:48 -07:00
Debarshi Dutta
11d27743f8 gpu: nvgpu: add NULL checks before freeing ZBC and ZCULL
Disabling NVGPU_SUPPORT_MIG in suspend path leads to inconsistencies.
During driver removal without the flag set, the driver still tries
to free structures that might not have been allocated in the first place.
e.g. nvgpu_gr_zbc_deinit, nvgpu_gr_zcull_deinit.

Added NULL checks for ZBC and ZCULL structures before freeing them as a
solution.

Jira NVGPU-6832

Change-Id: I8a0c64ca982d11fee55542abd3c5bce5a51b4007
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535101
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-01 07:58:57 -07:00
Tejal Kudav
9f43914933 gpu: nvgpu: Move Intr handling common code to CIC
CIC (Central Interrupt controller) will be responsible for the
interrupt handling. common.cic unit is the placeholder for all
interrupt related code. Move interrupt related defines and
Public APIs present in common.mc to common.cic.
Note: The common.mc interrupts related struct definitions are
not moved as part of this patch.

Adapt the code to use interrupt handling related defines and public
APIs migrated from common.mc to common.cic

JIRA NVGPU-6899

Change-Id: I747e2b556c0dd66d58d74ee5bb36768b9370d276
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535618
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-31 19:37:31 -07:00
Deepak Nibade
9034b1676e gpu: nvgpu: compile out GFxP support in safety
GFxP preemption for graphics contexts is not supported in safety.
But the support was enabled along with CONFIG_NVGPU_GRAPHICS since GFxP
preemption was protected under same config.

Add a separate config CONFIG_NVGPU_GFXP to protect all GFxP specific
code, enum values, and HALs.

Disable the config in safety profile.

Jira NVGPU-6893

Change-Id: Iebb5f754a1025dfa6e05a94704bdb8a7123b599a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534986
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-28 15:17:36 -07:00
dt
c1b302652e gpu: nvgpu: Add fix for dev_node leak
This is adding fix for dev_node leak when user_deinit
called.
The dev_nodes in linux are created in two phases. In first
phase the power dev_nodes(one for legacy and other for v2)
are created. The second phase other dev_nodes are created.
While creating the dev_nodes the power cdev_region overwritten
by cdev_region. This is fixed by introducing new cdev_region and
updating respective nodes.

JIRA NVGPU-6721

Change-Id: Iec78db8e5fe40cc0b14fb3fecc35b8881dff716f
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535265
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-28 11:39:58 -07:00
Antony Clince Alex
5c80999ec3 gpu: nvgpu: gm20b: update priv ring init sequence
Update priv ring init sequence to poll and validate
enumerate command completion. With this approach it is
no longer required to configure the chiplets to holdoff
priv transactions when the ring has not been initialized.
Hence, the write to pri_ringstation_sys_decode_config_r
register is removed.

Bug 3307879

Change-Id: I3f9ede95dea2814f279955884621fd4c028d722f
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527924
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-28 11:36:46 -07:00
Sami Kiminki
5f6ff29aea gpu: nvgpu: report number of syncpoints in nvgpu_as_get_sync_ro_map_arg
Add reporting for the number of syncpoints when mapping the RO
shim. This allows the userspace to perform boundary condition checks
when computing the GPU VA for a syncpoint.

JIRA GCSS-1579

Change-Id: Ia6c9eee917d2c1e08f9905701e03f2b09e01ba60
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2533981
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-27 21:19:38 -07:00
Martin Radev
8834275906 gpu: nvgpu: Validate PMA buffer size
The original code would only truncate the size to 32
bits and later write the value to a hw register. Let's
check that the user-provided buffer is large enough.

Bug 2510974

Change-Id: I8b14a07a46d30c0b8c7ea63e5bdef53fbd19ec6f
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527148
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-25 14:30:35 -07:00
Martin Radev
04ce9faf04 gpu: nvgpu: Minor fixes in ioctl handling
Fixes:
1) gk20a_sched_dev_ioctl allocates a buffer with size
*CTXSW_IOCTL_MAX_ARG_SIZE* but then sanitizes IOC_SIZE
against *SCHED_IOCTL_MAX_ARG_SIZE*. No big deal here
since both are of size 0x20 but may lead to issues in
the future.
2) nvgpu_clk_arb_ioctl_event_dev would BUG_ON if IOC_SIZE
is larger than expected. Let's instead sanitize and return
error.

Jira VFND-1586
Jira VQRM-3741

Change-Id: I9e00796a2b2f4a83c3a04194c34eb4c006b937d3
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2525753
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-25 14:30:30 -07:00
Tejal Kudav
e0a1fcf5f5 gpu: nvgpu: Add Central Intr Controller unit
Add a new Central Interrupt Controller(CIC) unit in common code.
The interrupt handling is done in a distributed manner currently.
The error handling policy for different errors resides in each unit's
ISR code. The goal is to converge this data under one central place -
the CIC unit.

This patch creates framework for CIC unit and moves the gv11b QNX
safety LUT to CIC unit. All the error reporting APIs from different
units are also moved to CIC.

New APIs are exposed by CIC unit to access its internal data like:
  1. Struct err_desc - the static err handling /injection data per
                       error id
  2. Num_hw_modules  - the number of error reporting HW units
                       supported by CIC

Init and deinit of CIC unit:
  1. CIC unit should be initialized earlyon during boot so that it
     is available for any interrupt handling.
  2. Initialize CIC just before the interrupts are enabled during
     boot.
  3. Similarly, CIC is disabled late during deinit cycle; right
     after the interrupts are masked.

LUT:
  1. LUT is currently used only for reporting error to safety
     services in gv11b QNX safety build.
  2. This error handling policy LUT currently has only two levels
     of handing - correctable and quiecse.
  3. Once, the error handling policy decision is moved from leaf
     unit nodes to CIC, LUT will be updated to have additional levels
     like fast recovery and full recovery.
  4. Also, then a separate LUT will be added for each platform/build.
  5. In current framework, the LUT is set to NULL for all
     configurations except gv11b.

report_err() ops is added to report error to safety services.
This ops is only effective for gv11b qnx build; and set to NULL for
other configurations.

NVGPU-6521
NVGPU-6523
NVGPU-6750
NVGPU-6758
NVGPU-6760
NVGPU-6754

Change-Id: I24be7836a96d787741e37b732e19863ed8014635
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683
Reviewed-by: Ajesh K V <akv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-25 14:28:04 -07:00
Tejal Kudav
bced5c5785 gpu: nvgpu: Add CIC specific debug logging API
Add gpu_dbg_cic bit to log_mask to enable/disable Central Interrupt
Controller debug logs.
Define CIC specific debug print API with "CIC |" prefix to help
grep CIC related logs.

NVGPU-6521

Change-Id: I86deee761ad9125001cd48d94b43bb2979174d42
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518692
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-25 14:27:58 -07:00
Prateek sethi
84534a050f gpu:nvgpu: Update doxygen range for io APIs
Patch updates the access range to 0 to SIZE-4.

Jira NVGPU-6229

Change-Id: I98606e1310c45e4b7343f739524bd77674080c3a
Signed-off-by: Prateek sethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521643
(cherry picked from commit b01a8689c470c67d32855981b115edba7954f451)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2530175
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-20 06:09:51 -07:00
mkumbar
f3c2c4e730 gpu: nvgpu: Update the FALCON/NVRISCV define's
Update the FALCON/NVRISCV define's

Bug 200728965

Change-Id: I2b45c216cc274e097d6bc99831b934eb29840dc9
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2531635
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-20 06:09:41 -07:00
Shashank Singh
57089a1b34 gpu: nvgpu: address comments from common.rc code inspection CR review
- Move unnecessary headers under recovery flag.
- Update doxygen documentation of one API to match the code.

Jira NVGPU-6372

Change-Id: I9cf744c8014ea92f18cc10824e9fcaed9aa7d5de
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527118
(cherry picked from commit cb4b03a3b00321a4c07b3d9cc2768f7183e99c45)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2531583
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-19 07:45:35 -07:00
Seshendra Gadagottu
85efe929ca gpu: nvgpu: prod programming for slcg timer unit
Added init function for common.ptimer unit and called
this init function during nvgpu early init.
int nvgpu_ptimer_init(struct gk20a *g);

Added following helper function for programming
prod values for slcg timer unit:
void nvgpu_cg_slcg_timer_load_enable(struct gk20a *g);

Invoked prod programming for slcg timer unit from
nvgpu_ptimer_init.

Jira NVGPU-6026

Change-Id: I29e32380a4d05ec8276d7ebe59bc2733917f8184
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524037
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-19 04:06:43 -07:00
ajesh
b15bd97c08 gpu: nvgpu: fix misra violation in bug unit
Modify the callback interface from bug to quiesce unit to remove
a possible cyclic dependency in the bug unit. Make the list of
callbacks from bug unit, UT specific. The quiesce callback function
and argument are kept in separate variables, and in a normal run the
only callback that bug unit would invoke will be the quiesce specific
function. These changes will fix the violation of Rule 17.2 in bug unit.

JIRA NVGPU-6537

Change-Id: Icb6bc92077f8d26c87425768b09a7194a98e015d
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527207
(cherry picked from commit 7696565648c5dd573a03be19ba9525856b781ea6)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2530900
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-18 18:20:18 -07:00
Martin Radev
d1983f5cfa gpu: nvgpu: Decrement CSS dmabuf ref cnt before ret
The function gk20a_channel_cycle_stats does not decrement the
dmabuf refcnt if vmapping it fails. This patch fixes it by
decrementing the ref cnt before returning.

NVGPU-397
NVGPU-415

Change-Id: Iae01ada710adb04fd4e4ba0371eccec5f8765254
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527190
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-18 18:18:25 -07:00
mkumbar
d2349b32ec gpu: nvgpu: update SSMD array size
-Update SSMD array size to hold all supported super-surface
members
-Handle the error and report if invalid SSMD ID is found.

issue: At present SSMD array size set to 32 but overall
33 super-surface members are supported, when 33rd member
accessed system crash happened due to overflow access,
so fixing it by setting the SSMD array size to actual
number of super-surface members supported

Bug 200721968
Bug 200721966

Change-Id: I5ba1084a661d7497056f13a053d2fc79d50f595c
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2528569
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-17 12:56:39 -07:00
Vedashree Vidwans
2514814851 gpu: nvgpu: common.ce fix MISRA 5.7 errors
Rule 5.7 doesn't allow an identifier to be reused. This patch renames
identifier "ops" to resolve this violation.

Jira NVGPU-6281

Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Change-Id: I02da8db6406ccc44b7d8c3037dfd2b062250878f
(cherry-picked from commit 659e54c96d5c8db8ab2f76cd110a11f1e2270c36)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527279
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-17 12:56:19 -07:00
Lakshmanan M
ede8215ca8 gpu: nvgpu: Add NVGPU_SUPPORT_ROP_IN_GPC flag
Added new flag to enable/disable the NVGPU_SUPPORT_ROP_IN_GPC

JIRA NVGPU-5656

Change-Id: Icbcb63a879c4ae4de0701742319eb02e98f66ca6
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2529121
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-14 21:00:44 -07:00