Commit Graph

358 Commits

Author SHA1 Message Date
Deepak Nibade
1ff79b1d2c gpu: nvgpu: remove support for quad reg_op
quad type reg_ops were only needed on Kepler, and not for any other chip
beginning Maxweel.

HAL g->ops.gr.access_smpc_reg() was incorrectly set for Volta and Turing
whereas it was only applicable to Kepler. Delete it.

There is no register in the quad type whitelist since the type itself is
not supported anymore. Remove the empty whitelists for all chips and
also delete below HALs:
g->ops.regops.get_qctl_whitelist()
g->ops.regops.get_qctl_whitelist_count()

hal/regops/regops_gv100.* files are not used anymore. Delete the files
instead of just deleting quad HALs in these files.

Bug 200628391

Change-Id: I4dcc04bef5c24eb4d63d913f492a8c00543163a2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366035
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Rajesh Devaraj
b8c6ad3f5f gpu: nvgpu: remove service IDs
This patch removes the reporting of _ECC_CORRECTED errors which are
not applicable to GV11B. Specifically, this patch removes the code
related to the  reporting of the following service IDs:

NVGUARD_SERVICE_IGPU_SM_SWERR_LRF_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_CBU_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_PMU_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_GPCCS_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_FECS_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_GCC_SWERR_L15_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_MMU_SWERR_L1TLB_FA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_MMU_SWERR_L1TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_L2TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_PTE_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_PDE0_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L0_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L0_PREDECODE_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L1_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L1_PREDECODE_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_TAG_MISS_FIFO_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_TAG_S2R_PIXPRF_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_TSTG_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_RSTG_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_DSTG_BE_ECC_CORRECTED

Bug 200616002

Change-Id: I199c396f9f6a6be007bd6d3c556199b5a73c3c91
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349587
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
fb1433811c gpu: nvgpu: modify gr.falcon.dump_stats
- Add gm20b_gr_falcon_gpccs_dump_stats() to print gpccs context switch
mailbox register values for all gpcs.
- Make gm20b_gr_falcon_fecs_dump_stats() a static function
- Add gm20b_gr_falcon_dump_stats() to trigger
gm20b_gr_falcon_fecs_dump_stats() and gm20b_gr_falcon_gpccs_dump_stats()
- Update legacy chips gr.falcon.dump_stats() to
gm20b_gr_falcon_dump_stats().

JIRA NVGPU-5597

Change-Id: I992c6432f3c2e3049bacc953f9b53ff6c4aa2f36
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357470
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
f6298157bc gpu: nvgpu: update HAL function name
get_ecc_override_val is renamed to gp10b_gr_get_ecc_override_val to
follow the naming convention of the HAL functions correctly.

Change-Id: I80e495e8bec3483651e325b792708382e5327de0
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357925
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
3a11bd69e7 Revert "gpu: nvgpu: modify nvgpu_writel check and loop"
This reverts commit c100ac23455d450a7046c62915014111a0aa2e70.

Bug 3009270

Change-Id: I1db1acac63c841b5383d75ec674fdc2160a0c84d
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2356076
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
2ad015f7a5 gpu: nvgpu: modify nvgpu_writel check and loop
Currently, nvgpu_writel_loop() writes to a register and immediately
checks if register value is updated. It might take some time for
hardware registers to get updated with value written by software.
Modify nvgpu_writel_loop() to accept number of retries to check if
register value is updated and assert with nvgpu_assert().
Also, move nvgpu_writel_loop() to common code and use generic
nvgpu_readl() and nvgpu_writel() APIs.

JIRA NVGPU-5490

Change-Id: Iaaf24203a91eee3d05de7d0c7dea18113367de5f
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348628
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
2d94863cae gpu: nvgpu: move is_tpc_addr and get_tpc_num to common
gr.is_tpc_addr() and gr.get_tpc_num() are chip agnostic hals. Move these
hals to common code.

Jira NVGPU-5504

Change-Id: I50fa7ac876c8667de42df1830bd412b412538508
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349272
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
5d06a59bc5 gpu: nvgpu: Cleanup uart and debugfs debug prints
The gk20a_debug_dump() function implicitly adds a newline since it
uses nvgpu_err() under the hood (for uart destined prints). For the
seq_file destined writes it does not so there is an annoying inconsistency.

Remove the newline that many of the gk20a_debug_dump() calls add and add
the newline to the (now) seq_printf() call. This reduces the length of
debug dump logs and speeds them up - UART is _very_ slow after all.

Also cleanup some formatting issues in the various debug prints I
happened to notice.

JIRA NVGPU-5541

Change-Id: Iabf853d5c50214794fc4cbb602dfffabeb877132
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347956
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
681077d578 gpu: nvgpu: volta+: convert SM broadcast to SM unicast
Starting volta, multiple SMs are supported. Ctxsw regops
require SM broadcast registers to be converted to unicast registers.

Bug 2960720
JIRA NVGPU-5502

Change-Id: Id6e87fcc993587317bcd9b6958233e39d6b41fa7
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340921
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
a039261724 gpu: nvgpu: add gr.process_context_buffer_priv_segment gops
1. Add below gr gops to process context buffer's priv segment.
int (*process_context_buffer_priv_segment)(struct gk20a *g,
					 enum ctxsw_addr_type addr_type,
					 u32 pri_addr,
					 u32 gpc_num, u32 num_tpcs,
					 u32 num_ppcs, u32 ppc_mask,
					 u32 *priv_offset);
Update all chips to use gr_gk20a_process_context_buffer_priv_segment()
as new gr hal.
2. Add and use ppc, tpc and etpc count functions to retrieve total count.

Bug 2960720
JIRA NVGPU-5502

Change-Id: I6cec36c323ff49ded853cd5cbfd9e0a28602b8ed
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340372
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
bb93223a21 gpu: nvgpu: add check_warp_esr_error hal
Set check_warp_esr_error hal pointer to
gv11b_gr_check_warp_esr_error hal function.

Jira NVGPU-4867

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: Ib014c5ff2456836af2fe89f849f37991fe52844e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331804
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
8f3a0f4486 gpu: nvgpu: add sm rams ecc enabled flag
Add sm rams ecc enabled flag.

Move ecc scrubbing timeout defines to
gr_init_gv11b.h

Jira NVGPU-4871

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: Ie43f5947c53be697d0b2fd064d308612856d823a
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328871
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Vinod G
6a7bf6cdc0 gpu: nvgpu: update sm ecc_status_error handling
Use gv11b_gr_intr_handle_tpc_sm_ecc_exception
function for future chip to avoid code replication.

Add sm_ecc_status_errors hal to read
the ecc_status_errors

Jira NVGPU-5033

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: I4a25837d9b833a48307b9353b82ff6597f985e41
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325537
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
0e0b966f0c gpu: nvgpu: update gr exception hal
Make generic gr exception static functions
to public functions.

Jira NVGPU-5033

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: I9ac4cbc728edda813a487f80af622559a798b319
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324676
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Vedashree Vidwans <vvidwans@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
25edcc1353 gpu: nvgpu: cta preemption mode HAL for tu104
Use gp10b_ctxsw_prog_set_compute_preemption_mode_cta instead
of gm20b_ctxsw_prog_set_compute_preemption_mode_cta for tu104.

Jira NVGPU-4661

Change-Id: I7b85cbcc139e6843c8b7bd89e0afb6030160362f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2316206
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
sagar
b8a3e54dda gpu: nvgpu: add invalid ctxsw method
* For ctxsw negative testing, nvgpu need to send and invalid method
 * Sending 0xFFFF method will result in triggering ctxsw error intr.

JIRA NVGPU-5080

Change-Id: I6c16137d86ee2ddb25f1508161d9d6befcbcbefe
Signed-off-by: sagar <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2317502
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Thomas Fleury
f43d5df83a gpu: nvgpu: build dGPU in safety
Enable build flags for dGPU in safety, when
NVGPU_FORCE_DGPU_SAFETY_PROFILE is set.

Use libnvgpu-dgpu_safe.exports for dGPU safety build.

Add build flags for tu104 HAL initialization (to solve
undefined symbols in safety build).

Temporarily add non-fusa files needed to build dGPU in safety.
related functions will have to move to fusa files.

Jira NVGPU-4611

Change-Id: I41db0c039c7f15d9191cdb811b4906e779d5cc88
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2310276
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
vinodg
a9d8fc96a7 gpu: nvgpu: hal correction for class error handling
Add new hal function
gp10b_gr_intr_handle_class_error.

Update handle_class_error hal function for
gp10b, gv11b and tu104 to
gp10b_gr_intr_handle_class_error from
gm20b_gr_intr_handle_class_error.
gr_trapped_data_mme_pc uses 12 bits from gp10b.

Move gm20b_gr_intr_handle_class_error hal function
to non-fusa section.

Jira NVGPU-4913

Signed-off-by: vinodg <vinodg@nvidia.com>
Change-Id: Ic93013ba43d4bf409527109f2f2d43db11c4238e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2314249
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
0c23bf57ea gpu: nvgpu: build flag for secure boot
Use CONFIG_NVGPU_GR_FALCON_NON_SECURE_BOOT build flag for
gm20b_gr_falcon_fecs_host_int_enable.

Jira NVGPU-4661

Change-Id: Id7d991b81206d00e38049556b42b4e9a4abd1708
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2313620
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kamble
59c6947fc6 gpu: nvgpu: add CONFIG_NVGPU_TEGRA_FUSE
Encapsulate the tegra fuse functionality under the config flag
CONFIG_NVGPU_TEGRA_FUSE.

Bug 2834141

Change-Id: I54c9e82360e8a24008ea14eb55af80f81d325cdc
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2306432
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
00eec69b3f gpu: nvgpu: add hal to get_ctx_buffer_offsets
Currently, gr_gk20a_get_ctx_buffer_offsets is defined as a function.
However, this function is used in the common code. So, add new GR hal
to get_ctx_buffer_offsets.

Jira NVGPU-5047

Change-Id: I0cec6ff19194fa726722e6af3a2f11a188dc9087
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2310352
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
vinodg
4aff9bcd4e gpu: nvgpu: fix for load imbalance across cta subpartitions
CTA_SUBPARTITION_SKEW load balancing is broken across
subpartitions. SW WAR to disable the CTA_SUBPARTITION_SKEW.

Jira NVGPU-5132
Bug 200593339

Signed-off-by: vinodg <vinodg@nvidia.com>
Change-Id: I3faae882a94fc6262cc287df44994cc04b4fd5d6
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2308905
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
feebc746ca gpu: nvgpu: fix global register access list
For legacy chips (gm20b, gp10b and gv11b), incorrect register
offset is used for global access register list:

incorrect: 0x418300, /* gr_pri_gpcs_rasterarb_line_class  */
correct:   0x418380, /* gr_pri_gpcs_rasterarb_line_class  */

Fix this issue by updating global access register list by using
correct register offset value.

NVGPU-5108

Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Change-Id: Id6722039f8d874dbcb79732dffd727d2ff2a1a72
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2306642
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
6669cbd7de gpu: nvgpu: gv11b: fix veid bundle wait issue
For non go_idle bundles, check should be fe_idle
not gr_idle. fe_gi state will be busy until go_idle
bundle gets processed.

Bug 2804205

Change-Id: I12dd05f59d406aeac9476e0c85b6e457c6bd6bed
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2299895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Abdul Salam
17cc9b2b98 gpu: nvgpu: Refactor Clock unit.
Current clk unit has multiple header files under pmuif folder.
This has combination of public struct which is accessed outside the
unit and private struct which is accessed within clk unit.
This patch segregates them based on their accessibility.
All private items are moved into ucode_clk_inf.h from pmuif which only
clk can access.
All public items are moved into include/clk.h which other units can
access
This will help in documentation of items for public items.

NVGPU-4491

Change-Id: Iccb0571e05ecb3cb13363390bed8c7214409b543
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2292318
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
b0984cbf05 gpu: nvgpu: fix default ctxsw wdt value
MAX wdt timeout value expected by CTXSW is 0x3fff_ffff.
Currenlty nvgpu tries to program it to 0x7fff_ffff.

Fix the hard coded value as per CTXSW expectation.

Bug 200586923

Change-Id: I67e06e988d301006b0c5153c18f5af1630d633b7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2292429
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
b811a0b755 gpu: nvgpu: add hal for fecs ctsw clear mailbox
Add hal to have chip specific fecs ctxsw mailbox clear function.
This hal has following prototype with mailbox reg_index and bitmask
for clear_val:
void (*fecs_ctxsw_clear_mailbox)(struct gk20a *g,
			u32 reg_index, u32  clear_val);

JIRA NVGPU-4870

Change-Id: I1d20309224f856872dc97040ecf7628c60fb2802
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287921
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
9a16bc3fd4 gpu: nvgpu: wait ACK for FECS watchdog timeout
From gv11b onwards, FECS ucode returns an ACK for set watchdog
timeout method. Failure to wait for this ACK was leading to races,
and in some cases, the ACK could be mistaken for the reply to the
next method.

In particular, this happened for the discover golden image size
method which is sent after set watchdog timeout.

With instrumented FECS ucode, it takes longer for the code to
process the set watchdog timeout method, and the write to ack
that method could happen after nvgpu driver clears the mailbox to
send the discover image size method.

With an invalid golden context image size, FECS ended up causing
an MMU fault while attempting to save past allocated buffer.

Added NVGPU_GR_FALCON_METHOD_SET_WATCHDOG_TIMEOUT to be used with
gops_gr_falcon.ctrl_ctxsw, and implemented 2 variants:
- gm20b_gr_falcon_ctrl_ctxsw, without ACK
- gv11b_gr_falcon_ctrl_ctxsw, with ACK

Added NVGPU_GR_FALCON_SUBMIT_METHOD_F_LOCKED flag to allow
executing above method without re-acquiring FECS lock. Longer term,
the 'flags' could be added to gop_gr_falcon.ctrl_ctxsw parameters.

Use gops_gr_falcon.ctrl_ctxsw instead of register writes to invoke
set watchdog timeout method in gm20b_gr_falcon_wait_ctxsw_ready.

Also replaced calls to gm20b_gr_falcon_ctrl_ctxsw to
gops_gr.falcon.ctrl_ctxsw when appropriate, since there are
multiple variants (gm20b, gp10b and gv11b).

Last, fixed clearing of mailbox 0 in gm20b_gr_falcon_bind_instblk.

Bug 200586923

Change-Id: I653b9a216555eec8cd4bb01d6f202bc77b75a939
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287340
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
8c01512e12 gpu: nvgpu: add ops in gr.falcon
- add gr.falcon.fecs_dmemc_write ops
- add gr.falcon.fecs_imemc_write ops
- add gr.falcon.gpccs_dmemc_write ops
- add gr.falcon.gpccs_imemc_write ops

This is required for nvgpu-next.

JIRA NVGPU-4908

Change-Id: I71c59aebf21276146ee62880d6739e850ef3fc78
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2288878
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
sagar
14384edb68 nvgpu: force cilp in cta
For integration testing in safety build, we need to test ctxsw firmware
header interface. inorder to validate we populate header with
unsupported data.

Use CONFIG_NVGPU_CTXSW_FW_ERROR_HEADER_TESTING for testing CILP error.

JIRA NVGPU-4471

Change-Id: Ibcd2f9cccc7a85403d2b006ad8164e37592ccaa1
Signed-off-by: sagar <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284602
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
sagar
9d0d16cffd nvgpu: add insturmentation for ctxsw error codes
For integration testing in safety build, we need to test ctxsw firmware
error codes. inorder to test this, we need some instrumentation in
nvgpu driver.

Added below instrumentation for CTXSW FW error code integration testing.

 1. Reduce the ctxsw watchdog time, so ctxsw watchdog always triggered.
    Use CONFIG_NVGPU_CTXSW_FW_ERROR_WDT_TESTING for testing watchdog

 2. Submit unsupported method to CTXSW FW, it will trigger err.
    Use CONFIG_NVGPU_CTXSW_FW_ERROR_CODE_TESTING for testing err code.

JIRA NVGPU-4471

Change-Id: Ib45a946b3e38d3b6dd5bbee277c5d3e7c55521c0
Signed-off-by: sagar <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284048
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Seema Khowala
c8050eabec gpu: nvgpu: gr: fix return value of *handle_sw_method*
Currently chip specific functions for handle_sw_method
does not return -EINVAL if class does not match as
expected. Fix it by setting default return value to
-EINVAL and updating return value to 0 for successful
class/method matches.

JIRA NVGPU-4909

Change-Id: Ifb3aa35215171ddbc64d6f1d23f8944c9fe44b2d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2285848
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
David Li
273ffae939 gpu: nvgpu: add zcull_ctx_debug and setup_debug_z_gamut_offset to access map
-3D API drivers need to write NV_PGRAPH_PRI_GPCS_ZCULL_CTX_DEBUG_ZF32_AS_Z16
  and NV_PGRAPH_PRI_GPCS_SETUP_DEBUG_Z_GAMUT_OFFSET_ZF32_AS_Z16 fields
  to get ZCULL for ZF32 depth values less than 0.25
-GM20B is already added

bug 2427703
bug 2704313

Change-Id: I07a4d7b3a1e09183c56d4c72533bc5d280bc7782
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2279674
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
340f35d76e gpu: nvgpu: remove unnecessary asserts in common.gr hal subunits
Below functions in common.gr hal subunits include unnecessary
asserts to ensure value is not truncated when parsing into U32 size.

gm20b_gr_init_commit_global_attrib_cb()
gp10b_gr_init_commit_global_bundle_cb()
gp10b_gr_init_commit_global_pagepool()
gv11b_gr_init_commit_global_attrib_cb()

Make use of nvgpu_safe_cast_u64_to_u32() and remove unnecessary
asserts

gp10b_gr_init_commit_global_bundle_cb() function checks if size <=
U32_MAX value. But since size is declared as u32, it will always be
<= U32_MAX value so there is no point in the check.
Remove unnecessary check.

Jira NVGPU-4778

Change-Id: I9562afd1b31c3c6b095f607cbdf725d33d87effb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2279898
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:10:29 -06:00
Scott Long
a54c207c37 gpu: nvgpu: hal: misra 12.1 fixes
MISRA Advisory Rule states that the precedence of operators within
expressions should be made explicit.

This change removes the Advisory Rule 12.1 violations from hal code.

Jira NVGPU-3178

Change-Id: If903544e1aa7264dc07f959a65ff666dfe89a230
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2277478
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Lakshmanan M
1c991a58af gpu: nvgpu: Add SM diversity support
To achieve permanent fault coverage, the CTAs launched by
each kernel in the mission and redundant contexts must execute on
different hardware resources. This feature proposes modifications
in the software to modify the virtual SM id to TPC mapping across
the mission and redundant contexts. The virtual SM identifier to TPC
mapping is done by nvgpu when setting up the patch context.

The recommendation for the redundant setting is to offset the
assignment by one TPC, and not by one GPC. This will ensure that both
GPC and TPC diversity. The SM and Quadrant diversity will happen
naturally. For kernels with few CTAs, the diversity is guaranteed
to be 100%. In case of completely random CTA allocation,
e.g. large number of CTAs in the waiting queue, the diversity is
1 - 1/#SM, or 87.5% for GV11B, 97.9% for TU104.

Added NvGpu CFLAGS to enable/disable the SM diversity support
"CONFIG_NVGPU_SM_DIVERSITY".

This support is only enabled on gv11b and tu104 QNX non safety build.

JIRA NVGPU-4685

Change-Id: I8e3eaa72d8cf7aff97f61e4c2abd10b2afe0fe8b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Seshendra Gadagottu
9160bd29c3 gpu: nvgpu: gm20b: update whitelist reg list
Added gm20b whitelist register access list with following
zcull registers:
gr_pri_gpcs_setup_debug_z_gamut_offset
gr_pri_gpcs_zcull_ctx_debug

Access to these registers is required by 3d API drivers to write
zcull depth values less than 0.25

Bug 2757650

Change-Id: I8eae7b027831b6c61b144898476dcb83cbe09644
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2274559
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
dnibade
ab76dc1ad5 gpu: nvgpu: unit: add coverage tests for gops.gr.init.ecc_scrub_reg
Add new unit test to cover gops.gr.init.ecc_scrub_reg HAL function

gops.gr.init.ecc_scrub_reg HAL can generate TIMEOUT errors which are
not returned to caller currently. Update this HAL to return int value
for error propagation.

Jira NVGPU-4458

Change-Id: I98f4d5af2ef17cc4301951fec4d660638c8ef72c
Signed-off-by: dnibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265456
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
8ab5e07d8f gpu: nvgpu: Update for gr config code coverage.
Replace if statement with nvgpu_assert,this checking is just to
assure following division will not cause system crash.

Jira NVGPU-4531

Change-Id: I213882b56ccfd993066c58bc3fb6c47a6fd92d4a
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2264410
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Deepak Nibade
34020a5999 gpu: nvgpu: fix issues identified by common.gr.obj_ctx negative testing
- nvgpu_gr_ctx_load_golden_ctx_image() does not return any error, change
  the return type to void
- Check for preemption modes greater than CILP in
  nvgpu_gr_ctx_check_valid_preemption_mode
- Check if received class is valid or not in
  nvgpu_gr_setup_set_preemption_mode
- Compile out entire nvgpu_gr_obj_ctx_init_ctxsw_preemption_mode since
  it is really not doing anything in safety
- Remove the switch statement in nvgpu_gr_obj_ctx_set_compute_preemption_mode
  since it is not possible to receive any other value than supported.
  Previous function calls ensure that input values are validated.
- nvgpu_gr_obj_ctx_commit_global_ctx_buffers() does not return any
  error, change the return type to void
- gops.gr.init.preemption_state HAL is not needed in safety since it
  only configures gfxp related timeout
- remove redundant call to gops.gr.init.wait_idle in
  nvgpu_gr_obj_ctx_commit_hw_state. We trigger wait despite earlier
  failure in same function call.

Jira NVGPU-4457

Change-Id: I06a474ef7cc1b16fbc3846e0cad1cda6bb2bf2af
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2260938
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
a73ca0b70e gpu: nvgpu: split GR ECC initialization
Split GR ECC initialization into GPC/TPC and FECS ECC init as FECS ECC
errors during acr_construct_execute need to be reported and handled
hence FECS ECC counters are required to be initialized before
acr_construct_execute.

GPC/TPC ECC counters are dependent on the GR config that will be
initialized only after acr_construct_execute.

nvgpu_gr_intr_init_support is moved to nvgpu_gr_prepare_sw.

FECS ECC interrupt is enabled by default hence interrupt is not
enabled through gr_fecs_host_int_enable_r in nvgpu_gr_prepare_sw.

JIRA NVGPU-4439

Change-Id: Ifc9912f0578015a6ba1e9d38765c42633632b15f
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2261987
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
e32529da57 gpu: nvgpu: compile out unused code in gr.intr with safety build
handle_tex_exception hal is not set for safety build. Add
CONFIG_NVGPU_HAL_NON_FUSA checking for that hal.

log_mme_exception hal is supported only for turing. Add
CONFIG_NVGPU_DGPU checking for that hal.

gr_intr_handle_class_error always return -EINVAL. Change the
return as void to avoid unwanted error checking.

nvgpu_gr_intr_get_channel_from_ctx function parameter curr_tsgid will
never be NULL based on the current call. Remove unwanted
(curr_tsgid != NULL) check from this function.

Jira NVGPU-4454

Change-Id: I165d1cc5f9e308dfb11d905b59151b44f63a31bb
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2259763
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
76c1a704bd gpu: nvgpu: compile out unused code with safety build for gr.falcon
Compile out unused code in gr.falcon for safety build.
By default NVGPU_SEC_SECUREGPCCS is enabled in Safety build. Add
CONFIG_NVGU_GR_FALCON_NON_SECURE_BOOT checking with non secure code.

In gm20b_gr_falcon_wait_ctxsw_ready function watchdog timer value is
calculated based on clock rate, not needed for safety code.
Add CONFIG_NVGPU_HAL_NON_FUSE checking for unused code in that function.

gm20b_gr_falcon_gr_code_less_equal is the last op status we check.
SKIP or any other status following this will be failed in checking
for valid op status. So code reaching this function mean it is for
only for GR_IS_UCODE_OP_LESSER_EQUAL. So removing the checking for 
opc_status != GR_IS_UCODE_OP_LESSER_EQUAL in this function

Jira NVGPU-4453

Change-Id: I156cac59f52779fa7f78052c1f0115d0e8f03bf9
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2258768
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Deepak Nibade
d7971e7444 gpu: nvgpu: add DGPU config for RTV circular buffer
RTV circular context buffer is only supported on TU104 dGPU as of
now. Hence compile out corresponding #define and code from safety build.

Jira NVGPU-4373

Change-Id: I46a3efc92fb247fa08efb925447c248b2a4b9a57
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2255768
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
01039aec35 gpu: nvgpu: disable unused compute sw method for safety
Add missing check with CONFIG_NVGPU_HAL_NON_FUSA in tu10x
code for NVC0C0_SET_SHADER_EXCEPTIONS.

Jira NVGPU-4454

Change-Id: Id8f0560c5061f7c017c84361059007d936dc53b5
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2257034
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
22dea4ca3b gpu: nvgpu: disable unused compute sw method in safety build.
NVC0C0_SET_SHADER_EXCEPTIONS is never used by CUDA software. Hence disable that
feature with CONFIG_NVGPU_HAL_NON_FUSA checking for safety build.

Jira NVGPU-4454

Change-Id: If71b97bf8b2a6a8f8d0c7206b8e801094b5b1b7c
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2256345
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
b0862a168c gpu: nvgpu: compile out nonsecure code in gr.falcon
For Safety code NVGPU_SEC_SECUREGPCCS bit will be set by default.

Compile out the code called for nonsecure gpccs under
CONFIG_NVGPU_GR_FALCON_NON_SECURE_BOOT checking.
This includes:
Hals for load_ctxsw_ucode_header and  load_ctxsw_ucode_boot.
nvgpu_gr_falcon_load_gpccs_with_bootloader and static functions
called from this function.

Jira NVGPU-4453

Change-Id: I56e6d585a26fcf593059a5157de07b77e3b9f00d
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2255490
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Deepak Nibade
4554b4654a gpu: nvgpu: make gops.gr.init.fs_state return void
This HAL function does not return any real error at all.
So just change the return type to void.

In case of vGPU, this function only calls another HAL
gops.gr.config.init_sm_id_table(). So unset gops.gr.init.fs_state()
for vGPU, and call gops.gr.config.init_sm_id_table() directly.

Jira NVGPU-4373

Change-Id: I06a80520e9be50a0703608a79187c553b33aa582
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2247844
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
bb942331e8 gpu: nvgpu: compile out unused code in gr.falcon unit
NVGPU_GR_FALCON_METHOD_HALT_PIPELINE section is used only
with CONFIG_NVGPU_ENGINE_RESET setting.

Jira NVGPU-4453

Change-Id: Ia33e370486ebcd2052b7ec9d530503f93e798cbf
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2253865
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Divya Singhatwaria
84a24c9593 gpu: nvgpu: Remove TPC powergate from safety build
- Remove non-safe TPC powergate feature from the safety
  build by introducing a new flag:
  CONFIG_NVGPU_TPC_POWERGATE

- Move nvgpu_init_power_gate_gr() under same compile time flag.
  and move HAL function gr_gv11b_powergate_tpc() to tpc_gv11b.c

- Also, remove the negative test scenario and
  usage of tpc_powergate from unit tests

JIRA NVGPU-4149

Change-Id: If489482401e94de499e472b16b1bc091b00992e6
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2242323
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00