As per Safety_Services, a client must perform polling to ensure that the
previously reported errors are cleared at FSI, in case of back-to-back
error reporting. However, to minimize the polling overhead, NvGPU driver
performs polling only when the error to be reported is corrected error
to ensure that it is not overwriting the previously reported
uncorrected/corrected error. In case of uncorrected errors, it will be
reported without doing polling. This situation leads to a failure in
error reporting, when uncorrected errors are reported back-to-back. This
is acceptable for safety builds where SW quiesce will be triggered
immediately after the reporting of first uncorrected error. In case of
other build configurations, MCU/SEH takes the decision on encountering
uncorrected errors. To handle such build configurations, polling is
enabled for all types of errors, in all build configurations.
This patch also removes an unused macro "ERR_TYPE_MASK".
Bug 3622420
Change-Id: I750b0406faec9b229d8d0c74e986807234362cb9
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707105
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
There shouldn't be an usecase that an fd, installed by nvgpu,
must be shared on exec with the new process. This doesn't only
lead to excessive number of fds in the exec process, but also
can lead to potential security issues.
This patch marks the fds with O_CLOEXEC, so that they get
closed on exec.
Bug 3583628
Change-Id: I3499b1429ac512b2c172e9e628d0a7a1417d72e3
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2704350
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
The emulate mode support is determined after chip detect and is flagged
by using NVGPU_SUPPORT_EMULATE_MODE flag. The present logic prevents
user from configuring the emulate mode sysfs knobs if this flag is not
set, however the emulate mode usecase requires the user to configure the
syfs knob prior to power-on, hence defer emulate mode check to a later
stage after chip detect.
Bug 3621460
Change-Id: If522527542fa8d7e95ccbcff43b74adbb9e976e6
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2703953
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mayur Poojary <mpoojary@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: David Li <davli@nvidia.com>
Fix following Coverity defect:
cde.c : Unintentional integer overflow
The Coverity issue suggests that (u64) (nvgpu_ltc_get_ltc_count(g) * nvgpu_ltc_get_slices_per_ltc(g) * nvgpu_ltc_get_cacheline_size(g)) can cause overflow because typecasting is done after multiplication.
This patch solves that issue by typecasting it before multiplication.
CID 10112360
Bug 3460991
Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I314ca7a9adc95fcb09f15eb603b56ad03ce34b99
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2697027
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fixed following Coverity Defects:
ioctl_as.c : Bad bit shift operation
mc_tu104.c : Bad bit shift operation
vm.c : Bad bit shift operation
vm_remap.c : Bad bit shift operation
A new linux header file for ilog2 is created.
The files which used the old ilog2 function
have been changed to use the new nvgpu_ilog2
function.
CID 9847922
CID 9869507
CID 9859508
CID 10112314
CID 10127813
CID 10127899
CID 10128004
Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: Ia201eea7cc426c3d6581e1e5ae3b882dbab3b490
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2700994
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix following Coverity Defect:
nvgpu_init.c : Unused value
The ret variable was being reassigned the error code from
nvgpu_cic_mon_deinit(g) without taking into account the previous ret value.
We need to propagate whether there is an error
(the last known error is returned) or not using ret, the temp_ret
variable helps in verifying this.
Similar coding style followed in the entire function.
CID 10127863
Bug 3460991
Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I732ba5269ebbbe68f113e53229df40ae49ccc13c
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2697104
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The function nvgpu_profiler_unbind_pm_resources is responsible for
destroying the regops allowlist object, but unfortunately does it
prior to any of the failure-prone operations. Because this function
can be called multiple times, in rare cases it can happen that
object is deallocated twice.
This patch fixes the issue by moving the free operations after the
failure-prone operations.
Bug 3591603
Change-Id: I3415712da561ccf162c9fb7f3ebb942faa9d9420
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693803
(cherry picked from commit I3415712da561ccf162c9fb7f3ebb942faa9d9420)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693799
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Prior to starting any profiling session, GPU power features should be
disabled, not doing this can result in unexpected behavior, print a
error message under such situations.
Jira NVGPU-7326
Bug 3579926
Change-Id: Ia9716d73ed945a588792a7d843aed64bf3b7f83b
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2701501
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
After the board boots up, we have a specific set of settings for
each power mode which contains, fbp, tpc mask values, cpu, gpu
frequencies etc. So, we need to make sure to provide all the setting
values for each mode so that those will be applied correctly.
So, the fbp, tpc, gpc mask could be same in different power modes
and since the golden image context check which comes before is
returning -ENODEV, the nvpmodel service fails to set the power mode.
Thus, we need to compare whether the fbp and gpc mask values are same
before we check for the golden image context.
Bug 3581634
Change-Id: I3fb1398d47cd37bf49ad70cf80b057d4b80dec04
Signed-off-by: Ninad Malwade <nmalwade@nvidia.com>
(cherry picked from commit a27e202402)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2688116
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
RO buffers should not be accessed via any paths in userspace/kernel.
vmap is one such path. nvmap now returns error if vmap is attempted
on RO buffer. However that leads to kernel warning. nvgpu should
ensure that buffer is RW before performing vmap on the buffer.
With this check, dma_buf_vmap will not be called for RO buffers.
Bug 3562426
Change-Id: I4782d2ae6b17243fdb3d0fc5d5d732a17f694a7c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693976
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Puneet Saxena <puneets@nvidia.com>
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Fix CERT issue in nvgpu_gr_falcon_bind_fecs_elpg where nvgpu_pmu_pg_buf
could return NULL. nvgpu_pmu_pg_buf is called from context where PG
will be enabled hence remove the NULL return logic as it is dead
code.
Replace nvgpu_pmu_pg_buf and nvgpu_pmu_pg_buf_get_cpu_va functions by
new function nvgpu_pmu_pg_buf_alloc.
CID 17860
Bug 3512546
Change-Id: I09820a966dadeb258167ce1433ca256f94845896
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2692466
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_ctrl_dev_ioctl clk_set_info: An unscrutinized value num_entries
is used as a loop bound. An attacker could control the number of times
the loop iterates.
Loop iterator is signed int which can lead to unpredictable results,
Hence change it to u32. And sanitize the num_entries parameter.
CID 1993996
Bug 3460991
Change-Id: Ib644cf19f016ab80a3f2d66f156ca863f8e138e1
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693942
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Disable below interrupts on safety as they do not report any error
condition and are not used by CUDA and Graphics(VKSC) on safety
build.
Signoff from CUDA and VKSC is on Bug https://nvbugs/3588603
1. NV_PGRAPH_INTR_NOTIFY: This intr is set when the Notification
style is WRITE_THEN_AWAKEN.
2. NV_PGRAPH_INTR_SEMAPHORE: This is set when a 3d class sempahore is
released as the result ofa SetSemaphoreD method, when the
AwakenEnable field is TRUE.
3. NV_PGRAPH_INTR_BUFFER_NOTIFY: This bit is set when a Mem2mem DMA
completes and the LaunchDma method specifies the interrupt type
as INTERRUPT
4. NV_PGRAPH_INTR_DEBUG_METHODS: This is debug feature and not used
on QNX safety
Bug 3588603
JIRA NVGPU-8166
Change-Id: I6d07dfd2857ac047fac4599421600d364251df76
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2694363
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Update PES, ROP exception handling for NVGPU_ERRATA_3524791. Enable the
errata for all Volta+ chips.
ROP, PES exceptions are being reported using the physical-id,
where logical-id should have been used. All ESR status registers are
reported using logical-id, so this matches with the SW expectation.
To address the (1), update ROP, PES exception handler translate from
physical to logical-id before reading the status registers.
Bug 3524791
Change-Id: Ieacbfb306bb0e69cf0113dc92f18e401573722e3
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2680029
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Volta+ chips supports PES floorsweeping and Ampere+(iGPU) chips supports
ROP floorsweeping. At present, the driver isn't aware of PES, ROP
floorsweeping, make the driver PES, ROP floorsweeping aware by introducing the
following fields in nvgpu_gr_config:
- gpc_(rop/pes)_mask: Contains the bit mask of non FSed ROP/PES units per GPC.
- gpc_(rop/pes)_logical_id_map: Translates per GPC ROP/PES physical id to
logical id.
Introduce the following HAL functions to read PES/ROP FS data:
- gops_fuse.fuse_status_opt_(pes/rop)_gpc: This fuction gets the FS
config from the fuse.
- gops_top.get_max_(pes/rop)_per_gpc: Gets the maximum number of PES/ROP
units that can be present in a GPC.
In addition, introduce the enabled flag NVGPU_SUPPORT_PES_FS to identify chips
which support PES floorsweeping, piggyback on NVGPU_SUPPORT_ROP_IN_GPC
enabled flag to identify ROP floorsweeping.
Bug 3524791
Change-Id: I065bab6c02618fe38892c8c890b069c340b85301
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2679570
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Upstream commit 7938f4218168 ("dma-buf-map: Rename to iosys-map")
renames 'struct dma_buf_map' to 'struct iosys_map' and breaks building
the NVGPU driver with Linux v5.18-rc1. In the NVGPU driver there are
many places where 'dma_buf_map' is used and so to clean-up the code and
minimise the impact of this change, add a gk20a_dmabuf_vmap() and a
gk20a_dmabuf_vunmap() helper function. These new functions support all
kernel versions and eliminate a lot the KERNEL_VERSION ifdefs.
Bug 3598986
Change-Id: Id0f904ec0662f20f3d699b74efd9542d12344228
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693970
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new profiler resource type NVGPU_PROFILER_PM_RESOURCE_TYPE_PC_SAMPLER.
Introduce regops HAL get_hwpm_pc_sampler_register_ranges to get
allowlist for PC_SAMPLER resources. Re-generate allowlist files to include
register ranges for PC_SAMPLER resources.
Update uapi header to advertise new resource type
NVGPU_PROFILER_PM_RESOURCE_ARG_PC_SAMPLER.
Bug 3408536
Change-Id: I7009ef822665771eed727da48ef1e89dcc6b9c4b
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689057
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
- When read from aelpg_param_read sysfs node and
write to aelpg_param_store sysfs node is done it
leads to system crash.
- This issue is seen on safety build as power features
are not enabled.
- To avoid this crash, add aelpg platform flag check in
aelpg_param_read and aelpg_param_store sysfs nodes.
- Also, as AELPG depends on ELPG add can_elpg check before
enabling/disabling aelpg through sysfs node.
Bug 3582946
Change-Id: Iaf709db2b5dc0340390767f4b06a0ac06962ed77
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2690548
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Replace the usage of tegra_fuse_readl with nvmem_cell_read_u32 for the
below fuse registers added as nvmem cells on v5.10+ kernels.
Older nvidia kernels do not have these tegra nvmem cell support.
1. FUSE_GCPLEX_CONFIG_FUSE_0
2. FUSE_RESERVED_CALIB0_0
3. FUSE_PDI0
4. FUSE_PDI1
bug 200633045
Change-Id: I187400720929233fcbc1970c9bbed34347b0a9a7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2670828
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
strncpy to a string from non-null terminating string without checking the
size of the source string can lead to target string become non-null
terminating. This can't be passed to strcat as it expects null
terminating string.
Check the size of the source string "name" in nvgpu_page_allocator_init.
CID 81207
Bug 3512546
Change-Id: I4b245a8c2236038c40912cee72d4dbf1ca14a525
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2692604
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
- When DISALLOW cmd is sent from driver to PMU the actual
completion of the disallow will be acknowledged by PMU
via a PG EVENT: ASYNC_CMD_RESP.
- Disallow needs a delayed ACK from PMU in order to disable
the ELPG.
- If ELPG is already engaged, the DISALLOW cmd will trigger
ELPG exit and then transition to PMU_PG_STATE_DISALLOW.
- After this whole process is completed, PMU will send
DISALLOW_ACK through ASYNC_CMD_RESP msg.
- After disallow command is sent from the driver, NvGPU driver
waits/polls for disallow command ack. This is sent immediately
by msg framework of PMU.
- Then, the driver will poll/wait for ASYNC_CMD_RESP event which
is the delayed DISALLOW ACK.
- The driver captures the ASYNC_CMD_RESP sent from PMU.
- set disallow_state to ELPG_OFF.
- If the driver does not wait/poll for this delayed disallow
ack from PMU, it can result in erros as PMU is still
processing DISALLOW cmd but the driver progressed further.
Bug 3580271
Change-Id: I332180c05b6a398107f065d54e9718b7038fb1b2
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689500
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Update GR suspend routine to clear GR falcon "coldboot_bootstrap_done"
flag, this is needed because GPU power rails are turned off during
suspend cycle due to which GR falcons need to be bootstrapped again
during resume.
Function "nvgpu_gr_falcon_suspend" is added to clear the above mentioned
flag.
Bug 3497398
Bug 3514055
Change-Id: If852a2c09f05c096f287b845c56d8b4f335ec8e7
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2670554
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
The high level API for the timer unit is the same across all OSs, so
get rid of the slight code duplication by moving the timer init
functions under a new file in common code:
- nvgpu_timeout_init_cpu_timer
- nvgpu_timeout_init_cpu_timer_sw
- nvgpu_timeout_init_retry
Much of the timer logic is also duplicated, but it is mixed between OS
specific current time retrieval. With some refactoring and addition of
an OS independent time keeping layer, that logic could also be made
shared.
Change-Id: I75d02ceb0d32022b0ba7f3bcd9fdb13d47039dbc
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2669510
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below listed HSI are handled with PMU ISR handler and
all these triggers interrupt from individual unit upon issue.
-Add ECC check for IMEM, DMEM, DCLS, REG, and MPU as per
HSI req
-Add MEMERR check for GPU_PMU_ACCESS_TIMEOUT_UNCORRECTED
PMU HSI id
-Add IOPMP check for GPU_PMU_ILLEGAL_ACCESS_UNCORRECTED
PMU HSI id
-Add WDT check for GPU_PMU_WDT_UNCORRECTED PMU HSI id
Bug 3491596
Bug 3366818
Change-Id: I751d653e447017ac62a2459da2c6bb9da506f438
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2686566
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Add PMU NVRISCV BR failure HSI support.
- Created a falcon unit function to check for the
BR competition status check and called from
other units as needed.
Bug 3491596
Bug 3366818
Change-Id: I5c3c6a7e6aeaad68f77e6b24f21239e40d9a7f78
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2686370
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
In Drive 6.0, the error reporting is supported only for orin (ga10b)
in dev-main. For this purpose, this patch does the following:
- Removes the redundant reporting of following IDs from gv11b:
- GPU_HOST_PFIFO_SCHED_ERROR
- GPU_HOST_PFIFO_CTXSW_TIMEOUT_ERROR
- GPU_HOST_PBDMA_HCE_ERROR
- GPU_MMU_L1TLB_SA_DATA_ECC_UNCORRECTED
- GPU_MMU_L1TLB_FA_DATA_ECC_UNCORRECTED
- GPU_LTC_CACHE_DSTG_ECC_CORRECTED
- GPU_LTC_CACHE_TSTG_ECC_UNCORRECTED
- Migrates the reporting of following IDs from gv11b to ga10b:
- GPU_SM_L1_TAG_ECC_CORRECTED
- GPU_SM_L1_TAG_ECC_UNCORRECTED
- GPU_SM_CBU_ECC_UNCORRECTED
- GPU_SM_LRF_ECC_UNCORRECTED
- GPU_SM_L1_DATA_ECC_UNCORRECTED
- GPU_SM_ICACHE_L1_DATA_ECC_UNCORRECTED
- GPU_SM_ICACHE_L0_PREDECODE_ECC_UNCORRECTED
- GPU_SM_L1_TAG_MISS_FIFO_ECC_UNCORRECTED
- GPU_SM_L1_TAG_S2R_PIXPRF_ECC_UNCORRECTED
- Removes the unused ID that doesn't have any HSI related to it:
- GPU_HOST_PBDMA_PREEMPT_ERROR
In addition to the above, this patch does the following:
- Updates error IDs related to page fault error.
- Updates look-up table to remove unused error IDs.
JIRA NVGPU-8094
Bug 200729736
Change-Id: Ifea76d38ba609c894560e61ff5a6e406290f919e
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2685249
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit