- On older chips, PMU uses CMD-MSG queue method to
communicate with NvGPU.
- From Turing onwards, PMU uses RPC method for this.
- During poweroff, we release pmu_sequence and reset the
members of the structure.
- For chips that use RPC, we need to free the payload as well
and then reset the members.
- Add pmu_seq_cleanup hal for this.
Bug 4019694
Bug 4059157
Change-Id: Ieb474fe4ed81f54d78480214cde53b51d45652c6
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2882267
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently KMD_SCHEDULING_WORKER_THREAD can be enabled/disabled using
compile time flag but this flag does give ability to control the
feature based on the chip.
GSP is enabled only on ga10b where KMD_SCHEDULING_WORKER_THREAD should
be disabled while should be enabled for other chips at the same time
to support GVS tests.
Change adds enabled flag to control KMD_SCHEDULING_WORKER_THREAD based
on the chip.
Bug 3935433
Change-Id: I9d2f34cf172d22472bdc4614073d1fb88ea204d7
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2867023
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Subcontext PDBs and valid mask in the instance blocks of the channels
in various subcontexts has to be updated when new subcontext is
created or a subcontext is removed.
Replayable fault state is cached in the channel structure. Replayable
fault state for subcontext is set based on first channel's bind
parameter. It was earlier programmed in function channel_setup_ramfc.
init_inst_block_core is updated to setup TSG level pdb map and mask.
Added new hal gv11b_channel_bind to enable the subcontext on channel
bind.
Bug 3677982
Change-Id: I58156c5b3ab6309b6a4b8e72b0e798d6a39c1bee
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2719994
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
The wait_pending HAL is now modified to simply
check the pending status of a given runlist.
The while loop is removed from this HAL.
A new function nvgpu_runlist_wait_pending_legacy() is
added that emulates the older wait_pending() HAL.
nvgpu_runlist_tick() is modified to accept a 64 bit
"preempt_grace_ns" value.
These changes prepare for upcoming control-fifo parser
changes.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: If3f288eb6f2181743c53b657219b3b30d56d26bc
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2766100
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
While creating a new channel, ioctls are called in the below sequence:
1. GPU_IOCTL_OPEN_CHANNEL
2. AS_IOCTL_BIND_CHANNEL
3. TSG_IOCTL_BIND_CHANNEL_EX
4. CHANNEL_ALLOC_GPFIFO_EX
5. CHANNEL_ALLOC_OBJ_CTX.
subctx pdbs and valid mask are programmed in the channel instance block
in the channel ioctls AS_IOCTL_BIND_CHANNEL & CHANNEL_ALLOC_GPFIFO_EX.
Programming them in the ioctl AS_IOCTL_BIND_CHANNEL is redundant.
Remove related hal g->ops.mm.init_inst_block_for_subctxs.
The hal init_inst_block will program context pdb and big page size.
The hal init_inst_block_core will program context pdb, big page size
and subctx 0 pdb. This is used by h/w units (fecs, pmu, hwpm, bar1,
bar2, sec2, gsp, perfbuf etc.).
For user channels, subctx pdbs are programmed as part of ramfc setup.
Bug 3677982
Change-Id: I6656b002d513404c1fd7c3d349933e80cca7e604
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2680907
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Patch defines a ZBC static table and configure it at sw layer. Later
existing API read this sw configuration and program it to hw.
This is applicable only for ga10b safety build and for other chips/
configuration it will be supported in the legacy way.
Bug 3585766
Change-Id: I00d79162c0b096616e3f555da965e82e47c014d1
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2713821
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
BVEC changes for nvgpu_rc_pbdma_fault and nvgpu_rc_mmu_fault
started reporting below MISRA issue.
kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:321:
1. misra_c_2012_rule_5_1_violation: Declaration with identifier
"nvgpu_tsg_unbind_channel_check_hw_state", which is ambiguous.
kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:349:
2. other_declaration: The first 31 characters of identifiers
"nvgpu_tsg_unbind_channel_check_ctx_reload" and
"nvgpu_tsg_unbind_channel_check_hw_state" are identical.
Do below renames to fix the issue. Doing both for consistency.
s/nvgpu_tsg_unbind_channel_check_hw_state/nvgpu_tsg_unbind_channel_hw_state_check
s/nvgpu_tsg_unbind_channel_check_ctx_reload/nvgpu_tsg_unbind_channel_ctx_reload_check
JIRA NVGPU-6772
Change-Id: Ib92cabe11c486621351bf15ddb86e20d16d514c4
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584152
(cherry picked from commit a619f259c6a4ffccb05550767212989af60c2a90)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2706551
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
The flag pmu->pg->golden_image_initialized is set to
true during initial GPU context creation and is not
cleared while the GPU goes into pm_suspend (during railgate).
Hence, when the GPU resumes after un-railgate it retains
the previous value which can cause ELPG to kick in immediately.
Due to this, when ELPG and Railgating are enabled, IDLE_SNAP
is seen for read access of gr_gpc0_tpc0_sm_arch_r reg.
To resolve this, if golden image is ready set the
pmu->pg->golden_image_initialized to suspend state during railgate,
to delay the early enable of ELPG. Add a new
pmu_init_golden_img_state hal in the NVGPU_INIT_TABLE_ENTRY.
This will be called after all the GR access is done and GPU resumes
completely after un-railgate. This hal will then check if
golden_image_initialized flag is in suspend state, it will set it
to ready state and then re-enable ELPG.
Bug 3431798
Change-Id: I1fee83e66e09b6b78d385bbe60529d0724f79e79
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2639188
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Earlier, buffer metadata support was made dependent on compression.
However that is not required.
Update the enabled flag NVGPU_SUPPORT_BUFFER_METADATA setup for
various hals. Enable it for all from linux characteristics init.
Update REGISTER_BUFFER and GET_BUFFER_INFO ioctls to seggregate
the compile/runtime compression functionality.
If compression is disabled, return error in case comptags are
required else don't fail the REGISTER_BUFFER ioctl.
Bug 200767700
Change-Id: I3850ccc879f180c97b830fb3d652c094b9d28a5b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614378
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GPU HW expects physically contiguous addresses when clearing
the compression bit store in memory. Currently on hypervisor setup,
the DMA_ATTR_FORCE_CONTIGUOUS flag ensures contiguous IPA, but it
is not possible to ensure contiguous physical memory.Disable
compression on virtualized environments until physically contiguous
memory is feasible.
Buffer Metadata support is dependent on compression support.
Move the initialization of NVGPU_SUPPORT_BUFFER_METADATA flag to
common code where NVGPU_SUPPORT_COMPRESSION is initialized.
Bug 200780546
Change-Id: Id94bffc878e275a80948880f0475162d0bb4ddae
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2607830
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
- On pre-silicon platform, static pg will be
done by nvgpu driver. For this, retain structs
and HALs of static pg.
- Add the static pg support under pre-silicon code.
- On silicon, the static pg will be done by BPMP.
- Rename variables used in static pg for better
readability and consistency
Bug 200768322
JIRA NVGPU-6433
Change-Id: Ib31c0f83b751c2b1563a36bd51af78a0bd12a117
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2594801
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
common.fbp has two interfaces to initialize FBP:
1. Public API nvgpu_fbp_init_support
2. HAL fbp.fbp_init_support
nvgpu_fbp_init_support() is only used to initialize HAL
fbp.fbp_init_support. Remove the HAL and use the API directly.
JIRA NVGPU-6644
Change-Id: I2c455e09dbcf5e4fb1dc370b284e4f0d5c678b40
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592047
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
After CONFIG_UBSAN kernel compilation flag to know any shifting
cause overflow or not enablement ,this is identified.
The register "gr_fe_tpc_fs_r(gpc_index)" is read only after
Volta. The gops where we are computing the index is not needed.
Bug 200727116
Change-Id: Ib2306103389ba9df77fd59d012ec70e775104989
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2573296
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
There is HW specific limit on number of channel entries that can be
added for each TSG entry in runlist. Right now there is no checking
to enforce this from SW and hence if User binds more than supported
channels to same TSG, invalid TSG formation error interrupts are
generated.
Fix this by adding appropriate checks in below steps :
- Add new field ch_count to struct nvgpu_tsg to keep track of
channels bound to TSG.
- Define new hal gops.runlist.get_max_channels_per_tsg() to retrieve
HW specific maximum channel count per TSG.
- Implement the HAL for gk20a and gv11b chips, and assign new HALs for
all chips appropriately.
- Increment ch_count while binding the channel to TSG and decrement it
while unbinding.
- While binding channel to TSG, Check if current channel count is
already equal to max channel count. If yes, print an error and bail
out.
Bug 200763991
Change-Id: Ic5f17a52e0fb171d1c020bf4f085f57cdb95f923
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582095
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
- Add gops_fbp_fs and gops_gpc_pg struct
- Add HALs to write to NV_FUSE_CTRL_OPT_FBP and
NV_FUSE_CTRL_OPT_GPC fuses needed for floorsweeping
- Add set_fbp_mask and set_gpc_mask to probe FBP and GPC mask
respectively during gpu probe
- Add sysfs node: fbp_fs_mask and gpc_fs_mask to store
FBP and GPC floorsweeping mask sent from userspace
- Move the floorsweeping programming early in NVGPU’s GPU init
function and then issue a PRI init.
JIRA NVGPU-6433
Change-Id: I84764d625c69914c107e1e8c7f29c476c2f64f78
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2499571
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Enable nvriscv debug buffer feature in NVGPU.
Debug buffer is a feature to print the debug log from ucode onto console
in real time.
Debug buffer feature uses the DMEM, queue and SWGEN1 interrupt to share
ucode debug data with NVGPU.
Ucode writes debug message to DMEM and updates offset in queue to trigger
interrupt to NVGPU.
NVGPU copies the debug message from DMEM to local buffer to process and
print onto console.
Debug buffer feature is added under falcon unit and required engine
can utilize the feature by providing required param through public
functions.
Currently GA10B NVRISCV NS/LS PMU ucode has support for this feature
and enabled support on NVGPU side by adding required changes, with this
feature enabled, it is now possible to see prints in real time.
JIRA NVGPU-6959
Change-Id: I9d46020470285b490b6bc876204f62698055b1ec
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548951
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Issue observed:
- In GA10B, it was observed that after recovery happens
ELPG does not engage.
- It was because, after CE reset, when nvgpu_submit_twod test
was run to engage ELPG, IDLE_FLIPPED_PWR_OFF signal was asserted.
- This means that when ELPG was engaged (engine is in PWR_OFF),
some idle signal flips (becomes non-idle) and this causes
IDLE_SNAP. After IDLE_SNAP is hit, ELPG will not engage further.
- After debugging from WAVES, it was observed that:
LCE0/LCE1 are not done with the reset sequence.
- The state of these LCE is RESET0. A pri request (pri read
to NV_CE_PCE_MAP register in CE) is seen that kicks it out of
RESET0. After this state, it goes through few states to update
some internal states (states RESET1/RESET2/PCE_MAP etc) and then
eventually settles down to IDLE state.
Solution:
- Read ce_pce_map_r register in recovery sequence (after ce reset).
- It is observed that when this read is added recovery is complete
and post that when nvgpu_submit_two test is executed, ELPG is engaging.
- This means that a pri read is needed after CE reset so that it settles
to idle state properly and post that ELPG can engage properly.
Bug 200734258
Change-Id: I5bb84921ca62a740fde81ffe6c29ccde4ebb341b
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554493
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
On all chips except ga10b, the number of ROP, L2 units per FBP
were in sync, hence, their FS masks could be represented by a single
fuse register NV_FUSE_STATUS_OPT_ROP_L2_FBP. However, on ga10b, the ROP
unit was moved out from FBP to GPC and it no longer matches the number
of L2 units, so the previous fuse register was broken into two -
NV_FUSE_CTRL_OPT_LTC_FBP, NV_FUSE_CTRL_OPT_ROP_GPC.
At present, the driver reads the NV_FUSE_CTRL_OPT_ROP_GPC register
and reports incorrect L2 mask. Introduce HAL function
ga10b_fuse_status_opt_l2_fbp to fix this.
In addition, rename fields and functions to exclusively fetch L2 masks,
this should help accommadate ga10b and future chips in which L2 and ROP units
are not in same. As part of this, the following functions and
fields have been renamed.
- nvgpu_fbp_get_rop_l2_en_mask => nvgpu_fbp_get_l2_en_mask
- fuse.fuse_status_opt_rop_l2_fbp => fuse.fuse_status_opt_l2_fbp
- nvgpu_fbp.fbp_rop_l2_en_mask => nvgpu_fbp.fbp_l2_en_mask
The HAL ga10b_fuse_status_opt_rop_gpc is removed as rop mask is not
used anywhere in the driver nor exposed to userspace.
Bug 200737717
Bug 200747149
Change-Id: If40fe7ecd1f47c23f7683369a60d8dd686590ca4
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551998
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
common.cic unit is divided into common.cic.mon and common.cic.rm
based on rm and mon process split.
CIC-mon subunit includes the code which is utilized in critical
interrupt handling path like initialization, error detection and
error reporting path. CIC-rm subunit includes the code corresponding
to rest of interrupt handling(like collecting error debug data from
registers) and ISR status management (status of deferred interrupts).
Split the CIC APIs and data-members into above two subunits.
JIRA NVGPU-6899
Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GFxP preemption for graphics contexts is not supported in safety.
But the support was enabled along with CONFIG_NVGPU_GRAPHICS since GFxP
preemption was protected under same config.
Add a separate config CONFIG_NVGPU_GFXP to protect all GFxP specific
code, enum values, and HALs.
Disable the config in safety profile.
Jira NVGPU-6893
Change-Id: Iebb5f754a1025dfa6e05a94704bdb8a7123b599a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534986
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a new Central Interrupt Controller(CIC) unit in common code.
The interrupt handling is done in a distributed manner currently.
The error handling policy for different errors resides in each unit's
ISR code. The goal is to converge this data under one central place -
the CIC unit.
This patch creates framework for CIC unit and moves the gv11b QNX
safety LUT to CIC unit. All the error reporting APIs from different
units are also moved to CIC.
New APIs are exposed by CIC unit to access its internal data like:
1. Struct err_desc - the static err handling /injection data per
error id
2. Num_hw_modules - the number of error reporting HW units
supported by CIC
Init and deinit of CIC unit:
1. CIC unit should be initialized earlyon during boot so that it
is available for any interrupt handling.
2. Initialize CIC just before the interrupts are enabled during
boot.
3. Similarly, CIC is disabled late during deinit cycle; right
after the interrupts are masked.
LUT:
1. LUT is currently used only for reporting error to safety
services in gv11b QNX safety build.
2. This error handling policy LUT currently has only two levels
of handing - correctable and quiecse.
3. Once, the error handling policy decision is moved from leaf
unit nodes to CIC, LUT will be updated to have additional levels
like fast recovery and full recovery.
4. Also, then a separate LUT will be added for each platform/build.
5. In current framework, the LUT is set to NULL for all
configurations except gv11b.
report_err() ops is added to report error to safety services.
This ops is only effective for gv11b qnx build; and set to NULL for
other configurations.
NVGPU-6521
NVGPU-6523
NVGPU-6750
NVGPU-6758
NVGPU-6760
NVGPU-6754
Change-Id: I24be7836a96d787741e37b732e19863ed8014635
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683
Reviewed-by: Ajesh K V <akv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, there are few chip specific erratas present in nvgpu code.
For better traceability of the erratas and corresponding fixes,
introduce flags to indicate existing erratas on a chip. These flags
decide if a corresponding solution is applied to the chip(s).
This patch introduces below functions to handle errata flags:
- nvgpu_init_errata_flags
- nvgpu_set_errata
- nvgpu_is_errata_present
- nvgpu_print_errata_flags
- nvgpu_free_errata_flags
nvgpu_print_errata_flags: print below details of erratas present in chip
1. errata flag name
2. chip where the errata was first discovered
3. short description of the errata
Flags corresponding to erratas present in a chip are set during chip hal
init sequence.
JIRA NVGPU-6510
Change-Id: Id5a8fb627222ac0a585aba071af052950f4de965
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2498095
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Update the nvgpu_runlist_update_for_channel() function:
- Rename it to nvgpu_runlist_update()
- Have it take a pointer to the runlist to update instead
of a runlist ID. For the most part this makes the code
better but there's a few places where it's worse (for
now).
This starts the slow and painful process of moving away from
the non-runlist code using runlist IDs in many places it should
not.
Most of this patch is just fixing compilation problems with
the minor header updates.
JIRA NVGPU-6425
Change-Id: Id9885fe655d1d750625a1c8aceda9e67a2cbdb7a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470304
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
The NEXT bit can remain set for the channel if timeslice expires before
scheduler clears it. Due to this nvgpu fails TSG unbind and in turn
nvrm_gpu fails channel close. In this case, checking the channel hw
state after some time can help see NEXT bit cleared by scheduler.
Reenable the tsg and return -EAGAIN to nvrm_gpu for it to retry again.
Bug 3144960
Change-Id: I35f417f02270e371a4e632986b73a00f8a4f921a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2468391
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit