nvgpu_rc_pbdma_fault just checks for the id and id_type from struct
nvgpu_pbdma_status_info. These contain invalid values during chsw_load
and chsw_switch. This patch corrects the above bug by checking for the
chsw status and then loading the values for id and type.
The current code reads the pbdma_status info after clearing the
interrupt. Other interrupts can cause enough delay between clearing the
interrupt and pbdma switching the channel leading to invalid channel/tsg
ID. Correct that by reading the pbdma_status info register before
clearing of the pbdma interrupt to correctly read the context
information before the pbdma can switch out the context.
Bug 2648298
Change-Id: Ic2f0682526e00d14ad58f0411472f34388183f2b
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2165047
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove below hals since the corresponding functions are same on all
platforms and they are h/w independent
g->ops.gr.enable_ctxsw()
g->ops.gr.disable_ctxsw()
g->ops.gr.reset()
Call the functions directly at all places
Remove CONFIG_NVGPU_DEBUGGER from places where these functions are
called since they are not debugger dependent
This also helps to disable CONFIG_NVGPU_DEBUGGER and to keep recovery
sequence intact
Jira NVGPU-3506
Change-Id: Id2b208ca23dc4667e78edcd8ad242a8558e0ff64
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2137255
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move chip specific recovery code for volta onwards
architecture to hal/rc/rc_gv11b.c
Rename
fifo.teardown_ch_tsg -> fifo.recover
gk20a_runlist_update_locked -> nvgpu_runlist_update_locked
Remove
Unused h/w headers from fifo_gv11b.c
Use local variable f instead of g->fifo
JIRA NVGPU-1314
Change-Id: Ia535bbe4780e7241fdd911a8f577c6b98cf0fe53
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2102897
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Basic units like fifo, rc are having dependency on
gr_falcon. Avoided outside gr units dependency on gr_falcon
by moving following functions to gr:
int nvgpu_gr_falcon_disable_ctxsw(struct gk20a *g,
struct nvgpu_gr_falcon *falcon); ->
int nvgpu_gr_disable_ctxsw(struct gk20a *g);
int nvgpu_gr_falcon_enable_ctxsw(struct gk20a *g,
struct nvgpu_gr_falcon *falcon); ->
int nvgpu_gr_enable_ctxsw(struct gk20a *g);
int nvgpu_gr_falcon_halt_pipe(struct gk20a *g); ->
int nvgpu_gr_halt_pipe(struct gk20a *g);
HALs also moved accordingly and updated code to reflect this.
Also moved following data back to gr from gr_falcon:
struct nvgpu_mutex ctxsw_disable_mutex;
int ctxsw_disable_count;
JIRA NVGPU-3168
Change-Id: I2bdd4a646b6f87df4c835638fc83c061acf4051e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2100009
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Removed unused struct from gr_gk20a.h
Change static allocation for struct gr_gk20a to dynamic type.
Change all the files that being affected by that change.
Call gr allocation from corresponding init_support functions, which
are part of the probe functions.
nvgpu_pci_init_support in pci.c
vgpu_init_support in vgpu_linux.c
gk20a_init_support in module.c
Call gr free before the gk20a free call in nvgpu_free_gk20a.
Rename struct gr_gk20a to struct nvgpu_gr
JIRA NVGPU-3132
Change-Id: Ief5e664521f141c7378c4044ed0df5f03ba06fca
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2095798
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Delete apply_ctxsw_timeout_intr ops and add
ctxsw_timeout_enable ops
Move chip specific sched_error and ctxsw_timeout
functions to hal/fifo/fifo_intr_* and hal/fifo/ctxsw_timeout_*
Add nvgpu_rc_ctxsw_timeout function under common/rc/rc.c
Do not check ctxsw timeout for channels that are no more
bound to tsg.
JIRA NVGPU-1312
Change-Id: Ide977fb60b3b72a27d9f22873f7a416c3bd1181d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2075734
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>