Add handling for below two interrupts on top of legacy ones. When pending,
PBDMA is stalled and s/w is expected to execute teardown.
clear_faulted_error: host is asked to clear fault status when no fault has been asserted.
eng_reset: An engine was reset while the PBDMA unit was processing a
channel from a runlist which serves the engine.
JIRA GPUT19X-47
Change-Id: I776e5799a73a1b63c394048fa61b597e621cf544
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1306558
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
if ch/tsg id is unknown and bit mask for the engines that need to be
recovered is not set, runlist mask should correspond to max number of
supported runlists
JIRA GPUT19X-7
Change-Id: I08e67af0846784a7918510d68de34e9162a42bf6
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1458155
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
- detect and decode sched_error type. Any sched error starting with xxx_* is
not supported in h/w and should never be seen by s/w
- for bad_tsg sched error, preempt all runlists to recover as faulted ch/tsg
is unknown. For other errors, just report error.
- ctxsw timeout is not part of sched error fifo interrupt. A new
fifo interrupt, ctxsw timeout is added in gv11b. Add s/w handling.
Bug 1856152
JIRA GPUT19X-74
Change-Id: I474e1a3cda29a450691fe2ea1dc1e239ce57df1a
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1317615
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
gk20a_err() and gk20a_warn() require a struct device pointer,
which is not portable across operating systems. The new nvgpu_err()
and nvgpu_warn() macros take struct gk20a pointer. Convert code
to use the more portable macros.
JIRA NVGPU-16
Change-Id: I8c0d8944f625e3c5b16a9f5a2a59d95a680f4e55
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1459822
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Context TSG teardown procedure:
1. Disable scheduling for the engine's runlist via NV_PFIFO_SCHED_DISABLE.
This enables SW to determine whether a context has hung later in the
process: otherwise, ongoing work on the runlist may keep ENG_STATUS from
reaching a steady state.
2. Disable all channels in the TSG being torn down or submit a new runlist
that does not contain the TSG. This is to prevent the TSG from being
rescheduled once scheduling is reenabled in step 6.
3. Initiate a preempt of the engine by writing the bit associated with its
runlist to NV_PFIFO_RUNLIST_PREEMPT. This allows to begin the preempt
process prior to doing the slow register reads needed to determine
whether the context has hit any interrupts or is hung. Do not poll
NV_PFIFO_RUNLIST_PREEMPT for the preempt to complete.
4. Check for interrupts or hangs while waiting for the preempt to complete.
During the pbdma/eng preempt finish polling, any stalling interrupts
relating to runlist must be detected and handled in order for the
preemption to complete.
5. If a reset is needed as determined by step 4:
a. Halt the memory interface for the engine (as per the relevant engine
procedure).
b. Reset the engine via NV_PMC_ENABLE.
c. Take the engine out of reset and reinit the engine (as per
relevant engine procedure)
6. Re-enable scheduling for the engine's runlist via NV_PFIFO_SCHED_ENABLE.
JIRA GPUT19X-7
Change-Id: I1354dd12b4a4f0e4b4a8d9721581126c02288a85
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1327931
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move code that touches host registers to fifo HAL. This sorts out
some of the dependencies between fifo HAL and channel HAL.
Change-Id: I2bff0443ae1c1fa5608e620974b440696d1cfdc4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1323385
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
-implement gv11b specific reset_enable_hw fifo ops
-timeout period in fifo_fb_timeout_r() is set to init instead of max
This register specifies the number of microseconds Host
should wait for a response from FB before initiating a timeout interrupt.
For bringup, this value should be set to a lower value than usual, such as
~.5 milliseconds (500), to help find out bugs in the memory subsystem.
-timeout period in pbdma_timeout_r() is set to init instead of max
This register contains a value used for detecting timeouts.
The timeout value is in microsecond ticks.
The timeouts that use this value are:
GPfifo fetch timeouts to FB for acks, reqs, rdats.
PBDMA connection to LB.
GPfifo processor timeouts to FB for acks, reqs, rdats.
Method processor timeouts to FB for acks, reqs, rdats.
The init value is changed to 64K us based on bug 1816557.
JIRA GPUT19X-74
JIRA GPUT19X-47
Change-Id: I6f818e129c3ea67571d206c5e735607cbfcf6ec6
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1325352
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
CTX_STATUS_SWITCH: Engine save hasn't started yet, continue to poll
CTX_STATUS_INVALID: The engine context has switched off. The
preemption step for this engine is complete.
CTX_STATUS_VALID or CTX_STATUS_CTXSW_SAVE: check the ID field:
* If ID matches the TSG for the context being torn down, the engine
reset procedure can be performed, or SW can continue
waiting for preempt to finish if id is not being torn down.
* If ID does NOT match, the context isn't running on the engine.
CTX_STATUS_LOAD: check the NEXT_ID field:
* If NEXT_ID matches the TSG of the context being torn down, the engine
is loading the context and reset can be performed
immediately or after a delay to allow the context a chance to load and
be saved off, or sw can continue waiting for preempt to finish if id
is not being torn down.
* If NEXT_ID does not match the TSG ID or CHID then the context is no
longer on the engine.
SW may alternatively wait for the CTX_STATUS to reach INVALID, but this
may take longer if an unrelated context is currently on the engine or
being switched to.
JIRA GPUT19X-7
Change-Id: I61499f932019de32e0200084c4c41b21a7cbbd2b
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1327164
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Init device_fatal, channel_fatal and restartable fifo intr pbdma s/w
variables for pbdma_intr_0 interrupt masks.
pbdma_intr_0 field changes for gv11b:-
bit 8(lbreq) does not exists in hw.
bit 28 (syncpoint_illegal)is removed in hw.
bit 20 is reused for clear_faulted_error in hw.
bit 24 (eng_reset) and bit 25 (semaphore) always existed in hw
but never handled in s/w. These are added as channel fatal.
JIRA GPUT19X-47
Change-Id: I13673430408f1cf7ef762075a29b94196f79a349
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1325401
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
preempt completion should be decided based on pbdma and
engine status. preempt_pending field is no longer used
to detect if preempt finished.
add a new function to to be used for preeempting ch and tsg
during recovery. If preempt timeouts while in recovery, do not
issue recovery.
JIRA GPUT19X-7
Change-Id: I0d69d12ee6a118f6628b33be5ba387c72983b32a
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1309850
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
In gv11b sync point support is moved to a shim outside of GPU,
and gv11b does not support sync points anymore. Remove use of
the sync point protection.
JIRA GV11B-47
JIRA GV11B-2
Change-Id: I70f3d2ce0cfe016453efe03f2bbf64c59baeb154
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1300964
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Similar HW header update as has been done for all the other chips.
HW header files are located under:
drivers/gpu/nvgpu/include/nvgpu/hw/gv11b/
And can be included like so:
#include <nvgpu/hw/gv11b/hw_gr_gv11b.h>
Bug 1799159
Change-Id: If39bd71480a34f85bf25f4c36aec0f8f6de4dc9f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284433
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Restore golden context correctly with subcontext header.
Increase subctx header size to hold complete golden context.
Also fill function pointer for freeing context header.
Bug 1834201
Change-Id: Id8a3437bc437fef02ee15333c1163290217d34d1
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1282440
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
gv11b needs atleast one subcontext to submit work. To support
legacy in gv11b, currently main context is always copied into
subcontext0 (veid0) during channel commit instance.
As part of channel commit instance, veid0 for that channel is
created and relevant pdb and context info copied to vedi0.
JIRA GV11B-21
Change-Id: I5147a1708b5e94202fa55e73fa0e53199ab7fced
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1231169
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
For gv11b, userd is allocated from sysmem.
Updated gp_get and gp_put functions to read or
write from sysmem instead of bar1 memory.
In gv11b, after updating gp_put, it is required
to notify pending work to host through channel
doorbell.
JIRA GV11B-1
Change-Id: Iebc52e6ccfc8b9ca0c57b227190e0ce1161076f1
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1226613
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Included all basic ops for gv11b and updated
sm related functions to include new priv register
addresses.
Bug 1735757
Change-Id: Ie48651f918ee97fba00487111e4b28d6c95747f5
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1126961
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>