Commit Graph

21 Commits

Author SHA1 Message Date
David Nieto
345eaef6a7 gpu: nvgpu: GPC MMU ECC support
Adding support for GPC MMU ECC error handling

JIRA: GPUT19X-112

Change-Id: I62083bf2f144ff628ecd8c0aefc8d227a233ff36
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1490772
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-06-04 20:34:57 -07:00
David Nieto
6bc36bded0 gpu: nvgpu: L2 cache tag ECC support
Adding support for L2 cache tag ECC error handling

JIRA: GPUT19X-112

Change-Id: I9a8ebefe97814b341f57a024dfb126013adaac1c
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1489029
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-06-04 20:34:57 -07:00
Seema Khowala
77199c0225 gpu: nvgpu: gv11b: init enable_exceptions gr ops
Enable FE, MEMFMT, DS and GPC exceptions only.
Make sure corresponding HWW_ESR are enabled too.

JIRA GPUT19X-75

Change-Id: Icf47b7e531dd72b59cbc6ac54b5902187f703d61
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1474859
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-30 08:43:35 -07:00
David Nieto
2173add7ae gpu: nvgpu: per-chip GPCCS exception support
Adding support for ISR handling of GPCCS exceptions
and GCC ECC support

JIRA: GPUT19X-83

Change-Id: Ica749dc678f152d536052cf47f2ea2b205a231d6
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1480997
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-24 04:55:42 -07:00
Lakshmanan M
45ca7cb8c5 gpu: nvgpu: gv11b: Add GCC L1.5 parity support
Add handling of GCC L1.5 parity exception.

JIRA GPUT19X-86

Change-Id: Ie83fc306d3dff79b0ddaf2616dcf0ff71fccd4ca
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1485834
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-19 09:44:25 -07:00
Lakshmanan M
5a08eafbe0 gpu: nvgpu: gv11b: Add L1 DATA + iCACHE parity
This CL covers the following parity support (uncorrected error),
1) SM's L1 DATA
2) SM's L0 && L1 icache

Volta Resiliency Id - Volta-634

JIRA GPUT19X-113
JIRA GPUT19X-99

Bug 1807553

Change-Id: Iacbf492028983529dadc5753007e43510b8cb786
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1483681
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-05-18 09:04:51 -07:00
Lakshmanan M
d503a23444 gpu: nvgpu: gv11b: Add LRF + CBU parity support
This CL covers the following parity support (uncorrected error),
1) SM's LRF
2) SM's CBU

Volta Resiliency Id - Volta-637

JIRA GPUT19X-85
JIRA GPUT19X-110

Bug 1775457

Change-Id: I3befb1fe22719d06aa819ef27654aaf97f911a9b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1481791
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-05-18 09:04:39 -07:00
Lakshmanan M
ffc37e50fa gpu: nvgpu: gv11b: Add L1 tags parity support
This CL covers the following parity support (corrected + uncorrected),
1) SM's L1 tags
2) SM's S2R's pixel PRF buffer
3) SM's L1 D-cache miss latency FIFOs

Volta Resiliency Id - Volta-720, Volta-721,  Volta-637

JIRA GPUT19X-85
JIRA GPUT19X-104
JIRA GPUT19X-100
JIRA GPUT19X-103

Bug 1825948
Bug 1825962
Bug 1775457

Change-Id: I53d7231a36b2c7c252395eca27b349eca80dec63
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1478881
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-05-18 09:04:28 -07:00
David Nieto
8c246cb18d gpu: nvgpu: gv11b: MMU parity HWW error intr
Adding support for ISR handling of ecc uncorrectable errors
for volta resiliency (Volta-686)

TODO: move interrupt init out of MC

bug 1881052
JIRA: GPUT19X-82

Change-Id: I45db01a6062445dd1f64a8297744cd15105e3344
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1476603
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-11 06:04:33 -07:00
Seema Khowala
d805731c8e gpu: nvgpu: gv11b: hw header update for CL38424879
Bug 200300756

Change-Id: I2991d306905d2681cfb3031301e1b45a215ff89b
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1466955
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-02 16:05:41 -07:00
Seema Khowala
1d4b22ed88 gpu: nvgpu: gv11b: set soc credits after fs_hub is out of reset
Without these credits, gpu mmu binds over nvlink to soc are hanging.
Also add l2_enabled for mc_elpg_enable.

Bug 1899460

Change-Id: I0b26410d5c8ec9b4c88b319ddd9442f2fd91b321
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1463204
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-18 10:24:54 -07:00
Seema Khowala
457f176785 gpu: nvgpu: gv11b: init handle sched_error & ctxsw_timout ops
- detect and decode sched_error type. Any sched error starting with xxx_* is
  not supported in h/w and should never be seen by s/w
- for bad_tsg sched error, preempt all runlists to recover as faulted ch/tsg
  is unknown. For other errors, just report error.
- ctxsw timeout is not part of sched error fifo interrupt. A new
  fifo interrupt, ctxsw timeout is added in gv11b. Add s/w handling.

Bug 1856152
JIRA GPUT19X-74

Change-Id: I474e1a3cda29a450691fe2ea1dc1e239ce57df1a
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1317615
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-12 15:33:50 -07:00
Seema Khowala
2766420dfb gpu: nvgpu: gv11b: implement teardown_ch_tsg fifo ops
Context TSG teardown procedure:

1. Disable scheduling for the engine's runlist via NV_PFIFO_SCHED_DISABLE.
   This enables SW to determine whether a context has hung later in the
   process: otherwise, ongoing work on the runlist may keep ENG_STATUS from
   reaching a steady state.

2. Disable all channels in the TSG being torn down or submit a new runlist
   that does not contain the TSG.  This is to prevent the TSG from being
   rescheduled once scheduling is reenabled in step 6.

3. Initiate a preempt of the engine by writing the bit associated with its
   runlist to NV_PFIFO_RUNLIST_PREEMPT.  This allows to begin the preempt
   process prior to doing the slow register reads needed to determine
   whether the context has hit any interrupts or is hung.  Do not poll
   NV_PFIFO_RUNLIST_PREEMPT for the preempt to complete.

4. Check for interrupts or hangs while waiting for the preempt to complete.
   During the pbdma/eng preempt finish polling, any stalling interrupts
   relating to runlist must be detected and handled in order for the
   preemption to complete.

5. If a reset is needed as determined by step 4:
  a. Halt the memory interface for the engine (as per the relevant engine
     procedure).
  b. Reset the engine via NV_PMC_ENABLE.
  c. Take the engine out of reset and reinit the engine (as per
     relevant engine procedure)
6. Re-enable scheduling for the engine's runlist via NV_PFIFO_SCHED_ENABLE.

JIRA GPUT19X-7

Change-Id: I1354dd12b4a4f0e4b4a8d9721581126c02288a85
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1327931
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-04 16:04:26 -07:00
Seema Khowala
1bce980d09 gpu: nvgpu: gv11b: init and implement reset_enable_hw
-implement gv11b specific reset_enable_hw fifo ops

-timeout period in fifo_fb_timeout_r() is set to init instead of max
This register specifies the number of microseconds Host
should wait for a response from FB before initiating a timeout interrupt.
For bringup, this value should be set to a lower value than usual, such as
~.5 milliseconds (500), to help find out bugs in the memory subsystem.

-timeout period in pbdma_timeout_r() is set to init instead of max
This register contains a value used for detecting timeouts.
The timeout value is in microsecond ticks.
The timeouts that use this value are:
GPfifo fetch timeouts to FB for acks, reqs, rdats.
PBDMA connection to LB.
GPfifo processor timeouts to FB for acks, reqs, rdats.
Method processor timeouts to FB for acks, reqs, rdats.
The init value is changed to 64K us based on bug 1816557.

JIRA GPUT19X-74
JIRA GPUT19X-47

Change-Id: I6f818e129c3ea67571d206c5e735607cbfcf6ec6
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1325352
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-28 09:39:13 -07:00
seshendra Gadagottu
434b1c588b gpu: nvgpu: gv11b: handle l2 related changes
Implemented gv11b specific l2 state init and Configured
ltc_ltcs_ltss_cbc_num_active_ltcs_r with following info:

- cbc_num_active_ltcs is read only for gv11b, so did not
  write any data to that field.
- enforced serilized access to l2 from sysmem and peermem.
- nvlink connected peer trafic sent through l2

JIRA GV11B-71

Change-Id: I63d9ee3f0a6da62e672a34e207f1f5214b6ed1b4
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1312831
GVS: Gerrit_Virtual_Submit
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-02 08:53:33 -08:00
Seema Khowala
2cc03def6a gpu: nvgpu: gv11b: update headers
generate headers for pri ring, pbdma intr and gmmu
with updated reg generator

JIRA  GV11B-47
JIRA  GV11B-7

Change-Id: Id198fb338c03acc52c523754cfd07db01ff9bffd
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1312756
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-01 13:46:43 -08:00
seshendra Gadagottu
207e2ac7d1 gpu: nvgpu: gv11b: reading max veid number
To get maximum number of subctx, sw should read
NV_PGRAPH_PRI_FE_CHIP_DEF_INFO_MAX_VEID_COUNT instead of
LITTER_NUM_SUBCTX.

JIRA GV11B-72

Change-Id: I4d675ba49d8a600da77e7b60da449d9e5ba48971
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1309591
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-27 10:03:23 -08:00
Seema Khowala
8497f45a2e nvgpu: gpu: gv11b: Remove syncpt protection support
In gv11b sync point support is moved to a shim outside of GPU,
and gv11b does not support sync points anymore. Remove use of
the sync point protection.

JIRA GV11B-47
JIRA GV11B-2

Change-Id: I70f3d2ce0cfe016453efe03f2bbf64c59baeb154
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1300964
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-13 17:54:28 -08:00
seshendra Gadagottu
f04a84b7ce gpu: nvgpu: gv11b: chip specific init_elcg_mode
Added thermal registers for gv11b. Implemented chip specific
init_elcg_mode. In thermal control register, engine power auto
control config is removed and added new field for engine holdoff
enable signal.

JIRA GV11B-58

Change-Id: I412d9a232800d25efbdb0a40f14949d3f085fb0e
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1300119
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-07 15:16:53 -08:00
seshendra Gadagottu
d00b2000b5 gpu: nvgpu: gv11b: update zcull and pm pointers
Update zcull and perfmon buffer pointers in context header.
For gv11b maximum 49 bits gpu va possible. But,
zcull and perfmon buffer pointers uses maximum 41 bit
va address (258 bytes aligned). To accommodate this, high pointer
registers needs to be updated in context header.

JIRA GV11B-48

Change-Id: Ibe62b6bfedd32c4f3721e4d19d96cce58ef0f366
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1291852
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
2017-01-27 13:54:40 -08:00
Alex Waterman
4b09997772 nvgpu: gpu: HW header update for Volta
Similar HW header update as has been done for all the other chips.
HW header files are located under:

  drivers/gpu/nvgpu/include/nvgpu/hw/gv11b/

And can be included like so:

  #include <nvgpu/hw/gv11b/hw_gr_gv11b.h>

Bug 1799159

Change-Id: If39bd71480a34f85bf25f4c36aec0f8f6de4dc9f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284433
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-24 15:15:16 -08:00