Commit Graph

476 Commits

Author SHA1 Message Date
Deepak Nibade
d91ea322e1 gpu: nvgpu: fix gpc/tpc index for SMPC broadcast conversion
In gv11b_gr_egpc_etpc_priv_addr_table(), we call
gv11b_gr_update_priv_addr_table_smpc() to convert SMPC broadcast address into
list of unicast addresses

But before calling gv11b_gr_update_priv_addr_table_smpc() we sometimes
incorrectly set gpc_num/tpc_num to zero and that leads to generating incorrect
list of unicast addresses

Remove this incorrect initialization of gpc_num/tpc_num

Also update gv11b_gr_egpc_etpc_priv_addr_table() to receive tpc_num along with
gpc_num

Bug 2099717
Jira NVGPU-580

Change-Id: Idd4e5f78dbe6ca1800efae93c66355d06417d1f2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1691373
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:30 -07:00
Deepak Nibade
78151bb6f9 gpu: nvgpu: use HAL for chiplet offset
We currently use hard coded values of NV_PERF_PMMGPC_CHIPLET_OFFSET and
NV_PMM_FBP_STRIDE which are incorrect for Volta

Add new GR HAL get_pmm_per_chiplet_offset() to get correct value per-chip
Set gr_gm20b_get_pmm_per_chiplet_offset() for older chips
Set gr_gv11b_get_pmm_per_chiplet_offset() for Volta

Use HAL instead of hard coded values wherever required

Bug 200398811
Jira NVGPU-556

Change-Id: I947e7febd4f84fae740a1bc74f99d72e1df523aa
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690028
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:11 -07:00
Deepak Nibade
19aa748be5 gpu: nvgpu: add support to get unicast addresses on volta
We have new broadcast registers on Volta, and we need to generate correct
unicast addresses for them so that we can write those registers to context image

Add new GR HAL create_priv_addr_table() to do this conversion
Set gr_gk20a_create_priv_addr_table() for older chips
Set gr_gv11b_create_priv_addr_table() for Volta

gr_gv11b_create_priv_addr_table() will use the broadcast flags and then generate
appriate list of unicast register for each broadcast register

Bug 200398811
Jira NVGPU-556

Change-Id: Id53a9e56106d200fe560ffc93394cc0e976f455f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690027
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:07 -07:00
Deepak Nibade
4314771142 gpu: nvgpu: add broadcast address decode support for volta
With Volta we have more number of broadcast registers than previous chips
and we don't decode them right now in gr_gk20a_decode_priv_addr()

Add a new GR HAL decode_priv_addr() and set gr_gk20a_decode_priv_addr() for all
previous chips
Add and use gr_gv11b_decode_priv_addr() for Volta

gr_gv11b_decode_priv_addr() will decode all the broadcast registers and set
the broadcast flags apporiately

Define below new broadcast types
PRI_BROADCAST_FLAGS_PMMGPC
PRI_BROADCAST_FLAGS_PMM_GPCS
PRI_BROADCAST_FLAGS_PMM_GPCGS_GPCTPCA
PRI_BROADCAST_FLAGS_PMM_GPCGS_GPCTPCB
PRI_BROADCAST_FLAGS_PMMFBP
PRI_BROADCAST_FLAGS_PMM_FBPS
PRI_BROADCAST_FLAGS_PMM_FBPGS_LTC
PRI_BROADCAST_FLAGS_PMM_FBPGS_ROP

Bug 200398811
Jira NVGPU-556

Change-Id: Ic673b357a75b6af3d24a4c16bb5b6bc15974d5b7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:03 -07:00
Alex Waterman
d8e2311291 gpu: nvgpu: Only use gr.create_gr_sysfs with CONFIG_SYSFS
Only populate the create_gr_sysfs() functions when the system actually
has SYSFS (i.e is compiling for the Linux kernel). This allows non-
Linux systems to compile.

JIRA NVGPU-525

Change-Id: I3bac34feff376d89c0b63259772c77f7b4a03adc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673824
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-05 11:23:56 -07:00
Deepak Nibade
89e0745fa0 gpu: nvgpu: handle misaligned_addr SM exception
We right now do not handle misaligned_addr SM exception explicitly and hence
we incorrectly initiate CILP on this exception

Handle this exception explicitly in this sequence -
- set error notifier first
- clear the interrupt
- return error from gr_gv11b_handle_warp_esr_error_misaligned_addr() so that
  RC recovery is triggered by gk20a_gr_isr()

Ensure that the error value is propagated back to gk20a_gr_isr() correctly

Use nvgpu_set_error_notifier_if_empty() to set error notifier since this will
prevent overwriting of error notifier value in case gk20a_gr_isr() also tries
to write to some error notifier value

Bug 200388475
Jira NVGPU-554

Change-Id: I84c4d202a8068e738567ccd344e05d9d5f6ad2f0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1686781
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-04 11:49:46 -07:00
Terje Bergstrom
e7cc24eb9b gpu: nvgpu: Correct sign qualifiers for LTC code
In constants we use in LTC code we miss the qualifier indicating
if the constant is signed or unsigned. Add qualifiers for LTC code
and the ZBC related constant used in LTC code.

Change-Id: Id80078722f8a4f50eb53370146437bebb72a3ffc
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1683859
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-03 17:05:19 -07:00
Deepak Nibade
4b8432a663 gpu: nvgpu: fix address table for GPCS_TPC6 broadcast conversion
In gr_gk20a_create_priv_addr_table() and gv11b_gr_egpc_etpc_priv_addr_table(),
we create a table of unicast addresses from broadcast addresses
For GPC boardcast addresses like NV_PGRAPH_PRI_EGPCS_ETPC6_SM_*, we generate
the table assuming there are 7 TPCs in all the GPCs

But this is incorrect in some cases like GV100 where GPC0/1 have only 6 TPCs
And hence we end up generating registers which do not exist

Fix this by explicitly checking the number of TPCs and ensuring that address
generated is belongs to valid TPC

Bug 200400376
Jira NVGPU-564

Change-Id: I65d7d6cd7f0bf16171eb54ed71f1f3840ade3495
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1686806
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-03 08:23:08 -07:00
Seema Khowala
298880b961 gpu: nvgpu: gv11b: do not poll if stall intr is set
Do not continue polling if engine save has not started yet
and stall intr is set because if a stall intr is hit,
preemption will anyways not get completed. Just set the
reset_eng_bitmask of the engine for which ctx_status
was being polled, As part of teardown corresponding
engine will be reset.

Bug 2069807

Change-Id: I9a506e0bca1d891ed5cd5d4953e292a40356f8ff
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1683694
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-30 14:54:24 -07:00
Sourab Gupta
0b2ea2924b gpu: nvgpu: add gops.fifo.setup_sw
bar1/userd setup is different for RM server. created common function
gk20a_init_fifo_setup_sw_common.

Jira VQRM-3058

Change-Id: I655b54e21ed5f15dcb8e7b01bd9cd129b35ae7a3
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665691
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:38 -07:00
Richard Zhao
8d8ff9d34e gpu: nvgpu: add gops.fifo.set_error_notifier
RM Server overrides it for handling stall interrupts.

Jira VQRM-3058

Change-Id: I8b14f073e952d19c808cb693958626b8d8aee8ca
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679709
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:29 -07:00
Richard Zhao
d436ad67b6 gpu: nvgpu: add gops.fifo.channel_suspend/channel_resume
RM Server acts differently for channel suspend/resume.

Jira VQRM-3058

Change-Id: If41e3099164654db448d1157fd7f51dd00c5e201
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679707
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:20 -07:00
Richard Zhao
bcab5c1486 gpu: nvgpu: add gops.fifo.check_tsg_ctxsw_timeout/check_ch_ctxsw_timeout
RM Server acts differently for ctxsw timeout check. It won't check
GP_GET or accumulated timeouts, but notify guest and go to recovery.

Jira VQRM-3058

Change-Id: I428aea34dc517311eb7e73feb556145e916309fb
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679706
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:11 -07:00
Richard Zhao
c5f03db98a gpu: nvgpu: add gops.fifo.ch_abort_clean_up
Channel abort clean up is only needed by native and vgpu driver but not
RM server. RM server expects guest will clean up itself. RM server
should not set the callback.

Jira VQRM-3058

Change-Id: I11b49b6f2d51c871e31de16955d487dca82609cb
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679705
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:02 -07:00
Seema Khowala
aa7ee8dac0 gpu: nvgpu: enhance pbus error reporting
-Dump timeout save0 and save1 even if they could
 be unreliable when fecs_tgt in set in save0 . This
 is good to have for debug purposes.
-Add priv_ring hal for decode_error_code
-Decode fecs error code for supported error types

Bug 1998067

Change-Id: I60cb6902d099df4a7df45fa624e44d9e0d46360f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1683014
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 13:53:59 -07:00
Seema Khowala
f81d83690f gpu: nvgpu: use gpc_tpc_count[gpc] for number of tpc in a gpc
Using tpc_count instead of gpc_tpc_count indexed by gpc, will result
in pbus error with decode error or client floorswept error codes.
tpc_count represents total number of tpc while gpc_tpc_count[gpc]
represents number of tpc in the indexed gpc.

Bug 1998067

Change-Id: I9adfb98a6c3e209cbb02a8cd5090f6b6adc1ec4b
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1682469
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Tested-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 13:53:50 -07:00
Thomas Fleury
8a64eea483 gpu: nvgpu: fix priv error register reads
Current code does not compute priv error register offsets
properly. This leads to invalid decoding of priv errors, and
can also trigger additional priv errors.

- add GPU_LIT_GPC_PRIV_STRIDE define
- return proj_gpc_priv_stride for GPU_LIT_GPC_PRIV_STRIDE in hals
- use GPU_LIT_GPC_PRIV_STRIDE instead of GPU_LIT_GPC_STRIDE in
  g->ops.priv_ring.isr() to compute priv error register offsets.

Bug 2093058

Change-Id: Ia7c36ccba0441126784bb0e00452f2cf1196ef71
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1682118
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-28 13:32:18 -07:00
Martin Radev
c392a7270f gpu: nvgpu: Reset streaming on perfbuf_enable and perfbuf_disable
Similarly to css_hw_(enable|disable)_snapshot the HWPM
state should be reset on perfbuf_enable and perfbuf_disable
to avoid leaking snapshot data into a freshly mapped buffer.

Bug 1960846

Change-Id: I94826b209ef4b8cb6ad44d3b8667745270c6a7e1
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676009
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-26 09:13:05 -07:00
Deepak Nibade
77b806fe7e gpu: nvgpu: gv100: fix PMA list alignment in ctxsw buffer
GV100 ucode is changed so that it expects LIST_nv_perf_pma_ctx_reg list in
ctxsw buffer to be 256 byte aligned but same change is not applied to other
chip ucodes

ADD new HAL (*add_ctxsw_reg_perf_pma) to configure PMA register list and
define a common HAL gr_gk20a_add_ctxsw_reg_perf_pma() for all other
chips except GV100

Define a separate HAL for GV100 gr_gv100_add_ctxsw_reg_perf_pma() and fix
the required alignment in this function

Bug 1998067

Change-Id: Ie172fe90e2cdbac2509f2ece953cd8552e66fc56
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676655
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-21 06:04:38 -07:00
Deepak Nibade
66751bc05d gpu: nvgpu: gv100: fix num_fbpas while adding ctxsw buffer entries
For LIST_nv_pm_fbpa_ctx_regs, we right now call
add_ctxsw_buffer_map_entries_subunits() to add registers corresponding
to all the FBPAs

But while configuring total number of registers, we do not consider
floorswept FBPAs and that causes misalignment in subsequent lists for GV100

Fix this by reading disabled/floorswept FBPAs from fuse and consider only those
FBPAs which are active for GV100

Add new HAL (*add_ctxsw_reg_pm_fbpa) to support this setting and define a
common HAL gr_gk20a_add_ctxsw_reg_pm_fbpa() for all chips except GV100

Define GV100 specific gr_gv100_add_ctxsw_reg_pm_fbpa() with above mentioned
implementation to consider floorsweeping

Bug 1998067

Change-Id: Id560551bb0b8142791c117b6d27864566c90b489
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676654
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-21 06:04:35 -07:00
Alex Waterman
22a95f15e0 gpu: nvgpu: Don't ioremap() regs when using POSIX
When __NVGPU_POSIX__ is defined do no use ioremap(). This operation
probably doesn't make much sense. Currently we have no plans to run
the driver in userspace against a real GPU, hence programming the
nvlink credits registers is simply not necessary.

Also fix an unused variable by returing it as an error.

JIRA NVGPU-525

Change-Id: Ic94d332551f6e25c1836331bf92188e7651546cb
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673815
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-16 07:34:56 -07:00
Shashank Singh
23a855b852 gpu: nvgpu: add fault_ch to record_sm_error_state
fault_ch is needed by rm-server to send the notification to guest VM.
rm-server is going to use gr sources from linux

Jira VQRM-2982

Change-Id: Ifb6e8a9630a471d07b89ffaa7f2ceb309220fd21
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1661665
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-13 14:09:33 -07:00
seshendra Gadagottu
a5364c30b1 gpu: nvgpu: gv11b: pmu: add dma coherent support
Setup pmu apertures based on dma coherent property.

Bug 200394053

Change-Id: I45beff671e4b8741f2b1ffbc811618b074772ea0
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1641609
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-13 14:09:16 -07:00
seshendra Gadagottu
3df619f68a gpu: nvgpu: hal for syncpt_incr_per_release
Create hal to indicate syncpt increments per release.
Legacy chip uses 2 syncpt increments per release and gv1xx
onwards uses 1 syncpt increment per release.

Bug 2066025

Change-Id: I5d6d0a5368ef561f8150fbb7120181f49f6e338b
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1669817
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-12 10:40:17 -07:00
seshendra Gadagottu
7a5a2fb75a gpu: nvgpu: gv11b: set 4byte payload size for sema
Default semaphore payload size is 16byte. Set it to 4 byte
to avoid double increment of associated sync point with
semaphore release.

Also removed extra 0 op function from syncpoint increment
command.

Bug 2066025

Change-Id: Ia282cc5625827d356b5ba963adb7b1b3c703a931
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1669714
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-12 10:40:08 -07:00
Martin Radev
a83c99ecb4 gpu: nvgpu: Use gv11b_css_hw_set_handled_snapshots for GV11B
The value of NV_PERF_PMASYS_MEM_BUMP is different for Volta
and NVGPU_IOCTL_CHANNEL_CYCLE_STATS_SNAPSHOT_CMD_FLUSH did not
have correct behavior on GV11B due to that.
The patch adds an instance of css_hw_set_handled_snapshots
for Volta to fix that.

Bug 1960846
Bug 2068936

Change-Id: Ic057338d3b1b951a66d070267e69a90f136598b9
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1668568
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-08 11:04:31 -08:00
Alex Waterman
418f31cd91 gpu: nvgpu: Enable IO coherency on GV100
This reverts commit 848af2ce6d.

This is a revert of a revert, etc, etc. It re-enables IO coherence again.

JIRA EVLR-2333

Change-Id: Ibf97dce2f892e48a1200a06cd38a1c5d9603be04
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1669722
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-07 18:04:41 -08:00
Aparna Das
ca95adb2d4 gpu: nvgpu: add hal op to handle semaphore pending
The vserver variant for gr handle semaphore pending needs different
functionality to send interrupt to VM. Add HAL operation to allow
overriding vserver usecase.

Jira VQRM-2982

Change-Id: I5fee5a491c6e54344f9da477eaf5881c50335bbc
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658298
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:52 -08:00
Richard Zhao
c6b846d34c gpu: nvgpu: add gops.semaphore_wakeup HAL
vserver handles semaphore differently from native, so it needs a
callback to differentiate from native. Also created common function
mc_gk20a_handle_intr_nonstall to handle all nonstall interrupts.

Jira VQRM-2982

Change-Id: I1b3821717a4005ca4bf2a4dac5dcd335872f48f1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656753
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:43 -08:00
Aparna Das
f6cac2e0c4 gpu: nvgpu: add debugger.post_events HAL op
RM Server will need to set specific HAL op and notify vgpu client.

Jira VQRM-2982

Change-Id: I679565831635ff3fadf0bdc1af5fd7a8679b6fdd
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1660226
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:39 -08:00
Richard Zhao
d6b5d74c5e gpu: nvgpu: make gr functions that are used by vsrv global
Fixed vsrv link errors for gr unification.

Jira VQRM-2982

Change-Id: Icd46792191f1a9aaefbf86d2f3c0b4d5bce2384e
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1664706
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:30 -08:00
Aparna Das
98d91dd260 gpu: nvgpu: add hal op to handle post event id
The vserver variant for gr post event id needs different
functionality to send interrupt to VM. Add HAL operation
to allow overriding vserver usecase.

Jira VQRM-2982

Change-Id: I915d089ef751023968c1e8ab181c21afeec997a5
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658382
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:21 -08:00
Aparna Das
d654ab4863 gpu: nvgpu: add hal op to handle notify pending
The vserver variant for gr handle notify pending needs different
functionality to send interrupt to VM. Add HAL operation to allow
overriding vserver usecase.

Jira VQRM-2982

Change-Id: I4cb88d4d769a5d5cb98a4ee6ac3fbb74245cb5f2
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658255
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:12 -08:00
Aparna Das
d6c6c6c483 gpu: nvgpu: add hal op for gr set error notifier
The vserver variant for gr set error notifier needs different
functionality to send interrupt to VM. Add HAL operation to
allow overriding vserver usecase.

Jira VQRM-2982

Change-Id: Ia445a27112bb6c5587dbb81100a9dafe5875b338
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1657830
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:03 -08:00
Kirill Artamonov
70bf8275ef gpu: nvgpu: gv11b: implement gfxp wfi controls
/sys/devices/gpu.0/gfxp_wfi_timeout_unit
usec - microseconds
sysclk - gpu clock count

Treat gr_fe_gfxp_wfi_timeout_r as context-switched
register on gv11b.

Set default gfxp_wfi_timeout to 100 usec to match
gp10b at 1GHz.

bug 1888344

Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com>
Change-Id: I7fa64ce6912ae861244856807543b17bd7a26bed
Reviewed-on: https://git-master.nvidia.com/r/1651517
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:51:50 -08:00
Deepak Goyal
963fac8068 gpu: nvgpu: initialize max_subctx_count before use
PMU instance block layout is not getting
populated with subctx pdb info as max_subctx_count is
0x0 and is getting initialized after PMU instance block
initialization gets completed.

Therefore initializing max_subctx_count before using it.
For Volta, FECS can bind itself with the PMU instance
only if SC PDB info is populated.

Bug 2051863
Bug 200392620

Change-Id: Id4fc26502e189c15cb57cb36cc09387dad773dc5
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1666585
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-05 21:18:29 -08:00
Deepak Goyal
26b9194603 gpu: nvgpu: gv11b: Correct PMU PG enabled masks.
PMU ucode records supported feature list for a
particular chip as support mask sent
via PMU_PG_PARAM_CMD_GR_INIT_PARAM.

It then enables selective feature list through
enable mask sent via
PMU_PG_PARAM_CMD_SUB_FEATURE_MASK_UPDATE cmd.

Right now only ELPG state machine mask was enabled.
Only ELPG state machine was getting executed
but other crucial steps in ELPG entry/exit sequence
were getting skipped.

Bug 200392620.
Bug 200296076.

Change-Id: I5e1800980990c146c731537290cb7d4c07e937c3
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665767
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-05 21:18:20 -08:00
Timo Alho
848af2ce6d Revert "Revert "Revert "gpu: nvgpu: Get coherency on gv100 + NVLINK working"""
This reverts commit 89fbf39a05.

Bug 2075315

Change-Id: Id34a0376be5160b164931926ec600f77edf69667
Signed-off-by: Timo Alho <talho@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1668487
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
2018-03-05 08:39:57 -08:00
Alex Waterman
89fbf39a05 Revert "Revert "gpu: nvgpu: Get coherency on gv100 + NVLINK working""
This reverts commit 5a35a95654.

JIRA EVLR-2333

Change-Id: I923c32496c343d39d34f6d406c38a9f6ce7dc6e0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1667167
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-02 22:10:14 -08:00
Deepak Nibade
37a15ce818 gpu: nvgpu: skip channel abort for deferred reset
In case deferred_reset_pending is set in gk20a_fifo_handle_mmu_fault() and in
gv11b_fifo_teardown_ch_tsg(), we skip resetting the engines and skip setting
the error notifier
Then we call gk20a_channel_abort()/gk20a_fifo_abort_tsg() which aborts the
channels, and resets the syncpoint values to release all the waiters

But since we don't set error notifier this could lead User to assume a
successful submission without any error

To fix this disable channel/TSG in case deferred_reset_pending is set and skip
calls to gk20a_channel_abort()/gk20a_fifo_abort_tsg()

Note that we finally abort the channel when channel is being closed

Bug 200363077

Change-Id: Ia48ca369701c14d1913d8f7b66ed466b7b840224
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1664319
(cherry picked from commit ac40d082e8)
Reviewed-on: https://git-master.nvidia.com/r/1666445
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-01 13:53:31 -08:00
Alex Waterman
5a35a95654 Revert "gpu: nvgpu: Get coherency on gv100 + NVLINK working"
Also revert other changes related to IO coherence. This may be the
culprit in a recent dev-kernel lockdown.

Bug 2070609

Change-Id: Ida178aef161fadbc6db9512521ea51c702c1564b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665914
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Srikar Srimath Tirumala <srikars@nvidia.com>
2018-02-28 13:49:22 -08:00
Alex Waterman
1170687c33 gpu: nvgpu: Use coherent aperture flag
When using a coherent DMA API wee must make sure to program
any aperture fields with the coherent aperture setting. To
do this the nvgpu_aperture_mask() function was modified to
take a third aperture mask argument, a coherent setting, so
that code can use this function to generate coherent aperture
settings.

The aperture choice is some what tricky: the default version
of this function uses the state of the DMA API to determine
what aperture to use for SYSMEM: either coherent or
non-coherent internally. Thus a kernel user need only specify
the normal nvgpu_mem struct and the correct mask should be
chosen. Due to many uses of nvgpu_mem structs not created
directly from the DMA API wrapper it's easier to translate
SYSMEM to SYSMEM_COH after creation.

However, the GMMU mapping code, will encounter buffers from
userspace with difference coerency attributes than the DMA
API. Thus the __nvgpu_aperture_mask() really respects the
aperture setting passed in regardless of the DMA API state.
This aperture setting is pulled from NVGPU_VM_MAP_IO_COHERENT
since this is either passed in from userspace or set by the
kernel when using coherent DMA. The aperture field in attrs
is upgraded to coh if this flag is set.

This change also adds a coherent sysmem mask everywhere that
it can. There's a couple places that do not have a coherent
register field defined yet. These need to eventually be
defined and added.

Lastly the aperture mask code has been mvoed from the Linux
vm.c code to the general vm.c code since this function has
no Linux dependencies.

Note: depends on https://git-master.nvidia.com/r/1664536 for
new register fields.

JIRA EVLR-2333

Change-Id: I4b347911ecb7c511738563fe6c34d0e6aa380d71
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1655220
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-27 16:03:43 -08:00
Srikar Srimath Tirumala
a26de1185a Revert "gpu: nvgpu: Use gv11b_css_hw_set_handled_snapshots for GV11B"
This reverts commit 2f2e51bbae.

Bug 2068936

Change-Id: I539cdc12a3bd0d9d7fe0ce7dbe9cb7a274eeaa57
Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1664647
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
2018-02-26 19:01:16 -08:00
Martin Radev
2f2e51bbae gpu: nvgpu: Use gv11b_css_hw_set_handled_snapshots for GV11B
The value of NV_PERF_PMASYS_MEM_BUMP is different for Volta
and NVGPU_IOCTL_CHANNEL_CYCLE_STATS_SNAPSHOT_CMD_FLUSH did not
have correct behavior on GV11B due to that.
The patch adds an instance of css_hw_set_handled_snapshots
for Volta to fix that.
The patch also renames css_hw_set_handled_snapshots
to gk20a_css_hw_set_handled_snapshots to make it more clear
that the function is arch dependent.

Bug 1960846

Change-Id: I92c35a862ecd7f918dd1458c086fc7ae42ca8fc5
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1662427
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-26 12:03:24 -08:00
Deepak Nibade
8d5536271f gpu: nvgpu: add user API to get a syncpoint
Add new user API NVGPU_IOCTL_CHANNEL_GET_USER_SYNCPOINT which will expose
per-channel allocated syncpoint to user space
API will also return current value of the syncpoint
On supported platforms, this API will also return a RW semaphore address
(corresponding to syncpoint shim) to user space

Add new characteristics flag NVGPU_GPU_FLAGS_SUPPORT_USER_SYNCPOINT to indicate
support for this new API
Add new flag NVGPU_SUPPORT_USER_SYNCPOINT for use of core driver

Set this flag for GV11B and GP10B for now

Add a new API (*syncpt_address) in struct gk20a_channel_sync to get GPU_VA
address of a syncpoint

Add new API nvgpu_nvhost_syncpt_read_maxval() which will read and return MAX
value of syncpoint

Bug 200326065
Jira NVGPU-179

Change-Id: I9da6f17b85996f4fc6731c0bf94fca6f3181c3e0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658009
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-26 03:48:11 -08:00
Thomas Fleury
0601fd25a5 gpu: nvgpu: gv100: nvlink endpoint driver
The following changes implements the initial (as per bringup) nvlink driver.

(1) SW initialization of nvlink core driver structures
(2) Nvlink interrupt handling
(3) Device initialization (IOCTRL, pll and clocks, device level intr)
(4) Falcon support for minion
(5) Minion load and bootstrapping
(6) Link initialization and DL PROD settings
(7) Device Interface init (and switching HSHUB to nvlink)
(8) HS set/get mode for both link and sublink
(9) Topology discovery and VBIOS settings.
(10) Ensures we get physical contiguous memory when Nvlink is enabled

This driver includes a hack for the current single dev/single link limitation.

JIRA: EVLR-2331
JIRA: EVLR-2330
JIRA: EVLR-2329
JIRA: EVLR-2328

Change-Id: Idca9a819179376cc655784482b24b575a52fa9e5
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656790
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-25 21:48:24 -08:00
Alex Waterman
b5d3cf444e gpu: nvgpu: Cleanup unused variables
There are numerous places where variables are assigned to but then
never used. This patch cleans up all these unused variables and
in some cases simplifies surrounding logic.

Also delete unused header includes and add necessary header includes.

JIRA NVGPU-525

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: Ice9ec2a0e97f262d0dcfebe22f83208dbea569d9
Reviewed-on: https://git-master.nvidia.com/r/1662548
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-23 21:53:38 -08:00
seshendra Gadagottu
7a9c2f68f3 gpu: nvgpu: gv11b: check for valid gr map_tiles
Number of gr map_tiles is equal to number of tpcs.
During gr_gv11b_setup_rop_mapping programming, check for validity
of gr map_tiles before programming gr_crstr_gpc_map_tile map.

Bug 200389570
Bug 2051856
JIRA NVGPU-523

Change-Id: Iaeb13c6a433d76ad895f89909e3033f887665619
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1657727
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-22 21:48:30 -08:00
Alex Waterman
b6ab47d396 gpu: nvgpu: Delete more Linux includes
Delete yet more Linux includes in gv11b code.

Change-Id: I96f2c053cb903ac7cc51ad8cdb12f4a75ac95ae1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1657742
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-15 13:23:20 -08:00
Alex Waterman
20b2652b41 gpu: nvgpu: Delete unused Linux headers in gv11b_hal.c
Not sure how these even made it in in the first place...

Change-Id: Ibca4d525926d4dacc7f8b41609dd147f14dd1a0d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1657733
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-15 13:23:16 -08:00