Commit Graph

304 Commits

Author SHA1 Message Date
Deepak Nibade
78151bb6f9 gpu: nvgpu: use HAL for chiplet offset
We currently use hard coded values of NV_PERF_PMMGPC_CHIPLET_OFFSET and
NV_PMM_FBP_STRIDE which are incorrect for Volta

Add new GR HAL get_pmm_per_chiplet_offset() to get correct value per-chip
Set gr_gm20b_get_pmm_per_chiplet_offset() for older chips
Set gr_gv11b_get_pmm_per_chiplet_offset() for Volta

Use HAL instead of hard coded values wherever required

Bug 200398811
Jira NVGPU-556

Change-Id: I947e7febd4f84fae740a1bc74f99d72e1df523aa
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690028
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:11 -07:00
Deepak Nibade
19aa748be5 gpu: nvgpu: add support to get unicast addresses on volta
We have new broadcast registers on Volta, and we need to generate correct
unicast addresses for them so that we can write those registers to context image

Add new GR HAL create_priv_addr_table() to do this conversion
Set gr_gk20a_create_priv_addr_table() for older chips
Set gr_gv11b_create_priv_addr_table() for Volta

gr_gv11b_create_priv_addr_table() will use the broadcast flags and then generate
appriate list of unicast register for each broadcast register

Bug 200398811
Jira NVGPU-556

Change-Id: Id53a9e56106d200fe560ffc93394cc0e976f455f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690027
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:07 -07:00
Deepak Nibade
4314771142 gpu: nvgpu: add broadcast address decode support for volta
With Volta we have more number of broadcast registers than previous chips
and we don't decode them right now in gr_gk20a_decode_priv_addr()

Add a new GR HAL decode_priv_addr() and set gr_gk20a_decode_priv_addr() for all
previous chips
Add and use gr_gv11b_decode_priv_addr() for Volta

gr_gv11b_decode_priv_addr() will decode all the broadcast registers and set
the broadcast flags apporiately

Define below new broadcast types
PRI_BROADCAST_FLAGS_PMMGPC
PRI_BROADCAST_FLAGS_PMM_GPCS
PRI_BROADCAST_FLAGS_PMM_GPCGS_GPCTPCA
PRI_BROADCAST_FLAGS_PMM_GPCGS_GPCTPCB
PRI_BROADCAST_FLAGS_PMMFBP
PRI_BROADCAST_FLAGS_PMM_FBPS
PRI_BROADCAST_FLAGS_PMM_FBPGS_LTC
PRI_BROADCAST_FLAGS_PMM_FBPGS_ROP

Bug 200398811
Jira NVGPU-556

Change-Id: Ic673b357a75b6af3d24a4c16bb5b6bc15974d5b7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 11:23:03 -07:00
Shashank Singh
e1200259ba gpu: nvgpu: fix misc issue for dgpu code on QNX build
- QNX is pulling dgpu code from linux which has
  multiple build failure on QNX. Like QNX needs
  explicit declaration for all non-static functions.
  Some linux specific headers need to be put under
  __KERNEL__ flag.

Change-Id: I15af1a1f6a069c82f9a81449f4f7c7d48612de42
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665752
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-10 09:43:51 -07:00
Alex Waterman
ceb0ecb766 gpu: nvgpu: gp106: Reduce usage of os_linux.h
In clk_gp106 the Linux platform specific data is only referenced if
debugfs is being used. Thus only include the Linux OS stuff if debugfs
is enabled. This allows us to compile the rest of the clk code in non-
Linux kernel environments.

Also delete os_linux.h from xve_gp106.c since that header is unused
in the XVE code (and also do a minor cleanup by deleting the
pr_warn()).

JIRA NVGPU-525

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: I5a20ba3b02eb2d8741c73ef2ded9276f6aebb957
Reviewed-on: https://git-master.nvidia.com/r/1687090
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-05 11:24:13 -07:00
Alex Waterman
43861331c5 gpu: nvgpu: Cast negative int to u32 before shift
A shift of a negative number is undefined; so to work around
said warning simply cast to a u32 first. In this case the
resulting operation should be ok since the sign bits are
maintained when the 32 bit negative integer is shifted into
a 24 bit negative integer.

JIRA NVGPU-525

Change-Id: I0a35b0ccbccbcf4ac1b0767acad75c082143429e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673826
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-05 11:24:09 -07:00
Alex Waterman
a2ceb5ec31 gpu: nvgpu: Abstract PCI header usage
The mclk code for gp106 requires access to the PCI device
ID headers in Linux. This patch makes it so that the mclk
code does not need to directly include the Linux headers.

This allows us to compile and link the dGPU chips code in
the user space unit testing framework.

JIRA NVGPU-525

Change-Id: I89e2fa7fbb3b67f061e026e48a374951e7934aa5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673823
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-05 11:23:47 -07:00
Alex Waterman
29df4f3da6 gpu: nvgpu: gp106 and missing types.h header fixes
Multiple places were missing explicit <nvgpu/types.h> includes but
used various types anyway. Fix that by including <nvgpu/types.h>
where necessary.

A gp106 file directly used the Linux delay header instead of
including <nvgpu/timers.h>.

This patch fixes both problems.

JIRA NVGPU-525

Change-Id: Ib7a30a8ed9098d469d646c0a2bba293087b8de90
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673821
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-30 14:54:12 -07:00
Alex Waterman
12cd49a733 gpu: nvgpu: Cleanup more set but unused variables
This time they were largely located in the various common directories.

JIRA NVGPU-525

Change-Id: I3a6d523b060a0c6761b227267890298c6d2fb19f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673820
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-30 14:54:08 -07:00
Sourab Gupta
0b2ea2924b gpu: nvgpu: add gops.fifo.setup_sw
bar1/userd setup is different for RM server. created common function
gk20a_init_fifo_setup_sw_common.

Jira VQRM-3058

Change-Id: I655b54e21ed5f15dcb8e7b01bd9cd129b35ae7a3
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665691
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:38 -07:00
Richard Zhao
8d8ff9d34e gpu: nvgpu: add gops.fifo.set_error_notifier
RM Server overrides it for handling stall interrupts.

Jira VQRM-3058

Change-Id: I8b14f073e952d19c808cb693958626b8d8aee8ca
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679709
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:29 -07:00
Richard Zhao
d436ad67b6 gpu: nvgpu: add gops.fifo.channel_suspend/channel_resume
RM Server acts differently for channel suspend/resume.

Jira VQRM-3058

Change-Id: If41e3099164654db448d1157fd7f51dd00c5e201
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679707
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:20 -07:00
Richard Zhao
bcab5c1486 gpu: nvgpu: add gops.fifo.check_tsg_ctxsw_timeout/check_ch_ctxsw_timeout
RM Server acts differently for ctxsw timeout check. It won't check
GP_GET or accumulated timeouts, but notify guest and go to recovery.

Jira VQRM-3058

Change-Id: I428aea34dc517311eb7e73feb556145e916309fb
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679706
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:11 -07:00
Richard Zhao
c5f03db98a gpu: nvgpu: add gops.fifo.ch_abort_clean_up
Channel abort clean up is only needed by native and vgpu driver but not
RM server. RM server expects guest will clean up itself. RM server
should not set the callback.

Jira VQRM-3058

Change-Id: I11b49b6f2d51c871e31de16955d487dca82609cb
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1679705
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 18:54:02 -07:00
Seema Khowala
aa7ee8dac0 gpu: nvgpu: enhance pbus error reporting
-Dump timeout save0 and save1 even if they could
 be unreliable when fecs_tgt in set in save0 . This
 is good to have for debug purposes.
-Add priv_ring hal for decode_error_code
-Decode fecs error code for supported error types

Bug 1998067

Change-Id: I60cb6902d099df4a7df45fa624e44d9e0d46360f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1683014
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-29 13:53:59 -07:00
Thomas Fleury
8a64eea483 gpu: nvgpu: fix priv error register reads
Current code does not compute priv error register offsets
properly. This leads to invalid decoding of priv errors, and
can also trigger additional priv errors.

- add GPU_LIT_GPC_PRIV_STRIDE define
- return proj_gpc_priv_stride for GPU_LIT_GPC_PRIV_STRIDE in hals
- use GPU_LIT_GPC_PRIV_STRIDE instead of GPU_LIT_GPC_STRIDE in
  g->ops.priv_ring.isr() to compute priv error register offsets.

Bug 2093058

Change-Id: Ia7c36ccba0441126784bb0e00452f2cf1196ef71
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1682118
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-28 13:32:18 -07:00
Deepak Nibade
77b806fe7e gpu: nvgpu: gv100: fix PMA list alignment in ctxsw buffer
GV100 ucode is changed so that it expects LIST_nv_perf_pma_ctx_reg list in
ctxsw buffer to be 256 byte aligned but same change is not applied to other
chip ucodes

ADD new HAL (*add_ctxsw_reg_perf_pma) to configure PMA register list and
define a common HAL gr_gk20a_add_ctxsw_reg_perf_pma() for all other
chips except GV100

Define a separate HAL for GV100 gr_gv100_add_ctxsw_reg_perf_pma() and fix
the required alignment in this function

Bug 1998067

Change-Id: Ie172fe90e2cdbac2509f2ece953cd8552e66fc56
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676655
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-21 06:04:38 -07:00
Deepak Nibade
66751bc05d gpu: nvgpu: gv100: fix num_fbpas while adding ctxsw buffer entries
For LIST_nv_pm_fbpa_ctx_regs, we right now call
add_ctxsw_buffer_map_entries_subunits() to add registers corresponding
to all the FBPAs

But while configuring total number of registers, we do not consider
floorswept FBPAs and that causes misalignment in subsequent lists for GV100

Fix this by reading disabled/floorswept FBPAs from fuse and consider only those
FBPAs which are active for GV100

Add new HAL (*add_ctxsw_reg_pm_fbpa) to support this setting and define a
common HAL gr_gk20a_add_ctxsw_reg_pm_fbpa() for all chips except GV100

Define GV100 specific gr_gv100_add_ctxsw_reg_pm_fbpa() with above mentioned
implementation to consider floorsweeping

Bug 1998067

Change-Id: Id560551bb0b8142791c117b6d27864566c90b489
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676654
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-21 06:04:35 -07:00
Thomas Fleury
6e0de56121 gpu: nvgpu: gp106: fix freq scale in debugfs nodes
For better precision dramdiv4 (MCLK/4) counter is used to measure
MCLK frequency. But the scaling factor of 2 must be taken into
account when reporting dramdiv2_rec_clk1.
The issue was not affecting other counters which use scale=1.

Bug 200386061

Change-Id: Ib3891f3f2dd4206ac36aa3e3290810144f4aa339
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1654536
(cherry picked from commit 6a68207c90feab1caee737013ab7cd5bb3863fb6)
Reviewed-on: https://git-master.nvidia.com/r/1657209
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-12 12:09:37 -07:00
seshendra Gadagottu
3df619f68a gpu: nvgpu: hal for syncpt_incr_per_release
Create hal to indicate syncpt increments per release.
Legacy chip uses 2 syncpt increments per release and gv1xx
onwards uses 1 syncpt increment per release.

Bug 2066025

Change-Id: I5d6d0a5368ef561f8150fbb7120181f49f6e338b
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1669817
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-12 10:40:17 -07:00
Mahantesh Kumbar
cc4b9f540f gpu: nvgpu: PMU super surface support
- Added ops "pmu.alloc_super_surface" to create
  memory space for pmu super surface
- Defined method nvgpu_pmu_sysmem_surface_alloc()
  to allocate pmu super surface memory & assigned
  to "pmu.alloc_super_surface" for gv100
- "pmu.alloc_super_surface" set to NULL for gp106
- Memory space of size "struct nv_pmu_super_surface"
  is allocated during pmu sw init setup if
  "pmu.alloc_super_surface" is not NULL &
  free if error occur.
- Added ops "pmu_ver.config_pmu_cmdline_args_super_surface"
  to describe PMU super surface details to PMU ucode
  as part of pmu command line args command if
  "pmu.alloc_super_surface" is not NULL.
- Updated pmu_cmdline_args_v6 to include member
  "struct flcn_mem_desc_v0 super_surface"
- Free allocated memory for PMU super surface in
  nvgpu_remove_pmu_support() method
- Added "struct nvgpu_mem super_surface_buf" to "nvgpu_pmu" struct
- Created header file "gpmu_super_surf_if.h" to include interface
  about pmu super surface, added "struct nv_pmu_super_surface"
  to hold super surface members along with rsvd[x] dummy space
  to sync members offset with PMU super surface members.

Change-Id: I2b28912bf4d86a8cc72884e3b023f21c73fb3503
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656571
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-07 23:27:49 -08:00
Alex Waterman
418f31cd91 gpu: nvgpu: Enable IO coherency on GV100
This reverts commit 848af2ce6d.

This is a revert of a revert, etc, etc. It re-enables IO coherence again.

JIRA EVLR-2333

Change-Id: Ibf97dce2f892e48a1200a06cd38a1c5d9603be04
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1669722
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-07 18:04:41 -08:00
Aparna Das
ca95adb2d4 gpu: nvgpu: add hal op to handle semaphore pending
The vserver variant for gr handle semaphore pending needs different
functionality to send interrupt to VM. Add HAL operation to allow
overriding vserver usecase.

Jira VQRM-2982

Change-Id: I5fee5a491c6e54344f9da477eaf5881c50335bbc
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658298
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:52 -08:00
Richard Zhao
c6b846d34c gpu: nvgpu: add gops.semaphore_wakeup HAL
vserver handles semaphore differently from native, so it needs a
callback to differentiate from native. Also created common function
mc_gk20a_handle_intr_nonstall to handle all nonstall interrupts.

Jira VQRM-2982

Change-Id: I1b3821717a4005ca4bf2a4dac5dcd335872f48f1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656753
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:43 -08:00
Aparna Das
f6cac2e0c4 gpu: nvgpu: add debugger.post_events HAL op
RM Server will need to set specific HAL op and notify vgpu client.

Jira VQRM-2982

Change-Id: I679565831635ff3fadf0bdc1af5fd7a8679b6fdd
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1660226
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:39 -08:00
Aparna Das
98d91dd260 gpu: nvgpu: add hal op to handle post event id
The vserver variant for gr post event id needs different
functionality to send interrupt to VM. Add HAL operation
to allow overriding vserver usecase.

Jira VQRM-2982

Change-Id: I915d089ef751023968c1e8ab181c21afeec997a5
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658382
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:21 -08:00
Aparna Das
d654ab4863 gpu: nvgpu: add hal op to handle notify pending
The vserver variant for gr handle notify pending needs different
functionality to send interrupt to VM. Add HAL operation to allow
overriding vserver usecase.

Jira VQRM-2982

Change-Id: I4cb88d4d769a5d5cb98a4ee6ac3fbb74245cb5f2
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658255
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:12 -08:00
Aparna Das
d6c6c6c483 gpu: nvgpu: add hal op for gr set error notifier
The vserver variant for gr set error notifier needs different
functionality to send interrupt to VM. Add HAL operation to
allow overriding vserver usecase.

Jira VQRM-2982

Change-Id: Ia445a27112bb6c5587dbb81100a9dafe5875b338
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1657830
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-06 14:52:03 -08:00
Deepak Goyal
26b9194603 gpu: nvgpu: gv11b: Correct PMU PG enabled masks.
PMU ucode records supported feature list for a
particular chip as support mask sent
via PMU_PG_PARAM_CMD_GR_INIT_PARAM.

It then enables selective feature list through
enable mask sent via
PMU_PG_PARAM_CMD_SUB_FEATURE_MASK_UPDATE cmd.

Right now only ELPG state machine mask was enabled.
Only ELPG state machine was getting executed
but other crucial steps in ELPG entry/exit sequence
were getting skipped.

Bug 200392620.
Bug 200296076.

Change-Id: I5e1800980990c146c731537290cb7d4c07e937c3
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665767
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-05 21:18:20 -08:00
Timo Alho
848af2ce6d Revert "Revert "Revert "gpu: nvgpu: Get coherency on gv100 + NVLINK working"""
This reverts commit 89fbf39a05.

Bug 2075315

Change-Id: Id34a0376be5160b164931926ec600f77edf69667
Signed-off-by: Timo Alho <talho@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1668487
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
2018-03-05 08:39:57 -08:00
Alex Waterman
89fbf39a05 Revert "Revert "gpu: nvgpu: Get coherency on gv100 + NVLINK working""
This reverts commit 5a35a95654.

JIRA EVLR-2333

Change-Id: I923c32496c343d39d34f6d406c38a9f6ce7dc6e0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1667167
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-02 22:10:14 -08:00
Alex Waterman
5a35a95654 Revert "gpu: nvgpu: Get coherency on gv100 + NVLINK working"
Also revert other changes related to IO coherence. This may be the
culprit in a recent dev-kernel lockdown.

Bug 2070609

Change-Id: Ida178aef161fadbc6db9512521ea51c702c1564b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665914
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Srikar Srimath Tirumala <srikars@nvidia.com>
2018-02-28 13:49:22 -08:00
Alex Waterman
1170687c33 gpu: nvgpu: Use coherent aperture flag
When using a coherent DMA API wee must make sure to program
any aperture fields with the coherent aperture setting. To
do this the nvgpu_aperture_mask() function was modified to
take a third aperture mask argument, a coherent setting, so
that code can use this function to generate coherent aperture
settings.

The aperture choice is some what tricky: the default version
of this function uses the state of the DMA API to determine
what aperture to use for SYSMEM: either coherent or
non-coherent internally. Thus a kernel user need only specify
the normal nvgpu_mem struct and the correct mask should be
chosen. Due to many uses of nvgpu_mem structs not created
directly from the DMA API wrapper it's easier to translate
SYSMEM to SYSMEM_COH after creation.

However, the GMMU mapping code, will encounter buffers from
userspace with difference coerency attributes than the DMA
API. Thus the __nvgpu_aperture_mask() really respects the
aperture setting passed in regardless of the DMA API state.
This aperture setting is pulled from NVGPU_VM_MAP_IO_COHERENT
since this is either passed in from userspace or set by the
kernel when using coherent DMA. The aperture field in attrs
is upgraded to coh if this flag is set.

This change also adds a coherent sysmem mask everywhere that
it can. There's a couple places that do not have a coherent
register field defined yet. These need to eventually be
defined and added.

Lastly the aperture mask code has been mvoed from the Linux
vm.c code to the general vm.c code since this function has
no Linux dependencies.

Note: depends on https://git-master.nvidia.com/r/1664536 for
new register fields.

JIRA EVLR-2333

Change-Id: I4b347911ecb7c511738563fe6c34d0e6aa380d71
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1655220
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-27 16:03:43 -08:00
Srikar Srimath Tirumala
a26de1185a Revert "gpu: nvgpu: Use gv11b_css_hw_set_handled_snapshots for GV11B"
This reverts commit 2f2e51bbae.

Bug 2068936

Change-Id: I539cdc12a3bd0d9d7fe0ce7dbe9cb7a274eeaa57
Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1664647
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
2018-02-26 19:01:16 -08:00
Martin Radev
2f2e51bbae gpu: nvgpu: Use gv11b_css_hw_set_handled_snapshots for GV11B
The value of NV_PERF_PMASYS_MEM_BUMP is different for Volta
and NVGPU_IOCTL_CHANNEL_CYCLE_STATS_SNAPSHOT_CMD_FLUSH did not
have correct behavior on GV11B due to that.
The patch adds an instance of css_hw_set_handled_snapshots
for Volta to fix that.
The patch also renames css_hw_set_handled_snapshots
to gk20a_css_hw_set_handled_snapshots to make it more clear
that the function is arch dependent.

Bug 1960846

Change-Id: I92c35a862ecd7f918dd1458c086fc7ae42ca8fc5
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1662427
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-26 12:03:24 -08:00
Deepak Nibade
f0cbe19b12 gpu: nvgpu: add user API to get read-only syncpoint address map
Add User space API NVGPU_AS_IOCTL_GET_SYNC_RO_MAP to get read-only syncpoint
address map in user space

We already map whole syncpoint shim to each address space with base address
being vm->syncpt_ro_map_gpu_va

This new API exposes this base GPU_VA address of syncpoint map, and unit size
of each syncpoint to user space.
User space can then calculate address of each syncpoint as
syncpoint_address = base_gpu_va + (syncpoint_id * syncpoint_unit_size)

Note that this syncpoint address is read_only, and should be only used for
inserting semaphore acquires.
Adding semaphore release with this address would result in MMU_FAULT

Define new HAL g->ops.fifo.get_sync_ro_map and set this for all GPUs supported
on Xavier SoC

Bug 200327559

Change-Id: Ica0db48fc28fdd0ff2a5eb09574dac843dc5e4fd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1649365
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-07 15:35:47 -08:00
Deepak Goyal
9402f4165b gpu: nvgpu: fix out of bounds access
lsf_ucode_desc_v1 has more size than signature bin.
In memcpy(dest, src, size_to_copy) usage, "size_to_copy"
is more than "size of the src" which is causing out of bounds
access.

Bug 2051856
NVGPU-507

Change-Id: I0aad34df39f95f7e95ccb10539e1fae9f65361a8
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1650140
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-06 10:01:48 -08:00
Seema Khowala
9beefc4551 gpu: nvgpu: add fecs_host_int_enable hal
This will be used to enable fecs interrupts per
chip.

Change-Id: Id99412ca1a9c4caad999c3458b0e9701515db4b9
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1642554
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-31 13:23:21 -08:00
Richard Zhao
b386768d32 gpu: nvgpu: make .tsg_unbind_channel one layer lower
The message to tell RM server to unbind channel has to be sent after
client unbinds the channel and before client calls tsg release. The
channel has to belong to a tsg on RM server before client submit a
runlist to remove the channel. Or there's a bare channel problem.

By moving .tsg_unbind_channl one layer lower, gk20a_tsg_unbind_channel()
will be common functions for all chip, and it'll call tsg release after
call .tsg_unbind_channel. So vgpu won't need to worry about tsg was
released before sending msg to RM server.

Bug 200382695
Bug 200382785

Change-Id: I32acc122f3f9d5d0628049ccf673225f9e90c87a
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1645383
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-31 02:40:48 -08:00
Terje Bergstrom
fb0a23ea16 gpu: nvgpu: Implement gp10b variant of cbc_ctrl
Pascal has support for more comptags than Maxwell, but we were using
gm20b definitions for cbc_ctrl on all chips. Specifically field
clear_upper_bound is one bit wider in Pascal.

Implement gp10b version of cbc_ctrl and take that into use in Pascal
and Volta.

Bug 200381317

Change-Id: I7d3cb9e92498e08f8704f156e2afb34404ce587e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1642574
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-24 14:42:16 -08:00
Terje Bergstrom
f3f14cdff5 gpu: nvgpu: Fold T19x code back to main code paths
Lots of code paths were split to T19x specific code paths and structs
due to split repository. Now that repositories are merged, fold all of
them back to main code paths and structs and remove the T19x specific
Kconfig flag.

Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-22 22:20:15 -08:00
seshendra Gadagottu
193a2ed38c gpu: nvgpu: add sw method for SET_BES_CROP_DEBUG4
Added sw method support for SET_BES_CROP_DEBUG4.
In this sw method:
CLAMP_FP_BLEND_TO_MAXVAL forces overflow and
CLAMP_FP_BLEND_TO_INF blend results to clamp to FP maxval.

Added support for this sw method in gp10b/gp106/gv11b
and gv100.

Bug 2046636

Change-Id: I3a9e97587aca76718f7f504ea3b853f87409092a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1641529
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-22 15:29:54 -08:00
Alex Waterman
d52b88315a gpu: nvgpu: fix typo
Rename gb10b_init_bar2_vm*() to gp10b_init_bar2_vm*().

Bug 200378257

Change-Id: I9f8a9ef42c82923200d7053c61bab2652b58cbc2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1639757
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-18 23:40:35 -08:00
Deepak Goyal
e0dbf3a784 gpu: nvgpu: gv11b: Enable perfmon.
t19x PMU ucode uses RPC mechanism for
PERFMON commands.

- Declared  "pmu_init_perfmon",
  "pmu_perfmon_start_sampling",
  "pmu_perfmon_stop_sampling" and
  "pmu_perfmon_get_samples" in pmu ops
  to differenciate for chips using RPC & legacy
  cmd/msg mechanism.
- Defined and used PERFMON RPC commands for t19x
  	- INIT
	- START
	- STOP
	- QUERY
- Adds RPC handler for PERFMON RPC commands.
- For guerying GPU utilization/load, we need to send PERFMON_QUERY
  RPC command for gv11b.
- Enables perfmon for gv11b.

Bug 2039013

Change-Id: Ic32326f81d48f11bc772afb8fee2dee6e427a699
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1614114
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-18 23:40:02 -08:00
Terje Bergstrom
2f6698b863 gpu: nvgpu: Make graphics context property of TSG
Move graphics context ownership to TSG instead of channel. Combine
channel_ctx_gk20a and gr_ctx_desc to one structure, because the split
between them was arbitrary. Move context header to be property of
channel.

Bug 1842197

Change-Id: I410e3262f80b318d8528bcbec270b63a2d8d2ff9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1639532
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-17 12:29:09 -08:00
Terje Bergstrom
ece3d958b3 gpu: nvgpu: Combine gk20a and gp10b free_gr_ctx
gp10b version of free_gr_ctx was created to keep gp10b source code
changes out from the mainline. gp10b was merged back to mainline a
while ago, so this separation is no longer needed. Merge the two
variants.

Change-Id: I954b3b677e98e4248f95641ea22e0def4e583c66
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635127
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-12 12:42:57 -08:00
seshendra Gadagottu
e9de95d7e0 gpu: nvgpu: use chip specific zbc_c/z format reg
Use chip specific gpcs_swdx_dss_zbc_c_format_reg
and gpcs_swdx_dss_zbc_z_format_reg. These registers
are different for gv11b/gv100 from gp10b/gp106.

Change-Id: I9e209c878a11edc986ba4304ff60fcccbb5087aa
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635091
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 08:47:07 -08:00
seshendra Gadagottu
0ac3ba2a99 gpu: nvgpu: gv11b: fix for gfx preemption
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.
For gp10b used smaller buffer sizes than specified
value in hw manuals as per sw requirement.

Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode

This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.

Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.

Bug 1976694

Change-Id: I2dc923340d34d0dc5fe45419200d0cf4f53cdb23
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635027
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 08:47:03 -08:00
Alex Waterman
2ae16008cd Revert "gpu: nvgpu: gv11b: fix for gfx preemption"
This reverts commit caf168e33e.

Might be causing an intermittency in quill-c03 graphics submit. Super
weird since the only change that seems like it could affect it is the
header file update but that seems rather safe.

Bug 2044830

Change-Id: I14809d4945744193b9c2d7729ae8a516eb3e0b21
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1634349
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Timo Alho <talho@nvidia.com>
Tested-by: Timo Alho <talho@nvidia.com>
2018-01-09 06:32:30 -08:00
seshendra Gadagottu
caf168e33e gpu: nvgpu: gv11b: fix for gfx preemption
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.

Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode

This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.

Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.

Bug 1976694

Change-Id: I284e29e0815d205c150998b07d0757b5089d3267
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630520
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Tested-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-08 12:16:49 -08:00