Add below two new HALs
gops.fifo.runlist_hw_submit() to submit a new runlist to hardware
gops.fifo.runlist_wait_pending() to wait until runlist write is successful
Set existing API gk20a_fifo_runlist_wait_pending() to
gops.fifo.runlist_wait_pending HAL
Add new API gk20a_fifo_runlist_hw_submit() which submits the runlist to h/w
and set it to gops.fifo.runlist_hw_submit HAL
Jira NVGPUT-20
Change-Id: Ic23f7d947e30883aca0b536de818e79e14733195
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1700548
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Allow a potential IOMMU'ed GMMU mapping for all SYSMEM buffers
inlcuding coherent sysmem. Typically this won't actually happen
since IO coherent mappings will also often be accessed over
NVLINK which is physically addressed.
Also update the comments surrounding this code to take into
account the new NVLINK nuances. Since NVLINK buffers are
directly mapped even when the IOMMU is enabled this is very
deserving of a comment explaining what's going on.
Lastly add some simple functions for checking if an nvgpu_mem
(or a particular aperture field) is a sysmem aperture. Currently
this includes SYSMEM and SYSMEM_COH.
JIRA EVLR-2333
Change-Id: I992d3c25d433778eaad9eef338aa5aa42afe597e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665185
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Cache the rate used in clk_set_rate().
Return that cached rate on clk_get_rate(), don't read from hardware.
This cached rate is used to avoid duplicate requests to clk_set_rate().
Motivation is to support multiple governors for gpu clk.
Reading clock from hardware is unreliable in multi-governor situation.
Relying on hardware clock value could mislead the kernel gpu governor
in its scaling calculations.
Bug 2051688
Change-Id: I43fc056eea6f69fe0889c45640fcb892b658071c
Signed-off-by: Arun Kannan <akannan@nvidia.com>
(cherry picked from commit 7f819a9ba7)
Reviewed-on: https://git-master.nvidia.com/r/1662759
Reviewed-on: https://git-master.nvidia.com/r/1668919
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch deals with cleanups meant to make things simpler for the
upcoming os abstraction patches for the sync framework. This patch
causes some substantial changes which are listed out as follows.
1) sync_timeline is moved out of gk20a_fence into struct
nvgpu_channel_linux. New function pointers are created to facilitate os
independent methods for enabling/disabling timeline and are now named
as os_fence_framework. These function pointers are located in the struct
os_channel under struct gk20a.
2) construction of the channel_sync require nvgpu_finalize_poweron_linux()
to be invoked before invocations to nvgpu_init_mm_ce_context(). Hence,
these methods are now moved away from gk20a_finalize_poweron() and
invoked after nvgpu_finalize_poweron_linux().
3) sync_fence creation is now delinked from fence construction and move
to the channel_sync_gk20a's channel_incr methods. These sync_fences are
mainly associated with post_fences.
4) In case userspace requires the sync_fences to be constructed, we
try to obtain an fd before the gk20a_channel_submit_gpfifo() instead of
trying to do that later. This is used to avoid potential after effects
of duplicate work submission due to failure to obtain an unused fd.
JIRA NVGPU-66
Change-Id: I42a3e4e2e692a113b1b36d2b48ab107ae4444dfa
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1678400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
As part of debug session unification following changes are
required.
-Including bug.h header file to fix the compilation issue
on QNX
- The mechanism of posting debug events is OS specific. In Linux
this works through poll fd, wherein we can make use of nvgpu_cond
variables to poll and trigger the corresponding wait_queue
via nvgpu_cond_broadcast_interruptible() call.
The post event functionality on QNX doesn't work on poll though.
It uses iofunc_notify_trigger to post the debug events to calling
process. As such QNX can't work with nvgpu_cond's.
To overcome this issue, it is proposed to create a OS specific
interface for posting debugger events. Linux can call
nvgpu_cond_broadcast_interruptible() in its implementation, which
makes sense since these are already initialized and poll'ed in the
Linux specific code only.
QNX can implement this interface to call iofunc_notify_* functions,
as per its need
Jira VQRM-2363
Change-Id: I0abdc0787f771040b8aff5384290d7e6549f81fb
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Signed-off-by: Prateek Sethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1696368
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, hyp_read_ipa_pa_info() only translates IPA for RAM
mappings. It fails for MMIO mappings. In particular, it will
fail when attempting to translate addresses in the syncpoint
shim aperture. As a workaround, assume 1:1 IPA to PA mapping
when hyp_read_ipa_pa_info fails, and address is in syncpt
shim aperture.
Bug 2096877
Change-Id: I5267f0a8febf065157910ad3408374cacd398731
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1687796
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
linux driver runs in user's process but qnx driver has dedicate driver
process, so they have different way to get user pid. nvgpu common code
expect calls from os specific code pass pid/tid.
ce/cde open channel for internal use, we use driver pid.
Jira VQRM-3534
Change-Id: I892372ac5f1dc4d25f9928d16992bcc659d12a56
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1694145
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gv11b/gk20a_create_priv_addr_table() we do not consider floorswept FBPAs
and just calculate the unicast list assuming all FBPAs are present
This generates incorrect list of unicast addresses
Fix this introducing new HAL ops.gr.split_fbpa_broadcast_addr
Set gr_gv100_get_active_fpba_mask() for GV100
Set gr_gk20a_split_fbpa_broadcast_addr() for rest of the chips
gr_gv100_get_active_fpba_mask() will first get active FPBA mask and generate
unicast list only for active FBPAs
Bug 200398811
Jira NVGPU-556
Change-Id: Idd11d6e7ad7b6836525fe41509aeccf52038321f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1694444
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The submit gpfifo flags are splattered everywhere inside the nvgpu
code. Though the usage is inside nvgpu Linux code only, still it
needs to be gotten rid of and replaced with the defines
present in common code.
VQRM-3465
Change-Id: I901b33565b01fa3e1f9ba6698a323c16547a8d3e
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1691979
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove the usage of nvgpu_gpfifo splattered across nvgpu,
and replace with a struct defined in common code.
The usage is still inside Linux, but this helps the
subsequent unification efforts, e.g. to unify the submit
path.
VQRM-3465
Change-Id: I9e5ac697a0c7f85239ddba319085c09481d20d6b
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1691978
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove the usage of nvgpu_fence splattered across nvgpu,
and replace with a struct defined in common code.
The usage is still inside Linux, but this helps the
subsequent unification efforts, e.g. to unify the submit
path.
VQRM-3465
Change-Id: Ic3737450123dfc5e1c40ca5b6b8d8f6b3070aa0d
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1691977
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gv11b_gr_egpc_etpc_priv_addr_table(), we call
gv11b_gr_update_priv_addr_table_smpc() to convert SMPC broadcast address into
list of unicast addresses
But before calling gv11b_gr_update_priv_addr_table_smpc() we sometimes
incorrectly set gpc_num/tpc_num to zero and that leads to generating incorrect
list of unicast addresses
Remove this incorrect initialization of gpc_num/tpc_num
Also update gv11b_gr_egpc_etpc_priv_addr_table() to receive tpc_num along with
gpc_num
Bug 2099717
Jira NVGPU-580
Change-Id: Idd4e5f78dbe6ca1800efae93c66355d06417d1f2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1691373
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gm20b_get_fbp_en_mask(), we read incorrect fuse register to get status
of enabled FBPs
And then we use incorrect arithmetic to calculate fpb_en_mask
Fix this by using correct fuse register and also doing correct arithmetic to get
mask of enabled FBPs
Bug 200398811
Jira NVGPU-556
Change-Id: I79f3ebf590faa9baf176c7a939142c379bf5ebf4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690029
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We currently use hard coded values of NV_PERF_PMMGPC_CHIPLET_OFFSET and
NV_PMM_FBP_STRIDE which are incorrect for Volta
Add new GR HAL get_pmm_per_chiplet_offset() to get correct value per-chip
Set gr_gm20b_get_pmm_per_chiplet_offset() for older chips
Set gr_gv11b_get_pmm_per_chiplet_offset() for Volta
Use HAL instead of hard coded values wherever required
Bug 200398811
Jira NVGPU-556
Change-Id: I947e7febd4f84fae740a1bc74f99d72e1df523aa
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690028
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We have new broadcast registers on Volta, and we need to generate correct
unicast addresses for them so that we can write those registers to context image
Add new GR HAL create_priv_addr_table() to do this conversion
Set gr_gk20a_create_priv_addr_table() for older chips
Set gr_gv11b_create_priv_addr_table() for Volta
gr_gv11b_create_priv_addr_table() will use the broadcast flags and then generate
appriate list of unicast register for each broadcast register
Bug 200398811
Jira NVGPU-556
Change-Id: Id53a9e56106d200fe560ffc93394cc0e976f455f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690027
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
With Volta we have more number of broadcast registers than previous chips
and we don't decode them right now in gr_gk20a_decode_priv_addr()
Add a new GR HAL decode_priv_addr() and set gr_gk20a_decode_priv_addr() for all
previous chips
Add and use gr_gv11b_decode_priv_addr() for Volta
gr_gv11b_decode_priv_addr() will decode all the broadcast registers and set
the broadcast flags apporiately
Define below new broadcast types
PRI_BROADCAST_FLAGS_PMMGPC
PRI_BROADCAST_FLAGS_PMM_GPCS
PRI_BROADCAST_FLAGS_PMM_GPCGS_GPCTPCA
PRI_BROADCAST_FLAGS_PMM_GPCGS_GPCTPCB
PRI_BROADCAST_FLAGS_PMMFBP
PRI_BROADCAST_FLAGS_PMM_FBPS
PRI_BROADCAST_FLAGS_PMM_FBPGS_LTC
PRI_BROADCAST_FLAGS_PMM_FBPGS_ROP
Bug 200398811
Jira NVGPU-556
Change-Id: Ic673b357a75b6af3d24a4c16bb5b6bc15974d5b7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1690026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Enabled internal temperature sensor read for gv100
dgpu.
- Added check to temperature read support before
proceeding to read temperature from H/W
- Assigned GP106 temperature HAL's for GV100 as no changes
between GP106 & GV100 H/W registers.
Bug 200352328
Change-Id: I86b5a1859b87ace49a07d0ff3749bb5b085bba91
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673347
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>