membar.sys does synchronization with the whole system (GPU and CPU),
membar.gl does synchronization within the GPU.
In gv11b, fb flush is generating membar.gl instead of membar.sys, which
is an issue. To fix this issue. following WAR is used:
1. Use bar1 engine id and bind it to a particular pdb,
2. Then instead of a fb_flush, issue a tlb invalidate of the bar1 pdb.
Now allocation of vm for bar1 instance block and bar1 binding is done
without check for bar1 support. Only bar1 register mapping is done
based on bar1 support enabled.
Bug 2112790
Change-Id: I76f43f1178a68f10823d48bc9da55d2bd686dd52
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1750257
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Add support for aborting runlist/s. Aborting runlist/s,
will abort all active tsgs and associated active channels
within these active tsgs
-Bare channels are no longer supported. Remove recovery
support for bare channels. In case there are bare
channels, recovery will trigger runlist abort
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: I6bec8a0004508cf65ea128bf641a26bf4c2f236d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640567
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Even though tsg preempt timed out, teardown sequence would by
design require s/w to issue another preempt.
If recovery includes an ENGINE_RESET, to not have race conditions,
use RUNLIST_PREEMPT to kick all work off, and cancel any context
load which may be pending. This is also needed to make sure
that all PBDMAs serving the engine are not loaded when engine is
reset
-Add max retries for pre-si platforms for runlist preempt done
polling loop
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: If9d1731fc17e7e7281b24a696ea0917cd269498c
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1709902
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-is_preempt_pending hal does not need timeout_rc_type input param as
for volta, reset_eng_bitmask is saved if preempt times out. For
legacy chips, recovery triggers mmu fault and mmu fault handler
takes care of resetting engines.
-For volta, no special input param needed to differentiate between
preempt polling during normal scenario and preempt polling during
recovery. Recovery path uses preempt_ch_tsg hal to issue preempt.
This hal does not issue recovery if preempt times out.
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: Ie76a18ae0be880cfbeee615859a08179fb974fa8
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1709799
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For Si platforms, gk20a_get_gr_idle_timeout returns
3000 ms i.e. 3 sec. Currently this time is used for
polling each pbdma, eng and runlist and this conflicts
channel timeout if preempt fails.
Use 1000 ms timeout for polling preempt timeout when
timeouts are enabled else use gk20a_get_gr_idle_timeout.
For non-si platforms, polling loop is depending on
max number of retries.
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: Icdb69b7b7d17292f2b6a43f1d8e9d75ff545d0ae
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1739543
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-During polling eng preempt done, reset eng only if eng stall
intr is pending. Also stop polling for eng preempt done
if eng intr is pending.
-Add max retries for pre-si platforms for poll pbdma and eng
preempt done polling loops.
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: I66b07be9647f141bd03801f83e3cda797e88272f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1694137
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- This patch sets required fecs trace functions to hals.
- In gv100, GPU VA needs to programmed in ucode to
get FECS trace. Hence, setting this "NVGPU_FECS_TRACE_VA"
characteristic to be true.
EVLR-2708
Change-Id: I41845a1db8452b8e3fa9d67ecc4cabfac503fd0b
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1747269
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
ACR allocates buffers from vidmem and passes explicitly flag
NVGPU_DMA_NO_KERNEL_MAPPING. Vidmem buffers are never mapped to CPU,
and dGPU ACR does not actually care, so switch to using
nvgpu_alloc_vid_at(). This removes another explicit dependency
to NVGPU_DMA_NO_KERNEL_MAPPING, which is going to be removed.
Change-Id: I532d40c5ba9e71f07461c526bd2a43e1eb01a290
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1753711
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Update the Linux specific code to match the MM API docs in the
previous patch. The user passed page size is plumbed through
the Linux VM mapping calls but is ultimately ignored once the
core VM code is called. This will be handled in the next
patch.
This also adds some code to make the CDE page size picking
happen semi-intelligently. In many cases the CDE buffers can
be mapped with large pages.
Bug 2011640
Change-Id: I20e78e7d5a841e410864b474179e71da1c2482f4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1740610
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
When generating the PTE size for a given mapping the code must
consider whether the GPU is being IOMMU'ed. The presence and
usage of an IOMMU implies the buffers will appear contiguous
to the GPU. Without an IOMMU we cannot assume that and therefor
must use small pages regardless of the size of the buffer to
be mapped.
Bug 2011640
Change-Id: I6c64cbcd8844a7ed855116754b795d949a3003af
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1697891
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_submit_channel_gpfifo the gpfifo entries can come from a kernel
buffer or from userspace. To simplify the logic in
gk20a_submit_append_gpfifo, extract out a function that copies the
entries directly from userspace to the gpu memory for performance, and
another function that copies from a kernel buffer to the gpu memory. The
latter is used for kernel submits and when the gpfifo pipe exists which
would mean that the gpfifo memory is in vidmem and is thus not directly
accessible with a kernel virtual pointer.
While this function is being changed a lot, also rename it to start with
nvgpu_ instead of gk20a_.
Additionally, simplify pushbuffer debug tracing by always using the
kernel memory for the prints. Tracing when the gpfifo memory has been
allocated in vidmem is no longer supported; sysmem is almost always used
in practice anyway.
Jira NVGPU-705
Change-Id: Icab843a379a75fb46054dee157a0a54ff9fbba59
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1730481
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The biggest remaining Linuxism in the submit path is the
copy_from_user() calls for reading the gpfifo entries to the HW-visible
buffer. Abstract away the copy of one such segment starting at some
offset and keep the wraparound logic and vidmem proxy in the core submit
path.
Jira NVGPU-705
Change-Id: I0c6438045c695e5e3f5da4fbc0c92d2c6e7f32cb
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1730480
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gk20a_submit_channel_gpfifo() supports reading the gpfifo entries from
either a kernel buffer or an userspace buffer in an ioctl. Add two
separate entry points: one for the ioctl and another for any other
kernel use. This shortens the function prototypes and simplifies and
clarifies the call sites slightly.
Jira NVGPU-705
Change-Id: If5141a459261a451f78cc50972f4c94d95ba44d1
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1730479
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Moved PG refcount checking to a wrapper function, this
function manages the refcount and decides whether to call
dbg_set_powergate function.
Instead of checking the dbg_s->is_pg_disabled variable,
code is checking g->dbg_powergating_disabled_refcount
variable to know if powergate is disabled or not.
Updating hwpm ctxsw mode without disabling powergate
will result in priv errors.
Bug 200410871
Bug 2109765
Change-Id: I33c9022cb04cd39249c78e72584dfe6afb7212d0
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1753550
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add an OS-abstracted API for printing the name of the current process
into a log message and convert the single occurrence of current->comm in
submit path power failure to use it.
Jira NVGPU-705
Change-Id: I1a509dcc5aecc3c89ce4582733888081b3e38f1f
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1749833
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In case of mmu nack error interrupt is received twice through SM
reported mmu nack interrupt and mmu fault in undertermined order.
Recover on the first received interrupt to avoid semaphore release
and skip doing a second recovery.
Also fix NULL pointer dereference in function
gv11b_fifo_reset_pbdma_and_eng_faulted when channel reference is
invalid in teardown path.
Bug 200382235
Change-Id: I361a5725d7b6355ebf02b2870727f647fbd7a37e
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1739804
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The watchdog tracks wall clock time. If the GPU's runlist is heavily
congested, other work can last long enough to trigger the watchdog for
trusted kernel channels too.
We don't expect the CDE work to ever get stuck, so disable wdt there.
Bug 200311892
Change-Id: I58c7d23891bc73aaeea0ccfcead567b3c6c13a52
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1493814
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make ecc sysfs hash table per GPU by adding it as
part of nvgpu_os_linux. Using a single hash table
might give incorrect results as GPUs have same filenames
and a filename is used as a key for a lookup.
Add device_attribute as part of struct gk20a_ecc_stat. Using
a single array of pointers of device attribute for an
ecc_stat results in memory leak and incorrect stats if
multiple GPUs are present on the system. This array of pointers
will always hold info for GPU which created sysfs nodes last.
Fix this by making device attribute array per ecc stat per GPU.
Fix ecc stat removal to consider zero sub-units for a given
number of hwunits. The multiplication with zero results
in not removing any sysfs node at all.
Bug 1987855
Change-Id: Ifcacc5623cede8decfe228c02d72786337cd0876
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735989
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add remove_gr_sys() op to gpu_ops to reverse steps
done in create_gr_sysfs().
Make gv11b_tegra_remove() specific to gv11b instead
to properly remove sysfs nodes. This also helps in
having gv11b specific remove steps.
Also, update platform remove function of dGPU i.e.
nvgpu_pci_tegra_remove() to remove sysfs nodes. This
adds parity with iGPU platform remove.
Bug 1987855
Change-Id: Ibbaffac5c24346709347f86444a951461894354d
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735987
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- defined platform agnostic wrapper for mempool
mapping and unmapping.
- used platform agnositc wrapper for device
tree parsing.
- modified css_gr_gk20a to include special
handling incase of rm-server
JIRA: VQRM:3699
Change-Id: I08fd26052edfa1edf45a67be57f7d27c38ad106a
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1733576
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- removed inclusion of linux includes.
- replaced with nvgpu/*.h's
- reformated the function signature of
"css_hw_get_pending_snapshot" and
"css_hw_get_overflow_status" be global instead of
static.
- added get_pending_snapshot and get_overflow_status
to ops->css.
JIRA: VQRM-3699
Change-Id: I177904c263e143b414924c2c28ad6fd3cfd00132
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1732783
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below HALs to setup mmu_fault configuration registers and to read
information registers and set them on Volta
gops.fb.write_mmu_fault_buffer_lo_hi()
gops.fb.write_mmu_fault_buffer_get()
gops.fb.write_mmu_fault_buffer_size()
gops.fb.write_mmu_fault_status()
gops.fb.read_mmu_fault_buffer_get()
gops.fb.read_mmu_fault_buffer_put()
gops.fb.read_mmu_fault_buffer_size()
gops.fb.read_mmu_fault_addr_lo_hi()
gops.fb.read_mmu_fault_inst_lo_hi()
gops.fb.read_mmu_fault_info()
gops.fb.read_mmu_fault_status()
Jira NVGPUT-13
Change-Id: Ia99568ff905ada3c035efb4565613576012f5bef
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1744063
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
A few review comments got lost in the review of moving bus code to
common/bus. This takes care of renaming the header file protection
define, deletes the unnecessary description of the file in header,
and updates copyright years.
Change-Id: Ib7dfe3d8fdf31ff3ea1fbf96fc41f9e454486dd1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1741824
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>