Wait for pbdma and engine to go idle so that the tasks get completed before
suspending.
Updated the logic in gk20a_wait_engine_idle to consider the ctxsw status.
And updated PBDMA idle logic to check the pbdma status and the pb/gp
get/put pointers.
Bug 3789519
Bug 3832838
Change-Id: Ifd105bbb305eaf358423281b192f67d782d773a4
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2870162
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
To support concurrent UMD kickoff and railgate (due to VPR resize), It
is necessary that nvgpu immediately prevents more work submission when
it's known gpu needs to be idled. So nvgpu needs to unmap the usermode
region earliest during suspend sequence.
Otherwise, engines will not go idle before poweroff. Hence move the
call to nvgpu_hide_usermode_poweroff to the beginning of
gk20a_pm_prepare_poweorff.
Also during suspend we ensure that the channels are preempted cleanly.
IRQs should be kept enabled until after channels are suspended as the
stalling IRQ can block the preemption. Hence moved the IRQ disable
post channel_suspend call.
gk20a_prepare_poweroff unconditionally sets power_on to false. Hence
there is no need to reenable IRQs, resume scale in the failure path
of gk20a_pm_prepare_poweroff as those will be done during call to
gk20a_pm_finalize_poweron.
Bug 3789519
Change-Id: I03064e7e636252a8f3d8fe9c8c05629ce2ba5fba
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2853584
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
UMD is unable to restrict railgate for deterministic channels
when NVGPU_CAN_RAILGATE is set to false. This causes issues
with VPR resize as there is no means of preventing an active
VPR resize in progress.
Add a fault handler for usermode region. The fault handler's purpose is
to intercept UMD accesses into the doorbell region when a GPU reset
is in progress. GPU reset could be triggered by VPR resize. During a
reset, the corresponding PTEs for the usermode region are zapped. The
fault handler tries to have a read access to g->deterministic_busy
and blocks till the reset is finished. A VPR resize is guaranteed
to be mutually exclusive due to use of the g->deterministic_busy
RW semaphore.
Bug 3789519
Change-Id: Ie046ee9be8d9b5d4019359c60a4578097b8d55a3
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2802185
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
%p prints (ptrval) instead of hexadecimal value until it gathers
enough entropy. During early boot stage, it can make invalid
cache name like "nvgpu-cache-0x (ptrval)-128-1" and that
kind of cache name can make failure of kmem_cache_create().
To avoid invalid cache name, replace %p with %px for cache name.
Bug 4100509
Change-Id: Iae0ae9cf1a30ec91aeddddaafda9e7376fc80796
Signed-off-by: Jake Park <jakep@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2929270
Reviewed-by: Dmitry Pervushin <dpervushin@nvidia.com>
Reviewed-by: Kwangwoo Lee <kwangwool@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
NETLIST_REGIONID_SW_CTX_LOAD writes update gr_gpcs_tpcs_tex_m_dbg2_r to
default value that keeps rd coalesce enabled for LG & SU.
Disable rd coalesce for tex, lg and su after NETLIST_REGIONID_SW_CTX_LOAD
writes during gr init and golden ctx init for it to take effect.
For gr sw method handling, don't update the tex rd coalesce on interrupt
with offset *_SET_RD_COALESCE as we want to keep rd coalescing disabled.
Bug 3881919
Change-Id: Ie7e6616d48f84547ce3380bfa395910b7995c05b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2857141
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently in case of any fecs error, we only dump fecs
cxtsw fw related registers, mailboxes and trace registers.
With this change, we want to ensure we dump gpccs register
space as well. This will help in debugging ctxsw related
failures
JIRA NVGPU-9560
Bug 3907163
Change-Id: I61e25883da4455ea1412ca70c5fc3377d9a786a3
Signed-off-by: Kishan <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2850402
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
gk20a_scale_target is called through the target member of a
devfreq_profile. It is only called from devfreqs update_devfreq
function or through governor_passive. governor_passive is not used for
nvgpu.
Since update_devfreq already enforce the devfreq limits,
gka20_scale_target can be simplified by only checking pm_qos
limits and also only if GK20A_PM_QOS is enabled.
This also resolves a race between creating devfreq sysfs
files and setting 'l->devfreq' in gk20a_scale_init that can
lead to accessing a NULL pointer by writing to the sysfs files.
Example:
Unable to handle kernel NULL pointer dereference at virtual address 00000430
<snip>
Call trace:
[<000000006aa50d89>] gk20a_scale_target+0x5c/0x120 [nvgpu]
[<00000000e5a63f7c>] update_devfreq+0xec/0x22c
[<0000000014a13c8a>] max_freq_store+0xa8/0xfc
[<0000000072139393>] dev_attr_store+0x48/0x60
[<000000008ec280df>] sysfs_kf_write+0x60/0x70
[<0000000038427ed5>] kernfs_fop_write+0xc4/0x1e0
[<00000000c0b74aa9>] __vfs_write+0x60/0x14c
[<0000000078fcebb4>] vfs_write+0xb0/0x1b4
[<000000007720da30>] SyS_write+0x74/0xf0
[<0000000067443e2c>] __sys_trace_return+0x0/0x4
Bug 3910155
Change-Id: I7193cc5ea85454acf0890b3ca8d1c3526ca8517e
Signed-off-by: Ken Chang <kenc@nvidia.com>
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2828219
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
- When DISALLOW cmd is sent from driver to PMU the actual
completion of the disallow will be acknowledged by PMU
via a PG EVENT: ASYNC_CMD_RESP.
- Disallow needs a delayed ACK from PMU in order to disable
the ELPG.
- If ELPG is already engaged, the DISALLOW cmd will trigger
ELPG exit and then transition to PMU_PG_STATE_DISALLOW.
- After this whole process is completed, PMU will send
DISALLOW_ACK through ASYNC_CMD_RESP msg.
- After disallow command is sent from the driver, NvGPU driver
waits/polls for disallow command ack. This is sent immediately
by msg framework of PMU.
- Then, the driver will poll/wait for ASYNC_CMD_RESP event which
is the delayed DISALLOW ACK.
- The driver captures the ASYNC_CMD_RESP sent from PMU.
- set disallow_state to ELPG_OFF.
- If the driver does not wait/poll for this delayed disallow
ack from PMU, it can result in erros as PMU is still
processing DISALLOW cmd but the driver progressed further.
Bug 3580271
Change-Id: I332180c05b6a398107f065d54e9718b7038fb1b2
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689500
(cherry picked from commit fb019bf43a)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2694312
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
On volta the GPU determines whether to do L3 allocation for a mapping by
checking bit 36 of the physical address. So if a mapping should allocate lines
in the L3 this bit must be set.
However, when the physical addresses for 64GB of RAM uses the 36th bit
resulting in a conflict. Thus, add support for disabling l3 support
for SKUs having 64GB of physical memory.
Bug 3486025
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Ic540e754274cf1d9e6625493962699d21509e540
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2661548
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Tested-by: Brad Griffis <bgriffis@nvidia.com>
GVS: Gerrit_Virtual_Submit
ACR ucode is encrypted using different keys for prod/dbg boards.
This change adds a check to select ACR ucode based on board type.
Note: This support is added only for t19x.
Bug 2350733
Bug 2672832
Bug 2672836
Bug 2674821
JIRA NVGPU-4001
(cherry picked from commit c19a0f0c26ab94f6bbf4380ab93e458b88589c82)
Change-Id: I2febc2cbe869c06bca0adebd7723b0d6fc1d4b23
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2483968
Tested-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
clk_arb completion file descriptor can get closed immediately after
poll finishes in the work item gp10b_clk_arb_run_arbiter_cb. In
that case, the refcount for nvgpu_clk_dev can become zero in
the work item and can lead to invalid access while removing
nvgpu_clk_dev from the lists.
Remove nvgpu_clk_dev from the list before dropping the reference to
it.
Also, delete the nvgpu_clk_dev in completion file release handler
within the session and requests spinlocks to avoid race with
gp10b_clk_arb_run_arbiter_cb using it.
bug 200757277
Change-Id: I054eee547f2a6fa633d7ef55df216ec36647a826
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2569522
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Don't store the return value of elpg re-enable if disable fails; this
could make the local status value zero again, causing the elpg-protected
call to be executed with elpg still enabled and elpg re-enabled twice.
Commit c905858565 ("gpu: nvgpu: add cg and pg function") introduced
this bug; failure of re-enabling after a failed disable might be another
problem (and it's not clear why this is done in the first place) which
isn't propagated to the caller, but that would belong to another patch.
Bug 200565050
Change-Id: I7cf7a0887ae59e85bf0c56c38aaaadfefd16cc1c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2541859
(cherry picked from commit 4b3591aafb)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2543030
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement nvgpu plumbing to allow reporting ECC errors(corrected
and uncorrected) to a L1SS service(if one exists).
This patch includes the following
1) Added code that submits ECC error reports via the Interrupt context
directly to a L1SS service in linux OS.
2) Added support for enabling/disabling the error reports via L1SS's
registration/deregistration API. Nvgpu simply invokes an empty function
until the registration is successful.
3) Added Spinlock to correctly handle concurrency for accessing the
correct Ops for submitting requests.
4) Adds error reporting for a subset of interrupts that can be verified
via external ECC injection logic. A subsequent patch will add the
API for rest of the interrupts.
5) In case of critical(uncorrected errors), change nvgpu's state to
quiesce state.
Jira L4T-1187
Bug 200700400
Change-Id: Id31f70531fba355e94e72c4f9762593e7667a11c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2530411
Tested-by: Bibek Basu <bbasu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
For Linux, limit the use of the cache to entries less than the page size, to
avoid potential problems with running out of CMA memory when allocating large,
contiguous slabs, as would be required for non-iommmuable chips.
Also, in nvgpu_pd_cache_do_free(), zero out entries only if iommu is in use
and PTE entries use the cache (since it's the prefetch of invalid PTEs by
iommu that needs to be avoided).
Bug 3093183
Bug 3100907
Change-Id: I363031db32e11bc705810a7e87fc9e9ac1dc00bd
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2422039
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Large buffers being mapped to GMMU end up needing many
pages for the PTE tables. Allocating these pages one
by one can end up being a performance bottleneck, particularly
in the virtualized case.
Add support for page-sized PTEs to the existing PD cache:
- define NVGPU_PD_CACHE_SIZE, the allocation size for a new slab
for the PD cache, effectively set to 64K bytes
- Use the PD cache for any allocation < NVGPU_PD_CACHE_SIZE
- When freeing up cached entries, avoid prefetch errors by
invalidating the entry (memset to 0)
Bug 3093183
Bug 3100907
Change-Id: I2302a1dfeb056b9461159121bbae1be70524a357
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2401783
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below change added capability check in the ioctl. nvgpu is advertising
the support for RESCHEDULE_RUNLIST for all processes even though it
fails the ioctl for non-realtime processes.
Clear the ioctl flag for RESCHEDULE_RUNLIST for non-realtime processes.
commit 838ba0a14d ("gpu: nvgpu: check capability for reschedule runlist submit flag")
Author: David Li <davli@nvidia.com>
Date: Tue Sep 12 18:37:00 2017 -0700
NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST is only used by realtime
priority EGL context, which checks for CAP_SYS_NICE during context
creation in userspace, so it wasn't secure against unprivileged program
spoofing submit ioctl with this flag to stall GPU progress of others.
This flag does increase duration of submit by approx 16us,
mostly due to register accesses and PMU FIFO mutex.
Bug 2823941
Change-Id: Iecee3989e5af035264b1ed5c1aa9a8576dd90883
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372957
(cherry picked from commit 864213ae55b009b0a026ac380b26276332f79177)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392714
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
On Volta, nvgpu needs to wait for explicit ACK from CTXSW while
setting FECS watchdog timeoout
This is manual port of the fixes 4d7e5026e38528b88a4a168eca9a8b180475b368
and ad89436b03428a42e43042b6a849c15843fdebc4 on dev-main since clean
cherry-pick is not possible due to huge file and structure differences.
Bug 200603566
Change-Id: Icba69998ab45eee5fdf2a29e1ac1067589301be6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2371708
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>