Add g->fifo_eng_timeout_us to define engine timeout in microseconds.
It is initialized with GRFIFO_TIMEOUT_CHECK_PERIOD_US. In RM server
case, it can be overriden with value defined in device tree.
Jira EVLR-2674
Change-Id: I69ac2ce779fe575566c8ba48e8cd2d0e6b2d93cf
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1728391
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
CAU (Counter Aggregation Unit) registers might be split out from SMPC registers
and moved into their own list on some platforms
In gr_gk20a_init_ctx_vars_fw() add support to check if pm_cau list is available
If list is available, count will be set to non-zero here
In add_ctxsw_buffer_map_entries_gpcs(), parse the pm_cau list if count is
non-zero
Bug 2139870
Change-Id: Ia630e7d03481a6f927c6739d28ebfe49f221326f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1733208
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Matthew Braun (SW-GPU) <matthewb@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add gk20a_fifo_profile_snapshot() to store the submit time in a
profiling entry that was acquired from gk20a_fifo_profile_acquire().
Also get rid of ifdef CONFIG_DEBUG_FS by stubbing the acquire and free
functions when debugfs is not enabled. This reduces some cyclomatic
complexity in the submit path.
Jira NVGPU-708
Change-Id: I39829a6475cfe3aa582620219e420bde62228e52
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1729545
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Integrity already typedefs these and complains if you override them
even with the same underlying type.
Since we only use these in the regops_gk20a.h header file (outside of
the Linux specific code, that is) this patch just changes the __uXX to
uXX. With that we can delete the now unnecessary __uXX defs.
JIRA NVGPU-525
Change-Id: I01dd2723b68db2170449342f73c711ee5a589adb
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1721186
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below two new HALs
gops.fb.enable_hub_intr() to enable hub interrupts
gops.fb.disable_hub_intr() to disable hub interrupts
Set existing APIs gv11b_fb_enable/disable_hub_intr() to these HALs
Call the HALs everywhere instead of calling the APIs directly
Jira NVGPUT-44
Change-Id: Id299c6d228733ed365a71be6b180186776cc1306
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1725977
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The forced PRAMIN reads and writes for sysmem buffers haven't worked in
a while since the PRAMIN access code was refactored to work with
vidmem-only sgt allocs. This feature was only ever meant for testing and
debugging PRAMIN access and early dGPU support, but that is stable
enough now so just delete the broken feature instead of fixing it.
Change-Id: Ib31dae4550f3b6fea3c426a2e4ad126864bf85d2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1723725
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
As part of the MISRA fixes, moving all the
gating_reglist files to common/clock_gating dir,
the new directory structure suggested to follow.
Removed unused gating_reglist files for gk20a
JIRA NVGPU-646
Change-Id: I388855befcf991ee68eeffed10fe9ac456210649
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1722330
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-timeouts will be enabled only when timeouts_disabled_refcount
will reach 0
-timeouts_enabled debugfs will change from u32 type to file type
to avoid race enabling/disabling timeout from debugfs and ioctl
-unify setting timeouts_enabled from debugfs and ioctl
Bug 1982434
Change-Id: I54bab778f1ae533872146dfb8d80deafd2a685c7
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1588690
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Code related to MC module is updated for handling
MISRA violations
Rule 10.1: Operands shalln't be an inappropriate
essential type.
Rule 10.3: Value of expression shalln't be assigned
to an object with a narrow essential type.
Rule 10.4: Both operands in an operator shall have
the same essential type.
Rule 14.4: Controlling if statement shall have
essentially Boolean type.
Rule 15.6: Enclose if() sequences with braces.
JIRA NVGPU-646
JIRA NVGPU-659
JIRA NVGPU-671
Change-Id: Ia7ada40068eab5c164b8bad99bf8103b37a2fbc9
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1720926
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below new HALs for bios operations
gops.bios.devinit()
gops.bios.preos()
gops.bios.verify_devinit()
Export existing APIs gp106_bios_devinit() and gp106_bios_preos() and set them
to above HALs on gp106 and gv100
And call new HALs from gp106_bios_init() if supported instead of directly
calling APIs
Jira NVGPUT-48
Change-Id: Ic89f1c86cf6e3e0785b3663fe733b201d6f2f773
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1708382
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLIST ioctl to reschedule runlist,
and optionally check host and FECS status to preempt pending load of
context not belonging to the calling channel on GR engine during context
switch.
This should be called immediately after a submit to decrease worst case
submit to start latency for high interleave channel.
There is less than 0.002% chance that the ioctl blocks up to couple
miliseconds due to race condition of FECS status changing while being read.
For GV11B it will always preempt pending load of unwanted context since
there is no chance that ioctl blocks due to race condition.
Also fix bug with host reschedule for multiple runlists which needs to
write both runlist registers.
Bug 1987640
Bug 1924808
Change-Id: I0b7e2f91bd18b0b20928e5a3311b9426b1bf1848
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1549050
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Release runlist_lock and then initiate recovery if preempt
timed out. Also do not issue preempt if ch, tsg or runlist
id is invalid. tsgid could be invalid for below call trace
gk20a_prepare_poweroff->gk20a_channel_suspend->
*_fifo_preempt_channel->*_fifo_preempt_tsg
Bug 2065990
Bug 2043838
Change-Id: Ia1e3c134f06743e1258254a4a6f7256831706185
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1662656
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below new HALs
gops.fifo.add_sema_cmd() to insert HOST semaphore acquire/release methods
gops.fifo.get_sema_wait_cmd_size() to get size of acquire command buffer
gops.fifo.get_sema_incr_cmd_size() to get size of release command buffer
Separate out new API gk20a_fifo_add_sema_cmd() to implement semaphore acquire/
release sequence and set it to gops.fifo.add_sema_cmd()
Add gk20a_fifo_get_sema_wait_cmd_size() and gk20a_fifo_get_sema_incr_cmd_size()
to return respective command buffer sizes
Jira NVGPUT-16
Change-Id: Ia81a50921a6a56ebc237f2f90b137268aaa2d749
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1704490
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Added vf change inject support for gv10x
- Updated clk_pmu_vf_inject() to fill required data
for pascal or volta vf change inject support
- Added new ctrl clk interface for gv10x clk domain list
- Added pmu interface for gv10x clk domain list &
vf change inject request
- Modified clk cmd, msg & RPC id's to match
with chips_a_23609936 branch
Bug 200399373
Change-Id: Ib9dc10073386f63bdfd92110c7ec3e09b1c484ce
Signed-off-by: Vaikundanathan S <vaikuns@nvidia.com>
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1700746
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
sync_gk20a.* files are no longer used by core code and only invoked
from linux specific implementations of the OS_FENCE framework which are
under the common/linux directory. Hence, sync_gk20a.* files are also
moved under common/linux.
JIRA NVGPU-66
Change-Id: If623524611373d2da39b63cfb3c1e40089bf8d22
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1712900
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently gpu sysfs retrieves Fmax@Vmin by direct call into Tegra DVFS
driver that introduces compile time dependencies on CONFIG_TEGRA_DVFS.
In addition incorrect clock is used for DVFS information access.
Re-factored sysfs node to use generic GPU clock operation for Fmax@Vmin
read. This would fix a bug in target clock selection, and allows to
remove dependency of sysfs on CONFIG_TEGRA_DVFS.
Updated nvgpu_linux_get_fmax_at_vmin_safe operation itself so it can be
called on platforms that does not support Tegra DVFS, although 0 will
still be returned as Fmax@Vmin on such platforms.
Bug 2045903
Change-Id: I32cce25320df026288c82458c913b0cde9ad4f72
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1710924
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Read GP_GET and GET from hardware every time when the poll timer expires
instead of when the watchdog timer expires. Restart the watchdog timer
if the get pointers have increased since the previous read. This way
stuck channels are detected quicker.
Previously it could have taken at most twice the watchdog timeout limit
for a stuck channel to get recovered; with this change, a coarse sliding
window is used. The polling period is still 100 ms.
The difference is illustrated in the following diagram:
time 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
get a b b b b b b b b b b b b b b b b b b b b
prev - n n n n n n n n n A n n n n n n n n n S
next - A s s s s s s s s s S
"time" represents wall time in polling units; 0 is submit time. For
simplicity, watchdog timeout is ten units. "get" is the GP_GET that
advances a little from a and then gets stuck at b. "prev" is the
previous behaviour, "next" is after this patch. "A" is when the channel
is detected as advanced, and "S" when it's found stuck and recovered;
small "s" is when it's found stuck but when the time limit has not yet
expired and "n" is when the hw state is not read.
Bug 1700277
Bug 1982826
Change-Id: Ie2921920d5396cee652729c6a7162b740d7a1f06
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1710554
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Replaced all instances of sync_fence in gk20a_fence* code with
nvgpu_os_fence. Added the API install_fence for the nvgpu_os_fence
abstraction. sync_fence mechanism and its dependencies are completely
removed from the fence_gk20a methods. Due to the recent os_fence changes
and the changes to fence_gk20a, we can finally get rid of all the CONFIG_SYNCS
present in the submit path.
JIRA NVGPU-66
Change-Id: I3551dab04b93b1e94db83fc102a41872be89e9ed
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1701245
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch constructs an abstraction to hide the sync_fence
functionality from the common code. struct nvgpu_os_fence acts as an
abstraction for struct sync_fence.
struct nvgpu_os_fence consists of an ops structure named nvgpu_os_fence_ops
which contains an API to do pushbuffer programming to generate wait
commands for the fence.
The current implementation of nvgpu only allows for wait method on a
sync_fence which was generated using a similar backend(i.e. either
Nvhost Syncpoints or Semaphores). In this patch, a
generic API is introduced which will decide the type of the underlying
implementation of the struct nvgpu_os_fence at runtime and run the
corresponding wait implementation on it.
This patch changes the channel_sync_gk20a's semaphore specific
implementation to use the abstract API. A subsequent patch will make
the changes for the nvhost_syncpoint based implementations as well.
JIRA NVGPU-66
Change-Id: If6675bfde5885c3d15d2ca380bb6c7c0e240e734
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1667218
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We so far do not support unbinding a channel from TSG explicitly through IOCTL
Channel is unbound from TSG when channel fd is closed
But this could create issue in case process is forked and fd is duplicated
In worst case it is possible that duplicate fd is closed sometime later when
TSG is active and that could lead to corruption of TSG state
Fix this by implementing NVGPU_TSG_IOCTL_UNBIND_CHANNEL API
This API will simply call gk20a_tsg_unbind_channel() API to unbind the
channel
Note that we also mark the channel has_timedout since unbound channel does
not have any context of its own and it cannot serve any job
Remove call to gk20a_disable_channel() while closing the channel.
We do not support bare channels without TSG, and if channel is not part of TSG
then it means channel is already unbound from TSG with call to
NVGPU_TSG_IOCTL_UNBIND_CHANNEL, and there is nothing to do while closing the
channel
Bug 200327095
Jira NVGPU-229
Change-Id: Ib7f6710f0dc8125caafabe7bee737622c3dd9fa3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1708902
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Switch all logging to nvgpu_log*(). gk20a_dbg* macros are
intentionally left there because of use from other repositories.
Because the new functions do not work without a pointer to struct
gk20a, and piping it just for logging is excessive, some log messages
are deleted.
Change-Id: I00e22e75fe4596a330bb0282ab4774b3639ee31e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1704148
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
segregated os-agnostic function from linux/sim.c and linux/sim_pci.c
to sim.c and sim_pci.c, while retaining os-specific functions.
renamed all gk20a_* api's to nvgpu_*.
renamed hw_sim_gk20a.h to nvgpu/hw_sim.h
moved hw_sim_pci.h to nvgpu/hw_sim_pci.h
JIRA VQRM-2368
Change-Id: I040a6b12b19111a0b99280245808ea2b0f344cdd
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1702425
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
PMU ucode is updated to include LDIV slowdown factor in gr_init_param command.
- Defined a new version gr_init_param_v2.
- Updated the PMU FW version code.
- Set the LDIV slowdown factor to 0x1e by default.
- Added sysfs entry to program ldiv_slowdown factor at runtime.
Bug 200391931
Change-Id: Ic66049588c3b20e934faff3f29283f66c30303e4
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1674208
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new HAL gops.mc.isr_nonstall() to handle nonstall interrupts
We already handle nonstall interrupts in nvgpu_intr_nonstall()
But this API is completely in linux specific code
Separate out os-independent code to handle nonstall interrupts in new API
mc_gk20a_isr_nonstall() and set it to HAL gops.mc.isr_nonstall() for all
existing chips
Call this HAL from nvgpu_intr_nonstall()
Jira NVGPUT-8
Change-Id: Iec6a56db03158a72a256f7eee8989a0a8a42ae2f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1706589
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement a worker thread to replace the workqueue based
handling for clk arbiter update callbacks.
The work items scheduled with the thread are of two types,
update_vf_table and arb_update. Previously, there were two
workqueues for handling these two work struct's separately.
Now the single worker thread would process these two events.
If a work item of a particular type is scheduled to be run
on the worker, another instance of same type won't be
scheduled, which mirrors the linux workqueue behavior.
This also removes dependency on linux workqueues/work struct
and makes code portable to be used by QNX also.
Jira VQRM-3741
Change-Id: Ic27ce718c62c7d7c3f8820fbd1db386a159e28f2
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1706032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Mostly just including necessary includes to make sure that
global function declarations actually match their implementations.
Also work around pointer munging warning:
/build/ddpx/linux/kernel/nvgpu/drivers/gpu/nvgpu/common/pmu/pmu.c: In function 'nvgpu_pmu_process_init_msg':
/build/ddpx/linux/kernel/nvgpu/drivers/gpu/nvgpu/common/pmu/pmu.c:348:4: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
(*(u32 *)gid_data.signature == PMU_SHA1_GID_SIGNATURE);
Work around this warning by simply moving the type punning.
This code is certainly dangerous - it assumes the endianness
of the header data is the same as the machine this code is
running on. Apparently it works, though, so this ignores
the warning.
JIRA NVGPU-525
Change-Id: Id704bae7805440bebfad51c8c8365e6d2b7a39eb
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1692454
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
clk_vin data structures updated as new calibration type (v20) is added.
GP106 header does not have vin calibration type.
Assuming V10 if calibration type is not V20.
Add fuse calibration for V20 type.
Bug 200399373
Change-Id: I9449de1ecb0d0873f3bc16f46660f93fab5b9eac
Signed-off-by: Vaikundanathan S <vaikuns@nvidia.com>
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1687591
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below rc_types to be passed to gk20a_fifo_recover
MMU_FAULT
PBDMA_FAULT
GR_FAULT
PREEMPT_TIMEOUT
CTXSW_TIMEOUT
RUNLIST_UPDATE_TIMEOUT
FORCE_RESET
SCHED_ERR
This is nice to have to know what triggered recovery.
Bug 2065990
Change-Id: I202268c5f237be2180b438e8ba027fce684967b6
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1662619
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The return value from NVGPU_COND_WAIT_INTERRUPTIBLE in
channel poll worker is wrongly compared with 0 and the
boolean result is assigned to ret value used subsequently.
Instead, the direct return value from
NVGPU_COND_WAIT_INTERRUPTIBLE should be used.
This bug seems remnant of the following patch which moved
the handling from 'wait_event_timeout' to 'NVGPU_COND_WAIT'.
commit 301965fb77
gpu: nvgpu: Use nvgpu_cond in channel worker
Change-Id: Id48e197756a6855b35a9ee0dc26d014b62ed3860
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1705976
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below two new HALs
gops.fifo.runlist_hw_submit() to submit a new runlist to hardware
gops.fifo.runlist_wait_pending() to wait until runlist write is successful
Set existing API gk20a_fifo_runlist_wait_pending() to
gops.fifo.runlist_wait_pending HAL
Add new API gk20a_fifo_runlist_hw_submit() which submits the runlist to h/w
and set it to gops.fifo.runlist_hw_submit HAL
Jira NVGPUT-20
Change-Id: Ic23f7d947e30883aca0b536de818e79e14733195
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1700548
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Cache the rate used in clk_set_rate().
Return that cached rate on clk_get_rate(), don't read from hardware.
This cached rate is used to avoid duplicate requests to clk_set_rate().
Motivation is to support multiple governors for gpu clk.
Reading clock from hardware is unreliable in multi-governor situation.
Relying on hardware clock value could mislead the kernel gpu governor
in its scaling calculations.
Bug 2051688
Change-Id: I43fc056eea6f69fe0889c45640fcb892b658071c
Signed-off-by: Arun Kannan <akannan@nvidia.com>
(cherry picked from commit 7f819a9ba7)
Reviewed-on: https://git-master.nvidia.com/r/1662759
Reviewed-on: https://git-master.nvidia.com/r/1668919
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch deals with cleanups meant to make things simpler for the
upcoming os abstraction patches for the sync framework. This patch
causes some substantial changes which are listed out as follows.
1) sync_timeline is moved out of gk20a_fence into struct
nvgpu_channel_linux. New function pointers are created to facilitate os
independent methods for enabling/disabling timeline and are now named
as os_fence_framework. These function pointers are located in the struct
os_channel under struct gk20a.
2) construction of the channel_sync require nvgpu_finalize_poweron_linux()
to be invoked before invocations to nvgpu_init_mm_ce_context(). Hence,
these methods are now moved away from gk20a_finalize_poweron() and
invoked after nvgpu_finalize_poweron_linux().
3) sync_fence creation is now delinked from fence construction and move
to the channel_sync_gk20a's channel_incr methods. These sync_fences are
mainly associated with post_fences.
4) In case userspace requires the sync_fences to be constructed, we
try to obtain an fd before the gk20a_channel_submit_gpfifo() instead of
trying to do that later. This is used to avoid potential after effects
of duplicate work submission due to failure to obtain an unused fd.
JIRA NVGPU-66
Change-Id: I42a3e4e2e692a113b1b36d2b48ab107ae4444dfa
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1678400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>