In gk20a_pm_shutdown(), we do not check return value
of gk20a_pm_prepare_poweroff
In some cases it is possible that gk20a_pm_prepare_poweroff()
returns -EBUSY (this could happen if engines are busy)
so we don't clean up s/w state and directly
trigger GPU railgate
In case some interrupt is triggered simultaneously
we try to access a register while GPU is already railgated
This leads to a hard hang in nvgpu shutdown path
Make below changes in shutdown sequence to fix this:
- check return value of gk20a_wait_for_idle()
- disable activity on all engines with
gk20a_fifo_disable_all_engine_activity()
- ensure engines are idle with gk20a_fifo_wait_engine_idle()
- check return value of gk20a_pm_prepare_poweroff()
- check return value of gk20a_pm_railgate()
Add a print when we bail out early in case
GPU is already railgated
Also, skip shutdown in case of VGPU since we don't need
to clean up virtual GPU, and RM server will take care of
cleaning h/w resources
Bug 200281010
Change-Id: I2856f9be6cd2de9b0d3ae12955cb1f0a2b6c29be
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1454658
(cherry picked from commit 6de456f840)
Reviewed-on: http://git-master/r/1461150
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
For dGPU, instance block is in vidmem, and context_ptr
was not properly computed, leading to reporting pid=0 in
FECS traces.
Use gk20a_mm_inst_block_addr, which handles all cases
to determine instance block physical address.
Bug 1899195
Change-Id: If003d9f00aff66d808e66c06baf6ded38699981a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1461646
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This register is context_reset , and the value is context-switched with
GPU contexts
Bug 1855307
Bug 1871606
Change-Id: I4f871eaf94b4bb589ae179fc0ea972943acd0881
Signed-off-by: Jon McCaffrey <jmccaffrey@nvidia.com>
Reviewed-on: http://git-master/r/1452947
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Donghan Ryu <dryu@nvidia.com>
(cherry picked from commit f02591a23ab575d8592d3b2bebf94163c1323f17)
Reviewed-on: http://git-master/r/1459948
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Getting a timeout on kernel's own CE channels is unrecoverable.
Vidmem freeing also depends on CE to clear pages that have been
used so that they can be reused.
Disable watchdog on kernel's CE channels.
Bug 200287270
Change-Id: I87e0aa925d6d20485a5a19d2a6bfd050de34e968
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1454208
(cherry picked from commit 2ff3a9f374)
Reviewed-on: http://git-master/r/1457402
GVS: Gerrit_Virtual_Submit
The nvgpu timer API prints a message when the timer expires, but
expiration of that does not necessarily mean here that the job has
actually timed out, which is tested by comparing gp_get. Change the
expiration check to just peek instead of the default which prints to
log on expiration.
Bug 1887569
Bug 200291842
Jira NVGPU-21
Change-Id: Ifde34cff701eaed2f3ea727dba3ec8affeef26b9
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1329731
(cherry picked from commit 580c8112f0)
Reviewed-on: http://git-master/r/1452969
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
On unbind we need to check that interrupts are complete before tearing
down the interrupt threads, but on vgpu those structures are not
initialized as they are managed by the server.
This change makes sure we do not try to free those resources on vgpu
shutdown
Bug 200293510
JIRA: EASS-1753
Change-Id: I77cb8594e1ad2c53f632e18b0dfc88f784a815e4
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
(cherry picked from commit 1a640fa6a3b41c3de7d63e14ee6770679e2c82af)
Reviewed-on: http://git-master/r/1330770
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Jinyoung Park <jinyoungp@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
This is a quick fix. Finally, the common probe code is better be put
in common function btween vgpu and native gpu.
Bug 200293437
Jira EVLR-1152
Change-Id: I55f0d179d7adba556e0cb404766e14405b3e27e5
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1330229
(cherry picked from commit 7691902fec8abdd621ee17561607efeef615499f)
Reviewed-on: http://git-master/r/1331606
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
In gk20a_suspend_all_sms(), we currently loop
over all GPCs and then loop over all TPCs in inner
loop
But this is incorrect and leads to SM with
invalid GPC,TPC ids
Fix this by looping over number of TPCs in each
GPC in inner loop
Also, fix gk20a_gr_wait_for_sm_lock_down() as
per below
- we right now wait infinitely for SM to lock down
- restrict this wait with a timeout on silicon
platforms
- return ETIMEDOUT instead of EAGAIN
- add more debug prints with additional data
for SM lock down failures
Bug 200258704
Change-Id: Id6fe32e579647fd8ac287a4b2ec80cbf98791e0d
Signed-off-by: Cory Perry <cperry@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1316471
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-on: http://git-master/r/1322373
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This change refactors the teardown in remove to ensure that it is
possible to unload the driver while leaving fds open. This is achieved
by making sure that the SW state is kept alive till all fds are closed
and by checking that subsequent calls to ioctls after the teardown fail.
Normally, this would be achieved ny calls into gk20a_busy(), but in
kickoff we dont call into that to reduce latency, so we need to check
the driver status directly, and also in some of the functions
as we need to make sure the ioctl does not dereference the device or
platform struct
bug 200277762
JIRA: EVLR-1023
Change-Id: I163e47a08c29d4d5b3ab79f0eb531ef234f40bde
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320219
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Shreshtha Sahu <ssahu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
(cherry picked from commit e0f2afe5eb)
Reviewed-on: http://git-master/r/1327755
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
After driver remove, the device structure passed in gk20a_busy can be
invalid. To solve this the prototype of the function is modified to pass
the gk20a struct instead of the device pointer.
bug 200277762
JIRA: EVLR-1023
Change-Id: I08eb74bd3578834d45115098ed9936ebbb436fdf
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320194
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
(cherry picked from commit 2a502bdd5f)
Reviewed-on: http://git-master/r/1327754
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
On the past, we had separate calls for platform and channel busy, but
those got removed. The result is that in the debugger code we have
essentially a double busy call int the powergating enable/disable.
This change removes it
bug 200277762
JIRA: EVLR-1023
Change-Id: Iba70b81700f27b847e1d0222fb69ed1a7a883342
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1323220
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-on: http://git-master/r/1327753
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
The fifo interrupt path was reading the PBDMA interrupt status
after clearing interrupts and this could lead to a situation in
which the host may have advanced to another channel, leading to
the recovery code resetting the wrong channel.
Bug 200278729
JIRA: EVLR-1036
Change-Id: I392423d1eaa8d23acf88454bf113c015e649e13d
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1326461
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit ab401c7068)
Reviewed-on: http://git-master/r/1327292
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
Support for hwpm reservations in the virtual case:
- Add session ops for checking and setting global and context reservations, and
releasing reservations
- in the native case, these just update reservation counts and flags
- in the vgpu case, when the reservation count is 0, check with the RM server
that a reservation is possible: for global reservations, no other guest
can have a reservation; for context reservations, no other guest can have
a global reservation
- in the vgpu case, when the reservation count is decremented to 0, notify
the RM server that the guest no longer has any reservations
Bug 1775465
JIRA VFND-3428
Change-Id: Idf115b730e465e35d0745c96a8f8ab6b645c7cae
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1326166
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The main driver structure is not refcounted properly,
so when the driver unload, file desciptors associated to the
driver are kept open with dangling references to the main object.
This change adds referencing to the gk20a structure.
bug 200277762
JIRA: EVLR-1023
Change-Id: Id892e9e1677a344789e99bf649088c076f0bf8de
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1317420
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit 74fe1caa2b)
Reviewed-on: http://git-master/r/1324637
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
The driver is not properly tearing down the arbiter on the PCI driver
unload. This change makes sure that the workqueues are drained before
tearing down the driver
bug 200277762
JIRA: EVLR-1023
Change-Id: If98fd00e27949ba1569dd26e2af02b75897231a7
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320147
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
(cherry picked from commit 469308beca)
Reviewed-on: http://git-master/r/1324636
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
We have a gk20a_busy() call in gk20a_buffer_convert_gpu_to_cde()
and we again call gk20a_busy() in gk20a_submit_channel_gpfifo()
If gk20a_do_idle() is triggered in between these two calls,
then this leads to a deadlock and results in idle failure
Hence to avoid this take power refcount in a more fine-grained
way i.e. in gk20a_cde_convert() instead of taking in
gk20a_buffer_convert_gpu_to_cde()
Keep gk20a_cde_execute_buffer() out of the gk20a_busy()/
gk20a_idle() pair since we take power refcount in submit path
anyways
Add correct error handling path in gk20a_cde_convert()
Bug 200287073
Change-Id: Iffea2d4c03f42b47dbf05e7fe8fe2994f9c6b37c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1324329
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
In gk20a_cde_convert(), we hold cde_app.mutex almost
for everything which is unnecessary
Also, this causes a deadlock scenario when gk20a_do_idle()
is called
In some cases it is possible that Thread 1
holds the lock and is trying to power on GPU, and
Thread 2 is trying to power off GPU and then grab
cde_app.mutex to cleanup GPU state
To fix this, grab the mutex only for gk20a_cde_get_context()
and then release it
Bug 200287073
Change-Id: Ic4856604652d0d4024abdeb5c2f8f03910c601a5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1324328
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
JIRA: EVLR-1004
(*) Refactor the non-stalling interrupt path to execute clear on the
top half, so on dGPU case processing of stalling interrupts does not
block non-stalling one.
(*) Use a worker thread to do semaphore wakeups and allow batching of
the non-stalling operations.
(*) Fix a bug where some gpus will not properly track the completion
of interrupts, preventing safe driver unloads
Change-Id: Icc90a3acba544c97ec6a9285ab235d337ab9eefa
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1312796
Reviewed-on: http://git-master/r/1320848
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
Currently callbacks from the PM_QOS framework (for
thermal events), result in a RPC call to set GPU frequency.
Since the governor will now be responsible for setting desired
rate, the max PM_QOS callback will now cap the possible
GPU frequency w/ a new RPC call to the server. The server
is responsible for setting the ultimate frequency
based on the cap & desired rates.
Jira VFND-3699
Change-Id: I806e309c40abc2f1381b6a23f2d898cfe26f9794
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1295543
(cherry picked from commit e81693c6e087f8f10a985be83715042fc590d6db)
Reviewed-on: http://git-master/r/1282467
(cherry picked from commit 7b4e0db647572e82a8d53e823c36b465781f4942)
Reviewed-on: http://git-master/r/1321836
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add devfreq governor support in order to allow frequency scaling
in virtualization config. GPU clock frequency operations are
re-directed to the server over RPC.
Bug 200237433
Change-Id: I1c8e565a4fff36d3456dc72ebb20795b7822650e
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1295542
(cherry picked from commit d5c956fc06697eda3829c67cb22987e538213b29)
Reviewed-on: http://git-master/r/1280968
(cherry picked from commit 25e2b3cf7cb5559a6849c0024d42c157564a9be2)
Reviewed-on: http://git-master/r/1321835
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_channel_clean_up_jobs, move removal of job from channel's job list
to before fences are cleaned up; this will prevent gk20a_channel_abort from
asynchronously trying to dereference an already freed job.
Bug 1844305
JIRA EVLR-849
Change-Id: I1ba05237aa74be1350007630bfa5eba9988f859a
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
(cherry picked from commit 2a9ce58b1b318b95ecfcdf78462f918d090eab99)
Reviewed-on: http://git-master/r/1319026
(cherry picked from commit 990f070b0a363159ce1b21f936b7512f469018ca)
Reviewed-on: http://git-master/r/1322332
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Query the RM server to retrieve gpu preemption policy
of guest based on pct configuration. If guest is not
allowed to request wfi preemption mode then set context
with either gfxp or cta preemption mode only.
Jira VFND-3079
Jira VFND-3081
Change-Id: I60cbf121d6f0e2373568cf40b3dfdb4df76fe02d
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: http://git-master/r/1280903
(cherry picked from commit 16ee09bb59)
Reviewed-on: http://git-master/r/1321661
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add support for clearing single SM error state for CUDA debugger.
In addition to clearing local copy of SM error state,
vgpu_gr_clear_sm_error_state now sends a command to RM server
(TEGRA_VGPU_CMD_CLEAR_SM_ERROR_STATE), to clear global ESR and
warp ESR.
Bug 1791111
Change-Id: I3a1f0644787fd900ec59a0e7974037d46a603487
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1296311
(cherry picked from commit fd07e03c3d086f396e4d65575c576a4dd68c920a)
Reviewed-on: http://git-master/r/1299060
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Cory Perry <cperry@nvidia.com>
Tested-by: Cory Perry <cperry@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add ability to suspend/resume contexts for a debug session
(NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS), in virtualized
case:
- added hal function to resume contexts.
- added vgpu support for suspend contexts, i.e. build a list
of channel ids, and send TEGRA_VGPU_CMD_SUSPEND_CONTEXTS
- added vgpu support for resume contexts, i.e. build a list
of channel ids, and send TEGRA_VGPU_CMD_RESUME_CONTEXTS
Bug 1791111
Change-Id: Icc1c00d94a94dab6384ac263fb811c00fa4b07bf
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1294761
(cherry picked from commit d17a38eda312ffa92ce92e5bafc30727a8b76c4e)
Reviewed-on: http://git-master/r/1299059
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Cory Perry <cperry@nvidia.com>
Tested-by: Cory Perry <cperry@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add a debugfs interface to profile the kickoff ioctl
it provides the probability distribution and separates the information
between time spent in: the full ioctl, the kickoff function, the amount
of time spent in job tracking and the amount of time doing pushbuffer
copies
JIRA: EVLR-1003
Change-Id: I9888b114c3fbced61b1cf134c79f7a8afce15f56
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1308997
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
The Linux timer code prints an error message containing the caller when
a timeout happens. The caller is formatted as a function name instead of
a pointer, so don't print 0x in front of it.
Bug 1799159
Change-Id: I4f86d885aba78ca20ba025a91c8c940d3c927d58
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1315749
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
1) Trace buffer allocation now calls generic gmmu alloc/map.
so for dGPUs they are allocated in vidmem and iGPUs they
are allocated in sysmem
2) Use pmu surface mechanism to setup trace buffer params as
dGPU binaries follow falcon memory structure to get mem surface
params
3) Fix minor coverity issue by removing unnecessary overwrite
of count variable in trace print function
JIRA DNVGPU-217
Coverity ID 2431386
Change-Id: I2ae49a4e0450481cde2a778447c270a796681dad
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/1312404
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Handle pbus and priv stall interrupts first.
In general critical interrupts should be
handled before any other non critical ones.
-Dump info enabled with gpu_dbg_intr if priv_ring
interrupt is flagged by fmodel.
JIRA NVGPU-25
Change-Id: Iee767d8c9c933ceb57532c1b5a7fd7812daf1b6d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1311273
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
trace_printk() does an extra stringify operation before calling
do_trace_printk(). The string ends up unused. This has an impact
to kernel even if we never end up using trace_printk().
Disable use of trace_printk() and introduce a Kconfig option for
re-enabling it.
Change-Id: I2e1014f5231f2089d7dc3cb2539e3eb5f4d58361
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1295298
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement kmem abstraction and tracking in nvgpu. The abstraction
helps move nvgpu's core code away from being Linux dependent and
allows kmem allocation tracking to be done for Linux and any other
OS supported by nvgpu.
Bug 1799159
Bug 1823380
Change-Id: Ieaae4ca1bbd1d4db4a1546616ab8b9fc53a4079d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1283828
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Change nvgpu_kalloc() to nvgpu_big_[mz]alloc(). This is necessary
since the natural free function name for this is nvgpu_kfree() but
that conflicts with nvgpu_k[mz]alloc() (implemented in a subsequent
patch).
This API exists becasue not all allocation sizes can be determined
at compile time and in some cases sizes may vary across the system
page size. Thus always using kmalloc() could lead to OOM errors due
to fragmentation. But always using vmalloc() is wastful of memory
for small allocations. This API tries to alleviate those problems.
Bug 1799159
Bug 1823380
Change-Id: I49ec5292ce13bcdecf112afbb4a0cfffeeb5ecfc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1283827
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove nvgpu_gpuid_t18x.h since this file is now visible. Migrate
the relevant definitions and defines into their expected places and
make the code use the real defines. No longer is hiding t18x specific
stuff necessary.
Bug 1799159
Change-Id: I47fa2392e46fdb7aacc70aeb0cc8c3f5ca0dc22f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1300976
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added fecs trace characteristic flag to vgpu to determine for
platform support
Also moved fecs flag from ctxsw init to fecs init for native linux.
This makes flag init uniform across both platforms.
JIRA EVLR-993
Change-Id: Id62f31954cb5d7028ef01a199548f5f908b51eb0
Signed-off-by: Vishnu Reddy Mandalapu <vmandalapu@nvidia.com>
Reviewed-on: http://git-master/r/1305045
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement a worker thread to replace the delayed works in channel
watchdog and job cleanups. Watchdog runs by polling the channel states
periodically, and job cleanup is performed on channels that are appended
on a work queue consumed by the worker thread. Handling both of these
two in the same thread makes it impossible for them to cause a deadlock,
as has previously happened.
The watchdog takes references to channels during checking and possibly
recovering channels. Jobs in the cleanup queue have an additional
reference taken which is released after the channel is processed. The
worker is woken up from periodic sleep when channels are added to the
queue.
Currently, the queue is only used for job cleanups, but it is extendable
for other per-channel works too. The worker can also process other
periodic actions dependent on channels.
Neither the semantics of timeout handling or of job cleanups are yet
significantly changed - this patch only serializes them into one
background thread.
Each job that needs cleanup is tracked and holds a reference to its
channel and a power reference, and timeouts can only be processed on
channels that are tracked, so the thread will always be idle if the
system is going to be suspended, so there is currently no need to
explicitly suspend or stop it.
Bug 1848834
Bug 1851689
Bug 1814773
Bug 200270332
Jira NVGPU-21
Change-Id: I355101802f50841ea9bd8042a017f91c931d2dc7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1297183
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
fifo_pbdma_status__size_1_v() and fifo_engine_status__size_1_v()
are not same for all gpus. Use litter value to calculate chip
specific fifo*status__size_1(v)
JIRA GV11B-45
Change-Id: I3d3d45bf79d15e14739fcc18cb1ca987669d5c11
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1312688
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove restriction of setting compute preemption mode for channels
created with PASCAL_COMPUTE_A class only and allow it to be set
for PASCAL_A class too.
Also print compute preemption mode during channel closing.
Bug 200284575
Change-Id: I2de3b3acda128e91caa2ab0fd341915ce6e6520b
Signed-off-by: Sandeep Shinde <sashinde@nvidia.com>
Reviewed-on: http://git-master/r/1313286
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Donghan Ryu <dryu@nvidia.com>
In gk20a_vm_map_duplicate_locked(), we always do a linear search
in rb-tree to find a duplicate entry of the buffer
In case NVGPU_AS_MAP_BUFFER_FLAGS_FIXED_OFFSET is set, we first
traverse whole rb-tree linearly and then compare offset_align
with the address searched from rb-tree
If size of rb-tree is very large this linear lookup takes upto
7mS and causes huge delays
Hence in case of NVGPU_AS_MAP_BUFFER_FLAGS_FIXED_OFFSET, we can
use offset_align to perform a binary search on rb-tree and then
verify that dmabuf and kind match with the node obtained from
the search
This saves a lot of time per-lookup
Bug 1874516
Change-Id: Ia4924b64d66e586c14341ae2e2283beac394bf6f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1309343
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove MCLK and GPCCLK domain aliases, now that userspace
has swithed to new enumerations.
Jira DNVGPU-211
Change-Id: I2af2fd67dbed47088d7161ba0605e13dd7c674a5
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1292609
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>