Commit Graph

368 Commits

Author SHA1 Message Date
Deepak Nibade
56df8c5808 gpu: nvgpu: use new List APIs to free channels
Use new APIs from <nvgpu/list.h> to access free
channel list

Define channel_gk20a_from_free_chs() to convert
a list node to struct channel_gk20a

Jira NVGPU-13

Change-Id: Idaf58f04be1c7fc553bea7c8de45951bf82bb340
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1303025
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-31 11:34:25 -07:00
Terje Bergstrom
f04031e5e8 gpu: nvgpu: Move programming of host registers to fifo
Move code that touches host registers and instance block to fifo HAL.
This involves adding HAL ops for the fifo HAL functions that get
called from outside fifo. This clears responsibility of channel by
leaving it only managing channels in software and push buffers.

channel had member ramfc defined, but it was not used, to remove it.

pbdma_acquire_val consisted both of channel logic and hardware
programming. The channel logic was moved to the caller and only
hardware programming was moved.

Change-Id: Id005787f6cc91276b767e8e86325caf966913de9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1322423
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-28 15:55:48 -07:00
Terje Bergstrom
3e39798997 gpu: nvgpu: Remove unnecessary use of dev_name()
Move the name field from struct gpu_ops up to struct gk20a. The field
is not a function op, so it doesn't belong in gpu_ops.

Replace all uses of dev_name() with use of g->name when possible.

JIRA NVGPU-16

Change-Id: Ic6e99e39258cbf3bb7c806962cbbd7de5126688f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1328534
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-28 12:05:03 -07:00
Konsta Holtta
580c8112f0 gpu: nvgpu: avoid false job timeout message
The nvgpu timer API prints a message when the timer expires, but
expiration of that does not necessarily mean here that the job has
actually timed out, which is tested by comparing gp_get. Change the
expiration check to just peek instead of the default which prints to
log on expiration.

Bug 1887569
Jira NVGPU-21

Change-Id: Ifde34cff701eaed2f3ea727dba3ec8affeef26b9
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1329731
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-28 09:39:13 -07:00
David Nieto
e0f2afe5eb gpu: nvgpu: refactor teardown to support unbind
This change refactors the teardown in remove to ensure that it is
possible to unload the driver while leaving fds open. This is achieved
by making sure that the SW state is kept alive till all fds are closed
and by checking that subsequent calls to ioctls after the teardown fail.

Normally, this would be achieved ny calls into gk20a_busy(), but in
kickoff we dont call into that to reduce latency, so we need to check
the driver status directly, and also in some of the functions
as we need to make sure the ioctl does not dereference the device or
platform struct

bug 200277762
JIRA: EVLR-1023

Change-Id: I163e47a08c29d4d5b3ab79f0eb531ef234f40bde
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320219
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Shreshtha Sahu <ssahu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2017-03-25 02:06:55 -07:00
David Nieto
2a502bdd5f gpu: nvgpu: pass gk20a struct to gk20a_busy
After driver remove, the device structure passed in gk20a_busy can be
invalid. To solve this the prototype of the function is modified to pass
the gk20a struct instead of the device pointer.

bug 200277762
JIRA: EVLR-1023

Change-Id: I08eb74bd3578834d45115098ed9936ebbb436fdf
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320194
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
2017-03-23 21:05:35 -07:00
Alex Waterman
4a94c135f0 gpu: nvgpu: Use new kmem API functions (channel)
Use the new kmem API functions in the channel and channel
related code.

Also delete the usage of kasprintf() since that must be paired
with a kfree(). Since the kasprintf() doesn't use the nvgpu kmem
machinery (and is Linux specific) instead use a small buffer
statically allocated on the stack.

Bug 1799159
Bug 1823380

Change-Id: Ied0183f57372632264e55608f56539861cc0f24f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1318312
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-22 18:37:10 -07:00
David Nieto
74fe1caa2b gpu: nvgpu: Add refcounting to driver fds
The main driver structure is not refcounted properly,
so when the driver unload, file desciptors associated to the
driver are kept open with dangling references to the main object.

This change adds referencing to the gk20a structure.

bug 200277762
JIRA: EVLR-1023

Change-Id: Id892e9e1677a344789e99bf649088c076f0bf8de
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1317420
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-20 16:39:55 -07:00
Peter Daifuku
38d90b6092 gpu: nvgpu: del channel job before fence is closed
In gk20a_channel_clean_up_jobs, move removal of job from channel's job list
to before fences are cleaned up; this will prevent gk20a_channel_abort from
asynchronously trying to dereference an already freed job.

Bug 1844305
JIRA EVLR-849

Change-Id: I1ba05237aa74be1350007630bfa5eba9988f859a
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
(cherry picked from commit 2a9ce58b1b318b95ecfcdf78462f918d090eab99)
Reviewed-on: http://git-master/r/1319026
(cherry picked from commit 990f070b0a363159ce1b21f936b7512f469018ca)
Reviewed-on: http://git-master/r/1321624
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-17 10:56:34 -07:00
Deepak Nibade
bf717d6273 gpu: nvgpu: check return value of mutex_init for channel/TSG
- check return value of nvgpu_mutex_init for all the mutexes
  of a channel and TSG
- add corresponding nvgpu_mutex_destroy calls

Jira NVGPU-13

Change-Id: Iba3a5f8bc2261ec684b300dd4237ab7d22fa3630
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1317139
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-14 11:46:58 -07:00
David Nieto
b9feba6efc gpu: nvgpu: in-kernel kickoff profiling
Add a debugfs interface to profile the kickoff ioctl
it provides the probability distribution and separates the information
between time spent in: the full ioctl, the kickoff function, the amount
of time spent in job tracking and the amount of time doing pushbuffer
copies

JIRA: EVLR-1003

Change-Id: I9888b114c3fbced61b1cf134c79f7a8afce15f56
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1308997
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-07 13:42:28 -08:00
Alex Waterman
707ea45e0f gpu: nvgpu: kmem abstraction and tracking
Implement kmem abstraction and tracking in nvgpu. The abstraction
helps move nvgpu's core code away from being Linux dependent and
allows kmem allocation tracking to be done for Linux and any other
OS supported by nvgpu.

Bug 1799159
Bug 1823380

Change-Id: Ieaae4ca1bbd1d4db4a1546616ab8b9fc53a4079d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1283828
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-03 10:34:48 -08:00
Alex Waterman
3966efc2e5 gpu: nvgpu: Give nvgpu_kalloc a less generic name
Change nvgpu_kalloc() to nvgpu_big_[mz]alloc(). This is necessary
since the natural free function name for this is nvgpu_kfree() but
that conflicts with nvgpu_k[mz]alloc() (implemented in a subsequent
patch).

This API exists becasue not all allocation sizes can be determined
at compile time and in some cases sizes may vary across the system
page size. Thus always using kmalloc() could lead to OOM errors due
to fragmentation. But always using vmalloc() is wastful of memory
for small allocations. This API tries to alleviate those problems.

Bug 1799159
Bug 1823380

Change-Id: I49ec5292ce13bcdecf112afbb4a0cfffeeb5ecfc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1283827
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-03 10:34:43 -08:00
Konsta Holtta
f1072a28be gpu: nvgpu: add worker for watchdog and job cleanup
Implement a worker thread to replace the delayed works in channel
watchdog and job cleanups. Watchdog runs by polling the channel states
periodically, and job cleanup is performed on channels that are appended
on a work queue consumed by the worker thread. Handling both of these
two in the same thread makes it impossible for them to cause a deadlock,
as has previously happened.

The watchdog takes references to channels during checking and possibly
recovering channels. Jobs in the cleanup queue have an additional
reference taken which is released after the channel is processed. The
worker is woken up from periodic sleep when channels are added to the
queue.

Currently, the queue is only used for job cleanups, but it is extendable
for other per-channel works too. The worker can also process other
periodic actions dependent on channels.

Neither the semantics of timeout handling or of job cleanups are yet
significantly changed - this patch only serializes them into one
background thread.

Each job that needs cleanup is tracked and holds a reference to its
channel and a power reference, and timeouts can only be processed on
channels that are tracked, so the thread will always be idle if the
system is going to be suspended, so there is currently no need to
explicitly suspend or stop it.

Bug 1848834
Bug 1851689
Bug 1814773
Bug 200270332
Jira NVGPU-21

Change-Id: I355101802f50841ea9bd8042a017f91c931d2dc7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1297183
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-02 17:51:03 -08:00
Terje Bergstrom
b71fa9289d gpu: nvgpu: Do not bind FECS trace on VPR channels
VPR channels can access VPR, and writing to FECS buffer outside of
VPR causes a region violation.

Bug 1877511

Change-Id: Ida466c81e928d1f67bf1b0e7dd6afb799c1ab2f6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1312759
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Tested-by: Season Li <seasonl@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
2017-03-02 10:43:40 -08:00
Deepak Nibade
8ee3aa4b31 gpu: nvgpu: use common nvgpu mutex/spinlock APIs
Instead of using Linux APIs for mutex and spinlocks
directly, use new APIs defined in <nvgpu/lock.h>

Replace Linux specific mutex/spinlock declaration,
init, lock, unlock APIs with new APIs
e.g
struct mutex is replaced by struct nvgpu_mutex and
mutex_lock() is replaced by nvgpu_mutex_acquire()

And also include <nvgpu/lock.h> instead of including
<linux/mutex.h> and <linux/spinlock.h>

Add explicit nvgpu/lock.h includes to below
files to fix complilation failures.
gk20a/platform_gk20a.h
include/nvgpu/allocator.h

Jira NVGPU-13

Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1293187
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-22 04:15:02 -08:00
Deepak Nibade
76611c4268 gpu: nvgpu: remove use of mutex_is_locked()
mutex_is_locked() API is defined on Linux only
and not on other OS like QNX.

Hence remove use of this API for OS abstraction
support to nvgpu.

Instead of using mutex_is_locked(), use
mutex_trylock() for same purpose

Jira NVGPU-13

Change-Id: I542daf20a2294153da8e8bfe89e0dc0387297523
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1297184
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-22 04:14:57 -08:00
Peter Boonstoppel
907adfd785 gpu: nvgpu: Add NVGPU_IOCTL_CHANNEL_SET_BOOSTED_CTX
This ioctl can be used on gp10b to set a flag in the context header
indicating this context should be run at elevated clock
frequency. FECS ctxsw ucode will read this flag as part of the context
switch and will request higher GPU clock frequencies from BPMP for the
duration of the context execution.

Bug 1819874

Change-Id: I84bf580923d95585095716d49cea24e58c9440ed
Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-on: http://git-master/r/1292746
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-14 14:54:46 -08:00
Alex Waterman
e7a0c0ae8b gpu: nvgpu: Move from gk20a_ to nvgpu_ in semaphore code
Change the prefix in the semaphore code to 'nvgpu_' since this code
is global to all chips.

Bug 1799159

Change-Id: Ic1f3e13428882019e5d1f547acfe95271cc10da5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284628
Reviewed-by: Varun Colbert <vcolbert@nvidia.com>
Tested-by: Varun Colbert <vcolbert@nvidia.com>
2017-02-13 18:15:03 -08:00
Alex Waterman
aa36d3786a gpu: nvgpu: Organize semaphore_gk20a.[ch]
Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore
code is common to all chips.

Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu
and rename it to semaphore.h. Also update all places where the header
is inluced to use the new path.

This revealed an odd location for the enum gk20a_mem_rw_flag. This should
be in the mm headers. As a result many places that did not need anything
semaphore related had to include the semaphore header file. Fixing this
oddity allowed the semaphore include to be removed from many C files that
did not need it.

Bug 1799159

Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284627
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-13 18:14:45 -08:00
Alex Waterman
7e403974d3 gpu: nvgpu: Simplify ref-counting on VMs
Simplify ref-counting on VMs: take a ref when a VM is bound to a
channel and drop a ref when a channel is freed.

Previously ref-counts were scattered over the driver. Also the CE
and CDE code would bind channels with custom rolled code. This was
because the gk20a_vm_bind_channel() function took an as_share as
the VM argument (the VM was then inferred from that as_share).
However, it is trivial to abtract that bit out and allow a central
bind channel function that just takes a VM and a channel.

Bug 1846718

Change-Id: I156aab259f6c7a2fa338408c6c4a3a464cd44a0c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1261886
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-07 14:54:02 -08:00
Terje Bergstrom
cf8d9ccf8e gpu: nvgpu: Base channel watchdog on gp_get
Instead of checking if a job is complete, only check that channel is
making progress by checking its gp_get is advancing.

This will make the watchdog conservative. Previously a whole job had
x seconds to complete. Now channel has x seconds to get host to
consume each push buffer segment.

Bug 1861838
Bug 200273419
Bug 200263100

Change-Id: I70adc1f50301bce8db7dac675771c251c0f11b70
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1294850
Reviewed-by: Automatic_Commit_Validation_User
2017-01-30 09:53:43 -08:00
Terje Bergstrom
e52c6ac1f2 gpu: nvgpu: Bump semaphore timeout
Semaphore acquire timeout is configured to half of watchdog timeout.
This is too short, so bump it to 80% of watchdog timeout.

Bug 200261389

Change-Id: Ie906ea3d3520c2e3f547cff7ffbb1e37459e6d2f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1283623
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-26 12:52:55 -08:00
Mihir Thakkar
bbc2342331 gpu: nvgpu: Debug spew for context priority & Gfxp
Prints out Timeslice value, Interleave level, Graphics preemption
mode and compute preempt mode along with chid, tsgid, pid.

Enable it with setting dbg_mask with 8192

Bug 1855710

Change-Id: I60efef9810587f8fedd4e2ba62ba67d06d84faea
Signed-off-by: Mihir Thakkar <mthakkar@nvidia.com>
Reviewed-on: http://git-master/r/1287141
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-24 14:06:06 -08:00
Konsta Holtta
bf0a666be0 gpu: nvgpu: fix dev_info typo in refcount tracking
The recently added refcount tracking support had a slight mishap in
refactoring some printks. Fix a typo to make the support compile again
when enabled.

Bug 1826754

Change-Id: Ifd76d644932fa219751db82a0beb3c8482ea68c3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1285922
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-18 16:47:32 -08:00
Alex Waterman
6e2237ef62 gpu: nvgpu: Use timer API in gk20a code
Use the timers API in the gk20a code instead of Linux specific
API calls.

This also changes the behavior of several functions to wait for
the full timeout for each operation that can timeout. Previously
the timeout was shared across each operation.

Bug 1799159

Change-Id: I2bbed54630667b2b879b56a63a853266afc1e5d8
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1273826
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-18 16:46:33 -08:00
Deepak Nibade
76dc6659ff gpu: nvgpu: print process name on submit failure
Print process name if we fail submit due to gk20a_busy()
failure

This is helpful in debugging and to know the process name
submitting jobs to nvgpu after system shutdown was
already triggered

Bug 200262275

Change-Id: I34d8c07fc96fd5556afa982bfd56f7f3964449d0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1284113
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2017-01-17 07:11:32 -08:00
Peter Boonstoppel
f15a86f265 gpu: nvgpu: Add sysfs nodes for timeslice min/max
The timeslice values that can be selected for a particular channel/tsg
are bounded by a static min/max. This change introduces two sysfs
nodes that allow these bounds to be configured from userspace.

min_timeslice_us
max_timeslice_us

Bug 200251974
Bug 1854791

Change-Id: I5d5a14225eee4090e418c7e43629324114f60768
Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-on: http://git-master/r/1280372
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-12 08:23:56 -08:00
Alex Waterman
b928f10d37 gpu: nvgpu: Start re-organizing the HW headers
Reorganize the HW headers of gk20a. The headers are moved to a
new directory:

  include/nvgpu/hw/gk20a

And from the code are included like so:

  #include <nvgpu/hw/gk20a/hw_pwr_gk20a.h>

This is the first step in reorganizing all of the HW headers for
gm20b, gm206, etc. This is part of a larger effort to re-structure
and make the driver more readable and scalable.

Bug 1799159

Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1244790
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-11 12:44:14 -08:00
Konsta Holtta
5e68c6e971 gpu: nvgpu: add support for refcount tracking
If enabled, track actions (gets and puts) on channel reference counters.
Dump the most recent actions to syslog when
gk20a_wait_until_counter_is_N gets stuck when closing a channel.
GK20A_CHANNEL_REFCOUNT_TRACKING specifies the size of the action
history. Default is to disable completely, as this has some runtime
overhead.

Bug 1826754

Change-Id: I880b0efe8881044d02ae224c243a51cb6c2db8c1
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1262424
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-11 09:13:43 -08:00
Terje Bergstrom
1c9b8c7868 gpu: nvgpu: Move GPFIFO to sysmem
Kickoff latencies when GPFIFO is in vidmem grow significantly as
function of number of GPFIFO entries. Move GPFIFO to sysmem to
improve kickoff latency.

Bug 1848369

Change-Id: Ie95d10df26b4f1370f7250a8fbf0f7ef0211df32
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
(cherry-picked from commit 897cb579a759bbe8455ce368413979e91eb0d475)
Reviewed-on: http://git-master/r/1281564
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-09 12:33:22 -08:00
Alex Waterman
6df3992b60 gpu: nvgpu: Move allocators to common/mm/
Move the GPU allocators to common/mm/ since the allocators are common
code across all GPUs. Also rename the allocator code to move away from
gk20a_ prefixed structs and functions.

This caused one issue with the nvgpu_alloc() and nvgpu_free() functions.
There was a function for allocating either with kmalloc() or vmalloc()
depending on the size of the allocation. Those have now been renamed to
nvgpu_kalloc() and nvgpu_kfree().

Bug 1799159

Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1274400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-09 12:33:16 -08:00
Alex Waterman
9e2f7d98d4 gpu: nvgpu: Move deferred interrupt wait code
Move the code that waits for deferred interrupts to nvgpu_common.c and
make it global. Also rename that function to use nvgpu_ as the function
prefix.

Bug 1816516
Bug 1807277

Change-Id: I42c4982ea853af5489051534219bfe8b253c2784
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1250027
(cherry picked from commit cb6fb03e20b08e5c3606ae8a5a9c237bfdf9e7da)
Reviewed-on: http://git-master/r/1274475
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-04 15:53:55 -08:00
Alex Waterman
91d977ced4 gpu: nvgpu: Misc fixes for crashes on shutdown
Fix miscellaneous issues seen during driver shutdown.

 o  Make sure pointers are valid before accessing them.
 o  Busy the GPU during channel timeout.
 o  Cancel delayed work on channels.
 o  Avoid access to channels that may have been freed.

Bug 1816516
Bug 1807277

Change-Id: I62df40373fdfb1c4a011364e8c435176a08a7a96
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1250026
(cherry picked from commit 64a95fc96c8ef7c5af9c53c4bb3402626e0d2f60)
Reviewed-on: http://git-master/r/1274474
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-04 15:53:55 -08:00
Terje Bergstrom
fef7083e51 gpu: nvgpu: Do not dereference NULL wait_cmd
In gk20a_submit_prepare_syncs(), after we have allocated wait_cmd
we check the results. If we failed to allocate wait_cmd, we still
jump to the error label that tries to free wait_cmd.

Create an own error labal for allocation before wait_cmd, and use
that if we fail to allocate wait_cmd.

Similarly create an error label for incr_cmd, and use that only once
incr_cmd has actually been allocated.

Change-Id: I1f8bc1d947c524038f5f237358a5e6b0dc2e6ac3
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1275742
GVS: Gerrit_Virtual_Submit
2017-01-04 01:44:24 -08:00
Terje Bergstrom
b6031e4e15 gpu: nvgpu: Check for NULL in job allocation
When allocating job channel_gk20a_job structure we assign the result
of allocation to *job_out, but we check the result of allocation
in job_out.

Change the check to check for result in *job_out.

Change-Id: Ia170cfa2dd5730665434b4c223c5a2f9502c744d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1275741
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
2017-01-04 01:44:23 -08:00
Deepak Nibade
505b442551 gpu: nvgpu: acquire mutex for notifier read
We use &ch->error_notifier_mutex to protect
writes and free of error notifier
But we currently do not protect reading of
notifier in gk20a_fifo_set_ctx_mmu_error()
and vgpu_fifo_set_ctx_mmu_error()

Add new API gk20a_set_error_notifier_locked()
which is same as gk20a_set_error_notifier()
but without the locks.

In *_fifo_set_ctx_mmu_error() APIs, acquire
the mutex explicitly, and then use this new
API

gk20a_set_error_notifier() will now just call
gk20a_set_error_notifier_locked() within
a mutex

Bug 1824788
Bug 1844312

Change-Id: I1f3831dc63fe1daa761b2e17e4de3c155f505d6f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1273471
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-12-27 01:24:35 -08:00
Alex Waterman
cee118c1d4 gpu: nvgpu: Fix memory leaks
Fix a memory leak introduced when making the priv struct for
TSGs.

Fix another memory leak when introducing a priv struct for
channels.

Bug 1816516

Change-Id: I7b0e62bb6352f7e65acb5501cab9cef055d1f535
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1266889
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-12-19 15:40:30 -08:00
Alex Waterman
c1be5105da gpu: nvgpu: Allow channel free to be forced
Allow forced channel freeing. This is useful when the driver is
being cleaned up and the gk20a_wait_until_counter_is_N() could
potentially hang.

Bug 1816516
Bug 1807277

Change-Id: I711f5f3f6413d0bb30b4857e785ca3b504b494ee
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1250022
(cherry picked from commit e132d0e5ae77d758680ac708622a4883bbd69ba3)
Reviewed-on: http://git-master/r/1261918
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-19 15:40:14 -08:00
Alex Waterman
274e1881af gpu: nvgpu: Allow semaphores to be force released
Allow SW to force a semaphore release. Typically SW waits for a
semaphore to reach the value right before the semaphore release
value before doing a release. For example, assuming a semaphore
is to be released by SW by writing a 10 into the semaphore, the
code waits for the semaphore to get to 9 before writing 10.

The problem with this happens when trying to shutdown the GPU
unexpectedly. When aborting a channel after the GPU has terminated
the GPU is potantially no longer processing anything. If a SW
semaphore release is waiting on the semaphore to reach N-1 before
writing N to the semaphore N-1 may never get written by the GPU.
This obviously causes a hang in the SW shutdown. The solution is
to let SW force a semaphore release in the channel_abort case.

Bug 1816516
Bug 1807277

Change-Id: Ib8b4afd86102eacf372362b1748fb6ca04e6fa66
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1250021
(cherry picked from commit 2e9fa40902d2c4d5a1febe0bf2db420ce14bc633)
Reviewed-on: http://git-master/r/1261915
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-19 15:40:03 -08:00
Alex Waterman
28fce3e72b gpu: nvgpu: Use struct to hold gk20a pointer
The private_data field in the file pointer passed to release() for
channels originally pointed directly to the referenced channel. The
problem with this is that when the driver is killed and the channel
mmeory is freed that pointer becomes invalid.

The necessity of that channel is to get access to the gk20a struct that
owns the channel. This can instead be accomplished by making a new
private data struct that has a pointer to the gk20a struct directly
instead of requiring the channel to be valid. This lets the release()
function work even if the channels are gone (though in such cases the
release function doesn't do very much).

Change-Id: I5e50bb5b6dd08d38974f8e7b46ba125e9a3f1922
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1246586
(cherry picked from commit 14b7c380c74d2caeb04c47ad3e33332a423a84bb)
Reviewed-on: http://git-master/r/1261913
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-12-19 15:39:58 -08:00
Deepak Nibade
53a9eceab7 gpu: nvgpu: fix deadlock between clean up and timeout worker
In case one job completes just around timeout boundary,
it is possible that we launch both clean up worker and
timeout worker for same job

Then in clean up worker we try to cancel timeout
worker, and in timeout worker we try to wait for clean
up to finish which leads to deadlock with below stacks

stack 1:
[<ffffffc0000bb484>] cancel_delayed_work_sync+0x10/0x18
[<ffffffc0004f820c>] gk20a_channel_cancel_job_clean_up+0x20/0x44
[<ffffffc0004fc794>] gk20a_channel_abort_clean_up+0x34/0x31c
[<ffffffc0004fcb30>] gk20a_channel_abort+0xb4/0xc0
[<ffffffc0004f3d18>] gk20a_fifo_recover_ch+0x9c/0xec
[<ffffffc0004f3f04>] gk20a_fifo_force_reset_ch+0xdc/0xf8
[<ffffffc0004fa8c4>] gk20a_channel_timeout_handler+0xf8/0x128

stack 2:
[<ffffffc0000bb484>] cancel_delayed_work_sync+0x10/0x18
[<ffffffc0004f82c4>] gk20a_channel_timeout_stop+0x40/0x60
[<ffffffc0004fc488>] gk20a_channel_clean_up_jobs+0x7c/0x238

To fix this, cancel the timeout worker in
gk20a_channel_update() itself instead of cancelling in
gk20a_channel_clean_up_jobs()

Bug 200246829

Change-Id: Idef9de3cae29668f4e25beb564422cf2e3736182
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1259963
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-30 09:19:31 -08:00
seshendra Gadagottu
ef95e43d97 gpu: nvgpu: chip specific init_inst_block
Add function pointer to add chip specific init_inst_block.
Update this function pointer for gk20a and gm20b.

JIRA GV11B-21

Change-Id: I74ca6a8b4d5d1ed36f7b25b7f62361c2789b9540
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1254875
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-21 08:50:45 -08:00
Sachit Kadle
5277ecd79c gpu: nvgpu: fix error handling bug
This change fixes error handling logic in
gk20a_alloc_channel_gpfifo(). In cases, where we don't
allocate a channel_sync at gpfifo allocation time,
we shouldn't attempt to destroy it while handling
an error.

Bug 200253447

Change-Id: I57a78c74bbce84fa17fb0360c59b8f413a9124a7
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1255858
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-11-21 04:00:13 -08:00
Terje Bergstrom
d29afd2c9e gpu: nvgpu: Fix signed comparison bugs
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.

Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-16 21:35:36 -08:00
Terje Bergstrom
8fa5e7c58a gpu: nvgpu: Remove IOCTL FREE_OBJ_CTX
We have never used the IOCTL FREE_OBJ_CTX. Using it leads to context
being only partially available, and can lead to use-after-free.

Bug 1834225

Change-Id: I9d2b632ab79760f8186d02e0f35861b3a6aae649
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1250004
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-11 11:47:42 -08:00
Terje Bergstrom
6410739922 gpu: nvgpu: Skip comparing u32 event_id against < 0
Skip checking of u32 event_id if it's smaller than zero.

Change-Id: I207c244eeff10f294c41a76b53f9393d50a84026
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249967
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-11 08:21:20 -08:00
Terje Bergstrom
5855fe26cb gpu: nvgpu: Do not post events to unbound channels
Change-Id: Ia1157198aad248e12e94823eb9f273497c724b2c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1248366
Tested-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
GVS: Gerrit_Virtual_Submit
2016-11-07 15:47:49 -08:00
Sachit Kadle
d539e97415 gpu: nvgpu: fix semaphore wakeup logic
Currently, when we receive a semaphore wakeup interrupt,
we call the channel_update callback, which schedules
deferred job clean-up.

For deterministic channels, we don't allow semaphore-backed
syncs anyways. That means for these channels, if we get a
semaphore wakeup interrupt, it must be for a userspace-managed
semaphore. In this case, there is no need to call into the
channel_update callback. So for deterministic channels, we
skip this.

Bug 1795076

Change-Id: I4cdfecd53144078c5cd4be8a41c5c3b7d74c338e
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1225620
(cherry picked from commit 64a6db0080c3b198ddc2029544f52eb590dc08ff)
Reviewed-on: http://git-master/r/1225615
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-10-26 15:04:29 -07:00
Konsta Holtta
d96ff67fb0 gpu: nvgpu: fix prealloc resource alloc error handling
Only free the per-channel preallocated job-tracking resources during
channel allocation error path if they have actually been allocated.

Bug 1795076

Change-Id: I2de90504f1042ce372337b68c5405727b4e4abb4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1234983
(cherry picked from commit 62cb75c6baa02d0edecd1f81f1b8b80a985fd715)
Reviewed-on: http://git-master/r/1238329
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-10-26 09:44:44 -07:00