Commit Graph

306 Commits

Author SHA1 Message Date
Seshendra Gadagottu
19b3bd28b3 gpu: nvgpu: use platform data for ptimer source rate
Instead of depending on clock frame-work, use platform data
for ptimer source rate. Removed ptimerscaling10x platform
data, and use ptimer source frequency to calculate
ptimerscaling factor.

Reviewed-on: http://git-master/r/819030
(cherry picked from commit dd291334d54dab80cab7eb1656dffc48a59610b4)

Change-Id: I7638ce9875a6e440bbfc2ba2da0d0b094b2700ff
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/827300
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-04 14:38:34 -08:00
Deepak Nibade
9592a4e6fc gpu: nvgpu: IOCTL to set TSG timeslice
Add new IOCTL NVGPU_IOCTL_TSG_SET_PRIORITY to allow
setting timeslice for entire TSG

Return error from channel specific IOCTL_CHANNEL_SET_PRIORITY
if the channel is part of TSG

Separate out API gk20a_channel_get_timescale_from_timeslice()
to get timeslice_timeout and scale from timeslice period

Use this API to get timeslice_timeout and scale for TSG and
store it in tsg_gk20a structure

Then trigger runlist update so that new timeslice values
will be re-written to runlist for TSG

Bug 200146615

Change-Id: I555467d034f81b372b31372f0835d72b1c159508
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/824206
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-03 14:20:08 -08:00
Janne Hellsten
057c6334f7 gpu: nvgpu: Get rid of legacy gpfifo type
Get rid of the duplicate gpfifo struct to emphasize the fact that
nvgpu_gpfifo is the only memory layout for gpfifo entries that works.
This is the same layout that HW uses.

Also, add a local pointer to the gpfifo memory in
gk20a_submit_channel_gpfifo to get rid of repeated typecasts.

Bug 1592391
Bug 1550886

Change-Id: I5432859ef8e7c1aab5907e44098994d7bb807f50
Signed-off-by: Janne Hellsten <jhellsten@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/677341
(cherry picked from commit 724c8c6228af81dd440e825bddf545dd6b2b8bd7)
Reviewed-on: http://git-master/r/822548
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-27 08:40:31 -07:00
Terje Bergstrom
2aead8a72f gpu: nvgpu: Disable only channel at zcull bind
At zcull bind we disable whole GR engine. This is unnecessary, so
instead disable only the channel and make sure it's unloaded.

Introduces also an API in fifo_gk20a.c to do the channel disable.

gr_gk20a_ctx_zcull_setup() was always passed true as last parameter,
so remove parameter.

Change-Id: I7ae6e101ec7d1ab3f6ee4e9bcc442d23dbd21247
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/787570
2015-10-23 08:30:23 -07:00
Terje Bergstrom
75c09b96b4 gpu: nvgpu: Protect sync by an own lock
Protect creation and deletion of sync by an own mutex. This prevents
deadlock in channel abort when abort is called from submit path.

Bug 200147887

Change-Id: I5d6308b773c1d1a6a89d4590e2e74c74d691f79d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/821127
2015-10-22 11:41:24 -07:00
Deepak Nibade
0269339256 gpu: nvgpu: restart timer instead of cancel
In gk20a_fifo_handle_sched_error(), we currently cancel
the timeout on all the channels
But this could cause us to miss one of stuck channel

hence, instead of cancelling, restart the timeout of channel
on which it is already active

Bug 200133289

Change-Id: I40e7e0e5394911fc110ab6fde39592b885dfaf7d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/816133
Reviewed-by: Ishan Mittal <imittal@nvidia.com>
Tested-by: Ishan Mittal <imittal@nvidia.com>
2015-10-19 23:52:50 -07:00
Deepak Nibade
da8ff40e55 gpu: nvgpu: fix deadlock on timeout lock
In gk20a_channel_timeout_stop(), we take the channel's
timeout lock and then cancel the timeout worker thread

Timeout worker thread also tries to acquire same timeout
lock.

Hence, while cancelling the timeout in gk20a_channel_timeout_stop()
if the timeout_handler is already scheduled, we will have a deadlock

Fix this by moving cancel_delayed_work_sync() out of the locks

Bug 200133289
Bug 1695481

Change-Id: Iea78770180b483a63e5e176efba27831174e9dde
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/815922
Reviewed-by: Ishan Mittal <imittal@nvidia.com>
Tested-by: Ishan Mittal <imittal@nvidia.com>
2015-10-19 23:52:04 -07:00
Terje Bergstrom
87adfb4963 Revert "gpu: nvgpu: WAR for bad GPFIFO entries from userspace"
This reverts commit aeb74fc7952718ffab6281c687951499510c4333.
User space was fixed not to send zero-length GPFIFO entries.

Bug 1662670

Change-Id: Iec6bf1870a19db4e8daa2ed4512650b92a37ba92
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/811783
2015-10-08 15:34:48 -07:00
Deepak Nibade
d01a0249c4 gpu: nvgpu: cancel all wdt timeouts while handling SCHED errors
A SCHED error might cause multiple channels' watchdogs to trigger
simultaneously

Hence, to avoid this conflict cancel watchdog timeout on all
channels before recovering from SCHED errors

Also, define API gk20a_channel_timeout_stop_all_channels()
to cancel wdt timeout on all channels

Bug 200133289

Change-Id: I8324c397891f0a711327b77d0677cd6718af6d01
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/810959
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-07 15:03:28 -07:00
Deepak Nibade
ff417a72e2 gpu: nvgpu: make wdt timeout per-platform
Channel watchdog timeout is set to a costant value of 5s
as of now

Make this timeout platform specific and set it to 5s for gm20b
and 7s for gk20a

Bug 200133289

Change-Id: I6e7f0fed93a8d5b197ae46807131311196c6636f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/810956
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-07 15:00:18 -07:00
Kirill Artamonov
3ad8438c90 gpu: nvgpu: set correct timeslice value
Scale timeslice register value based on platform
specific ptimer scale koefficient.

Expose timeslice values through debugfs to simplify performance
tuning.

bug 1605552
bug 1603226

Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25
Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-on: http://git-master/r/800007
(cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a)
Reviewed-on: http://git-master/r/808251
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-06 13:31:10 -07:00
Deepak Nibade
613990cb39 gpu: nvgpu: implement per-channel watchdog
Implement per-channel watchdog/timer as per below rules :
- start the timer while submitting first job on channel or if
  no timer is already running
- cancel the timer when job completes
- re-start the timer if there is any incomplete job left
  in the channel's queue
- trigger appropriate recovery method as part of timeout
  handling mechanism

Handle the timeout as per below :
- get timed out channel, and job data
- disable activity on all engines
- check if fence is really pending
- get information on failing engine
- if no engine is failing, just abort the channel
- if engine is failing, trigger the recovery

Also, add flag "ch_wdt_enabled" to enable/disable channel
watchdog mechanism. Watchdog can also be disabled using
global flag "timeouts_enabled"

Set the watchdog time to be 5s using macro
NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS

Bug 200133289

Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797072
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-28 09:08:12 -07:00
Vijayakumar
b8faddfe2a gpu: nvgpu: fix runlist update timeout handling
bug 1625901
1) disable ELPG before doing GR reset when runlist update times out
2) add mutex for GR reset to avoid multiple threads resetting GR
3) protect GR reset with FECS mutex so that no one else submits methods

Change-Id: I02993fd1eabe6875ab1c58a40a06e6c79fcdeeae
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/793643
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-16 09:44:00 -07:00
Deepak Nibade
cf0351a560 gpu: nvgpu: disable channel before adjusting syncpoints
As per current sequence in gk20a_channel_abort(),
we first balance the syncpoint values associated with
failing channel, and then abort it

Reverse this sequence so that we first disable the channel
and then only balance the syncpoints

Bug 200133289

Change-Id: I5a748afce437e728a5ff6c8a030a75d0f627c622
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797071
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-11 08:46:32 -07:00
Leonid Moiseichuk
54c2ae59f0 gpu: nvgpu: cyclestats snapshot permissions rework
Cyclestats snapshot feature is expected for new devices.
The detection code was isolated in separate function and run-time
check added to validate/allow ioctl calls on the current GPU.

Bug 1674079

Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/781697
(cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4)
Reviewed-on: http://git-master/r/793630
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-04 09:03:07 -07:00
Alex Waterman
00721d4cb8 gpu: nvgpu: WAR for bad GPFIFO entries from userspace
Userspace sometimes sends GPFIFO entries with a zero length. This
is a problem when the address of the pushbuffer of zero length is
larger than 32 bits. The high bits are interpreted as an opcode and
either triggers an operation that should not happen or is trated as
invalid.

Oddly, this WAR is only necessary on simulation.

Change-Id: I8be007c50f46d3e35c6a0e8512be88a8e68ee739
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/753379
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/755814
Reviewed-by: Automatic_Commit_Validation_User
2015-06-11 10:16:25 -07:00
Terje Bergstrom
c5367d7350 gpu: nvgpu: Do not pre-allocate cmdbuf entries
Do not preallocate cmdbuf tracking entries. Allocate them only when
needed.

Bug 200104160

Change-Id: I12f8392723c301a368af1e280893ff993480477f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743953
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/755148
2015-06-09 11:14:34 -07:00
Konsta Holtta
6085c90f49 gpu: nvgpu: add per-channel refcounting
Add reference counting for channels, and wait for reference count to
get to 0 in gk20a_channel_free() before actually freeing the channel.
Also, change free channel tracking a bit by employing a list of free
channels, which simplifies the procedure of finding available channels
with reference counting.

Each use of a channel must have a reference taken before use or held
by the caller. Taking a reference of a wild channel pointer may fail, if
the channel is either not opened or in a process of being closed. Also,
add safeguards for protecting accidental use of closed channels,
specifically, by setting ch->g = NULL in channel free. This will make it
obvious if freed channel is attempted to be used.

The last user of a channel might be the deferred interrupt handler,
so wait for deferred interrupts to be processed twice in the channel
free procedure: once for providing last notifications to the channel
and once to make sure there are no stale pointers left after referencing
to the channel has been denied.

Finally, fix some races in channel and TSG force reset IOCTL path,
by pausing the channel scheduler in gk20a_fifo_recover_ch() and
gk20a_fifo_recover_tsg(), while the affected engines have been identified,
the appropriate MMU faults triggered, and the MMU faults handled. In this
case, make sure that the MMU fault does not attempt to query the hardware
about the failing channel or TSG ids. This should make channel recovery
more safe also in the regular (i.e., not in the interrupt handler) context.

Bug 1530226
Bug 1597493
Bug 1625901
Bug 200076344
Bug 200071810

Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/448463
(cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353)
Reviewed-on: http://git-master/r/755147
Reviewed-by: Automatic_Commit_Validation_User
2015-06-09 11:13:43 -07:00
Terje Bergstrom
0dc66952e4 gpu: nvgpu: Use vmalloc only when size >4K
When allocation size is 4k or below, we should use kmalloc. vmalloc
should be used only for larged allocations.

Introduce nvgpu_alloc, which checks the size, and decides the API
to use.

Change-Id: I593110467cd319851b27e57d1bfe8d228d3f2909
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743974
(cherry picked from commit 7f56aa1f0ecafbfde7286353b60e25e494674d26)
Reviewed-on: http://git-master/r/753276
Reviewed-by: Automatic_Commit_Validation_User
2015-06-06 07:24:11 -07:00
Leonid Moiseichuk
837ceffcab gpu: nvgpu: cyclestats mode E snapshots support
That is a kernel supporting code for cyclestats mode E.
Cyclestats mode E implemented following Windows-design in user-space
and required the following operations to be implemented:
- attach a client for shared hardware buffer of device
- detach client from shared hardware buffer
- flush means copy of available data from hardware buffer to private
  client buffers according to perfmon IDs assigned for clients
- perfmon IDs management for user-space clients
- a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added

Bug 1573150

Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/740653
(cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f)
Reviewed-on: http://git-master/r/753274
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-06 07:23:24 -07:00
Seshendra Gadagottu
38cee4d7ef gpu:nvgpu: update channel_setup_ramfc interface
Pass flags parameter to channel_setup_ramfc for
indicating nvgpu_alloc_gpfifo_args characteristics.

Bug 1645628

Change-Id: Ia40b37c5c7b208d459aa84f1b022036dd5e1b599
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/744526
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-02 11:17:39 -07:00
Deepak Nibade
4531a27adc gpu: nvgpu: fix channel leak with immediate close
If a GPU channel is closed immediately after opening without
performing any operation on it, we leak that channel

e.g. below command leaks a channel
echo > /dev/nvhost-gpu

Fix this leak by releasing the channel before returning

Change-Id: I2598e3cabec6996cb1cf8066a1e6d7d5864ae02b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/743235
(cherry picked from commit 428771509b4431ebe88e38061b495cabc5192327)
Reviewed-on: http://git-master/r/744279
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-05-19 20:32:38 -07:00
Konsta Holtta
7072bdc513 gpu: nvgpu: check sync existence in channel update
The channel sync object can get deleted before all channel updates have
finished if the channel is freed before them, so work around a null
dereference by testing if the sync exists. Channel and/or c->sync
refcounting would be necessary for proper fix.

Bug 200076344

Change-Id: Ica8ef2df9cd95cfa593cd4f41768dbb6641357b2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/734266
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:31:39 +05:30
Terje Bergstrom
06be77da37 gpu: nvgpu: Do not send WFI when finishing channel
The channel teardown process sends a WFI method to ensure that all
work has been completed. But we also preempt the channel a while
later, which also ensures that all work is completed.

Remove the code for submitting WFI, and rely on preemption to handle
idling the pipe.

Change-Id: I2af029184440ee73e70d377f15690ddaf9b8599f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/735067
Reviewed-on: http://git-master/r/737527
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:55:41 -07:00
Terje Bergstrom
9bbffa11de gpu: nvgpu: Reconfigure instance block with syncpt
Resetup RAMFC once sync point id is allocated for a channel.

Change-Id: Idbac406bea1c94c89ef587dda08fddc740c1fadb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/711302
Reviewed-on: http://git-master/r/737526
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:55:35 -07:00
Terje Bergstrom
029ccf28ec gpu: nvgpu: Sem wakeup to post event
Add posting a channel event whenever we do a wakeup due to semaphore.

Change-Id: Id1765123de93bcbc0822af7926d7f4e9919ffe10
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/726420
2015-04-04 19:17:38 -07:00
Gagan Grover
29ceb36331 gpu: nvgpu: Fix the GPFIFO submit condition
GPFIFO HW with fifo size N can actually accept
N-1 entries. Also, kernel can insert two extra
entries, before and after the user GPFIFOs.

So, GPFIFO with N size can accept max N-3 user
gpfifos.

Bug 1613125

Change-Id: I173526afb70dddf1b2b9ec0a99890335c81f0e02
Signed-off-by: Gagan Grover <ggrover@nvidia.com>
Reviewed-on: http://git-master/r/725380
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 19:02:56 -07:00
Terje Bergstrom
69ef2e6fa3 gpu: nvgpu: Increase GPFIFO priv cmd queue size
We're doing two sync point increments, so we need to allocate space
for it.

Change-Id: I663abb18a930eb3955379d5f8dd1b08c8fa56897
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/719884
2015-04-04 19:02:41 -07:00
Deepak Nibade
87ccc6a02f gpu: nvgpu: improve error print in check_gp_put()
Improve error print in check_gp_put() and dump put
pointer stored in channel's gpfifo structure and
the put pointer that we read from memory

Bug 200089835

Change-Id: I7e854398811330a43d9d9bf86bf29c5986ab22b7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/722536
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 19:02:16 -07:00
Terje Bergstrom
92b667b888 gpu: nvgpu: Use common allocator for GPFIFO
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: I81701427ae29b298039a77f1634af9c14237812e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/719872
2015-04-04 19:01:37 -07:00
Terje Bergstrom
2fc4169293 gpu: nvgpu: Use common allocator for cmd queue
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: If93063acbbfaa92aef530208241988427b5df8eb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/719871
2015-04-04 19:01:36 -07:00
Terje Bergstrom
1eded55286 gpu: nvgpu: Use gk20a_mem_phys instead of sg_phys
There were still a couple of places using sg_phys directly. Use new
gk20a_mem_phys() to make the code shorter.

Change-Id: I6eb9b14e0c14a27ec39bacd06ab24e31e99769ca
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717502
2015-04-04 19:00:43 -07:00
Konsta Holtta
d86fc5414b gpu: nvgpu: use vmalloc for temp gpfifo in submit
Use vmalloc instead of kzalloc for the temporary gpfifo buffer copied
from userspace in the ioctl when submitting gpfifos. The data may be too
big for kzalloc, and it doesn't need to be physically contiguous.

Bug 1617747
Bug 200081843

Change-Id: I66a43d17eb13a2783bc1f0598a38abbf330b2ba6
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/715207
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 19:00:21 -07:00
Terje Bergstrom
7290a6cbd5 gpu: nvgpu: Implement common allocator and mem_desc
Introduce mem_desc, which holds all information needed for a buffer.
Implement helper functions for allocation and freeing that use this
data type.

Change-Id: I82c88595d058d4fb8c5c5fbf19d13269e48e422f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/712699
2015-04-04 18:59:26 -07:00
Konsta Holtta
e1ae4e3053 gpu: nvgpu: protect channel ioctls with a mutex
Add a big mutex for protecting the channel during ioctls, in case the
userspace uses the same channel from several threads at once. The lock
is taken during all operations except CHANNEL_WAIT, which could deadlock.

Bug 1603482

Change-Id: Ibed962eadc9f00645abd54413dde9aaee00377ab
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/678871
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:08:27 -07:00
Terje Bergstrom
bdb34abd93 gpu: nvgpu: Per-chip PBDMA signature
PBDMA HW signature depends on the chip.

Change-Id: If57d721d9bb77a090f967930a1aa2037bf4a16fe
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/672922
2015-04-04 18:07:37 -07:00
Terje Bergstrom
9bf82585aa gpu: nvgpu: Install fd after no errors can happen
fd_install() should be called only once all other initialization is
done and no errors can happen.

Bug 1589104

Change-Id: I822511a64d4c6fa59c8e772a896dbd6818459c97
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/706928
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
2015-04-04 18:07:35 -07:00
Terje Bergstrom
226c671f8e gpu: nvgpu: More robust recovery
Make recovery a more straightforward process. When we detect a fault,
trigger MMU fault, and wait for it to trigger, and complete recovery.
Also reset engines before aborting channel to ensure no stray sync
point increments can happen.

Change-Id: Iac685db6534cb64fe62d9fb452391f43100f2999
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/709060
(cherry picked from commit 95c62ffd9ac30a0d2eb88d033dcc6e6ff25efd6f)
Reviewed-on: http://git-master/r/707443
2015-04-04 18:06:37 -07:00
Terje Bergstrom
a3b26f25a2 gpu: nvgpu: TLB invalidate after map/unmap
Always invalidate TLB after mapping or unmapping, and remove the
delayed TLB invalidate.

Change-Id: I6df3c5c1fcca59f0f9e3f911168cb2f913c42815
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/696413
Reviewed-by: Automatic_Commit_Validation_User
2015-04-04 18:06:37 -07:00
Konsta Holtta
adb33505b2 gpu: nvgpu: add open channel ioctl to ctrl node
Add the ioctl to open a new gpu channel to also the control node for
improved process startup performance, in addition to the current open
ioctl in the channel node. The new channel fd creation is refactored to
a separate function which is called from both ctrl and channel ioctls.

Bug 1604952

Change-Id: I3357ceec694c0e6d7a85807183884324cb725d3a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/679516
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:05:59 -07:00
Konsta Holtta
2dda8077ec gpu: nvgpu: unify instance block initialization
Create gk20a_init_inst_block() to reduce reg write clutter when
initializing instance blocks, which is done in several places.

Change-Id: Idcb8b604851a849e0bb6abce5743c9f4cbf98033
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/672434
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:04:35 -07:00
Janne Hellsten
9148a1e627 gpu: nvgpu: gk20a: Optimize gpfifo entry copy
Use memcpy for copying gpfifo inputs into the gpfifo ring buffer.
This speeds up one command buffer heavy benchmark from 82 FPS to 86
FPS.  Speed up is due to a) faster memory move and b) zero tracing
overhead when PB tracing is disabled.

Bug 1550886

Change-Id: If95ebff53745bbf59edeac32ad4f32f10f1ea7ee
Signed-off-by: Janne Hellsten <jhellsten@nvidia.com>
Reviewed-on: http://git-master/r/676967
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:03:43 -07:00
Janne Hellsten
f6587d13e4 gpu: nvgpu: gk20a: Add a gpfifo wait trace point
Add a couple of trace points for tracking when we need to wait for
space in the gpfifo ring buffer.  This wait can introduce significant
latencies to rendering with large gpfifo entry inputs so it's good to
be able to measure how often this path is taken.

Bug 1592391
Bug 1550886

Change-Id: I7f362e9c307eeffeeecaaba268ef2e3613e54597
Signed-off-by: Janne Hellsten <jhellsten@nvidia.com>
Reviewed-on: http://git-master/r/674021
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:03:23 -07:00
Seshendra Gadagottu
8b887af59a gpu: nvgpu: correct channel open sequence
Corrected sequence to bind and enable channel
only afer channel gpfifo alloction done.

Bug 1591647

Change-Id: I539458d1b666c0403cca1abcf8271b9c8c09f52c
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/671208
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:02:17 -07:00
Konsta Holtta
4ccb162da7 gpu: nvgpu: unify instance block creation
Reduce copypaste code in instance block allocation and deletion with
functions purposed for that.

Change-Id: I2c8ae6a317ac89e2c857dde4296cb4316b8aaafe
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/668698
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 15:06:38 -07:00
Sam Payne
d1d1fbfb60 gpu: nvgpu: check gpfifo size against request
if a request is submitted larger than the
allocated fifo, an error is returned immediately
rather than waiting for timeout while enough space
becomes available in the fifo (timeout
will not trigger in this case)

bug 1563401

Change-Id: I264dee2673dc8722034881f9e7db7bb137a8c0c8
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/665113
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 15:06:17 -07:00
Konsta Holtta
4f3647ca32 gpu: nvgpu: protect channel abort with submit lock
Bug 200065789

Change-Id: I59eb93c7929a77cd4de4be40fd7902cd05e536c7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/665655
(cherry-picked from commit 4ee1893926557b01d7058a0a4c1c23e4476d7668)
Reviewed-on: http://git-master/r/668850
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
2015-04-04 15:06:04 -07:00
Timo Alho
31a436b3a1 Revert "gpu: nvgpu: Enable syncpt reclaim only on gm20b"
This reverts commit 8eefb93c21934b101d7f423c38d9ea384a45fad6.

Bug 1585422

Change-Id: I217e0ffe6c230ee3c63d9aec1c48ce9c41770468
Signed-off-by: Timo Alho <talho@nvidia.com>
Reviewed-on: http://git-master/r/659426
2015-03-18 12:12:26 -07:00
Terje Bergstrom
3683428235 gpu: nvgpu: Enable syncpt reclaim only on gm20b
gm20b has more channels than sync points. We use aggressive reclaim
of sync points to offset that. Disable aggressive reclaim for gk20a
because it is not needed there.

Bug 1583849

Change-Id: I2a74b0504150a54cb8a97016effe20c5d905ac95
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/657095
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
2015-03-18 12:12:25 -07:00
Terje Bergstrom
2d71d633cf gpu: nvgpu: Physical page bits to be per chip
Retrieve number of physical page bits based on chip.

Bug 1567274

Change-Id: I5a0f6a66be37f2cf720d66b5bdb2b704cd992234
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/601700
2015-03-18 12:12:19 -07:00