Commit Graph

227 Commits

Author SHA1 Message Date
Alex Waterman
4e21f4a148 gpu: nvgpu: export gk20a_free_priv_cmdbuf
Export gk20a_free_priv_cmdbuf() so that the channel_sync_gk20a.c code
can call this function. This is necessary for error paths in the
semaphore wait/incr functions.

Bug 1732449
JIRA DNVGPU-12

Change-Id: Id2ea13e5553d50475ee1bbf94781e18590321fdf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1162686
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-14 14:01:03 -07:00
Terje Bergstrom
edd080b05a gpu: nvgpu: Disable channel watchdog
Bug 200198908

Change-Id: I4dfb3517f5467f8b5449e65290453ba1c828243d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1163439
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-06-14 04:50:48 -07:00
Lakshmanan M
a295d90cac gpu: nvgpu: Add uapi support for non-graphics engines
Extend the existing NVGPU_GPU_IOCTL_OPEN_CHANNEL interface to allow
opening channels for other than the primary (i.e., the graphics)
runlists. This is required to push work to dGPU engines that have
their own runlists, such as the asynchronous copy engines and the
multimedia engines.

Minor change - Added active_engines_list allocation
and assignment for fifo_vgpu back end.

JIRA DNVGPU-25

Change-Id: I3ed377e2c9a2b4dd72e8256463510a62c64e7a8f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1161541
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-13 07:45:19 -07:00
Alex Waterman
832e4fce1e gpu: nvgpu: Rework the channel timeout handler messages
Rework how the messages in the channel timeout handler to be a little
bit more verbose and more clear about what is happening.

Bug 1732449
JIRA DNVGPU-12

Change-Id: Ifc018d99c647b3036caa8ad453e5e3dfc4151396
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1153669
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-13 07:38:57 -07:00
Alex Waterman
5f36481371 gpu: nvgpu: Remove dead priv_cmdbuf code
Remove the gp_get and gp_put pointers from the priv_cmdbuf code. These
pointers appear to track the position of th the priv_cmdbuf in the gp_fifo.
However, these pointers are not used for anything nor are they needed for
anything in the future. This code appears to be a relic left over from the
past.

Change-Id: Ibed1a6d51fa0cac12c5e0429760e8e2f611fc899
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1161859
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-13 07:38:14 -07:00
Deepak Nibade
2bc8f4e36d gpu: nvgpu: fix event id polling
In gk20a_event_id_poll(), we always set the mask value
and return it. This causes poll() from UMD to be always
successful irrespective of event is really generated or not

Fix this by adding a flag event_posted for each event
Set this flag while posting the event
In gk20a_event_id_poll(), set the mask value only if
this flag is set. If flag is set, set mask and clear the flag

Bug 200089620

Change-Id: If14236547c611fe4bfa1410ff5b69c9fa02d43bb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1160253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-10 08:14:49 -07:00
Lakshmanan M
6299b00beb gpu: nvgpu: Add multiple engine and runlist support
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
   for gm206 GPU family
5) Added generic mechanism to identify the
   CE engine pri_base address for gm206
   (CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
   made generic way
7) Code cleanup for readability

JIRA DNVGPU-26

Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-07 12:31:34 -07:00
Deepak Nibade
8b05c705fb gpu: nvgpu: fix TSG abort sequence
In gk20a_fifo_abort_tsg(), we loop through channels of
TSG and call gk20a_channel_abort() for each channel

This is incorrect since we disable and preempt each
channel separately, whereas we should disable all channels
at once and use TSG specific API to preempt TSG

Fix this with below sequence :
- gk20a_disable_tsg() to disable all channels
- preempt tsg if required
- for each channel in TSG
  - set has_timedout flag
  - call gk20a_channel_abort_clean_up() to clean up channel state

Also, separate out common gk20a_channel_abort_clean_up() API
which can be called from both channel and TSG abort routines

In gk20a_channel_abort(), call gk20a_fifo_abort_tsg() if the
channel is part of TSG

Add new argument "preempt" to gk20a_fifo_abort_tsg() and
preempt TSG if flag is set

Bug 200205041

Change-Id: I4eff5394d26fbb53996f2d30b35140b75450f338
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157190
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-01 13:04:02 -07:00
Deepak Nibade
31fa6773ef gpu: nvgpu: use correct APIs for disable and preempt
In gr_gk20a_ctx_zcull_setup(), gr_gk20a_update_smpc_ctxsw_mode(),
and in gk20a_channel_suspend(), we call channel specific APIs
to disable/preempt/enable channel
But we do not consider TSGs in this case

Hence use correct (below) APIs in above functions which
will handle channel or TSG internally :
gk20a_disable_channel_tsg()
gk20a_fifo_preempt()
gk20a_enable_channel_tsg()

Bug 200205041

Change-Id: Ieed378dac4ad2322b35f9102706176ec326d386c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157189
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-01 13:03:56 -07:00
Terje Bergstrom
97a58123e8 gpu: nvgpu: Free unbound channels
If a channel cannot be allocated a GPFIFO, its status will remain
unbound. We still need to free the channel when the fd gets closed.

Bug 1769481

Change-Id: I8a9170f431cfa98dcf9e3cbc082393f02d2203db
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1154725
2016-05-31 12:24:08 -07:00
Richard Zhao
d707c5a444 gpu: nvgpu: add tsg support for vgpu
- make tsg_gk20a.c call HAL for enable/disable channels
- add preempt_tsg HAL callbacks
- add tsg bind/unbind channel HAL callbacks
- add according tsg callbacks for vgpu

Bug 1702773
JIRA VFND-1003

Change-Id: I2cba74b3ebd3920ef09219a168e6433d9574dbe8
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1144932
(cherry picked from commit c3787de7d38651d46969348f5acae2ba86b31ec7)
Reviewed-on: http://git-master/r/1126942
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-31 10:47:22 -07:00
Deepak Nibade
dc7af18bf8 gpu: nvgpu: suppress prints in submit path
When we run out of gpfifo space or private command buffer
space, we have error spew like below :
__gk20a_channel_syncpt_incr: not enough priv cmd buffer space
gk20a_submit_channel_gpfifo: fail

Dumping these prints to UART cause increase in submit
latencies

But on these failures, we return -ENOSPC to UMD and then
UMD retries the submit, hence it might be unnecessary to dump
these prints

Hence, remove the error prints of insufficient space
and use gk20a_dbg_fn() instead of gk20a_err() to print failure
in gk20a_submit_channel_gpfifo()

Bug 200202653

Change-Id: I49efd7c6c554dd4fbfa4e66d196eb352e69f92c6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1152378
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-24 12:35:34 -07:00
Thomas Fleury
64f2e3ee9b gpu: nvgpu: update trace for sched params
Use an inline function instead of a macro to
"expand" all channel parameters.

Jira EVLR-244
Jira EVLR-318

Change-Id: I4e8c5ee6bc9da36564af171be809f50dd2dfd439
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1150050
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-21 11:34:18 -07:00
Thomas Fleury
47e3d2e905 gpu: nvgpu: fix engine reset in FECS trace
In virtualization case, VM server is the only one
allowed to write to ctxsw ring buffer. It will
also generate an event in case of engine reset.
Only generate a tracepoint on Guest OS side.

EVLR-314

Change-Id: I2cb09780a9b41237fe196ef1f3515923f36a24a4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1130743
(cherry picked from commit 4bbf9538e2a3375eb86b2feea6c605c3eec2ca40)
Reviewed-on: http://git-master/r/1133614
(cherry picked from commit 2076d944db41b37143c27795b3cffd88e99e0b00)
Reviewed-on: http://git-master/r/1150046
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-21 11:33:23 -07:00
Thomas Fleury
4df6cd4a34 gpu: nvgpu: add ctxsw channel reset event
Generate a ctxsw channel reset when engine needs to be reset.
This event is generated by the driver, while other events are
generated by FECS.

JIRA ELVR-314

Change-Id: I7791cf3e538782464c37c442c871acb177484566
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1129029
(cherry picked from commit 114038a1a5d9e8941bc53f3e95115b01dd1f8c6e)
Reviewed-on: http://git-master/r/1134379
(cherry picked from commit 15fa2a7b48a0937dfd449ca0c4ed5ad3a863d6bf)
Reviewed-on: http://git-master/r/1123916
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-21 11:29:13 -07:00
Konsta Holtta
dc45473eeb gpu: nvgpu: use mem_desc in priv_cmd_entry
Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a
reference to the underlying mem_desc (within priv_cmd_queue) paired with
an offset, for buffer aperture flexibility.

JIRA DNVGPU-21
JIRA DNVGPU-23

Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1145922
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-18 11:54:34 -07:00
Konsta Holtta
6eebc87d99 gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.

gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.

vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.

Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.

JIRA DNVGPU-23

Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
2016-05-13 07:11:33 -07:00
Deepak Nibade
d868b65441 gpu: nvgpu: separate IOCTL to set preemption mode
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD

Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h

Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously

Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)

API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute

Make necessary changes in code to support two preemption
modes

Bug 1646259

Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:29 -07:00
Haley Teng
4c4d0e6eb2 nvgpu: vgpu: create fifo.force_reset_ch in gpu_ops
gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to
create a function pointer in gpu_ops and assign it differently for
vgpu and non-vgpu.

Bug 200184349

Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/1130420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 09:52:04 -07:00
Alex Waterman
70d5313882 gpu: nvgpu: update priv_cmdbuff computation
Update the priv_cmdbuff computation to take into account the amount
of memory semaphores take. Since semaphores always require more
memory than sync-pts the sync-pt computation has been dropped.

Bug 1732449
JIRA DNVGPU-12

Change-Id: Ic05c26b4d1ed9cbd03d3239655c4607bb418396c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1141420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-06 12:13:52 -07:00
Thomas Fleury
93678f571c gpu: nvgpu: Add trace and debugfs for sched params
JIRA EVLR-244
JIRA EVLR-318

Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
2016-05-05 09:25:02 -07:00
Deepak Nibade
71beac3de0 gpu: nvgpu: cancel clean up before update_fn_work
In gk20a_channel_suspend(), we first cancel worker
thread update_fn_work and then cancel job clean up worker

But it is possible that we re-schedule update_fn_work from
job cleanup worker after it was cancelled
And in that case worker update_fn_work might run after
we poweroff GPU

Fix this by first cancelling job clean up worker and
then update_fn_work worker

Bug 200187905

Change-Id: Ia192c515702f14becf60d92c6471d8c0e892551e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120426
(cherry picked from commit b132b6c5aa9ab88c6733f36906f1b874114ec72d)
Reviewed-on: http://git-master/r/1134888
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-03 09:02:13 -07:00
Deepak Nibade
caed43e0a2 gpu: nvgpu: return from worker if gpu is not up
During GPU shutdown path, it is possible that we
shut down the GPU while worker thread is still
running gk20a_channel_update()

Hence before accessing gp_put/get, check if GPU
is up or not

Bug 200166139

Change-Id: Iba3ec173041a84527c4700a93f20564a842cfb01
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935193
(cherry picked from commit c81ea5fe383c44e872754b363968af57d84225ac)
Reviewed-on: http://git-master/r/1121917
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-03 09:01:20 -07:00
Alex Waterman
4c5e29436d gpu: nvgpu: Flush FB before checking semaphores
Before checking semaphore values to determine if jobs have been
completed flush the FB. If this is not done, despite the sempahore
memory being mapped as volatile in the GMMU, outstanding writes
can still be pending.

Bug 1732449
JIRA DNVGPU-12

Change-Id: I67b596cd23a5465af05d6d173641a579cb7f168c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133787
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-29 14:44:48 -07:00
Alex Waterman
a191fd905a gpu: nvgpu: Schedule channel update when jobs actually finish
Do not schedule channel update call backs unless a job is actually
finished. This saves a lot of call backs to the CDE code that don't
do anything when semaphores are enabled.

Bug 1732449
JIRA DNVGPU-12

Change-Id: I2f9a78498b08ebca44ee6a5171931a07721767f1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133786
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-29 14:44:24 -07:00
Deepak Nibade
63319c0fb9 gpu: nvgpu: fix sparse warnings
Fix below sparse warnings :
drivers/gpu/nvgpu/gk20a/gk20a.c:764:5: warning: symbol
'gk20a_pm_finalize_poweron' was not declared. Should it be static?
drivers/gpu/nvgpu/gk20a/channel_gk20a.c:2504:14:
warning: symbol 'gk20a_event_id_poll' was not declared. Should it be
static?
drivers/gpu/nvgpu/gk20a/channel_gk20a.c:2538:5:
warning: symbol 'gk20a_event_id_release' was not declared. Should it be
static?

Bug 200067946
Bug 200088648

Change-Id: I5c23e7ee09c1a18fe2eeff12f80a3c2bf73120ef
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1128060
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sri Krishna Chowdary <schowdary@nvidia.com>
2016-04-20 02:16:14 -07:00
Deepak Nibade
2ac8c9729a gpu: nvgpu: make jobs_lock more fine grained
While processing all the jobs in gk20a_channel_clean_up_jobs(),
We currently acquire jobs_lock, traverse the list,
clean up the jobs, and then release the lock

But in this case we might hold the lock for too long
blocking the submit path

Hence make jobs_lock more fine grained by restricting
it for list accesses only

Bug 200187553

Change-Id: If82af8ff386f7bc29061cfd57fdda7df62f11c17
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120412
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:17:30 -07:00
Deepak Nibade
c8d17e9167 gpu: nvgpu: remove submit lock
Remove submit lock since we have moved to use
more fine-grained locks

Remove API check_gp_put() since we cannot call
it in submit path due to latencies and we cannot
call it in gk20a_channel_clean_up_jobs() anymore
since it will fail there without the lock

Bug 200187553

Change-Id: I05b9fa95c9009000e13232d8fa567336eeee11c6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120411
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:16:44 -07:00
Deepak Nibade
e0c9da1fe9 gpu: nvgpu: implement sync refcounting
We currently free sync when we find job list empty
If aggressive_sync is set to true, we try to free
sync during channel unbind() call

But we rarely free sync from channel_unbind() call
since freeing it when job list is empty is
aggressive enough

Hence remove sync free code from channel_unbind()

Implement refcounting for sync:
- get a refcount while submitting a job (and
  allocate sync if it is not allocated already)
- put a refcount while freeing the job
- if refcount==0 and if aggressive_sync_destroy is
  set, free the sync
- if aggressive_sync_destroy is not set, we will
  free the sync during channel close time

Bug 200187553

Change-Id: I74e24adb15dc26a375ebca1fdd017b3ad6d57b61
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120410
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:16:13 -07:00
Deepak Nibade
1c96bc6942 gpu: nvgpu: add lock for fences
All pre/post fence accesses in last_submit are
currently protected by submit lock
In order to remove the submit lock, move all fence
accesses under own lock i.e. fence_lock

Bug 200187553

Change-Id: I0132d1933dc92db8c5ed8c9311e49a030aa2d38c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120409
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:15:37 -07:00
Deepak Nibade
dfac8ce704 gpu: nvgpu: support binding multiple channels to a debug session
We currently bind only one channel to a debug session
But some use cases might need multiple channels bound
to same debug session

Add this support by adding a list of channels to debug session.
List structure is implemented as struct dbg_session_channel_data

List node dbg_s_list_node is currently defined in struct
dbg_session_gk20a. But this is inefficient when we need to
add debug session to multiple channels

Hence add new reference structure dbg_session_data to
store dbg_session pointer and list entry

For each NVGPU_DBG_GPU_IOCTL_BIND_CHANNEL call, create
two reference structure dbg_session_channel_data for channel
and dbg_session_data for debug session and bind them together

Define API nvgpu_dbg_gpu_get_session_channel() which will
get first channel in the list of debug session
Use this API wherever we refer to channel bound to debug
session

Remove dbg_sessions define in struct gk20a since it is
not being used anywhere

Add new API NVGPU_DBG_GPU_IOCTL_UNBIND_CHANNEL to support
unbinding of channel from debug sesssion

Bug 200156699

Change-Id: I3bfa6f9cd5b90e7254a75c7e64ac893739776b7f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120331
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:34 -07:00
Terje Bergstrom
7d8e219389 gpu: nvgpu: Use sysmem aperture for SoC memory
In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it
has to be accessed as sysmem.

Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120468
2016-04-15 08:50:34 -07:00
Terje Bergstrom
9b5427da37 gpu: nvgpu: Support GPUs with no physical mode
Support GPUs which cannot choose between SMMU and physical
addressing.

Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120469
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2016-04-13 13:12:41 -07:00
Terje Bergstrom
e8bac374c0 gpu: nvgpu: Use device instead of platform_device
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.

Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
2016-04-08 09:42:41 -07:00
Thomas Fleury
b2dd107455 gpu: nvgpu: add trace event for channel reset
Change-Id: I319e877978b7f483108ef8f67c05702b71709f62
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120501
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 13:56:38 -07:00
Deepak Nibade
16658fd39d gpu: nvgpu: post BPT_INT/PAUSE and BLOCKING_SYNC events
Post EVENT_ID_BPT_INT when bpt.int is pending
Post EVENT_ID_BPT_PAUSE when bpt.pause is pending
Post EVENT_ID_BLOCKING_SYNC whenever there is
non-stalling semaphore interrupt indicating work
completion from GR/CE2 engine

Bug 200089620

Change-Id: I91b7bf48f8585f0d318298fc0c4a66d42055f0a7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1112274
(cherry picked from commit d2b744b1f9acac56435cd7e7ab9a7a845579ef24)
Reviewed-on: http://git-master/r/1120321
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 08:45:47 -07:00
Deepak Nibade
ce04ae15bb gpu: nvgpu: APIs to post event id events
Add below channel and TSG APIs to post events
on event_id interface

gk20a_channel_event_id_post_event()
gk20a_tsg_event_id_post_event()

Bug 200089620

Change-Id: I0cfadc9ffdb880b2410f97758fad47905c620db1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1112267
(cherry picked from commit 9f50d7da4500af4dbf4dabe7916eda6fc220f4fb)
Reviewed-on: http://git-master/r/1120320
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-04-07 08:44:58 -07:00
Deepak Nibade
5f10073540 gpu: nvgpu: add TSG support to channel event id
Add NVGPU_IOCTL_TSG_EVENT_ID_CTRL API for channel
event id support to TSGs

This API will accept an event_id (like BPT.INT or
BPT.PAUSE), a command to enable
the event, and return a file descriptor on which
we can raise the event (if cmd=enable)

Events generated for TSGs will reuse file
operations "gk20a_event_id_ops"

Bug 200089620

Change-Id: I2f563c6d3a0988eb670caac2d3c7c6795724792c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1030776
(cherry picked from commit 72b61fa266279038f013e582be80c21808e1038d)
Reviewed-on: http://git-master/r/1120319
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-04-07 08:44:38 -07:00
Deepak Nibade
e87ba53235 gpu: nvgpu: add channel event id support
With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can
raise events to User space. But user space
cannot distinguish between various types of events.
To overcome this, we need finer-grained API to
deliver various events to user space.

Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL,
and all the support for this API (we can remove
this since User space has not started using this
API at all)

Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL
which will accept an event_id (like BPT.INT or
BPT.PAUSE), a command to enable the event,
and return a file descriptor on which
we can raise the event (if cmd=enable)
Event is disabled when file descriptor is closed

Add file operations "gk20a_event_id_ops"
to support polling on event fd

Also add API gk20a_channel_get_event_data_from_id()
to get event_data of event from its id

Bug 200089620

Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1030775
(cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275)
Reviewed-on: http://git-master/r/1120318
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-04-07 08:43:49 -07:00
Seshendra Gadagottu
f5ce107f19 gpu: nvgpu: dump gpu status on channel wdt
To get more useful debug info, dump gpu status
on channel watchdog timeout.

Bug 200163782

Change-Id: Ie160e3a65650b87f812e0c1d1d9b54814b45a1d7
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1115439
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-29 11:14:21 -07:00
Anton Vorontsov
1c40d09c4c gpu: nvgpu: Add support for FECS ctxsw tracing
bug 1648908

This commit adds support for FECS ctxsw tracing. Code is compiled
conditionnaly under CONFIG_GK20_CTXSW_TRACE.
This feature requires an updated FECS ucode that writes one record to a ring
buffer on each context switch. On RM/Kernel side, the GPU driver reads records
from the master ring buffer and generates trace entries into a user-facing
VM ring buffer. For each record in the master ring buffer, RM/Kernel has
to retrieve the vmid+pid of the user process that submitted related work.

Features currently implemented:
- master ring buffer allocation
- debugfs to dump master ring buffer
- FECS record per context switch (with both current and new contexts)
- dedicated device for ctxsw tracing (access to VM ring buffer)
- SOF generation (and access to PTIMER)
- VM ring buffer allocation, and reconfiguration
- enable/disable tracing at user level
- event-based trace filtering
- context_ptr to vmid+pid mapping
- read system call for ctxsw dev
- mmap system call for ctxsw dev (direct access to VM ring buffer)
- poll system call for ctxsw dev
- save/restore register on ELPG/CG6
- separate user ring from FECS ring handling

Features requiring ucode changes:
- enable/disable tracing at FECS level
- actual busy time on engine (bug 1642354)
- master ring buffer threshold interrupt (P1)
- API for GPU to CPU timestamp conversion (P1)
- vmid/pid/uid based filtering (P1)

Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1022737
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-23 07:48:47 -07:00
Aingara Paramakuru
82da6ed595 gpu: nvgpu: add support to set channel timeslice
As part of improving GPU scheduling, userspace can now set a
channel's timeslice, within reasonable limits imposed by the
kernel driver.

JIRA VFND-1312
Bug 1729664

Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1020917
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-22 10:42:45 -07:00
Richard Zhao
97108797a2 gpu: nvgpu: enable semaphore acquire timeout only when timeouts_enabled is set
Bug 1727687

Change-Id: I7a7a4a2011b029474122fdbfbeb02b6302a5902b
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1011486
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-22 10:19:23 -07:00
Aingara Paramakuru
2a58d3c27b gpu: nvgpu: improve channel interleave support
Previously, only "high" priority bare channels were interleaved
between all other bare channels and TSGs. This patch decouples
priority from interleaving and introduces 3 levels for interleaving
a bare channel or TSG: high, medium, and low. The levels define
the number of times a channel or TSG will appear on a runlist (see
nvgpu.h for details).

By default, all bare channels and TSGs are set to interleave level
low. Userspace can then request the interleave level to be increased
via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will
be added later).

As timeslice settings will soon be coming from userspace, the default
timeslice for "high" priority channels has been restored.

JIRA VFND-1302
Bug 1729664

Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1014962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-15 16:23:44 -07:00
Konsta Holtta
f07a046a52 gpu: nvgpu: validate wait notification offset
Make sure that the notification object fits within the supplied buffer.

Bug 1739182

Change-Id: Ifb66f848e3758438f37645be6f534f5b60260214
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1026431
(cherry picked from commit 2484c47f123c717030aa00253446e8756e1a0807)
Reviewed-on: http://git-master/r/1030875
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-15 16:22:32 -07:00
Konsta Holtta
ec023c3ff7 gpu: nvgpu: validate error notifier offset
Make sure that the notifier object fits within the supplied buffer.

Bug 1739183
Bug 1739932

Change-Id: I713574ce797ffc23cec10b5114f469dbadc68f1e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1026410
(cherry picked from commit f476b93eb19b962b8760457102448bd533efc54d)
Reviewed-on: http://git-master/r/1028737
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-15 16:21:44 -07:00
Deepak Nibade
8b665ac6b2 gpu: nvgpu: move clean up of jobs to separate worker
We currently clean up the jobs in gk20a_channel_update()
which is called from nvhost worker thread

Instead of doing this, schedule another delayed worker thread
clean_up_work to clean up the jobs (with delay of 1 jiffies)

Keep update_gp_get() in channel_update() and not in
delayed worker since this will help in better book
keeping of gp_get

Also, this scheduling will help delay job clean-up so
that more number of jobs are batched for clean up
and hence less time is consumed by worker

Bug 1718092

Change-Id: If3b94b6aab93c92da4cf0d1c74aaba756f4cd838
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/931701
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-05 08:07:10 -08:00
Richard Zhao
8fb33d92b0 gpu: nvgpu: vgpu: add channel_set_priority support
- add gops.fifo.channel_set_priority and move current code
  as native callback.
- implement the callback for vgpu

Bug 1701079

Change-Id: If1cd13ea4478d11d578da2f682598e0c4522bcaf
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/932829
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-25 15:22:22 -08:00
Deepak Nibade
cd09ac26c7 gpu: nvgpu: move resetup_ramfc() out of sync_lock
We currently have this sequence :
- acquire sync_lock
- sync_create
- resetup_ramfc()
- release sync_lock

but this can lead to deadlock in case resetup_ramfc()
triggers below stack :
- resetup_ramfc()
  - channel_preempt() - preemption fails
  - trigger recovery
  - channel_abort()
    - acquire sync_lock

Fix this by moving resetup_ramfc() out of sync_lock.

resetup_ramfc() is still protected by submit_lock and
hence we cannot free sync after allocation and
before resetup

Bug 200165811

Change-Id: Iebf74d950d6f6902b6d180c2cd8cd2d50493062c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/931726
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-21 07:52:11 -08:00
Richard Zhao
7095a72e56 gpu: nvgpu: fix tsg bugs
- correct runlist entry type for tsg
- consider tsg when preempt channel

Bug 1617046

Change-Id: Ie067df17fb53ae91c49403637a5f35fc3710e0b3
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/926571
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-19 08:34:55 -08:00