Commit Graph

212 Commits

Author SHA1 Message Date
Seema Khowala
27e3546175 gpu: nvgpu: add new tsg functions for ctxsw timeout re-org
Add nvgpu_tsg_set_error_notifier function for setting error_notifier
for all channels of a tsg.

Add nvgpu_tsg_timeout_debug_dump_state function for finding if
timeout_debug_dump is set for any of the channels of a tsg.

Add nvgpu_tsg_set_timeout_accumulated_ms to set
timeout_accumulated_ms for all the channels of a tsg.

JIRA NVGPU-1312

Change-Id: Ib2daf2d462c2cf767f5a6e6fd3436abf6860091d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2077626
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-22 05:20:01 -07:00
Deepak Nibade
7fa2189fb3 gpu: nvgpu: move fecs_trace operations under gr
Move g->ops.fecs_trace.*() HAL operations under gr operations as
g->ops.gr.fecs_trace.*()

Also rename gk20a_ctxsw_*() functions used in common code to the
format nvgpu_gr_fecs_trace_*()

Jira NVGPU-1880

Change-Id: Idf2f8fb3d7ba2832bf1837fd97b70b3cee412123
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2070767
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-16 05:05:41 -07:00
Deepak Nibade
1208ad7cef gpu: nvgpu: rearrange linux specific fecs trace support
We have 3 header files for FECS tracing support
include/nvgpu/gr/fecs_trace.h : common header
include/nvgpu/ctxsw_trace.h : header that includes both common and
                              os-specific functions
os/linux/ctxsw_trace.h : linux specific header

Remove the second header since it is not needed.

Move all structures that are needed in common code to
include/nvgpu/gr/fecs_trace.h
Move all function declarations that are needed in common code to
include/nvgpu/gr/fecs_trace.h
Move all linux specific declarations in os/linux/ctxsw_trace.h and
rename this file as os/linux/fecs_trace_linux.h

Also rename os/linux/ctxsw_trace.c to os/linux/fecs_trace_linux.c

Jira NVGPU-1880

Change-Id: I05cc4489c4b6a64880b7d59c02b22cd2244d5e22
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2070766
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-16 05:05:32 -07:00
Thomas Fleury
ffed5095db gpu: nvgpu: move fifo init/deinit code to common
Add fifo sub-unit to common.fifo to handle init/deinit code
and global support functions.

Split init into:
- nvgpu_channel_setup_sw
- nvgpu_tsg_setup_sw
- nvgpu_fifo_setup_sw
- nvgpu_runlist_setup_sw
- nvgpu_engine_setup_sw
- nvgpu_userd_setup_sw
- nvgpu_pbdma_setup_sw

Split de-init into
- nvgpu_channel_cleanup_sw
- nvgpu_tsg_cleanup_sw
- nvgpu_fifo_cleanup_sw
- nvgpu_runlist_cleanup_sw
- nvgpu_engine_cleanup_sw
- nvgpu_userd_cleanup_sw
- nvgpu_pbdma_cleanup_sw

Added the following HALs
- runlist.length_max
- fifo.init_pbdma_info
- fifo.userd_entry_size

Last 2 HALs should be moved resp. to pbdma and userd sub-units,
when available.

Added vgpu implementation of above hals
- vgpu_runlist_length_max
- vgpu_userd_entry_size
- vgpu_channel_count

Use hals in vgpu_fifo_setup_sw.

Jira NVGPU-1306

Change-Id: I954f56be724eee280d7b5f171b1790d33c810470
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2029620
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-14 20:35:22 -07:00
Seema Khowala
cb91bf1e13 gpu: nvgpu: protect recovery with engines_reset_mutex
Rename gr_reset_mutex to engines_reset_mutex and acquire it
before initiating recovery. Recovery running in parallel with
engine reset is not recommended.

On hitting engine reset, h/w drops the ctxsw_status to INVALID in
fifo_engine_status register. Also while the engine is held in reset
h/w passes busy/idle straight through. fifo_engine_status registers
are correct in that there is no context switch outstanding
as the CTXSW is aborted when reset is asserted.

Use deferred_reset_mutex to protect deferred_reset_pending variable
If deferred_reset_pending is true then acquire engines_reset_mutex
and call gk20a_fifo_deferred_reset.
gk20a_fifo_deferred_reset would also check the value of
deferred_reset_pending before initiating reset process

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: I47de669a6203e0b2e9a8237ec4e4747339b9837c
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2022373
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-13 06:34:31 -07:00
Deepak Nibade
3391aa9d84 gpu: nvgpu: move fecs_trace bind/unbind calls to gr/fecs_trace unit
Move below calls to gr/fecs_trace unit
gk20a_fecs_trace_bind_channel()
gk20a_fecs_trace_unbind_channel()

And rename them to
nvgpu_gr_fecs_trace_bind_channel()
nvgpu_gr_fecs_trace_unbind_channel()

We are not accessing any fifo/ch/tsg construct in gr/fecs_trace unit
hence update parameter list of above APIs to receive inst_block,
gr_ctx, subctx pointers directly instead of receiving channel_gk20a

Delete gk20a/fecs_trace_gk20a.* files since they are no longer
required. All the contents in those files are now moved to gr/fecs_trace
unit

Jira NVGPU-1880

Change-Id: I7ef9f0b66781b45155035237172ae400f02740e4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2032707
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-08 07:07:27 -08:00
Seema Khowala
5222d0ff4f gpu: nvgpu: do not do timeout_debug_dump for non fifo_error_idle_timeout
Any recovery that goes through gk20a_fifo_recover path e.g. gr error,
mmu fault or any recovery that involves engine recovery as well, will
still dump the full debug dump. This change will just avoid dumping debug
dump for force reset channels and pbdma intr if they do not involve
engine recovery. For FIFO_ERROR_IDLE_TIMEOUT error notifiers that
involves tsg recovery only, debug_dump will happen only if
timeout_debug_dump is set. timeout_debug_dump by default is set to true
but can be changed using NVGPU_IOCTL_CHANNEL_SET_TIMEOUT_EX.

Bug 2092051

Change-Id: Ibbf3cd2c44c586d9deb9e61ffbf37945b8d9e428
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2033068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-07 15:14:24 -08:00
Nicolas Benech
ee6ef2a719 gpu: nvgpu: resolve MISRA 17.7 for WARN_ON
MISRA Rule-17.7 requires the return value of all functions to be used.
Fix is either to use the return value or change the function to return
void. This patch ensures that WARN and WARN_ON always return void; and
introduces a new nvgpu_do_assert construct to trigger the equivalent
of WARN_ON(true) so that stack can be dumped (depends on OS support)

JIRA NVGPU-677

Change-Id: Ie2312c5588ceb5b1db825d15a096149b63b69af4
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2018706
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-05 11:14:46 -08:00
Debarshi Dutta
8db1955d74 gpu: nvgpu: split semaphore.c file into multiple units
The file semaphore.c is now split into 4 units namely
semaphore, semaphore_hw, semaphore_pool and semaphore_sea.

Each of the above units now have separate compilation units under
common/semaphore/. The public APIs corresponding to each unit is
present in include/nvgpu/semaphore.h. The dependency graph of the
below units is as follows where '->' indicates left depends on right.

semaphore -> semaphore_hw -> semaphore_pool -> semaphore_sea

Some of the other major changes made in this patch are as follows
  i) Renamed some of the functions.
  ii) Some functions are changed from private to public.
  iii) Public header for semaphore contains only the declaration of the
       corresponding structs as an opaque structure.
  iv) Constructed a private header to contain internal functions common
      to all the units and struct definitions corresponding to each unit.
  v)  Added new functions to provide access to internal members of the
      units.

Jira NVGPU-2076

Change-Id: I6f111647ba9a9a9f8ef9c658f316cd5d6276c703
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2022782
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-27 12:54:15 -08:00
Seema Khowala
2c0933de05 gpu: nvgpu: rename ch_timedout to unserviceable
ch_timedout is not a good variable name for broken and
unusable state of the channel. Rename ch_timedout to
unserviceable

Bug 2092051
Bug 2429295

Change-Id: I633eaff61928d5ef9836dcdc162b07e7a5e03881
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1996865
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-22 20:21:37 -08:00
Philip Elcan
c02bccd6db gpu: nvgpu: cond: use u32 for COND_WAIT timeout
The type for the timeout parameter to the NVGPU_COND_WAIT and
NVGPU_COND_WAIT_INTERRUPTIBLE macros was too weak. This updates these
macros to require a u32 for the timeout.

Users of the macros are updated to be compliant as necessary.

This addresses MISRA 10.3 violations for implicit conversions of types
of different size or essential type.

JIRA NVGPU-1008

Change-Id: I12368dfa81b137c35bd056668c1867f03a73b7aa
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017503
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-21 10:24:24 -08:00
Seema Khowala
13f37f9c70 gpu: nvgpu: remove gk20a_is_channel_marked_as_tsg
Use tsg_gk20a_from_ch to get tsg pointer for tsgid of a channel. For
invalid tsgid, tsg pointer will be NULL

Bug 2092051
Bug 2429295
Bug 2484211

Change-Id: I82cd6a2dc5fab4acb147202af667ca97a2842a73
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2006722
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-21 10:23:50 -08:00
Konsta Holtta
c330d8fd98 gpu: nvgpu: add channel HAL section for ccsr_*
Split out ops that belong to channel unit to a new section called
channel. Channel is a broad concept; this includes just the code that
accesses channel registers (ccsr_*). This is effectively just renaming;
the implementation still stays put.

The word "channel" is also dropped from certain HAL entries to avoid
redundancy (e.g., channel.disable_channel -> channel.disable).
fifo.get_num_fifos gets an entirely new name: channel.count.

Jira NVGPU-1307

Change-Id: I9a08103e461bf3ddb743aa37ababee3e0c73c861
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017261
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-12 17:05:34 -08:00
Philip Elcan
fa81cf9000 gpu: nvgpu: fifo: cleanup MISRA 10.3 violations
MISRA 10.3 prohibits assigning of objects of different size or essential
type. This fixes a number of violations in the common/fifo code.

JIRA NVGPU-1008

Change-Id: I138c27eb86f6e0f9481c39a94d6632e2b4360af8
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2009940
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-11 12:55:27 -08:00
Konsta Holtta
49506f257e gpu: nvgpu: split update_runlist HAL API in two
A comment for gk20a_fifo_update_runlist() says:

 /* add/remove a channel from runlist
    special cases below: runlist->active_channels will NOT be changed.
    (ch == NULL && !add) means remove all active channels from runlist.
    (ch == NULL &&  add) means restore all active channels on runlist. */

Those special cases call for a new function, so add that. Delete the
update_runlist HAL op and add update_for_channel (like update_runlist
without the special cases) and reload (no channel to add or remove, just
the special cases).

While at it, rename gk20a_fifo_update_runlist_ids to
nvgpu_runlist_reload_ids. It's common across chips and does what the
reload HAL does but for a list of several IDs.

Jira NVGPU-1922

Change-Id: I9a99ab03a636a1214c021faad359d2b304a9472f
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2013058
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-08 12:56:09 -08:00
Adeel Raza
d828e013db gpu: nvgpu: common: MISRA rule 15.6 fixes
MISRA rule 15.6 requires that all if/else/loop blocks should be enclosed
by brackets. This patch adds brackets to single line if/else/loop blocks
in the common directory.

JIRA NVGPU-775

Change-Id: I0dfb38dbf256d49bc0391d889d9fbe5e21da5641
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2011655
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Scott Long <scottl@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-05 19:23:47 -08:00
Deepak Nibade
254253732c gpu: nvgpu: add new unit for GR subcontext
Add new unit common/gr/subctx.c to manage GR subcontext
This unit provides interfaces to allocate/free/load GR subcontext

Add new header file include/nvgpu/gr/subctx.h to declare all the
interfaces.

Right now channel_gk20a structure directly includes a nvgpu_mem
for context header.
Declare a new structure nvgpu_gr_subctx for subcontext and include
this from channel_gk20a

Make all necessary changes to refer ctx_header from subctx instead
of directly referencing it from channel

Jira NVGPU-1613

Change-Id: I9eb1ee8f26fa88d2881f9b294935b65e9cbcc9b4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1990129
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-02 03:03:43 -08:00
Seema Khowala
013ca60edd gpu: nvgpu: remove code for ch not bound to tsg
- Remove handling for channels that are no more bound to tsg
  as channel could be referenceable but no more part of a tsg
- Use tsg_gk20a_from_ch to get pointer to tsg for a given channel
- Clear unhandled gr interrupts

Bug 2429295
JIRA NVGPU-1580

Change-Id: I9da43a2bc9a0282c793b9f301eaf8e8604f91d70
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1972492
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-01 11:58:57 -08:00
Seema Khowala
aacc33bb47 gpu: nvgpu: do not use raw spinlock for ch->timeout.lock
With PREEMPT_RT kernel, regular spinlocks are mapped onto sleeping
spinlocks (rt_mutex locks), and raw spinlocks retain their behaviour.

Schedule while atomic can occur in gk20a_channel_timeout_start,
as it acquires ch->timeout.lock raw spinlock, and then calls
functions that acquire ch->ch_timedout_lock regular spinlock.

Bug 200484795

Change-Id: Iacc63195d8ee6a2d571c998da1b4b5d396f49439
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2004100
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-28 12:44:00 -08:00
Konsta Holtta
4e85ebc05f gpu: nvgpu: use channel pointer for update_runlist
A naked channel ID does not carry good information about the channel
validity and is a very low level construct for an API of this level.
Refactor the runlist updating fifo APIs to take a channel pointer.

While at it, delete the channel and wait_for_finish parameters from
gk20a_fifo_update_runlist_ids() - the only caller is suspend and resume
and the parameters were always null for channel and true for wait.

Jira NVGPU-1309
Jira NVGPU-1737

Change-Id: Ied350bc8e482d8e311cc708ab0c7afdf315c61cc
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1997744
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-25 11:44:47 -08:00
Konsta Holtta
6fda25e958 gpu: nvgpu: move runlist HAL ops to separate section
Split out ops that belong to runlist unit to a new section called
runlist. This is effectively just renaming; the implementation still
stays put.

Jira NVGPU-1309

Change-Id: Ib928164f8008f680d9cb13c969e3304ef727abba
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1997823
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-24 04:14:31 -08:00
Konsta Holtta
d1d1f56c49 gpu: nvgpu: skip nvgpu syncpoint in usermode submits
The nvgpu managed syncpoint is not needed for anything if a channel uses
usermode submits; in that case the channel would allocate an
user-managed syncpoint and use that. Create the channel sync in
nvgpu_channel_setup_bind() only if usermode submit is not enabled.

Bug 200466905

Change-Id: I976f4b4fd0c3131cb310c72b286329fb16f1f29a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1990270
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-09 09:35:18 -08:00
Konsta Holtta
8979a97af3 gpu: nvgpu: abstract out timeout rewinding
The channel timeout ends up in a strange state during timeout handling
for a brief moment; it can become stopped and started again, and the
timeout lock is released in the middle. Add a more explicit rewind
function to reset the timeout to start if it's active. The active check
allows to use this from gk20a_channel_timeout_restart_all_channels(), so
that's also modified.

Also replace the return statements with more readable control flow in
gk20a_channel_timeout_handler().

Change-Id: Ia7d67242dfc149ace1f4f841a837e90b6c985308
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1989327
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-08 08:24:55 -08:00
Adeel Raza
c961b7ed1d nvgpu: fifo: fix invalid ID macros
MISRA rule 10.1 prohibits using signed values with bitwise operators.
Make fifo invalid ID macros compliant with this MISRA rule.

Also use these macros in source code instead of hardcoded numbers to
make the code more readable.

JIRA NVGPU-1006

Change-Id: I2f336d1decbc53b08f93587f2e00ea2cce47f72b
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1983700
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-06 19:24:13 -08:00
Konsta Holtta
e05c0d13a0 gpu: nvgpu: add runlist unit to common
Extract non-chip-specific code that manages the runlists (init, update,
reschedule etc.) to a new file in the common directory. Move the
declarations to a new matching runlist.h header.

Jira NVGPU-1309

Change-Id: I3c7e0032899516487037f47ddc9a7e7aa4b0b33a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1978058
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-04 11:15:34 -08:00
Seema Khowala
13aed4da44 gpu: nvgpu: remove log_fn prints in _gk20a_channel_from_id
Remove nvgpu_log_fn for _gk20a_channel_from_id as enabing log_fn
prints during debugging become very noisy due to these prints.

Change-Id: I52ef193d13af87924dbde59a55c892e98e95bc85
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1982263
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-01-02 08:35:28 -08:00
Philip Elcan
90024cb73a gpu: nvgpu: misc MISRA 14.4 fixes
This fixes a few lingering MISRA Rule 14.4 violations.  Rule 14.4
requires that the condition of an if statement be a boolean.

JIRA NVGPU-1022

Change-Id: Ib6293e00e0436fceee9f7bf0ada1b6ac01a82faa
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1975424
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-12-19 11:24:42 -08:00
Debarshi Dutta
fb114f8fda gpu: nvgpu: move gk20a_fifo_recover_ch to channel unit
gk20a_fifo_recover_ch does high-level calls and invokes
gk20a_fifo_recover. This function belongs to the channel unit and is
moved to the file channel.c. Also, the function is renamed to
nvgpu_channel_recover.

Jira NVGPU-1237

Change-Id: I31890f85fdb2c42648cc063dd9c4e7e35930dcef
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1970033
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-12-14 21:54:58 -08:00
Debarshi Dutta
57f03e3a20 gpu: nvgpu: move channel functions to common
Any channel specific functions having high-level software-centric
operations belong to the channel unit and not the FIFO unit.
Move the below public functions as well as their dependent
static functions to common/fifo/channel.c. Also, rename the functions
to use the prefix nvgpu_channel_*.

gk20a_fifo_set_ctx_mmu_error_ch
gk20a_fifo_error_ch
gk20a_fifo_check_ch_ctxsw_timeout

Jira NVGPU-1237

Change-Id: Id6b6d69bbed193befbfc4c30ecda1b600d846199
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1932358
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-12-14 21:54:17 -08:00
Thomas Fleury
7e68e5c83d gpu: nvgpu: userd slab allocator
We had to force allocation of physically contiguous memory for
USERD in nvlink case, as a channel's USERD address is computed as
an offset from fifo->userd address, and nvlink bypasses SMMU.

With 4096 channels, it can become difficult to allocate 2MB of
physically contiguous sysmem for USERD on a busy system.

PBDMA does not require any sort of packing or contiguous USERD
allocation, as each channel has a direct pointer to that channel's
512B USERD region. When BAR1 is supported we only need the GPU VAs
to be contiguous, to setup the BAR1 inst block.

- Add slab allocator for USERD.
- Slabs are allocated in SYSMEM, using PAGE_SIZE for slab size.
- Contiguous channels share the same page (16 channels per slab).
- ch->userd_mem points to related nvgpu_mem descriptor
- ch->userd_offset is the offset from the beginning of the slab

- Pre-allocate GPU VAs for the whole BAR1
- Add g->ops.mm.bar1_map() method
  - gk20a_mm_bar1_map() uses fixed mapping in BAR1 region
  - vgpu_mm_bar1_map() passes the offset in TEGRA_VGPU_CMD_MAP_BAR1
  - TEGRA_VGPU_CMD_MAP_BAR1 is called for each slab.

Bug 2422486
Bug 200474793

Change-Id: I202699fe55a454c1fc6d969e7b6196a46256d704
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-12-11 16:24:10 -08:00
Sai Nikhil
303fc7496c gpu: nvgpu: common: fix MISRA Rule 10.4 Violations
MISRA Rule 10.4 only allows the usage of arithmetic operations on
operands of the same essential type category.

Adding "U" at the end of the integer literals or casting operands
to have same type of operands when an arithmetic operation is
performed.

This fixes violations where an arithmetic operation is performed on
signed and unsigned int types.

JIRA NVGPU-992

Change-Id: I27e3e59c3559c377b4bd3cbcfced90fdf90350f2
Signed-off-by: Sai Nikhil <snikhil@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1921459
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-12-11 10:26:16 -08:00
Debarshi Dutta
9abe9fe062 gpu: nvgpu: replace input param chid with pointer to channel
preempt_channel needs to use the channel to pass it to other
public functions, get access to a tsg etc. This qualifies it to take a
pointer to a channel as an input parameter instead of a chid.

Increment the channel ref counter using the function
gk20a_channel_from_id in functions where we get the chid from the h/w
registers directly. Once the prempt_channel function call is done,
use a gk20a_channel_put on the referenced channel.

Jira NVGPU-1461

Change-Id: I6c87c8104cfcb418d468c8c590087fd4aeabf4bd
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1963200
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-12-05 21:55:10 -08:00
Debarshi Dutta
e5bebd880f gpu: nvgpu: replace tsgid input variable with pointer to a struct tsg_gk20a
replace tsgid with a pointer to a struct tsg_gk20a in the function
gk20a_fifo_tsg_abort(). gk20a_fifo_tsg_abort needs to enumerate through
all the channels within the tsg as well as pass the tsg pointer to
other functions, qualifying the need to use a pointer instead as an
input parameter.

Jira NVGPU-1461

Change-Id: I59cec05d5d778f733d0c3e9ffadf46e74e249080
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1956567
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-30 08:14:48 -08:00
Konsta Holtta
7df3d58750 gpu: nvgpu: add safe channel id lookup
Add gk20a_channel_from_id() to retrieve a channel, given a raw channel
ID, with a reference taken (or NULL if the channel was dead). This makes
it harder to mistakenly use a channel that's dead and thus uncovers bugs
sooner. Convert code to use the new lookup when applicable; work remains
to convert complex uses where a ref should have been taken but hasn't.

The channel ID is also validated against FIFO_INVAL_CHANNEL_ID; NULL is
returned for such IDs. This is often useful and does not hurt when
unnecessary.

However, this does not prevent the case where a channel would be closed
and reopened again when someone would hold a stale channel number. In
all such conditions the caller should hold a reference already.

The only conditions where a channel can be safely looked up by an id and
used without taking a ref are when initializing or deinitializing the
list of channels.

Jira NVGPU-1460

Change-Id: I0a30968d17c1e0784d315a676bbe69c03a73481c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1955400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-27 12:24:38 -08:00
Seema Khowala
def687d4df gpu: nvgpu: check ch_timedout for poll/restart
poll_timeouts and timeout_restart_all_channels should
only handle channels that have not been recovered/aborted.
Check ch_timedout status of the channel to make sure
channel is still alive to be used. A channel reference
could still be available even if it is recovered but not
closed.

Bug 2404865

Change-Id: I016c8b9952ef1d4c349c2a2a2ca55cb81326d380
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929339
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-15 15:36:15 -08:00
Seema Khowala
88cff206ae gpu: nvgpu: do not suspend/resume recovered channel
Already torn down channels should not be suspended or
resumed. A channel reference could still be available
even if it is recovered but not closed. Use ch_timedout
status to check if channel is already recovered/aborted.

Bug 2404865

Change-Id: I718eab6032ee94a9322da7a239a978b388de2b01
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929338
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-15 15:36:06 -08:00
Seema Khowala
1f54ea09e3 gpu: nvgpu: rename has_timedout and make it thread safe
Currently has_timedout variable is protected by wmb at places
where it is being set and there is no correspoding rmb whenever
has_timedout variable is read. This is prone to errors for
concurrent execution. This change is supposed to fix this issue.
Rename has_timedout variable of channel struct to ch_timedout.
Also to avoid rmb every time ch_timedout is read,
ch_timedout_spinlock is added to protect ch_timedout
variable for taking care of concurrent execution.

Bug 2404865
Bug 2092051

Change-Id: I0bee9f50af0a48720aa8b54cbc3af97ef9f6df00
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1930935
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-15 15:35:57 -08:00
smadhavan
503b897b45 gpu: nvgpu: Fix MISRA rule 8.3 violations
MISRA rule 8.3 requires that all declarations of a function
shall use the same parameter names and type qualifiers. There
are cases where the parameter names do not match between
function prototype and declaration. This patch will fix some of
these violations by renaming the prototype parameter.

JIRA NVGPU-847

Change-Id: I980ca7ba8adc853de9c1b6f6c7e7b3e4ac12f88e
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1926980
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-15 15:35:47 -08:00
Srirangan Madhavan
63d1b7113a gpu: nvgpu: Fix MISRA 12.2 misc bit shift errors
MISRA rule 12.2 states that the right hand operand of a shift
operator shall lie in the range zero to one less than the width
in bits of the essential type of the left hand operand. This
patch will fix these violations by casting them to an appropriate
type or using the relevant BITxx() macros.

JIRA NVGPU-666

Change-Id: I57b6081e9bd98c45ca9f7aa5f35e1d2d66ed0134
Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1945655
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-14 09:14:37 -08:00
Amurthyreddy
23f35e1b2f gpu: nvgpu: MISRA 14.4 bitwise operation as boolean
MISRA rule 14.4 doesn't allow the usage of integer types as booleans
in the controlling expression of an if statement or an iteration
statement.

Fix violations where the result of a bitwise operation is used as a
boolean in the controlling expression of if and loop statements.

JIRA NVGPU-1020

Change-Id: I6a756ee1bbb45d43f424d2251eebbc26278db417
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1936334
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-13 09:45:25 -08:00
Amurthyreddy
1023c6af14 gpu: nvgpu: MISRA 14.4 boolean fixes
MISRA rule 14.4 doesn't allow the usage of non-boolean variable as
boolean in the controlling expression of an if statement or an
iteration statement.

Fix violations where a non-boolean variable is used as a boolean in the
controlling expression of if and loop statements.

JIRA NVGPU-1022

Change-Id: I61a2d24830428ffc2655bd9c45bb5403c7f22c09
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1943058
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-07 10:35:22 -08:00
Amurthyreddy
710aab6ba4 gpu: nvgpu: MISRA 14.4 boolean fixes
MISRA rule 14.4 doesn't allow the usage of non-boolean variable as
boolean in the controlling expression of an if statement or an
iteration statement.

Fix violations where a non-boolean variable is used as a boolean in the
controlling expression of if and loop statements.

JIRA NVGPU-1022

Change-Id: I957f8ca1fa0eb00928c476960da1e6e420781c09
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1941002
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-07 10:35:13 -08:00
Nicolas Benech
cb2a05dd92 gpu: nvgpu: Fix LibC MISRA 17.7 in common
MISRA Rule-17.7 requires the return value of all functions to be used.
Fix is either to use the return value or change the function to return
void. This patch contains fix for all 17.7 violations instandard C functions
in common code.

JIRA NVGPU-1036

Change-Id: Id6dea92df371e71b22b54cd7a521fc22812f9b69
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929899
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-11-01 17:15:37 -07:00
Alex Waterman
c64f9432b1 gpu: nvgpu: Fix comment in priv_cmd_buf allocation
Update the comment to fix obvious issues and describe the
new allocation logic.

Bug 2327792

Change-Id: Ica0dd4159467e3023cc487a2bf9f525db3ad76e6
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1831096
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-29 17:06:45 -07:00
Alex Waterman
b9ec592f1d gpu: nvgpu: Make priv_cmd_buf honor num_in_flight jobs
If num_in_flight jobs is set use that to determine the proper
size of the priv_cmd_buf. If num_in_flight is not set then use
the original logic: the priv_cmd_buf is sized based on a worst
case assumption for the GPFIFO.

Also clean up MISRA issues.

Bug 2327792

Change-Id: Ie192caeb6cc48fdcac57e5cbb71c534aeaf46011
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1831095
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-29 17:06:41 -07:00
Alex Waterman
05ec7b80eb gpu: nvgpu: Use deterministic flag to decide pre-alloc
Instead of using num_inflight_jobs to determine whether to pre-alloc
resources for a channel use the c->deterministic flag and the
number of inflight jobs field. Non-determinsitic channels do not
require pre-alloced resources and deterministic channels with 0
in flight jobs (i.e no kernel job tracking, AKA fast path sumits)
also do not require pre-alloced resources.

Bug 2327792

Change-Id: I7e8eb0478c22e005ca2c46c555415afa0ded0be1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1850123
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-29 17:06:37 -07:00
Konsta Holtta
99b1c6dcdf gpu: nvgpu: support usermode submit buffers
Import userd and gpfifo buffers from userspace if provided via
NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO_EX. Also supply the work submit token
(i.e., the hw channel id) to userspace.

To keep the buffers alive, store their dmabuf and attachment/sgt handles
in nvgpu_channel_linux. Our nvgpu_mem doesn't provide such data for
buffers that are mainly in kernel use. The buffers are freed via a new
API in the os_channel interface.

Fix a bug in gk20a_channel_free_usermode_buffers: also unmap the
usermode gpfifo buffer.

Bug 200145225

Change-Id: I8416af7085c91b044ac8ccd9faa38e2a6d0c3946
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1795821
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-29 08:04:43 -07:00
Debarshi Dutta
6fe9bb835b gpu: nvgpu: access channel_sync via public API
struct nvgpu_channel_sync is moved to a private header i.e.
channel_sync_priv.h present in common/sync/. All accesses to callback
functions inside the struct nvgpu_channel_sync in NVGPU driver is replaced by
the public channel_sync specific APIs.

Jira NVGPU-1093

Change-Id: I52d57b3d458993203a3ac6b160fb569effbe5a66
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929783
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-26 02:12:23 -07:00
Adeel Raza
dc37ca4559 gpu: nvgpu: MISRA fixes for composite expressions
MISRA rules 10.6, 10.7, and 10.8 prevent mixing of types in composite
expressions. Resolve these violations by casting variables/constants to
the appropriate types.

Jira NVGPU-850
Jira NVGPU-853
Jira NVGPU-851

Change-Id: If6db312187211bc428cf465929082118565dacf4
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1931156
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-25 11:13:38 -07:00
Amurthyreddy
f8ce19f879 gpu: nvgpu: MISRA 14.4 Function pointer as boolean
MISRA rule-14.4 doesn't allow the usage of function pointers & integer
types as booleans in the controlling expression of an if statement or
an iteration statement.

Fix violations where a function pointer or a function whose return
value is an integer, is used as a boolean in the controlling expression
of if and loop statements.

JIRA NVGPU-1021

Change-Id: Ic5336268394ba4396ce80744c25930d2fb44dc42
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1932147
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-10-24 17:01:39 -07:00