Commit Graph

380 Commits

Author SHA1 Message Date
Thomas Fleury
1701a267bc gpu: nvgpu: move setup ramfc code to common
Create ramfc under common/fifo

Created the following HAL:
- ramfc.setup
- ramfc.commit_userd

Moved setup code to ramfc HAL:
- vgpu_channel_setup_ramfc
- gk20a_fifo_setup_ramfc
- channel_gp10b_setup_ramfc
- channel_gv11b_setup_ramfc
- channel_tu104_setup_ramfc

Renamed as:
- <chip>_ramfc_setup

Moved commit userd code to ramfc HAL:
- gk20a_fifo_commit_userd
- channel_gp10b_commit_userd

Renamed as:
- <chip>_ramfc_commit_userd

Jira NVGPU-1750

Change-Id: Ieb1bd2866fd77601edd218f879ababf4f90db54a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2069947
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-27 20:35:04 -07:00
Vinod G
4777c81f82 gpu: nvgpu: move gk20a_gr_flush_channel_tlb to common.gr.init
Move gk20a_gr_flush_channel_tlb function to common.gr.init as
nvgpu_gr_flush_channel_tlb function.

JIRA NVGPU-1885

Change-Id: I4979266d826b0d188b09bbad156103bb11005c84
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2081368
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-26 21:15:03 -07:00
Seema Khowala
434931799a gpu: nvgpu: remove channel.check_ctxsw_timeout
nvgpu_channel_check_ctxsw_timeout is removed as ctxsw
timeout is not checked for channel that is not bound to
tsg.

JIRA NVGPU-1312

Change-Id: I8d12251e478a959d150b736206396c338575b2ec
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2079513
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 22:49:00 -07:00
Seema Khowala
dfafddcc21 gpu: nvgpu: move common and chip specific ctxsw timeout
Delete apply_ctxsw_timeout_intr ops and add
ctxsw_timeout_enable ops

Move chip specific sched_error and ctxsw_timeout
functions to hal/fifo/fifo_intr_* and hal/fifo/ctxsw_timeout_*

Add nvgpu_rc_ctxsw_timeout function under common/rc/rc.c

Do not check ctxsw timeout for channels that are no more
bound to tsg.

JIRA NVGPU-1312

Change-Id: Ide977fb60b3b72a27d9f22873f7a416c3bd1181d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2075734
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 22:47:45 -07:00
Seema Khowala
fe2a599700 gpu: nvgpu: rename fifo_eng_timeout_us
Rename fifo_eng_timeout_us to ctxsw_timeout_period_ms for
clarity.

JIRA NVGPU-1312

Change-Id: I23faff3df7160c1193f797ac03769ef2ecf4449e
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2076776
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 22:47:09 -07:00
Seema Khowala
9393e2a90a gpu: nvgpu: rename timeout of channel struct to wdt
Rename channel_gk20a_timeout to nvgpu_channel_wdt.
Rename timeout variable of channel_gk20a struct to wdt.
Rename ch_wdt_timeout_ms to ch_wdt_init_limit_ms.

Rename gk20a_channel_timeout_* to nvgpu_channel_wdt_*

JIRA NVGPU-1312

Change-Id: Ida78426cc007b53f3d407cf85428d15f7fe7518a
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2077641
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 22:46:52 -07:00
Seema Khowala
737de7eac5 gpu: nvgpu: rename timeout_* of channel struct
timeout_ms_max is renamed as ctxsw_timeout_max_ms
timeout_debug_dump is renamed as ctxsw_timeout_debug_dump
timeout_accumulated_ms is renamed as ctxsw_timeout_accumulated_ms
timeout_gpfifo_get is renamed as ctxsw_timeout_gpfifo_get

gk20a_channel_update_and_check_timeout is renamed as
nvgpu_channel_update_and_check_ctxsw_timeout

JIRA NVGPU-1312

Change-Id: Ib5c8829c76df95817e9809e451e8c9671faba726
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2076847
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 22:46:36 -07:00
Thomas Fleury
9a0b8c0234 gpu: nvgpu: add non-safe compile flag NVGPU_USERD
common.fifo.userd unit has both safe as well as non-safe functions.
The build flag NVGPU_USERD is used to restrict the use of
non-safe functions of the userd unit in safety builds.

Jira NVGPU-2713

Change-Id: Idf3b244b24816789892ea802c2dcb42ca92649e1
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2075928
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 16:24:45 -07:00
Nitin Kumbhar
a2314ee780 gpu: nvgpu: move no_of_sm to common.gr.config
1. Move no_of_sm from gr to common.gr.config
2. Add nvgpu_gr_config_get_no_of_sm() API in gr.config
to fetch no_of_sm.

JIRA NVGPU-1884

Change-Id: I3c6c20a12cd7f9939a349a409932195f17392943
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2073583
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 11:55:20 -07:00
Nitin Kumbhar
30eea4ff2b gpu: nvgpu: create common.gr.zcull
1. Separate out zcull unit from gr
2. Move zcull HALs from gr to common.hal.gr.zcull
3. Move common zcull functions to common.gr.zcull

JIRA NVGPU-1883

Change-Id: Icfc297cf3511f957aead01044afc6fd025a04ebb
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2076547
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-25 01:55:14 -07:00
Thomas Fleury
7abb0324b4 gpu: nvgpu: move engine reset to common
Move gk20a_fifo_reset_engine to common engine code.

Added the following hals:
- gr.halt_pipe
- gr.reset

Jira NVGPU-1306
Jira NVGPU-1315

Change-Id: I97b195720bcf8bf9dbc6882f0830e33942441b7b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2035216
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-22 12:47:03 -07:00
Thomas Fleury
696d212718 gpu: nvgpu: move userd to separate unit
Add userd unit under common/fifo

Moved userd setup/cleanup from fifo:
- nvgpu_userd_setup_sw
- nvgpu_userd_cleanup_sw

Moved common userd code from hals:
- nvgpu_userd_init_slabs
- nvgpu_userd_free_slabs
- nvgpu_userd_init_channel

Replaced the following hals
- fifo.userd_gp_get
- fifo.userd_gp_put
- fifo.userd_pb_get
- fifo.setup_userd
- fifo.userd_entry_size

With
- userd.gp_get
- userd.gp_put
- userd.pb_get
- userd.init_mem
- userd.entry_size

Also added the following hals
- userd.setup_sw: init slabs and reserve userd gpu_va
- userd.cleanup_sw: de-init slabs and free gpu_va
- userd.setup_hw: setup writeback timeout

Jira NVGPU-2713

Change-Id: Ide854a38531a3ce00e61045449ddd010c956bdeb
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2035116
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-22 06:25:55 -07:00
Seema Khowala
27e3546175 gpu: nvgpu: add new tsg functions for ctxsw timeout re-org
Add nvgpu_tsg_set_error_notifier function for setting error_notifier
for all channels of a tsg.

Add nvgpu_tsg_timeout_debug_dump_state function for finding if
timeout_debug_dump is set for any of the channels of a tsg.

Add nvgpu_tsg_set_timeout_accumulated_ms to set
timeout_accumulated_ms for all the channels of a tsg.

JIRA NVGPU-1312

Change-Id: Ib2daf2d462c2cf767f5a6e6fd3436abf6860091d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2077626
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-22 05:20:01 -07:00
Debarshi Dutta
61fd020137 gpu: nvgpu: move engine_status and pbdma_status HAL units to hal/fifo
engine_status and pbdma_status HAL specific units are moved from
common/fifo to hal/fifo.

Jira NVGPU-1315
Jira NVGPU-2950

Change-Id: Ic4eab71bcc53c04f6b855d5eebe919bce57a3494
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2071271
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-21 04:36:48 -07:00
Thomas Fleury
30591ce1a7 gpu: nvgpu: move runlist_submit_lock to runlist
runlist_submit_lock is used by runlist unit.
Move initialization to nvgpu_runlist_setup_sw.

Jira NVGPU-1306

Change-Id: I8476a12b51cd0831de700c793559d46cd43b784e
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2076707
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-20 16:26:24 -07:00
Thomas Fleury
6729d67f59 gpu: nvgpu: vpgu: use fifo common init/deinit code
Native and vgpu were using different paths for fifo
init/deinit code.

Use same nvgpu_fifo_init_support for init code:

nvgpu_fifo_init_support
  g->ops.fifo.setup_sw
    vgpu_fifo_setup_sw (NEW)

Use same nvgpu_fifo_remove_support for deinit code:

nvgpu_fifo_remove_support (NEW)
  g->ops.fifo.cleanup_sw (NEW)
    vgpu_fifo_cleanup_sw (NEW)

Also implemented gk20a_fifo_cleanup_sw for native case.

Jira NVGPU-1306
Jira NVGPU-2855

Change-Id: Iefe303cc224f804a206422e2efffda9da1616d89
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2029649
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-18 16:56:04 -07:00
Deepak Nibade
7fa2189fb3 gpu: nvgpu: move fecs_trace operations under gr
Move g->ops.fecs_trace.*() HAL operations under gr operations as
g->ops.gr.fecs_trace.*()

Also rename gk20a_ctxsw_*() functions used in common code to the
format nvgpu_gr_fecs_trace_*()

Jira NVGPU-1880

Change-Id: Idf2f8fb3d7ba2832bf1837fd97b70b3cee412123
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2070767
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-16 05:05:41 -07:00
Deepak Nibade
1208ad7cef gpu: nvgpu: rearrange linux specific fecs trace support
We have 3 header files for FECS tracing support
include/nvgpu/gr/fecs_trace.h : common header
include/nvgpu/ctxsw_trace.h : header that includes both common and
                              os-specific functions
os/linux/ctxsw_trace.h : linux specific header

Remove the second header since it is not needed.

Move all structures that are needed in common code to
include/nvgpu/gr/fecs_trace.h
Move all function declarations that are needed in common code to
include/nvgpu/gr/fecs_trace.h
Move all linux specific declarations in os/linux/ctxsw_trace.h and
rename this file as os/linux/fecs_trace_linux.h

Also rename os/linux/ctxsw_trace.c to os/linux/fecs_trace_linux.c

Jira NVGPU-1880

Change-Id: I05cc4489c4b6a64880b7d59c02b22cd2244d5e22
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2070766
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-16 05:05:32 -07:00
Thomas Fleury
ffed5095db gpu: nvgpu: move fifo init/deinit code to common
Add fifo sub-unit to common.fifo to handle init/deinit code
and global support functions.

Split init into:
- nvgpu_channel_setup_sw
- nvgpu_tsg_setup_sw
- nvgpu_fifo_setup_sw
- nvgpu_runlist_setup_sw
- nvgpu_engine_setup_sw
- nvgpu_userd_setup_sw
- nvgpu_pbdma_setup_sw

Split de-init into
- nvgpu_channel_cleanup_sw
- nvgpu_tsg_cleanup_sw
- nvgpu_fifo_cleanup_sw
- nvgpu_runlist_cleanup_sw
- nvgpu_engine_cleanup_sw
- nvgpu_userd_cleanup_sw
- nvgpu_pbdma_cleanup_sw

Added the following HALs
- runlist.length_max
- fifo.init_pbdma_info
- fifo.userd_entry_size

Last 2 HALs should be moved resp. to pbdma and userd sub-units,
when available.

Added vgpu implementation of above hals
- vgpu_runlist_length_max
- vgpu_userd_entry_size
- vgpu_channel_count

Use hals in vgpu_fifo_setup_sw.

Jira NVGPU-1306

Change-Id: I954f56be724eee280d7b5f171b1790d33c810470
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2029620
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-14 20:35:22 -07:00
Seema Khowala
cb91bf1e13 gpu: nvgpu: protect recovery with engines_reset_mutex
Rename gr_reset_mutex to engines_reset_mutex and acquire it
before initiating recovery. Recovery running in parallel with
engine reset is not recommended.

On hitting engine reset, h/w drops the ctxsw_status to INVALID in
fifo_engine_status register. Also while the engine is held in reset
h/w passes busy/idle straight through. fifo_engine_status registers
are correct in that there is no context switch outstanding
as the CTXSW is aborted when reset is asserted.

Use deferred_reset_mutex to protect deferred_reset_pending variable
If deferred_reset_pending is true then acquire engines_reset_mutex
and call gk20a_fifo_deferred_reset.
gk20a_fifo_deferred_reset would also check the value of
deferred_reset_pending before initiating reset process

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: I47de669a6203e0b2e9a8237ec4e4747339b9837c
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2022373
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-13 06:34:31 -07:00
Seema Khowala
672e6bc31e gpu: nvgpu: disable elpg before ctxsw_disable
if fecs is sent stop_ctxsw method, elpg entry/exit cannot happen
and may timeout. It could manifest as different error signatures
depending on when stop_ctxsw fecs method gets sent with respect
to pmu elpg sequence. It could come as pmu halt or abort or
maybe ext error too.

If ctxsw failed to disable, do not read engine info and just abort tsg.

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: I5f3ba07663bcafd3f0083d44c603420b0ccf6945
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2014914
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-13 06:34:02 -07:00
Debarshi Dutta
dc0e037d8c gpu: nvgpu: move engine_status_dump functions to common.fifo.hal.engine_status
The functions gk20a_dump_eng_status and gv11b_dump_eng_status belongs
to engine_status HAL unit.

1) The corresponding declaration and definitions of the above functions
are moved from fifo_{arch} files to engine_status_{arch} files.

2) The corresponding HAL pointer .dump_eng_status is moved from
fifo to engine_status HAL unit.

3) gv11b_dump_eng_status is now based to gv100b_dump_eng_status

4) Small changes in the files for ENGINE_STATUS such as correction of
HEADER DEFINES etc

Jira NVGPU-1315

Change-Id: I7fc06eab97206bc3b78c6f5c7aa30fa2c034961c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2033632
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-12 13:36:29 -07:00
Debarshi Dutta
8fae143b57 gpu: nvgpu: remove HAL pointer for gk20a_fifo_wait_engine_idle
The corresponding HAL pointer for gk20a_fifo_wait_engine_idle is not
being invoked anywhere and hence they are removed from the code.

The function gk20a_fifo_wait_engine_idle belongs to engine unit and is
only called in a non-safe build, hence its moved to engine unit and is
restricted by a non-safe build flag NVGPU_ENGINE
Also, gk20a_fifo_wait_engine_idle is renamed to nvgpu_engine_wait_for_idle

Jira NVGPU-1315

Change-Id: Ie550c7e46a4284dfe368859d828b1994df34185f
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2033631
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-12 13:36:14 -07:00
Debarshi Dutta
adc27cc9b4 gpu: nvgpu: move engine_activity functions to common.fifo.engine unit.
The following functions belong to engine unit and are moved
gk20a_fifo_enable_engine_activity
gk20a_fifo_enable_all_engine_activity
gk20a_fifo_disable_engine_activity
gk20a_fifo_disable_all_engine_activity

These are renamed by replacing gk20a_fifo with nvgpu_engine as prefix.
These functions are only invoked by linux build and not required for
safety build and hence they are defined when
-DNVGPU_ENGINE is enabled.

Jira NVGPU-1315

Change-Id: I39d820879bb55b40e754526c657d794930a4b6a1
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2032606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-12 13:36:00 -07:00
Deepak Nibade
3391aa9d84 gpu: nvgpu: move fecs_trace bind/unbind calls to gr/fecs_trace unit
Move below calls to gr/fecs_trace unit
gk20a_fecs_trace_bind_channel()
gk20a_fecs_trace_unbind_channel()

And rename them to
nvgpu_gr_fecs_trace_bind_channel()
nvgpu_gr_fecs_trace_unbind_channel()

We are not accessing any fifo/ch/tsg construct in gr/fecs_trace unit
hence update parameter list of above APIs to receive inst_block,
gr_ctx, subctx pointers directly instead of receiving channel_gk20a

Delete gk20a/fecs_trace_gk20a.* files since they are no longer
required. All the contents in those files are now moved to gr/fecs_trace
unit

Jira NVGPU-1880

Change-Id: I7ef9f0b66781b45155035237172ae400f02740e4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2032707
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-08 07:07:27 -08:00
Seema Khowala
5222d0ff4f gpu: nvgpu: do not do timeout_debug_dump for non fifo_error_idle_timeout
Any recovery that goes through gk20a_fifo_recover path e.g. gr error,
mmu fault or any recovery that involves engine recovery as well, will
still dump the full debug dump. This change will just avoid dumping debug
dump for force reset channels and pbdma intr if they do not involve
engine recovery. For FIFO_ERROR_IDLE_TIMEOUT error notifiers that
involves tsg recovery only, debug_dump will happen only if
timeout_debug_dump is set. timeout_debug_dump by default is set to true
but can be changed using NVGPU_IOCTL_CHANNEL_SET_TIMEOUT_EX.

Bug 2092051

Change-Id: Ibbf3cd2c44c586d9deb9e61ffbf37945b8d9e428
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2033068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-07 15:14:24 -08:00
Thomas Fleury
b64ee64fa7 gpu: nvgpu: do not free individual runlists
gk20a_fifo_delete_runlist was invoking nvgpu_kfree for each
runlist, but active runlists are now stored in the
g->active_runlist_info array.

Remove nvgpu_kfree for individual runlists.
Also clear active_runlist_info and num_runlists fields in
gk20a_fifo_delete_runlist.

Bug 2470115
Bug 2522374

Change-Id: Ie678c16af31ed8345ca0f015c17d61a3965b924d
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030970
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-07 11:46:25 -08:00
Thomas Fleury
70453a8606 Revert "Revert "gpu: nvgpu: allocate only active runlists""
This reverts commit f67bc51e51.

Currently a fifo_runlist_info_gk20a structure is allocated and
initialized for each possible runlist. But only a few runlists
are actually used.

Skip allocation and initialization of inactive runlists. Active
runlists info is stored in the active_runlist_info array.If a
runlist is active, then runlist_info[runlist_id] points to one
entry in active_runlist_info. Otherwise, runlist_info[runlist_id]
is NULL.

Operations that used to walk through all runlists are modified
to walk though active runlists only.

Bug 2470115
Bug 2522374

Change-Id: I98253ebebb4b1ba5957b57329820b94444b9d41b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030409
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-07 11:46:15 -08:00
Thomas Fleury
c23738969d Revert "Revert "gpu: nvgpu: array of pointers to runlists""
This reverts commit ade1d50cbe.

Currently a fifo_runlist_info_gk20a structure is allocated and
initialized for each possible runlist. But only a few runlists
are actually used.

Use an array of pointers to runlists in fifo_gk20a. The array
keeps existing indexing by runlist_id. In this patch a context
is still allocated for each possible runlist, but follow up
patch will allow to skip context allocation for inactive
runlists.

Bug 2470115
Bug 2522374

Change-Id: I0deb6981bc6f5152bdf121f0a44429748aa14687
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030407
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-07 11:45:59 -08:00
Debarshi Dutta
675a2b6858 gpu: nvgpu: added non-functional changes to engines unit
The following changes are made in this patch.

1) nvgpu driver is incorrectly using u32 to store enum values in some
functions. Replaced them with correct type enum nvgpu_fifo_engine

2) change parameter type in nvgpu_engine_get_ids from engine_id[]
to *engine_ids

3) rename some function names to remove redundant characters to make
the name shorter.

4) Removed the initialization of enum nvgpu_fifo_engine in functions
where we assign a value before direct access.

Jira NVGPU-1315

Change-Id: Ic65b40c9cb1e90ad278cb36a00e1c9de51724f27
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2020230
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-06 04:45:20 -08:00
Nicolas Benech
ee6ef2a719 gpu: nvgpu: resolve MISRA 17.7 for WARN_ON
MISRA Rule-17.7 requires the return value of all functions to be used.
Fix is either to use the return value or change the function to return
void. This patch ensures that WARN and WARN_ON always return void; and
introduces a new nvgpu_do_assert construct to trigger the equivalent
of WARN_ON(true) so that stack can be dumped (depends on OS support)

JIRA NVGPU-677

Change-Id: Ie2312c5588ceb5b1db825d15a096149b63b69af4
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2018706
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-05 11:14:46 -08:00
Mahantesh Kumbar
4bb9b0b987 gpu: nvgpu: use support_ls_pmu flag to check LS PMU support
Currently PMU support enable check is done with multiple
methods which added complexity to know status of PMU
support.

Changed to replace multiple methods with support_pmu
flag to know the PMU support, support_pmu will be updated
at init stage based on platform/chip specific settings
to know the PMU support status.

Cleaned up support_pmu flag check with platform specific
PMU members in multiple places & moved check to
public functions

JIRA NVGPU-173

Change-Id: Ief2c64250d1f78e3b054203be56499e4d1d9b046
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2024024
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-04 03:33:16 -08:00
Thomas Fleury
ade1d50cbe Revert "gpu: nvgpu: array of pointers to runlists"
This reverts commit 5fdda1b075.

Bug 2522374

Change-Id: Icb5e2181b056dc2247291c7f0e47d46c29095286
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030293
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Hoang Pham <hopham@nvidia.com>
2019-02-28 17:51:37 -08:00
Thomas Fleury
f67bc51e51 Revert "gpu: nvgpu: allocate only active runlists"
This reverts commit 45fa0441f7.

Bug 2522374

Change-Id: Icb80b7a31c7588a269850a3768ab0238dbec67b1
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030292
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Hoang Pham <hopham@nvidia.com>
2019-02-28 17:51:22 -08:00
Thomas Fleury
45fa0441f7 gpu: nvgpu: allocate only active runlists
Currently a fifo_runlist_info_gk20a structure is allocated and
initialized for each possible runlist. But only a few runlists
are actually used.

Skip allocation and initialization of inactive runlists.
Active runlists info is stored in the active_runlist_info array.
If a runlist is active, then runlist_info[runlist_id] points to
one entry in active_runlist_info. Otherwise, runlist_info[runlist_id]
is NULL.

Operations that used to walk through all runlists are modified to
walk though active runlists only.

Bug 2470115

Change-Id: Icd10281dc904bdee581ebc9cfeb662018ecca121
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2025385
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-27 17:54:54 -08:00
Thomas Fleury
5fdda1b075 gpu: nvgpu: array of pointers to runlists
Currently a fifo_runlist_info_gk20a structure is allocated and
initialized for each possible runlist. But only a few runlists
are actually used.

Use an array of pointers to runlists in fifo_gk20a. The array
keeps existing indexing by runlist_id. In this patch a context
is still allocated for each possible runlist, but follow up
patch will allow to skip context allocation for inactive
runlists.

Bug 2470115

Change-Id: I1615043cea84db35a270ade64695d51f85c1193a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2025203
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-27 17:54:37 -08:00
Debarshi Dutta
8db1955d74 gpu: nvgpu: split semaphore.c file into multiple units
The file semaphore.c is now split into 4 units namely
semaphore, semaphore_hw, semaphore_pool and semaphore_sea.

Each of the above units now have separate compilation units under
common/semaphore/. The public APIs corresponding to each unit is
present in include/nvgpu/semaphore.h. The dependency graph of the
below units is as follows where '->' indicates left depends on right.

semaphore -> semaphore_hw -> semaphore_pool -> semaphore_sea

Some of the other major changes made in this patch are as follows
  i) Renamed some of the functions.
  ii) Some functions are changed from private to public.
  iii) Public header for semaphore contains only the declaration of the
       corresponding structs as an opaque structure.
  iv) Constructed a private header to contain internal functions common
      to all the units and struct definitions corresponding to each unit.
  v)  Added new functions to provide access to internal members of the
      units.

Jira NVGPU-2076

Change-Id: I6f111647ba9a9a9f8ef9c658f316cd5d6276c703
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2022782
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-27 12:54:15 -08:00
Seema Khowala
2c0933de05 gpu: nvgpu: rename ch_timedout to unserviceable
ch_timedout is not a good variable name for broken and
unusable state of the channel. Rename ch_timedout to
unserviceable

Bug 2092051
Bug 2429295

Change-Id: I633eaff61928d5ef9836dcdc162b07e7a5e03881
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1996865
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-22 20:21:37 -08:00
Thomas Fleury
8610ae5fdc gpu: nvgpu: skip buffer allocation for unused runlists
Currently, 2x 1MB buffers are allocated per possible
runlist (which totals 26MB in GV100 case). But only
a few runlists are actually used.

Skip runlist buffer allocation for unused runlists.

Bug 2470115

Change-Id: Ifc9a36c38d302ca758d1fe99d293a1bfbde85ac7
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2024279
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-22 15:24:11 -08:00
Philip Elcan
c02bccd6db gpu: nvgpu: cond: use u32 for COND_WAIT timeout
The type for the timeout parameter to the NVGPU_COND_WAIT and
NVGPU_COND_WAIT_INTERRUPTIBLE macros was too weak. This updates these
macros to require a u32 for the timeout.

Users of the macros are updated to be compliant as necessary.

This addresses MISRA 10.3 violations for implicit conversions of types
of different size or essential type.

JIRA NVGPU-1008

Change-Id: I12368dfa81b137c35bd056668c1867f03a73b7aa
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017503
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-21 10:24:24 -08:00
Seema Khowala
13f37f9c70 gpu: nvgpu: remove gk20a_is_channel_marked_as_tsg
Use tsg_gk20a_from_ch to get tsg pointer for tsgid of a channel. For
invalid tsgid, tsg pointer will be NULL

Bug 2092051
Bug 2429295
Bug 2484211

Change-Id: I82cd6a2dc5fab4acb147202af667ca97a2842a73
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2006722
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-21 10:23:50 -08:00
Debarshi Dutta
9767366c60 gpu: nvgpu: add pbdma_status unit
A new unit pbdma_status is added. The unit provides a HAL
ops function pointer read_pbdma_status_info() to read and produce
a struct of type nvgpu_pbdma_status_info. Additionally, the unit
provides public APIs to retrieve data from the struct
nvgpu_pbdma_status_info.

Jira NVGPU-1311

Change-Id: Ic89c78703c3738b91be8d18ba970a591658d4022
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2019976
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-19 04:17:00 -08:00
Debarshi Dutta
061aa66adc gpu: nvgpu: move engine specific functions to common/fifo
The following changes are done in this patch.

1) gk20a_fifo_get_engine_info() is moved to common/fifo/engine.c
and is renamed to gk20a_fifo_get_active_engine_info() to reflect
accurately the purpose of the function.

2) move the definition of enum fifo_engine to <nvgpu/engines.h> and
add the prefix NVGPU_

3) move the following functions related to engines in fifo_gk20a.c to
common/fifo/engines.c and replace their signature by adding the prefix
nvgpu_engine and removing gk20a_fifo.

gk20a_fifo_get_active_engine_info
gk20a_fifo_engine_enum_from_type
gk20a_fifo_get_engine_ids
gk20a_fifo_is_valid_engine_id
gk20a_fifo_get_gr_engine_id
gk20a_fifo_act_eng_interrupt_mask
gk20a_fifo_engine_interrupt_mask
gk20a_fifo_get_all_ce_engine_reset_mask

Jira NVGPU-1315

Change-Id: I63d9dcd905a0bebcc9a4c65776cf6ec7a0837acf
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2011298
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-15 09:44:19 -08:00
Konsta Holtta
93e15f9c43 gpu: nvgpu: rename redundant runlist names in HAL
Drop the "runlist_" part in the runlist section of the HAL ops. For
example:

- old: g->ops.runlist.runlist_wait_pending
- new: g->ops.runlist.wait_pending

At the same time, drop the "fifo_" part from the function names. For
example:

- old: gk20a_fifo_runlist_wait_pending
- new: gk20a_runlist_wait_pending

Also rename eng_runlist_base_size to count_max. The size of the
eng_runlist_base register array depicts the maximum possible number of
runlists in the chip for which count_max is more descriptive.

Jira NVGPU-1309

Change-Id: Ie9e94b9f65cd10d3e682d19954f240adb6e311be
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017403
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-14 18:52:29 -08:00
Debarshi Dutta
ddcdf364b7 gpu: nvgpu: use public APIs of engine_status_info unit
nvgpu driver presently uses h/w functions to read and process
the engine_status registers. H/w headers shouldn't be directly invoked
by common code and should be called via HAL layer. This patch replaces
the h/w headers with the APIs in the engine_status_info unit.

Jira NVGPU-1315

Change-Id: I767a2b116b07cce4f4b587e6da8dd118afa27de5
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2005470
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-13 14:34:03 -08:00
Debarshi Dutta
e60bae8ec4 gpu: nvgpu: add engine_status_info unit
A new unit nvgpu_engine_status_info is added. The unit provides a HAL
ops function pointer read_engine_status_info() to read and produce
a struct of type nvgpu_engine_status_info. Additionally, the unit
provides public APIs to retrieve data from the struct
nvgpu_engine_status_info.

Jira NVGPU-1315

Change-Id: I6c167c36081bee5c9a8db51d3467c8f5f02c2685
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2003886
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-13 14:34:00 -08:00
Konsta Holtta
38c548a39c gpu: nvgpu: Add channel.reset_faulted HAL
Add a HAL op for resetting the eng_faulted and pbdma_faulted states on a
channel. This used to be a local feature in fifo_gv11b.c; the HAL is
defined for all chips from gv11b onwards.

Jira NVGPU-1307

Change-Id: I120a59c429851cc69e712ddd5b06a4b3d16c06c9
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017269
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-12 17:06:37 -08:00
Konsta Holtta
44e4d69734 gpu: nvgpu: add channel.force_ctx_reload HAL
Isolate the write to ccsr_channel_force_ctx_reload behind a HAL op.

Jira NVGPU-1307

Change-Id: Iaef7d740f4a89e4a45c7de28f001a7dea98ce066
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017268
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-12 17:06:28 -08:00
Konsta Holtta
9457a5ea91 gpu: nvgpu: add eng_faulted to channel HAL for gv11b+
The ccsr_channel_eng_faulted field exists from Volta onwards. Implement
the read_state HAL op for those chips, and store that bit as a boolean
in the channel state info.

Jira NVGPU-1307

Change-Id: Ie997892f2d3db0725496661a4d3083e7396894cc
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017267
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-12 17:06:18 -08:00
Konsta Holtta
cd4b2f642c gpu: nvgpu: add HAL for reading ccsr_channel
Refactor read accesses to the ccsr_channel register for channel state to
be done via a channel HAL op for all chips. A new op called read_state
is added for this; information needed by other units is collected in a
new struct nvgpu_channel_hw_state.

Jira NVGPU-1307

Change-Id: Iff9385c08e17ac086d97f5771a54b56b2727e3c4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2017266
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-12 17:06:09 -08:00