Commit Graph

342 Commits

Author SHA1 Message Date
Thomas Fleury
2088dc5d85 gpu: nvgpu: remove dead code for get gr runlist_id
nvgpu_engine_get_gr_runlist_id gets the first instance of
active GR engine using nvgpu_engine_get_ids. Therefore the
engine_id is already known to be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.

Jira NVGPU-4511

Change-Id: Ifcc0851e3d14d862e2ed7b21ea57f17a66eca9dd
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2263617
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
ca17622b7e gpu: nvgpu: set invalid veid for non GR engines
In nvgpu_engine_mmu_fault_id_to_eng_id_and_veid, set veid to
invalid for non-GR engines.

Jira NVGPU-4511

Change-Id: I2cec7898f8f7dec15224fdf70c444c0dd6de8a16
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262220
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
6fa5da61d7 gpu: nvgpu: use engine_id to access engine_info
Generalize use of "engine_id" variable name to index f->engine_info.

Jira NVGPU-4511

Change-Id: Ie3bc2c701dc3bab833d6ac134273dd6a102528c2
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262219
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
66b68edd6b gpu: nvgpu: iterator name for active_engines
Some functions used engine_id or eng_id to index active_engines_list,
which could get confusing when used in conjunction with similar
variable as active_engine_id or act_eng_id.

Use generic iterator name i or j instead, to make it clear that
f->active_engines_list is NOT indexed by engine id.

Jira NVGPU-4511

Change-Id: I07a6bf00dfb6d4e608b10f2f79e38a70e557428c
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262218
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
269fe8bea6 gpu: nvgpu: compile channel dbg_s_* only for debugger
Channel's dbg_s_lock and dbg_s_list are only needed when
CONFIG_NVGPU_DEBUGGER is defined.

Conditionally compile those fields, so that they are
not present in safety build and related documentation.

Jira NVGPU-4376

Change-Id: Ie2e99a39e5cbb60fb05d3eccc4c57242f0eef303
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2273262
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
67696c6870 gpu: nvgpu: conditionally compile tsg event ids
event_id_list and event_id_list_locks fields are only
needed in nvgpu_tsg when CONFIG_NVGPU_CHANNEL_TSG_CONTROL
is defined.

Conditionally compile those fields and related code,
so that they are removed from safety build.

Jira NVGPU-4376

Change-Id: I8678aa1b8cd4166aa37bcb42cda1eb9c703fd32f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2273261
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
ddutta
c9bb9da6da gpu: nvgpu: Reduce CCM for nvgpu_channel_setup_bind
A new static function channel_setup_bind_prechecks is constructed.
All precondition checks present in nvgpu_channel_setup_bind are moved
to channel_setup_bind_prechecks.

Jira NVGPU-4063

Change-Id: I1c784bd74628ba95f427d9b53629016e8b0acb9a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268076
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Vedashree Vidwans
71040ef04f gpu: nvgpu: unit: mm: mmu_fault gv11b_fusa UT
This unit test covers most of the nvgpu.hal.mm.mmu_fault.gv11b_fusa
module lines and almost all branches.

Jira NVGPU-2218

Change-Id: I7c95876a0b1b4bb4b86eb15e21ca0da747d06162
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2258545
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Vedashree Vidwans
55b3890642 gpu: nvgpu: fix bugs in fifo cleanup sw function
Currently, GPU fifo sw_ready flag is not reset after fifo clean_up
execution. This patch resets g->fifo.sw_ready flag in
nvgpu_fifo_cleanup_sw_common() to indicate fifo attributes are reset.
Also, pbdma setup and cleanup functions are optional and may not be
populated. This patch modifies nvgpu_fifo_cleanup_sw_common() to
executes nvgpu_pbdma_cleanup_sw() if pbdma.cleanup_sw is populated.

Jira NVGPU-4339

Change-Id: I6fd53577afdd0a15c75f15b54a916e70e850d1b0
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2237809
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
1fc9a427e0 gpu: nvgpu: tear down TSG on unbind HAL failure
Currently nvgpu_tsg_unbind ignores return code from
g->ops.tsg.unbind_channel. For consistency, tear down
TSG in case an error occurs in the unbind HAL.

Also make sure to restore valid ops for fifo.preempt_tsg
in test_gr_setup_free_obj_ctx, to avoid unbind failure.

Jira NVGPU-4387

Change-Id: I27a9c0daa365d05684149fc4bb17874d60ae1fde
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2248159
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
1865b2804b gpu: nvgpu: bail out on HAL failure in TSG bind
Currently nvgpu_tsg_bind adds that channel to TSG's channel
list, even if g->ops.tsg.bind_channel fails.
Instead, bail out from function, and return an error.

Jira NVGPU-4387

Change-Id: I02dd836d9d499ddbe9b269856e39b2a7c9ccfe64
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2248158
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
a560a378a1 gpu: nvgpu: ce: fifo: fix CE interrupt mask
Fix bug where the CE mask includes other engine types besides just CEs
in nvgpu_ce_engine_interrupt_mask().

The intent of this API is to return mask of CE interrupts. However, the
if clause in the for loop is only excluding engine interrupts if the CE
stall or non-stall ISR is NULL. So, it does not distinquish between CE
or GR engine interrupts if the CE ISR is non-null.

Since the expectation is to not return CE interrupts if the ISRs are
NULL, just return a 0 mask if either ISR is NULL without having to
bother with the loop.

If the ISRs are set in the CE HAL, within the loop, only add interrupts
to the mask returned if the engine type is actually a CE engine (i.e. do
not include GR engine interrupts).

JIRA NVGPU-2224

Change-Id: Ic0048b00f16590fec50bb0858bd3f4498a00650d
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2256269
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
945e9ebee2 gpu: nvgpu: checks in nvgpu_engine_init_info
Return error in nvgpu_engine_init_info if g->ops.top.get_device_info
is NULL. In particular, do not attempt to init CE info.

Jira NVGPU-3693

Change-Id: I521cb43233a48b6e765ffd0b1feee81a30dbd739
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2242699
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
a8c9c800cd gpu: nvgpu: reorganization of MC interrupts control
Previously, unit interrupt enabling/disabling and corresponding MC level
interrupt enabling/disabling was not done at the same time.
With this change, stall and nonstall interrupt for units are programmed
at MC level along with individual unit interrupts. Kept access to MC
interrupt registers through mc.intr_lock spinlock.

For doing this separated CE and GR interrupt mask functions.
mc.intr_enable is only used when there is global interrupt
control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable
is now removed. Removed following functions - mc_gv100_intr_enable,
mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config
as we can use the generic unit interrupt control function.

JIRA NVGPU-4336

Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196178
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
36da08388b gpu: nvgpu: fix check on channel_fatal_0_intr_descs
nvgpu_pbdma_init_intr_descs was checking device_fatal_0_intr_descs
instead of channel_fatal_0_intr_descs to assign
f->intr.pbdma.channel_fatal_0.

Jira NVGPU-3490

Change-Id: Ied8fb9db0bd43e7cb76b6b9f41b0ed5639181d72
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2241798
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
2edf3db10a gpu: nvgpu: move mc gpu_ops out of gk20a.h and add doxygen comments for HALs
gk20a.h will include gops_mc.h to contain the mc ops definitions. Add
doxygen comments for the HAL functions that are called directly.
Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file.

JIRA NVGPU-2524

Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
9169e8c048 gpu: nvgpu: mc: move mc declarations to mc.h
Move declarations that belong to mc from gk20a.h to mc.h where they
belong.

JIRA NVGPU-2532

Change-Id: I91934ff60e2735c61d16459c04507fed6e1c96d7
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2214421
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Peter Daifuku
8f42de2775 gpu: nvgpu: channel_setup_bind: must be bound to TSG
In nvgpu_channel_setup_bind, return an error if the
channel isn't bound to a TSG, as future operations
rely on being bound.

Update usermode setup_bind test to bind channel to
the tsg before calling nvgpu_setup_bind

Manual port from rel-32

Bug 200543218

Change-Id: If33b01b8176c7488445c23080ad9d11f341bff43
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2215160
(cherry picked from commit 56f8e5b878)
Reviewed-on: https://git-master.nvidia.com/r/2218885
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
14b94f7099 gpu: nvgpu: doxygen for fifo HAL
Add documentation for fifo HALs that are called
from other units.
- fifo_init_support
- fifo_suspend
- preempt_tsg
- preempt_runlists_for_rc
- intr_0_isr
- intr_1_isr

Jira NVGPU-4104

Change-Id: I7a7bc4384ef3d9cb5f0b4a6a3ecf0c9ad2de85da
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2213611
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
6c3c360462 gpu: nvgpu: protect nvgpu power state access using spinlock
IRQs can get triggered during nvgpu power-on due to MMU fault, invalid
PRIV ring or bus access etc. Handlers for those IRQs can't access the
full state related to the IRQ unless nvgpu is fully powered on.

In order to let the IRQ handlers know about the nvgpu power-on state
gk20a.power_on_state variable has to be protected through spinlock
to avoid the deadlock due to usage of earlier power_lock mutex.

Further the IRQs need to be disabled on local CPU while updating the
power state variable hence use spin_lock_irqsave and spin_unlock_-
irqrestore APIs for protecting the access.

JIRA NVGPU-1592

Change-Id: If5d1b5e2617ad90a68faa56ff47f62bb3f0b232b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2203860
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Vedashree Vidwans
b634c76cf1 gpu: nvgpu: return error for allocation failure
This patch modifies nvgpu_runlist_setup_sw() to return error code for
allocation failures.

Jira NVGPU-3699

Change-Id: I61d38658ef943474f9ceaf00979dd219714de820
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2211121
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Scott Long
52a4dd74e2 gpu: nvgpu: fix misra 18.4 violations
This change eliminates MISRA Advisory Rule 18.4 violations in the
following cases:

 * nvgpu_submit_append_gpfifo_user_direct()
 * nvgpu_submit_append_gpfifo_common()
    - use array-indexing to access gpfifo entry lists
 * gv11b_gr_intr_record_sm_error_state()
    - use array-indexing to access sm_error_states table

Advisory Rule 18.4 states that the +, -, +=, and -= operators should
not be applied to an expression of pointer type.

JIRA NVGPU-3798

Change-Id: I736930e4ba09a88888b0ef48f62496c4082ea5a1
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2210173
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Philip Elcan
065f98f669 gpu: nvgpu: init: add return for all init APIs
This adds return values for all init APIs. This make all the init APIs
have the same signature. This is a prerequisite to making a table of
init functions.

JIRA NVGPU-3980

Change-Id: I5b71fd06ad248092af133ffe908e2930acb6d2b0
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2202973
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Scott Long
77ffea99bd gpu: nvgpu: fix misra 18.4 violations
This change eliminates MISRA Advisory Rule 18.4
violations in the following by accessing g->fifo.channel
with array indexing:

 * nvgpu_channel_init_support()
 * nvgpu_channel_semaphore_wakeup()

Advisory Rule 18.4 states that the +, -, +=, and -=
operators should not be applied to an expression
of pointer type.

JIRA NVGPU-3798

Change-Id: I6b1bf360db6ec25894cc0ea430c33067e0cddf64
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2207550
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Philip Elcan
9378675213 gpu: nvgpu: whitelist MISRA violations for WARN_ON/BUG_ON
Whitelist false positive violations cause by a Coverity bug that
that overrides the WARN_ON/BUG_ON macros. See nvbug 2277532 for
details on the bug.

JIRA NVGPU-4031

Change-Id: I395f97c89580195485e93275663a062f26ab6fc7
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2207326
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Shashank Singh
6fd0d972ae nvgpu: gpu: include qnx_init unit in doxygen documentation
-Include qnx_init unit in doxygen documentation.
-Add documentation for gk20a_busy/idle and similar functions.
-Remove must_check return value as misra already reports violation for
 that.

Jira NVGPU-2571

Change-Id: I9573cb61865677944809dcc494d92f63cc6e0f58
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176755
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Richard Zhao
e770573468 gpu: nvgpu: disable mmu debug mode before unbind ch from tsg
disable mmu debug mode needs to reference tsg struct, so it must be
called when ch can trace back to tsg.

Bug 2586624

Change-Id: I050b557fb7abbf7e52faec242a1c290742e86c0d
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2206636
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
Tested-by: Kajetan Dutka <kdutka@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Vedashree Vidwans
1155e394a1 gpu: nvgpu: return error for allocation failure
This patch modifies nvgpu_runlist_setup_sw() to return error code for
allocation failures. This patch also modifies nvgpu_runlist_cleanup_sw()
to check active_runlist_info pointer before freeing runlist->mem.

Jira NVGPU-3699

Change-Id: Id6e72188ae5e921568c7ad016c115676358edf58
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2197346
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Thomas Fleury
e7e0879217 gpu: nvgpu: return int for tsg.init_eng_method_buffers
nvgpu_kzalloc can fail in gv11b_init_eng_method_buffers.
Added checks on returned pointer.
Also changed g->ops.tsg.init_eng_method_buffers to return an int,
and check return value in callers.

Jira NVGPU-3788

Change-Id: Icb541665c40b89d512929cc9cf9f6a3e7a0033db
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2205851
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Debarshi Dutta
eecd562be6 gpu: nvgpu: reduce CCM for channel_free
Reduce CCM complexity of channel_free from 20 to 10
by extracting out multiple groups of code into different static
functions.

Jira NVGPU-4063

Change-Id: Iafff739a9db089681c1d74ac6eba1b3c365ee627
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2205286
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Debarshi Dutta
6e2f5a85d3 gpu: nvgpu: rectify incorrect setting of pbdma_acquire_timeout
The driver was incorrectly setting pbdma_acquire_timeout during default
init when kernelmode submits were disabled. This is corrected to make
the behavior similar to the previous mode. Also, added logging for the
pbdma_acquire_timeout value being set in NV_RAMFC_

Jira NVGPU-3172

Change-Id: Ic39638386bd999871cd8eafec70a3770bc648f93
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2203580
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Debarshi Dutta
a3d21d7127 gpu: nvgpu: change CCM for runlist unit
1) Reduce CCM for nvgpu_runlist_setup_sw by extracting the mapping between
runlist_info and active_runlist into a separate static function
nvgpu_init_active_runlist_mapping.

nvgpu_runlist_setup_sw:
Previous MCC TCC | Current MCC TCC
         12   12 |          6   6

nvgpu_init_active_runlist_mapping:
Previous MCC TCC | Current MCC TCC
	 N/A N/A |          8   8

2) Reduce CCM for nvgpu_runlist_get_runlists_mask by restructuring the
function.
nvgpu_runlist_get_runlists_mask:
Previous MCC TCC | Current MCC TCC
         11   11 |         10   10

Jira NVGPU-4063

Change-Id: I458df50f15b2c4b2eeae8432a7687b83f9049194
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2200378
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Debarshi Dutta
d7ae490dff gpu: nvgpu: Reduce CCM for channel function
Reduce CCM for nvgpu_channel_suspend_all_serviceable_ch by early
calling channel.unbind

nvgpu_channel_suspend_all_serviceable_ch:
Previous MCC TCC | Current MCC TCC
         11   11 |          8   8

Jira NVGPU-4063

Change-Id: If701c7d83cbde31a19bbc19866962322c58c370d
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2201486
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Rajesh Devaraj
935c5f6578 gpu: nvgpu: fix misra violations in SDL
This patch addresses misra violations due to SDL error reporting
callbacks. In particular, it addresses the following misra violation:

- misra_c_2012_directive_4_7_violation: Calling function
  "nvgpu_report_*_err()" which returns error information without testing
  the error information.

JIRA NVGPU-4025

Change-Id: Ia10b6b3fd9c127a8c5189c3b6ba316f243cedf04
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196895
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Adeel Raza
252ddc4f05 gpu: nvgpu: add coverity whitelisting support
Add macros for whitelisting coverity violations. These macros use pragma
directives. The pragma directives and whitelisting macros are only
enabled when a coverity scan is being run.

The whitelisting macros have been added to a new header called
static_analysis.h. The contents of safe_ops.h (CERT C safe ops) have
been moved into static_analysis.h because this will be the new header
for static analysis related macros/defines/etc.

JIRA NVGPU-3820

Change-Id: I9c63f20f670880b420415535738034619314b7c3
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2180600
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Debarshi Dutta
6f9dfeaab1 gpu: nvgpu: fix misra violations in hal.fifo and common.fifo
The following misra violations are fixed in the current patch.

1) misra_c_2012_directive_4_7_violation: Calling function
"nvgpu_report_host_err" which returns error information without testing
the error information.

2) misra_c_2012_directive_4_7_violation: The variable "intr_0_en_mask"
which contains error information hasn't been tested.

3) misra_c_2012_directive_4_7_violation: Calling function
"gv11b_fifo_intr_0_error_mask(g)" which returns error information
without testing the error information.

4) misra_c_2012_rule_8_6_violation: "gk20a_fifo_bar1_snooping_disable"
is declared but never defined.

5) misra_c_2012_rule_8_6_violation: "gm20b_fuse_check_priv_security" is
declared but never defined.

6) misra_c_2012_rule_8_6_violation: "gm20b_fuse_status_opt_gpc" is
declared but never defined.

Jira NVGPU-3881

Change-Id: I731cd1d99649e07cb39aa75c4715e17eedd4d927
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2188161
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:01:38 -06:00
Scott Long
9a5ea7174d gpu: nvgpu: fix misra 13.4 violation
Advisory Rule 13.4 states that the result of an assignment
operator should not be used.

This change eliminates the Advisory Rule 13.4 violation from
channel_setup_kernelmode().

Jira NVGPU-3178

Change-Id: I6dcbfacec080f99fa4aa6f8e9aa716e994761a6e
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2186588
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-29 18:23:45 -07:00
Vedashree Vidwans
5fd301c61b gpu: nvgpu: fix race for channel sync read/write
CTS test dEQP-VK.api.object_management.max_concurrent.device_group
crashes with invalid userspace memory access.
Currently, nvgpu_submit_prepare_syncs() races with
nvgpu_channel_clean_up_jobs() and this race condition is exposed when
aggressive_sync_destroy_thresh is set to non-zero value.
nvgpu_submit_prepare_syncs() gets ref for c->sync to submit job and
releases channel sync_lock immediately. Meanwhile,
nvgpu_worker_poll_work() triggers nvgpu_channel_clean_up_jobs(), which
destroys ref'd c->sync pointer.
Channel sync is deleted by nvgpu_channel_clean_up_jobs() only if
aggressive_sync_destroy_thresh is non-zero.
So, nvgpu_channel_clean_up_jobs() and nvgpu_submit_prepare_syncs() will
race only in this scenario.
Hence, if aggressive_sync_destroy_thresh value is non-zero, this patch
protects channel's sync pointer by holding channel sync_lock
during complete execution of nvgpu_submit_prepare_syncs().

Bug 2613870

Change-Id: I030d8df7af10d4ed86f921b5cf60de2b1d60e5d3
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2181360
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-28 17:44:15 -07:00
Vedashree Vidwans
83fea157a3 Revert "gpu: nvgpu: fix race for channel sync read/write"
This reverts commit e22d743a20.

Change-Id: I4ea0a8158030d2fb9700ef5b84f8d77e579c1025
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2182350
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-28 17:44:00 -07:00
Thomas Fleury
f422aee393 gpu: nvgpu: use refcnt for ch mmu_debug_mode
Replaced ch->mmu_debug_mode_enabled with ch->mmu_debug_mode_refcnt.
If channel is enabled multiple times by userspace, then ref count is
updated accordingly. There is an expectation that enable/disable
calls are balanced for setting channel's mmu debug mode.
When unbinding the channel, decrease refcnt for the channel until it
reaches 0.
Also, removed tsg parameter from nvgpu_tsg_set_mmu_debug_mode as it
can be retrieved from ch.

Bug 2515097

Change-Id: If334e374a55bd14ae219edbfd3b1fce5ff25c226
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2184702
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-28 16:54:51 -07:00
Thomas Fleury
8057514a9f gpu: nvgpu: set FB/HSMMU debug mode
Set NV_PFB_HSMMU_PRI_MMU_DEBUG_CTRL and NV_PFB_PRI_MMU_DEBUG_CTRL
in addition to NV_PGRAPH_PRI_GPCS_MMU_DEBUG_CTRL, in
NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE

Bug 2515097

Change-Id: I1763b43e79fac3edb68a35980683d58bfa89519f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2115785
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-28 16:54:26 -07:00
Vedashree Vidwans
7bc3cdcf95 gpu: nvgpu: use vpr resize enabled API
This patch adds nvgpu API in linux and posix to query vpr resize.
The new API nvgpu_is_vpr_resize_enabled() is used in
nvgpu_submit_channel_gpfifo().
Previously, if non-deterministic channel has timeout disabled and
GPU cannot railgate on some platform, then channel doesn't power ref
count and results in video freeze. To resolve non-determinstic channel
job tracking needs to be enabled if vpr resize is supported or if GPU
can railgate.

Bug 200532122

Change-Id: Icfbff6253762b195b2f5955749343974b1a7a269
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2171093
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-28 14:24:19 -07:00
Thomas Fleury
95bb19827e gpu: nvgpu: add sw quiesce
For safety build, nvgpu driver should enter SW quiesce state
in case an uncorrectable error has occurred. In this state, any
activity on the GPU should be prevented, without powering off the GPU.
Also, a minimal set of operations should be used to enter SW quiesce
state.

Entering SW quiesce state does the following:
- set sw_quiesce_pending: when this flag is set, interrupt
  handlers exit after masking interrupts. This should help mitigate
  an interrupt storm.
- wake up thread to complete quiescing.

The thread performs the following:
- set NVGPU_DRIVER_IS_DYING to prevent allocation of new resources
- disable interrupts
- disable fifo scheduling
- preempt all runlists
- set error notifier for all active channels

Note: for channels with usermode submit enabled, userspace can
still ring doorbell, but this will not trigger any work on
engines since fifo scheduling is disabled.

Jira NVGPU-3493

Change-Id: I639a32da754d8833f54dcec1fa23135721d8d89a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2172391
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-27 10:37:21 -07:00
Debarshi Dutta
486815f81f gpu: nvgpu: fix misra violations for fifo units.
The following violations are fixed in this patch

a) misra_c_2012_rule_2_1_violation: This code cannot be reached: "return
err;".

b) misra_c_2012_directive_4_7_violation: Calling function
"nvgpu_preempt_channel(g, ch)" which returns error information without
testing the error information.

c) misra_c_2012_rule_8_6_violation: "" is declared but never defined for
following functions

1) gm20b_dump_engine_status
2) gp10b_ramfc_setup
3) gp10b_ramfc_get_syncpt
4) gp10b_ramfc_set_syncpt
5) gk20a_fifo_intr_0_enable
6) gk20a_fifo_intr_0_isr
7) gk20a_fifo_handle_sched_error
8) gk20a_fifo_is_mmu_fault_pending
9) gk20a_fifo_intr_set_recover_mask
10) gk20a_fifo_intr_unset_recover_mask
11) gk20a_init_fifo_reset_enable_hw
12) gk20a_init_fifo_setup_hw
13) nvgpu_tsg_set_runlist_interleave
14) gm20b_dump_engine_status
15) gp10b_pbdma_channel_fatal_0_intr_descs
16) gp10b_pbdma_allowed_syncpoints_0_index_f
17) gp10b_pbdma_allowed_syncpoints_0_valid_f
18) gp10b_pbdma_allowed_syncpoints_0_index_v
19) gk20a_runlist_reschedule

The above functions declarations are now embedded within
CONFIG_NVGPU_HAL_NON_FUSA

d) The function nvgpu_channel_abort_clean_up has a UMD version and hence
its taken out of CONFIG_NVGPU_KERNEL_MODE_SUBMIT to avoid errors of
type c above.

Jira NVGPU-3881

Change-Id: I5f85c7070e1d2f0b18d14db07ce22a01c29f0e40
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2181032
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-27 04:48:41 -07:00
Seema Khowala
2f731c5fa8 gpu: nvgpu: Add doxygen documentation in tsg.h
- Add doxygen documentation.
- Remove unused fields of nvgpu_tsg struct:
-- timeslice_timeout
-- timeslice_scale
- Remove unused functions:
-- nvgpu_tsg_set_runlist_interleave
- nvgpu_tsg_post_event_id is not supported in safety build.
  This function is moved under CONFIG_NVGPU_CHANNEL_TSG_CONTROL
  compiler flag.
- Below functions are moved under CONFIG_NVGPU_KERNEL_MODE_SUBMIT
  nvgpu_tsg_ctxsw_timeout_debug_dump_state
  nvgpu_tsg_set_ctxsw_timeout_accumulated_ms
- Rename
  gk20a_is_channel_active -> nvgpu_tsg_is_channel_active
  release_used_tsg -> nvgpu_tsg_release_used_tsg
- nvgpu_tsg_unbind_channel_common declared static
- Fix build issue when CONFIG_NVGPU_CHANNEL_TSG_CONTROL is disabled
  Remove CONFIG_NVGPU_CHANNEL_TSG_CONTROL for
  nvgpu_gr_setup_set_preemption_mode as it is needed in safety build.
  By default compute preemption mode will be set to WFI. CUDA will
  change it to CTA during context init time.

JIRA NVGPU-3595

Change-Id: I8ff6cabc8b892c691d951c37cdc0721e820a0297
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2151489
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-26 16:06:42 -07:00
vinodg
087d4d3df4 gpu: nvgpu: rmmod support in dgpu simulation
Changes added to support "rmmod nvgpu" in dgpu simulation after gpu
poweron.

nvgpu_engine-wait_for_idle got stuck in busy mode for nvdec and nvec
engines in simulation as simulation doesnt support timeout.
These engines are not valid engines in nvgpu engine list.
Add nvgpu_engine_check_valid_id before checking engine status.

Simulation crash on accessing 0xb81604 top interrupt register.
Add func_priv_cpu_intr_top__size_1_v() function to get the supported
size than using default MAX_INTR_TOP_REGS.

nvlink is not supprted in dgpu simulation. Avoid warning for
-ENODEV return.

Avoid register read following gpu power off completion.

Bug 2498574

Change-Id: I9f9f1cf1ac4620242bda1d2cc0f29f51f81a6711
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2179930
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-21 23:38:56 -07:00
Seema Khowala
e22d743a20 gpu: nvgpu: fix race for channel sync read/write
CTS test dEQP-VK.api.object_management.max_concurrent.device_group
crashes with invalid userspace memory access.
Currently, nvgpu_submit_prepare_syncs() races with
nvgpu_channel_clean_up_jobs() and this race condition is exposed when
aggressive_sync_destroy_thresh is set to non-zero value.
nvgpu_submit_prepare_syncs() gets ref for c->sync to submit job and
releases channel sync_lock. Meanwhile, nvgpu_worker_poll_work()
triggers nvgpu_channel_clean_up_jobs(), which destroys ref'd c->sync
pointer.
This patch protects channel's sync pointer by holding channel sync_lock
during complete execution of nvgpu_submit_prepare_syncs().

Bug 2613870

Change-Id: I6f3d48aff361d1cb38c30d2ce5de276d0c55fb6f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176929
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-20 11:46:02 -07:00
Seema Khowala
c21347e26e gpu: nvgpu: fix missing sync_lock acquire
sync_lock is released without acquiring it in channel free.
Fix it by adding acquire sync_lock.

Bug 2613870

Change-Id: I790636c3209414041a1962d6f3ef5f03cc827561
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176961
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-19 12:56:24 -07:00
Scott Long
4277f65834 gpu: nvgpu: fix misra 2.7 violations
Advisory Rule 2.7 states that there should be no unused
parameters in functions.

This patch removes unused function parameters from the following:

 * nvgpu_channel_ctxsw_timeout_debug_dump_state()
 * nvgpu_channel_destroy()
 * nvgpu_tsg_destroy()
 * nvgpu_rc_pdbma_fault()

Jira NVGPU-3178

Change-Id: I12ad0d287fd7980533663a9776428ef5d4fd1fb9
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176066
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-16 16:06:04 -07:00
Thomas Fleury
3845316f1a gpu: nvgpu: ctxsw timeout for usermode submits
If ctxsw timeout occurs for a channel with usermode submit enabled,
do not attempt to read GP_GET from USERD, since it is not mapped in
kernel memory. Instead, assume that there is no progress on PBDMA.
Recovery will occur in case ctxsw_timeout_accumulated_ms reaches max.

Bug 2674388

Change-Id: I8ccb73f5e51cab6ecdcabbb2359865197163d4aa
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176006
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Kajetan Dutka <kdutka@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-16 14:16:00 -07:00