Commit Graph

4614 Commits

Author SHA1 Message Date
Konsta Holtta
7e96b14390 gpu: nvgpu: track opened Linux ctrl files
An upcoming patch will need to enumerate opened ctrl nodes; track them
in a list, protected by a mutex.

Bug 200145225
Bug 200541476

Change-Id: I50dc15056832a3bb53fbdd7bd2bffcdaecc7b21c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1811840
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit d53495400e
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/2170005
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-15 00:58:25 -07:00
Konsta Holtta
8281262187 gpu: nvgpu: add usermode_base HAL
Add a HAL function pointer to fifo to for reading the usermode_cfg0
register and implement it for gv11b.

Bug 200145225
Bug 200541476

Change-Id: I5f77b15d3b502d9370b1f14129314eaf51a9d7d1
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1811839
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit fddb296924
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/2170004
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-15 00:58:15 -07:00
Konsta Holtta
eca2cf043e gpu: nvgpu: store bus addr of gpu regs
Usermode submit needs to access the usermode region of registers from
userspace. Store the start address of register resource in struct
nvgpu_os_linux to be used in remap to userspace.

Bug 200145225
Bug 200541476

Change-Id: I3796b6bf67942af0cc16c86accb82a013032bfc8
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1811838
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 38c11db264
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/2169921
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-15 00:58:05 -07:00
Debarshi Dutta
58ee7561f7 gpu: nvgpu: Add CHANNEL_SETUP_BIND IOCTL
For a long time now, the ALLOC_GPFIFO_EX channel IOCTL has done much
more than just gpfifo allocation, and its signature does not match
support that's needed soon. Add a new one called SETUP_BIND to hopefully
cover our future needs and deprecate ALLOC_GPFIFO_EX.

Change nvgpu internals to match this new naming as well.

Bug 200145225
Bug 200541476

Change-Id: I766f9283a064e140656f6004b2b766db70bd6cad
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1835186
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from e0c8a16c8d
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/2169882
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-15 00:57:45 -07:00
Divya Singhatwaria
ae175e45ed gpu: nvgpu: Use TPC_PG_MASK to powergate the TPC
- In GV11B, read fuse_status_opt_tpc_gpc register
  to read which TPCs are floorswept.
- The driver will also read sysfs node: tpc_pg_mask
- Based on these two values "can_tpc_powergate" will
  be set to true or false and mask will be used to write to
  fuse_ctrl_opt_tpc_gpc register to powergate the TPC.
- can_tpc_powergate = true indicates that the mask value
  sent from userspace is valid and can be used to power gate
  the desired TPC
- can_tpc_powergate = false indicates that the mask value
  sent from userspace is not valid and cannot  be used to
  power gate the desired TPC.

Bug 200532639

Change-Id: Ib0806e4c96305a13b3574e8063ad8e16770aa7cd
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2159219
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-02 12:57:24 -07:00
Debarshi Dutta
47f6bc0c2e gpu: nvgpu: Fix the race between runtime PM and L2 flush
gk20a_mm_l2_flush flushes the L2 cache when "struct gk20a->power_on"
is true. But it doesn't acquire power lock when doing that, which
creates a race that runtime PM might suspend the GPU in the middle
of L2 flush. The FB flush looks having the same issue with L2 flushing.

This patch fixes that by calling pm_runtime_get_if_in_use at the
beginning of the ioctl. This API from PM does a compare and add to
the usage count. If the device was not in use, it simply returns
without incrementing the usage count as its unnecessary to wake up
the GPU(using e.g. a call to gk20a_busy()) as the caches are
flushed when the device would be resumed anyways.

Bug 2643951

Change-Id: I2417f7ca3223c722dcb4d9057d32a7e065b9e574
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2151532
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mark Zhang <markz@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-02 07:12:00 -07:00
Debarshi Dutta
9fdb446b47 gpu: nvgpu: add missing ops assignment
fecs_trace is not enabled in rel-32 for gm20b due to a missing
assignment of gops->fecs_trace from gm20b->fecs_trace. This patch
corrects this by adding the required line.

Bug 2052906

Change-Id: I90c360d170373534270b0125a5905bee512d5316
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2164991
GVS: Gerrit_Virtual_Submit
Reviewed-by: Jonathan Mccaffrey <jmccaffrey@nvidia.com>
Tested-by: Jonathan Mccaffrey <jmccaffrey@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-01 06:56:43 -07:00
Jeremy Ho
42c2bdfb9f gpu: nvgpu: remove reversed ordering for deadlock
In some cases, we would get deadlock issue due to there are two locks
acquisition on common clk driver's lock and nvgpu driver's locks. At
the bug, inconsistent lock ordering problem will come with one thread
gets "nvgpu lock -> clk lock" and the other thread gets "clk lock ->
nvgpu lock".

Slove the latter path with one-time initializing clk_parent entry
and use cached data afterward.

Bug 2555115

Change-Id: I31c5c2728f406307e7cfd4e555f4db0c163234d8
Signed-off-by: Jeremy Ho <jeremyh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2146727
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aleksandr Frid <afrid@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-07-16 06:12:03 -07:00
Peter Daifuku
00b41b8538 gpu: nvgpu: align size to page size in vgpu map
Align size to the page size in vgpu_gp10b_locked_gmmu_map
before setting up the memory descriptors being passed to the
RM server

Bug 2212569
Bug 200528973

Change-Id: I7149f3116c2c4c909f77cd791f5954ad8c486073
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1953444
(cherry picked from commit 0babd46eb4)
Reviewed-on: https://git-master.nvidia.com/r/2140963
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Steinle <tsteinle@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-06-24 23:14:56 -07:00
Seema Khowala
e5c8bbb391 gpu: nvgpu: set channel to serviceable after it is bound to tsg
Channel's unserviceable status should to set to false only
after channel is bound to tsg.

Bug 200460037

Change-Id: I24976c673b3b08cc652d2c203b9fc1f3aaed403f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2135923
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-06-14 23:59:39 -07:00
sumitg
fadfa3289f gpu: nvgpu: vgpu: correct param to sysfs_attr_init
Pass correct attr parameter to sysfs_attr_init().
This fixes the compilation error on enabling debug
lock alloc.
 error: ‘struct device_attribute’ has no member named ‘key’

Bug 200464909
Bug 2604007

Change-Id: Ia0d2672b1c8fe9eb4807b4809892dcdc0cff2669
Signed-off-by: sumitg <sumitg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2034954
(cherry picked from commit daa4d7e42b)
Reviewed-on: https://git-master.nvidia.com/r/2132154
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Phoenix Jung <pjung@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-06-11 11:32:02 -07:00
Kary Jin
fea9e05454 gpu: nvgpu: add check for "vm->num_user_mapped_buffers"
The "nvgpu_big_zalloc()" will be failed if the passed-in argument
"vm->num_user_mapped_buffers" is zero. The returned value is 16
which will bypass the NULL-check and then causes the panic.

This patch adds a check on the "vm->num_user_mapped_buffers" to
avoid the zero is passed-in the "nvgpu_big_zalloc()".

Bug 2603292

Change-Id: I399eecf72a288e13992730651a34a6cea1ef56d1
Signed-off-by: Kary Jin <karyj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2123499
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Daniel Fu <danifu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-30 22:17:11 -07:00
Debarshi Dutta
1f867543da gpu: nvgpu: Add DT support for TPC_PG_POWERGATE
Added support for TPC_PG_POWERGATE during probe for nvgpu via DT.
A new DT binding GV11B_FUSE_OPT_TPC_DISABLE is supported by nvgpu
driver that checks for valid masks and updates the global tpc_pg_mask
flag.

Bug 200518434

Change-Id: Ia65ae518b48e36d28de5e9375bc994232f6a9438
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2117783
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-15 16:14:30 -07:00
Debarshi Dutta
543a904e63 gpu: nvgpu: fecs ctxsw trace for gm20b
Register gk20a non-arch-specific functions for gm20b
gpu_ops.fecs_trace,

Register nvgpu_os_linux_ops.fecs_trace.init_debugfs

gp10b_fecs_trace_flush is now replaced by gm20b_fecs_trace_flush in
fecs_trace_gm20b.* and the fecs_trace_gp10b.* files are removed.

Bug 2052906

Change-Id: I245c91ae8e6015b87bafeb3ec023b98fe4c57501
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2115247
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-14 14:59:33 -07:00
Peng Liu
27625718c4 Revert "gpu: nvgpu: cache gpu clk rate"
This reverts commit e9a6d179a4 ("gpu: nvgpu: cache gpu clk rate")

 - Real clock rate doesn't always equal clock rate requested by caller
 - call of clk_set_rate() and update of cached_rate are not atomic
 - Real root cause for Bug 2051688 is in bpmp and gboost design

Bug 2538692

Change-Id: I9248e0c69e2271ed2d0070587db59afa6f8160f2
Signed-off-by: Peng Liu <pengliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2109708
(cherry picked from commit cc70f89bb4)
Reviewed-on: https://git-master.nvidia.com/r/2113647
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aaron Tian <atian@nvidia.com>
Tested-by: Aaron Tian <atian@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-10 06:43:12 -07:00
Shih-hsin Li
af95d14bb0 gpu: nvgpu: fix synchronization in nvgpu_vm_map
The mapping early returned from nvgpu_vm_map might already
be unmapped during channel clean up. Increase refcount of
an already mapped buffer inside the scope of update_gmmu_lock
mutex to avoid this race.

Bug 200494150

Change-Id: I66d9272e42c40cd3aae7ba3bb8106ec37691bf8e
Signed-off-by: Shih-hsin Li <seasonl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2114163
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vinayak Pane <vpane@nvidia.com>
Reviewed-by: Daniel Fu <danifu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-09 20:43:17 -07:00
Debarshi Dutta
6509bb49da gpu: nvgpu: protect recovery with engines_reset_mutex
Rename gr_reset_mutex to engines_reset_mutex and acquire it
before initiating recovery. Recovery running in parallel with
engine reset is not recommended.

On hitting engine reset, h/w drops the ctxsw_status to INVALID in
fifo_engine_status register. Also while the engine is held in reset
h/w passes busy/idle straight through. fifo_engine_status registers
are correct in that there is no context switch outstanding
as the CTXSW is aborted when reset is asserted.

Use deferred_reset_mutex to protect deferred_reset_pending variable
If deferred_reset_pending is true then acquire engines_reset_mutex
and call gk20a_fifo_deferred_reset.
gk20a_fifo_deferred_reset would also check the value of
deferred_reset_pending before initiating reset process

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: I47de669a6203e0b2e9a8237ec4e4747339b9837c
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2022373
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from cb91bf1e13
in dev-main)
Reviewed-on: https://git-master.nvidia.com/r/2024901
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-09 14:42:33 -07:00
Debarshi Dutta
4d8ad643d6 gpu: nvgpu: wait for gr.initialized before changing cg/pg
set gr.initialized to false in the beginning of gk20a_gr_reset() and
set it to true at the end of successful execution of gk20a_gr_reset.

Use gk20a_gr_wait_initialized() to enable/disable cg/pg
functions to make sure engine is out of reset and initialized.

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: Ic7b0b71382c6d852a625c603dad8609c43b7f20f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from 7e2f124fd1 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2111038
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-09 14:42:14 -07:00
Debarshi Dutta
bdaacf5441 gpu: nvgpu: disable elpg before ctxsw_disable
if fecs is sent stop_ctxsw method, elpg entry/exit cannot happen
and may timeout. It could manifest as different error signatures
depending on when stop_ctxsw fecs method gets sent with respect
to pmu elpg sequence. It could come as pmu halt or abort or
maybe ext error too.

If ctxsw failed to disable, do not read engine info and just abort tsg.

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: I5f3ba07663bcafd3f0083d44c603420b0ccf6945
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2014914
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2018156
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-09 14:41:50 -07:00
Debarshi Dutta
c81cc032c4 gpu: nvgpu: add cg and pg function
Add new power/clock gating functions that can be called by
other units.

New clock_gating functions will reside in cg.c under
common/power_features/cg unit.

New power gating functions will reside in pg.c under
common/power_features/pg unit.

Use nvgpu_pg_elpg_disable and nvgpu_pg_elpg_enable to disable/enable
elpg and also in gr_gk20a_elpg_protected macro to access gr registers.

Add cg_pg_lock to make elpg_enabled, elcg_enabled, blcg_enabled
and slcg_enabled thread safe.

JIRA NVGPU-2014

Change-Id: I00d124c2ee16242c9a3ef82e7620fbb7f1297aff
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2025493
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from c905858565 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2108406
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-09 14:41:30 -07:00
Anuj Gangwar
f495f52c70 nvgpu: Change the path in the dependent files
changes in path because we move the nvhost linux user-interface
from include/linux/ to include/uapi/linux

depends on I2e116dc8f6c33f53c03fb56b923931b6e600b534

Bug 2062672

Change-Id: If2e165852432d5795cf6680cfeb5d4b661fdee74
Signed-off-by: Anuj Gangwar <anujg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1953731
(cherry picked from commit 4e7333967d)
Reviewed-on: https://git-master.nvidia.com/r/2110254
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-03 13:43:59 -07:00
Seema Khowala
889271dc04 gpu: nvgpu: change err to info print if failing eng id is -1
For handle_sched_error, change err to info print for failing eng
id returned as -1 i.e. FIFO_INVAL_ENGINE_ID as no engine is found
busy doing ctxsw. May be ctxsw already finished for the context
for which ctxsw timeout intr was triggered.

Possible Causes:
a)
On hitting engine reset, h/w drops the ctxsw_status to INVALID in
fifo_engine_status register. Also while the engine is held in reset
h/w passes busy/idle straight through. fifo_engine_status registers
are correct in that there is no context switch outstanding
as the CTXSW is aborted when reset is asserted.
This is just a side effect of how gv100 and earlier versions of
ctxsw_timeout behave.
With gv10b and later, h/w snaps the context at the point of error
so that s/w can see the tsg_id which caused the HW timeout.
b)
If engines are not busy and ctxsw state is valid then intr occurred
in the past and if the ctxsw state has moved on to VALID from LOAD
or SAVE, it means that whatever timed out eventually finished
anyways. The problem with this is that s/w cannot conclude which
context caused the problem as maybe more switches occurred before
intr is handled.

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: Ia79bee6e860fb179ee39024c963671d4f8245227
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2030866
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry-picked from d27f875d2c
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2076126
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-02 02:43:42 -07:00
Seema Khowala
dd282e229a gpu: nvgpu: do not do timeout_debug_dump for non fifo_error_idle_timeout
Any recovery that goes through gk20a_fifo_recover path e.g. gr error,
mmu fault or any recovery that involves engine recovery as well, will
still dump the full debug dump. This change will just avoid dumping debug
dump for force reset channels and pbdma intr if they do not involve
engine recovery. For FIFO_ERROR_IDLE_TIMEOUT error notifiers that
involves tsg recovery only, debug_dump will happen only if
timeout_debug_dump is set. timeout_debug_dump by default is set to true
but can be changed using NVGPU_IOCTL_CHANNEL_SET_TIMEOUT_EX.

Bug 2092051

Change-Id: Ibbf3cd2c44c586d9deb9e61ffbf37945b8d9e428
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2033068
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 5222d0ff4f
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2076117
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-02 02:43:11 -07:00
Seema Khowala
ef69df6dae gpu: nvgpu: add hal to mask/unmask intr during teardown
ctxsw timeout error prevents recovery as it can get triggered
periodically. Disable ctxsw timeout interrupt to allow recovery.

Bug 2092051
Bug 2429295
Bug 2484211
Bug 1890287

Change-Id: I47470e13968d8b26cdaf519b62fd510bc7ea05d9
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2019645
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 68c13e2f04
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2024899
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-05-02 02:43:02 -07:00
Seema Khowala
9bde6f8950 gpu: nvgpu: gv11b: add missing tsg_mark_error
nvgpu_tsg_mark_error is missing in teardown path for aborting tsg.
Without this, channels corresponding to tsg being aborted will not be
set to timedout (unserviceable) and also notifier_wq and semaphore_wq
will not be woken up.

Bug 2092051
Bug 2429295
Bug 2484211

Change-Id: Ie71c9a3b7a7fd1aa8cb9ec5d0dc30ccaeadfeae5
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1999026
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 7fed0c1937
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2086594
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-04-16 03:57:43 -07:00
Peng Liu
3a11883f7f gpu: nvgpu: using pmu counters for load estimate
PMU counters #0 and #4 are used to count total cycles and busy cycles.
These counts are used by podgov to estimate GPU load.

PMU idle intr status register is used to monitor overflow. Overflow
rarely occurs because frequency governor reads and resets the counters
at a high cadence. When overflow occurs, 100% work load is reported to
frequency governor.

Bug 1963732

Change-Id: I046480ebde162e6eda24577932b96cfd91b77c69
Signed-off-by: Peng Liu <pengliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1939547
(cherry picked from commit 34df003519)
Reviewed-on: https://git-master.nvidia.com/r/1979495
Reviewed-by: Aaron Tian <atian@nvidia.com>
Tested-by: Aaron Tian <atian@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Tested-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-04-01 15:27:17 -07:00
Deepak Nibade
f1be222687 gpu: nvgpu: fix invalid TSG pointer
In gr_gp10b_set_cilp_preempt_pending() we already extract TSG pointer
by calling tsg_gk20a_from_ch() which safely returns correct TSG or
NULL in error case

But before calling g->ops.fifo.post_event_id() we again extract TSG
by directly accessing g->fifo.tsg array, and this could result in
getting invalid TSG pointer

Fix this by removing direct TSG extraction through g->fifo.tsg

Bug 2444819
Jira NVGPU-1601

Change-Id: I9d49b5309c74e162828e7cb7d97556aae939a07c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1984954
(cherry picked from commit dcd3778b5e)
Reviewed-on: https://git-master.nvidia.com/r/2077313
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-04-01 09:12:33 -07:00
Deepak Nibade
8282b72a04 gpu: nvgpu: fix channel reference leak in error case
In gr_gp10b_get_cilp_preempt_pending_chid(), we leak the channel
reference if tsg_gk20a_from_ch() returns NULL
Fix this by calling gk20a_channel_put() in error case

Bug 2444819
Jira NVGPU-1601

Change-Id: Ic5d036c6d043b0b95dd2a564afcc0add67c1ca02
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1984953
(cherry picked from commit 2322cb131c)
Reviewed-on: https://git-master.nvidia.com/r/2077312
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-04-01 09:12:25 -07:00
Peter Daifuku
9e329ca39b gpu: nvgpu: tsg: ensure unbound channel is disabled
Multiple threads could be unbinding different channels from
the same tsg at the same time. At the point where we
remove the channel from the tsg's channel list, call
disable_channel again, in case another thread had
re-enabled the channel after we had disabled it.

Bug 200404549

Change-Id: I9abbc08dc11fe1f7a0abada88376c0ef96b56610
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2083337
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Satish Arora <satisha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-29 03:57:26 -07:00
Seema Khowala
e00804594b gpu: nvgpu: remove gk20a_is_channel_marked_as_tsg
Use tsg_gk20a_from_ch to get tsg pointer for tsgid of a channel. For
invalid tsgid, tsg pointer will be NULL

Bug 2092051
Bug 2429295
Bug 2484211

Change-Id: I82cd6a2dc5fab4acb147202af667ca97a2842a73
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2006722
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 13f37f9c70
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2025507
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-18 11:30:16 -07:00
Preetham Chandru R
77ee4144ce gpu: nvgpu: add compatibility version
Add compatibility version to page table and dma mapping structure.

Bug 200438879

Change-Id: I04b4601f71ae2b3e75843f39f5445ecca2b16677
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2029086
(cherry picked from commit 8bbbd09caa)
Reviewed-on: https://git-master.nvidia.com/r/2071427
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-13 14:43:56 -07:00
Dmitry Pervushin
4269d56d02 nvgpu: more changes to clean loading/unloading
Bug 200487652

Change-Id: Ib52cc6a85a19ea0396c8ab584c5ce9970f93085a
Signed-off-by: Dmitry Pervushin <dpervushin@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2020386
(cherry picked from commit 617dff478c3687a08ed5b77f4ac2073b290c57ea)
Reviewed-on: https://git-master.nvidia.com/r/2035720
GVS: Gerrit_Virtual_Submit
Reviewed-by: Rahul Jain (SW-TEGRA) <rahuljain@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-11 11:00:46 -07:00
Dmitry Pervushin
7c8d212b50 gpu: do not release managed resource
l->bar is a managed resource, it will be released automatically
Therefore, there is no need to explicitly unmap it

Bug 200487652

Change-Id: Ic543baa770d9cbcf7e7319281c4a27fab4b4b4df
Signed-off-by: dmitry pervushin <dpervushin@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2012324
GVS: Gerrit_Virtual_Submit
Reviewed-by: Rahul Jain (SW-TEGRA) <rahuljain@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-03-11 10:58:11 -07:00
Seema Khowala
c9d4df288d gpu: nvgpu: remove code for ch not bound to tsg
- Remove handling for channels that are no more bound to tsg
  as channel could be referenceable but no more part of a tsg
- Use tsg_gk20a_from_ch to get pointer to tsg for a given channel
- Clear unhandled gr interrupts

Bug 2429295
JIRA NVGPU-1580

Change-Id: I9da43a2bc9a0282c793b9f301eaf8e8604f91d70
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1972492
(cherry picked from commit 013ca60edd
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2018262
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Tested-by: Debarshi Dutta <ddutta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-22 18:59:18 -08:00
Seema Khowala
465aff5f0d gpu: nvgpu: do not use raw spinlock for ch->timeout.lock
With PREEMPT_RT kernel, regular spinlocks are mapped onto sleeping
spinlocks (rt_mutex locks), and raw spinlocks retain their behaviour.

Schedule while atomic can occur in gk20a_channel_timeout_start,
as it acquires ch->timeout.lock raw spinlock, and then calls
functions that acquire ch->ch_timedout_lock regular spinlock.

Bug 200484795

Change-Id: Iacc63195d8ee6a2d571c998da1b4b5d396f49439
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2004100
(cherry picked from commit aacc33bb47
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2017923
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Tested-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-18 06:02:00 -08:00
Konsta Holtta
5e440e63d6 gpu: nvgpu: abstract out timeout rewinding
The channel timeout ends up in a strange state during timeout handling
for a brief moment; it can become stopped and started again, and the
timeout lock is released in the middle. Add a more explicit rewind
function to reset the timeout to start if it's active. The active check
allows to use this from gk20a_channel_timeout_restart_all_channels(), so
that's also modified.

Also replace the return statements with more readable control flow in
gk20a_channel_timeout_handler().

Bug 200484795

Change-Id: Ia7d67242dfc149ace1f4f841a837e90b6c985308
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1989327
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
(cherry picked from commit 8979a97af3
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2017922
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Tested-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-18 06:01:57 -08:00
Preetham Chandru R
b5d13e16ae gpu: nvgpu: rename dma map/umap interfaces
On Desktop verion, map is called nvidia_p2p_dma_map_pages and umap is
called nvidia_p2p_dma_umap_pages. So renamed these two apis to match
the desktop version.

Bug 200438879

Change-Id: I66301c48b832dfed8c3950678f473c2f82b8761a
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2014940
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-14 06:00:16 -08:00
Seema Khowala
cbf6394482 gpu: nvgpu: check ch_timedout for poll/restart
poll_timeouts and timeout_restart_all_channels should
only handle channels that have not been recovered/aborted.
Check ch_timedout status of the channel to make sure
channel is still alive to be used. A channel reference
could still be available even if it is recovered but not
closed.

Bug 2404865

Change-Id: I016c8b9952ef1d4c349c2a2a2ca55cb81326d380
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929339
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit def687d4df
in rel-32)
Reviewed-on: https://git-master.nvidia.com/r/2016995
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-13 13:19:45 -08:00
Seema Khowala
f78918fd6c gpu: nvgpu: do not suspend/resume recovered channel
Already torn down channels should not be suspended or
resumed. A channel reference could still be available
even if it is recovered but not closed. Use ch_timedout
status to check if channel is already recovered/aborted.

Bug 2404865

Change-Id: I718eab6032ee94a9322da7a239a978b388de2b01
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929338
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 88cff206ae
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2016994
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-13 13:19:41 -08:00
Seema Khowala
220860d043 gpu: nvgpu: rename has_timedout and make it thread safe
Currently has_timedout variable is protected by wmb at places
where it is being set and there is no correspoding rmb whenever
has_timedout variable is read. This is prone to errors for
concurrent execution. This change is supposed to fix this issue.
Rename has_timedout variable of channel struct to ch_timedout.
Also to avoid rmb every time ch_timedout is read,
ch_timedout_spinlock is added to protect ch_timedout
variable for taking care of concurrent execution.

Bug 2404865
Bug 2092051

Change-Id: I0bee9f50af0a48720aa8b54cbc3af97ef9f6df00
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1930935
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 1f54ea09e3
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2016975
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-13 13:19:37 -08:00
Debarshi Dutta
18643ac135 gpu: nvgpu: replace input param chid with pointer to channel
preempt_channel needs to use the channel to pass it to other
public functions, get access to a tsg etc. This qualifies it to take a
pointer to a channel as an input parameter instead of a chid.

Increment the channel ref counter using the function
gk20a_channel_from_id in functions where we get the chid from the h/w
registers directly. Once the prempt_channel function call is done,
use a gk20a_channel_put on the referenced channel.

Jira NVGPU-1461

Change-Id: I6c87c8104cfcb418d468c8c590087fd4aeabf4bd
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1963200
(cherry picked from commit 9abe9fe062
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2013728
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-11 08:18:47 -08:00
Debarshi Dutta
a8f0cb89f4 gpu: nvgpu: replace input param chid with pointer to channel
gk20a_fifo_recover_channel takes a reference to the channel via its
chid before passing the channel pointer to other public functions such
as gk20a_channel_abort and gk20a_fifo_error_ch. This qualifies the
gk20a_fifo_recover_channel to take a pointer to a channel instead of
only chid.

Jira NVGPU-1461

Change-Id: I338a12a05e5ccee785a202fea7848db5201a3a39
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1963199
(cherry picked from commit 99acb8011a
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2013727
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-11 08:18:44 -08:00
Debarshi Dutta
d9efcd5871 gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20a
The function gk20a_fifo_recover_tsg has to pass a valid struct tsg to
other functions from within. This qualifies it to have a pointer to
struct tsg_gk20a as an input parameter.

Tsg specific parts of the gk20a_fifo_preempt_timeout_rc are now moved
into another function gk20a_fifo_preempt_timeout_rc_tsg
that takes a tsg as an input and passes it to gk20a_fifo_recover_tsg.
The pointer to a tsg is also used to enumerate channels from within.

The function gk20a_fifo_preempt_timeout_rc now contains only channel
specific code.

Jira NVGPU-1461

Change-Id: Ice0a9921567841fb5586a7e4e010c442ca6cf172
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1961675
(cherry picked from commit e19cea7ab3
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2013726
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-11 08:18:40 -08:00
Debarshi Dutta
ef9de9e992 gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20a
gv11b_fifo_preempt_tsg needs to access the runlist_id of the tsg as
well as pass the tsg pointer to other public functions such as
gk20a_fifo_disable_tsg_sched. This qualifies the preempt_tsg to use a
pointer to a struct tsg_gk20a instead of just using the tsgid.

Jira NVGPU-1461

Change-Id: I01fbd2370b5746c2a597a0351e0301b0f7d25175
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959068
(cherry picked from commit 1e78d47f15
in rel-32)
Reviewed-on: https://git-master.nvidia.com/r/2013725
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-11 08:18:36 -08:00
Debarshi Dutta
5b8ecbc51f gpu: nvgpu: replace tsgid input variable with pointer to a struct tsg_gk20a
replace tsgid with a pointer to a struct tsg_gk20a in the function
gk20a_fifo_tsg_abort(). gk20a_fifo_tsg_abort needs to enumerate through
all the channels within the tsg as well as pass the tsg pointer to
other functions, qualifying the need to use a pointer instead as an
input parameter.

Jira NVGPU-1461

Change-Id: I59cec05d5d778f733d0c3e9ffadf46e74e249080
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1956567
(cherry picked from commit e5bebd880f
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2013724
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-11 08:18:33 -08:00
Konsta Holtta
7e8ba851a8 gpu: nvgpu: delete raw chid lookup
This (dangerous) array lookup with no channel references is now unused.

Jira NVGPU-1460

Change-Id: Ic6bdbcf19fc8996bc6ff02a40afe3224bdd5bc27
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1955402
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 4a53854a92 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2008517
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-05 09:04:33 -08:00
Konsta Holtta
301a9d2426 gpu: nvgpu: store ch ptr in gr isr data
Store a channel pointer that is either NULL or a referenced channel to
avoid confusion about channel ownership. A pure channel ID is dangerous.

Jira NVGPU-1460

Change-Id: I6f7b4f80cf39abc290ce9153ec6bf5b62918da97
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1955401
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 4e6d9afab8 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2008516
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-05 09:04:24 -08:00
Konsta Holtta
3794afbeb1 gpu: nvgpu: add safe channel id lookup
Add gk20a_channel_from_id() to retrieve a channel, given a raw channel
ID, with a reference taken (or NULL if the channel was dead). This makes
it harder to mistakenly use a channel that's dead and thus uncovers bugs
sooner. Convert code to use the new lookup when applicable; work remains
to convert complex uses where a ref should have been taken but hasn't.

The channel ID is also validated against FIFO_INVAL_CHANNEL_ID; NULL is
returned for such IDs. This is often useful and does not hurt when
unnecessary.

However, this does not prevent the case where a channel would be closed
and reopened again when someone would hold a stale channel number. In
all such conditions the caller should hold a reference already.

The only conditions where a channel can be safely looked up by an id and
used without taking a ref are when initializing or deinitializing the
list of channels.

Jira NVGPU-1460

Change-Id: I0a30968d17c1e0784d315a676bbe69c03a73481c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1955400
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 7df3d58750
in dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2008515
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-05 09:04:20 -08:00
Philip Elcan
ed6e396090 gpu: nvgpu: channel: make chid u32
The chid member of the channel_gk20a struct was being used as a unsigned
value. By being declared as an int, it was causing MISRA 10.3 violations
for implicit assignment of different types.

JIRA NVGPU-647

Change-Id: I7477fad6f0c837cf7ede1dba803158b1dda717af
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1918470
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 1c7bb9b538 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2008514
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-05 09:04:11 -08:00
Philip Elcan
bace52ac7a gpu: nvgpu: make tsgid a consistent type
Different units were declaring tsgid as int or u32. This makes everyone
use u32. This change resolves MISRA 10.3 violations for implicit
assingment to different types.

JIRA NVGPU-647

Change-Id: I78660e737acb0dad76dd538e5dd37f4527cf5acd
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1918469
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit f5cac144a0 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/2008513
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-02-05 09:04:02 -08:00