Process long regops lists in 4-kB fragments, overcoming the overly
low limit of 128 reg ops per IOCTL call. Bump the list limit to 1024
and report the limit in GPU characteristics.
Bug 200248726
Change-Id: I3ad49139409f32aea8b1226d6562e88edccc8053
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/1253716
(cherry picked from commit 22314619b28f52610cb8769cd4c3f9eb01904eab)
Reviewed-on: http://git-master/r/1266652
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- PG statistics read support for multiple engines
- updated stat_dmem_offset member to array to hold
dmem offset of PG engines
- PMU allocates memory in DMEM for each PG engine requested,
updated gk20a_pmu_get_elpg_residency_gating() to get
engine statistics for requested PG engine
JIRA DNVGPU-71
Change-Id: I2ddade37f85716f757bf33034dbff816184577eb
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1250506
(cherry picked from commit 68ba7a97d6662b87d0e489365d8afb8e2d237a03)
Reviewed-on: http://git-master/r/1270972
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
- Added enable_mscg, mscg_enabled & mscg_stat flags,
mscg_enabled flag can be used to controll
mscg enable/disable at runtime along with mscg_stat flag.
- Added defines & interface to support ms/mclk-change/post-init-param
- Added defines for lpwr tables read from vbios.
- HAL to support post init param which is require
to setup clockgating interface in PMU & interfaces used during
mscg state machine.
- gk20a_pmu_pg_global_enable() can be called when pg support
required to enable/disable, this also checks & wait
if pstate switch is in progress till it complets
- pg_mutex to protect PG-RPPG/MSCG enable/disable
JIRA DNVGPU-71
Change-Id: If312cefc888a4de0a5c96898baeaac1a76e53e46
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1247554
(cherry picked from commit e6c94948b8058ba642ea56677ad798fc56b8a28a)
Reviewed-on: http://git-master/r/1270971
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
- pmu_init_powergating loops & init multiple
PG engines based on PG engines supported
- generalize pg init param HAL to support
multiple PG-engine init based on PG engine
parameter
- HAL's to return supported PG engines on chip &
its sub features of engine.
- Send Allow/Disallow for PG engines which are
enabled & supported.
- Added defines for pg engines
JIRA DNVGPU-71
Change-Id: I236601e092e519a269fcb17c7d1c523a4b51405f
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1247409
(cherry-picked from commit 1c138cc475bac7d3c3fbbd5fb18cfcb2e7fdf67a)
Reviewed-on: http://git-master/r/1269319
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Store pending sema waits so that they can be explicitly handled when
the driver dies. If the sema_wait is freed before the pending wait is
either handled or canceled problems occur.
Internally the sync_fence_wait_async() function uses the kernel timers.
That uses a linked list of possible events. That means every so often
the kernel iterates through this list. If the list node that is in the
sync_fence_waiter struct is freed before it can be removed from the
pending timers list then the kernel timers list can be corrupted. When
the kernel then iterates through this list crashes and other related
problems can happen.
Bug 1816516
Bug 1807277
Change-Id: Iddc4be64583c19bfdd2d88b9098aafc6ae5c6475
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1250025
(cherry picked from commit 01889e21bd31dbd7ee85313e98079138ed1d63be)
Reviewed-on: http://git-master/r/1261920
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Check if the GPU is present after each register read. If the a register
read returns 0xffffffff then it's possible the GPU has fallen off the
bus for some reason or another. However, to confirm that a register read
is due to a dead GPU vs just a 0xffffffff being returned by happenstance
the chip ID register is read which should never return 0xffffffff. If
that read returns 0xffffffff as well then certainly the GPU is dead.
Bug 1805082
Bug 1816516
Bug 1807277
Change-Id: I4de61b56289217d9c0d8167e84615a67c8bde8a9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1239518
(cherry picked from commit bd50828de20aba9b2887ee99c2269602c21a793f)
Reviewed-on: http://git-master/r/1261916
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add HAL interface for TSG open, which is intended to
be called from the exisiting gk20a_tsg_open function.
The tsg_open entryoint is only implemented for vgpu,
as the server needs to clear metadata when a tsg is opened.
Bug 200215060
Change-Id: Icc8fd602f31e52d9fa9b2e7786b665b9e7b9294e
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1249218
(cherry picked from commit 35c86f7c796c6574d3dc336e20012ea5c16d7cb4)
Reviewed-on: http://git-master/r/1256468
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
JIRA DNVGPU-129
1)Add function hook for PMU VFE event handler which will do for VF
curve re-evaluation
2)Add function hook to send temperature limit of GPU sensor
3)Call VFE event handler from PMU's event handle function
Change-Id: I2e3577d3d895e97e6ad06e92f0f4827f9855d0b6
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/1245393
(cherry picked from commit 1a5c6c32cdec73fb23735430f43577eda675e5af)
Reviewed-on: http://git-master/r/1268060
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_scale_target(), to check for duplicate
freq requests we compare current frequency with
devfreq->previous_freq
But for very first request after boot, we have
devfreq->previous_freq set to MIN freq
And in case we evaluate new frew as MIN freq
then we skip calling postscale() and scaling
of EMC clock
This results in keeping EMC at MAX value
To fix this, add new variable last_freq in
gk20a structure.
Use this variable to store frequency value
and to compare for duplicate requests
Bug 200255163
Bug 200257544
Change-Id: Icfc57234c63f68cce8ccf8221237105272dad853
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1263747
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
JIRA DNVGPU-143
(1) Added conversion routines in ctrl_gk20a.c to
do conversions between Hz and MHZ
(2) Use new api to prevent corruption of requests
is multiple threads on same session commit
simultaneously
Change-Id: I87875e593d2cc90647d5c4f60a4e293ed3ea6b83
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1239460
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1267152
Reviewed-by: Automatic_Commit_Validation_User
Install one completion fd per SET request.
Notifications on dedicated event fd.
Changed frequencies unit to Hz from MHz.
Remove sequence numbers from dummy arbiter.
Added effective clock type (query frequency from counters).
Jira DNVGPU-125
Change-Id: Ica364eccdf85b188fd208f770e4eae0e9f0379e9
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1230224
(cherry picked from commit f9b06686c090c676e60e1e137fdc9bbfc76d4843)
Reviewed-on: http://git-master/r/1243109
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add ioctls for clock range and VF points query.
Add ioctls to set target mhz, and get actual mhz.
Jira DNVGPU-125
Change-Id: I7639789bb15eabd8c98adc468201dba3a6e19ade
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1223473
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit 5e635ae34221c99a739321bcfc1418db56c1051d)
Reviewed-on: http://git-master/r/1243107
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Instead of using enum type for litter values, use
define macros. This will fix:
1. Resolve ambiguity associated with enum type size.
2. Litter values can be extended easily in future chips.
JIRA GV11B-21
Change-Id: Idca5144ea3754820c67831a716bb0aaf2e375eb2
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1254854
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.
Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We right now obtain pm_qos frequency requirments in
qos notifier callback gk20a_scale_qos_notify()
But now we want to limit GPU frequencies based on
frequency limited from devfreq nodes
And devfreq requirement should precede over
qos requirements
Hence, move all frequency estimation and clipping
to function gk20a_scale_target() which sets the
frequency at the end
Bug 200245796
Change-Id: I0572c676dce0acc0917924a11e4c0fb4a9db4e6e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1243427
(cherry picked from commit 81c757a3232463d126aecba64ca0c55d8e4423d2)
Reviewed-on: http://git-master/r/1239936
Reviewed-by: Aaron Huang <aaronh@nvidia.com>
Tested-by: Aaron Huang <aaronh@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following implementation,
1) Power Sensor Table parsing.
2) Power Topology Table parsing.
3) Add debugfs interface to get the current power(mW), current(mA) and
voltage(uV) information from PMU.
4) Power Policy Table Parsing
5) Implement PMU boardobj interface for pmgr module.
6) Over current protection.
JIRA DNVGPU-47
Change-Id: I620f4470aa704f1cc920e03947831440fbb0eb05
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1217176
(cherry picked from commit ed56743c2ac8dc325c75f85a82271d2d5ed8d96a)
Reviewed-on: http://git-master/r/1241952
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove a global debugfs variable and instead save the allocator
debugfs root node in the gk20a struct.
Bug 1799159
Change-Id: If4eed34fa24775e962001e34840b334658f2321c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225611
(cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a)
Reviewed-on: http://git-master/r/1242390
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following small modifications,
1) Add proper memset size handling during pmu surface cleanup
2) Reset the pmu surface mem desc pointer after deallocate the memory
JIRA DNVGPU-47
Change-Id: I400f8c4d3f5dc650d4fc6669cef6a1e41a70f4ab
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1220100
(cherry picked from commit 1f171b977be51db20c2dfc56b3f6e3dd6b4b9095)
Reviewed-on: http://git-master/r/1240881
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move ELCG parameter programming to a new function in therm,
elcg_init_idle_filter. Implement gk20a variant and use it for gk20a
and gm20b.
JIRA DNVGPU-74
Change-Id: I8ef400f3a6195311fb9e7da8db6c34993d62f461
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1220433
(cherry picked from commit f6654ae4d83d31cd40b317bf55922964bbfa575d)
Reviewed-on: http://git-master/r/1239421
GVS: Gerrit_Virtual_Submit
Implement the ability to swap between different PCIe bus speeds. This
code is called during init in case the GPU is not running at the max
supported PCIe bus speed.
JIRA DNVGPU-89
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1218178
(cherry picked from commit 8dcd3e10f46f524c9bac9fd5dae0f0a899123c23)
Change-Id: I21f96110578a68d5c5e30ae21776cff69aefba5d
Reviewed-on: http://git-master/r/1227922
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We take an extra power refcount when we disable railgating
through railgate_enable sysfs
And that breaks suspend/resume since we check for power
refcount first in gk20a_pm_suspend()
Fix this with following :
- set a flag user_railgate_disabled when User
disables railgating through sysfs railgate_enable
- in gk20a_pm_suspend(), drop one power refcount
if flag is set
- in gk20a_pm_resume(), take one refcount again
if flag is set
Fix __gk20a_do_idle() to consider this extra refcount as well.
Add new variable target_ref_cnt and use it instead of
assuming target refcount of 1
In case User has disabled rail gating, set this target
refcount as 2
Also, export gk20a_idle_nosuspend() which drop power refcount
without triggering suspend
Bug 200233570
Change-Id: Ic0e35c73eb74ffefea1cd90d1b152650d9d2043d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1236047
(cherry picked from commit 6e002d57da4b5c58ed79889728bb678d3aa1f1b1)
Reviewed-on: http://git-master/r/1235219
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There are eight tiles per map tile register and
depending on how many tpcs are present, there is
a chance that s/w will be accessing un-allocated
memory for reading tile values from temp buffers.
Bug 1735760
Change-Id: I5c0e09ec75099aaf6ad03dde964b9e93c2dc2408
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1221580
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Suppress error message when nvgpu tries to load VBIOS overlay, but
one is not found. This situation is normal. This is done by moving
gk20a_request_firmware() to be nvgpu generic function
nvgpu_request_firmware(), and adding a NO_WARN flag to it.
Introduce also a NO_SOC flag to suppress attempt to load firmware
from SoC specific directory in addition to the chip specific
directory. Use it for dGPU firmware files.
Bug 200236777
Change-Id: I0294d3308f029a6a6d3c2effa579d5f69a91e418
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1223840
(cherry picked from commit cca44c3f010f15918cdd2259c15170ba1917828a)
Reviewed-on: http://git-master/r/1233353
GVS: Gerrit_Virtual_Submit
It attaches the neccesary namemap structures to the clock struct so we can enumerate the clock domains in the debugfs code in nvgpu-t18x.
the other is to add an accessor for the fields.
JIRA DNVGPU-98
Change-Id: I6e5c6e763b2b88daa1995f4136a9a7b33ea25b17
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1199083
Reviewed-on: http://git-master/r/1204016
(cherry picked from commit b9d95a45791b93ddc010d1aeddbe798d2a9705d4)
Reviewed-on: http://git-master/r/1227910
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Putting the wmb() before the write only ensures that any previous
writes are done. But this doesn't really do anything for the
writel_relaxed(). The point of the wmb() here is to ensure that
the write performed by the writel_relaxed() is actually done
before proceeding.
Bug 1811382
Change-Id: I7250ea074b8548c899acfd34d816de466cf53b6f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1216434
(cherry picked from commit c9aa02dc61138615d971902fe58dc6a113cdf00a)
Reviewed-on: http://git-master/r/1223457
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>