In gk20a_scale_target(), to check for duplicate
freq requests we compare current frequency with
devfreq->previous_freq
But for very first request after boot, we have
devfreq->previous_freq set to MIN freq
And in case we evaluate new frew as MIN freq
then we skip calling postscale() and scaling
of EMC clock
This results in keeping EMC at MAX value
To fix this, add new variable last_freq in
gk20a structure.
Use this variable to store frequency value
and to compare for duplicate requests
Bug 200255163
Bug 200257544
Change-Id: Icfc57234c63f68cce8ccf8221237105272dad853
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1263747
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
JIRA DNVGPU-143
(1) Added conversion routines in ctrl_gk20a.c to
do conversions between Hz and MHZ
(2) Use new api to prevent corruption of requests
is multiple threads on same session commit
simultaneously
Change-Id: I87875e593d2cc90647d5c4f60a4e293ed3ea6b83
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1239460
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1267152
Reviewed-by: Automatic_Commit_Validation_User
Install one completion fd per SET request.
Notifications on dedicated event fd.
Changed frequencies unit to Hz from MHz.
Remove sequence numbers from dummy arbiter.
Added effective clock type (query frequency from counters).
Jira DNVGPU-125
Change-Id: Ica364eccdf85b188fd208f770e4eae0e9f0379e9
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1230224
(cherry picked from commit f9b06686c090c676e60e1e137fdc9bbfc76d4843)
Reviewed-on: http://git-master/r/1243109
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add ioctls for clock range and VF points query.
Add ioctls to set target mhz, and get actual mhz.
Jira DNVGPU-125
Change-Id: I7639789bb15eabd8c98adc468201dba3a6e19ade
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1223473
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit 5e635ae34221c99a739321bcfc1418db56c1051d)
Reviewed-on: http://git-master/r/1243107
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Instead of using enum type for litter values, use
define macros. This will fix:
1. Resolve ambiguity associated with enum type size.
2. Litter values can be extended easily in future chips.
JIRA GV11B-21
Change-Id: Idca5144ea3754820c67831a716bb0aaf2e375eb2
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1254854
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.
Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We right now obtain pm_qos frequency requirments in
qos notifier callback gk20a_scale_qos_notify()
But now we want to limit GPU frequencies based on
frequency limited from devfreq nodes
And devfreq requirement should precede over
qos requirements
Hence, move all frequency estimation and clipping
to function gk20a_scale_target() which sets the
frequency at the end
Bug 200245796
Change-Id: I0572c676dce0acc0917924a11e4c0fb4a9db4e6e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1243427
(cherry picked from commit 81c757a3232463d126aecba64ca0c55d8e4423d2)
Reviewed-on: http://git-master/r/1239936
Reviewed-by: Aaron Huang <aaronh@nvidia.com>
Tested-by: Aaron Huang <aaronh@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following implementation,
1) Power Sensor Table parsing.
2) Power Topology Table parsing.
3) Add debugfs interface to get the current power(mW), current(mA) and
voltage(uV) information from PMU.
4) Power Policy Table Parsing
5) Implement PMU boardobj interface for pmgr module.
6) Over current protection.
JIRA DNVGPU-47
Change-Id: I620f4470aa704f1cc920e03947831440fbb0eb05
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1217176
(cherry picked from commit ed56743c2ac8dc325c75f85a82271d2d5ed8d96a)
Reviewed-on: http://git-master/r/1241952
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove a global debugfs variable and instead save the allocator
debugfs root node in the gk20a struct.
Bug 1799159
Change-Id: If4eed34fa24775e962001e34840b334658f2321c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225611
(cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a)
Reviewed-on: http://git-master/r/1242390
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following small modifications,
1) Add proper memset size handling during pmu surface cleanup
2) Reset the pmu surface mem desc pointer after deallocate the memory
JIRA DNVGPU-47
Change-Id: I400f8c4d3f5dc650d4fc6669cef6a1e41a70f4ab
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1220100
(cherry picked from commit 1f171b977be51db20c2dfc56b3f6e3dd6b4b9095)
Reviewed-on: http://git-master/r/1240881
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move ELCG parameter programming to a new function in therm,
elcg_init_idle_filter. Implement gk20a variant and use it for gk20a
and gm20b.
JIRA DNVGPU-74
Change-Id: I8ef400f3a6195311fb9e7da8db6c34993d62f461
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1220433
(cherry picked from commit f6654ae4d83d31cd40b317bf55922964bbfa575d)
Reviewed-on: http://git-master/r/1239421
GVS: Gerrit_Virtual_Submit
Implement the ability to swap between different PCIe bus speeds. This
code is called during init in case the GPU is not running at the max
supported PCIe bus speed.
JIRA DNVGPU-89
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1218178
(cherry picked from commit 8dcd3e10f46f524c9bac9fd5dae0f0a899123c23)
Change-Id: I21f96110578a68d5c5e30ae21776cff69aefba5d
Reviewed-on: http://git-master/r/1227922
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We take an extra power refcount when we disable railgating
through railgate_enable sysfs
And that breaks suspend/resume since we check for power
refcount first in gk20a_pm_suspend()
Fix this with following :
- set a flag user_railgate_disabled when User
disables railgating through sysfs railgate_enable
- in gk20a_pm_suspend(), drop one power refcount
if flag is set
- in gk20a_pm_resume(), take one refcount again
if flag is set
Fix __gk20a_do_idle() to consider this extra refcount as well.
Add new variable target_ref_cnt and use it instead of
assuming target refcount of 1
In case User has disabled rail gating, set this target
refcount as 2
Also, export gk20a_idle_nosuspend() which drop power refcount
without triggering suspend
Bug 200233570
Change-Id: Ic0e35c73eb74ffefea1cd90d1b152650d9d2043d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1236047
(cherry picked from commit 6e002d57da4b5c58ed79889728bb678d3aa1f1b1)
Reviewed-on: http://git-master/r/1235219
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There are eight tiles per map tile register and
depending on how many tpcs are present, there is
a chance that s/w will be accessing un-allocated
memory for reading tile values from temp buffers.
Bug 1735760
Change-Id: I5c0e09ec75099aaf6ad03dde964b9e93c2dc2408
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1221580
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Suppress error message when nvgpu tries to load VBIOS overlay, but
one is not found. This situation is normal. This is done by moving
gk20a_request_firmware() to be nvgpu generic function
nvgpu_request_firmware(), and adding a NO_WARN flag to it.
Introduce also a NO_SOC flag to suppress attempt to load firmware
from SoC specific directory in addition to the chip specific
directory. Use it for dGPU firmware files.
Bug 200236777
Change-Id: I0294d3308f029a6a6d3c2effa579d5f69a91e418
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1223840
(cherry picked from commit cca44c3f010f15918cdd2259c15170ba1917828a)
Reviewed-on: http://git-master/r/1233353
GVS: Gerrit_Virtual_Submit
It attaches the neccesary namemap structures to the clock struct so we can enumerate the clock domains in the debugfs code in nvgpu-t18x.
the other is to add an accessor for the fields.
JIRA DNVGPU-98
Change-Id: I6e5c6e763b2b88daa1995f4136a9a7b33ea25b17
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1199083
Reviewed-on: http://git-master/r/1204016
(cherry picked from commit b9d95a45791b93ddc010d1aeddbe798d2a9705d4)
Reviewed-on: http://git-master/r/1227910
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Putting the wmb() before the write only ensures that any previous
writes are done. But this doesn't really do anything for the
writel_relaxed(). The point of the wmb() here is to ensure that
the write performed by the writel_relaxed() is actually done
before proceeding.
Bug 1811382
Change-Id: I7250ea074b8548c899acfd34d816de466cf53b6f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1216434
(cherry picked from commit c9aa02dc61138615d971902fe58dc6a113cdf00a)
Reviewed-on: http://git-master/r/1223457
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add support for cyclestats snapshots in the virtual case
Bug 1700143
JIRA EVLR-278
Change-Id: I376a8804d57324f43eb16452d857a3b7bb0ecc90
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1211547
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
We have completely different versions of probe for
nvgpu and pci device
Extract out common steps into nvgpu_probe() function
and separate it out in new file nvgpu_common.c
Divide task of nvgpu_probe() into further smaller
functions
Do platform specific things (like irq handling,
memresource management, power management) only in
individual probes and then call nvgpu_probe() to
complete the common initialization
Move all debugfs initialization to common gk20a_debug_init()
This also helps to bringup all debug nodes to pci device
Pass debugfs_symlink name as a parameter to gk20a_debug_init()
This allows us to set separate debugfs symlink for nvgpu
and pci device
In case of railgating, cde and ce debugfs, check if
platform supports them or not
Copy vidmem_is_vidmem from platform to mm structure
and set it to true for pci device
Return from gk20a_scale_init() if we don't have either of
governor or qos_notifier
Fix gk20a_alloc_debugfs_init() and gk20a_secure_page_alloc()
to receive device pointer instead of platform_device
Export gk20a_railgating_debugfs_init() so that we can call
it from gk20a_debug_init()
Jira DNVGPU-56
Jira DNVGPU-58
Change-Id: I3cc048082b0a1e57415a9fb8bfb9eec0f0a280cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204207
(cherry picked from commit add6bb0a3d5bd98131bbe6f62d4358d4d722b0fe)
Reviewed-on: http://git-master/r/1204462
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add a new debug message type: gpu_dbg_map_v. This is used for mapping
messages that are not specifically memory map operations.
Also cleanup the memory mapping debugging a bit since there was one
duplicate print and the memory map print was difficult to parse
visually. As a result the message has been modified to put the most
important information first in an easily readable format.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ib19c9371ee958009ab5a2d89b9610e699d070ee2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198593
(cherry picked from commit 51dba53b06ca171cdb13d1707f2d026b0ce29f07)
Reviewed-on: http://git-master/r/1147670
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Flushing timestamp record method can fail in case FECS is not
processing the main method queue. In particular, this occurs
in case of ctxsw timeout, where we process fifo sched interrupts
from the host, but FECS is still waiting for idle (grWFI).
In such scenario, this adds huge delay in fifo recovery
procedure (timeout on FECS method). Since flushing the last
(incomplete) record from FECS would only be useful in that case
(context switch ongoing), remove flush operation on engine
reset. Note that an explicit ENGINE_RESET event (with pid)
is inserted in user-facing ctxsw buffer on engine reset.
Bug 200228310
Change-Id: I885525f8f197f81266b50db161bb511867fc74f4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1207305
(cherry picked from commit 44391b6204fd648949295f90481b0c424d9a5ddf)
Reviewed-on: http://git-master/r/1208414
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
We currently post bpt events (bpt.int and bpt.pause) even
before we process and clear the interrupts and this
could cause races with UMD
Fix this by posting bpt events only after we are done
processing the interrupts
Bug 200209410
Change-Id: Ic3ff7148189fccb796cb6175d6d22ac25a4097fb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1184109
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>