This CL covers the following implementation,
1) Power Sensor Table parsing.
2) Power Topology Table parsing.
3) Add debugfs interface to get the current power(mW), current(mA) and
voltage(uV) information from PMU.
4) Power Policy Table Parsing
5) Implement PMU boardobj interface for pmgr module.
6) Over current protection.
JIRA DNVGPU-47
Change-Id: I620f4470aa704f1cc920e03947831440fbb0eb05
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1217176
(cherry picked from commit ed56743c2ac8dc325c75f85a82271d2d5ed8d96a)
Reviewed-on: http://git-master/r/1241952
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove a global debugfs variable and instead save the allocator
debugfs root node in the gk20a struct.
Bug 1799159
Change-Id: If4eed34fa24775e962001e34840b334658f2321c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225611
(cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a)
Reviewed-on: http://git-master/r/1242390
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following small modifications,
1) Add proper memset size handling during pmu surface cleanup
2) Reset the pmu surface mem desc pointer after deallocate the memory
JIRA DNVGPU-47
Change-Id: I400f8c4d3f5dc650d4fc6669cef6a1e41a70f4ab
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1220100
(cherry picked from commit 1f171b977be51db20c2dfc56b3f6e3dd6b4b9095)
Reviewed-on: http://git-master/r/1240881
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move ELCG parameter programming to a new function in therm,
elcg_init_idle_filter. Implement gk20a variant and use it for gk20a
and gm20b.
JIRA DNVGPU-74
Change-Id: I8ef400f3a6195311fb9e7da8db6c34993d62f461
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1220433
(cherry picked from commit f6654ae4d83d31cd40b317bf55922964bbfa575d)
Reviewed-on: http://git-master/r/1239421
GVS: Gerrit_Virtual_Submit
Implement the ability to swap between different PCIe bus speeds. This
code is called during init in case the GPU is not running at the max
supported PCIe bus speed.
JIRA DNVGPU-89
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1218178
(cherry picked from commit 8dcd3e10f46f524c9bac9fd5dae0f0a899123c23)
Change-Id: I21f96110578a68d5c5e30ae21776cff69aefba5d
Reviewed-on: http://git-master/r/1227922
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We take an extra power refcount when we disable railgating
through railgate_enable sysfs
And that breaks suspend/resume since we check for power
refcount first in gk20a_pm_suspend()
Fix this with following :
- set a flag user_railgate_disabled when User
disables railgating through sysfs railgate_enable
- in gk20a_pm_suspend(), drop one power refcount
if flag is set
- in gk20a_pm_resume(), take one refcount again
if flag is set
Fix __gk20a_do_idle() to consider this extra refcount as well.
Add new variable target_ref_cnt and use it instead of
assuming target refcount of 1
In case User has disabled rail gating, set this target
refcount as 2
Also, export gk20a_idle_nosuspend() which drop power refcount
without triggering suspend
Bug 200233570
Change-Id: Ic0e35c73eb74ffefea1cd90d1b152650d9d2043d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1236047
(cherry picked from commit 6e002d57da4b5c58ed79889728bb678d3aa1f1b1)
Reviewed-on: http://git-master/r/1235219
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There are eight tiles per map tile register and
depending on how many tpcs are present, there is
a chance that s/w will be accessing un-allocated
memory for reading tile values from temp buffers.
Bug 1735760
Change-Id: I5c0e09ec75099aaf6ad03dde964b9e93c2dc2408
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1221580
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Suppress error message when nvgpu tries to load VBIOS overlay, but
one is not found. This situation is normal. This is done by moving
gk20a_request_firmware() to be nvgpu generic function
nvgpu_request_firmware(), and adding a NO_WARN flag to it.
Introduce also a NO_SOC flag to suppress attempt to load firmware
from SoC specific directory in addition to the chip specific
directory. Use it for dGPU firmware files.
Bug 200236777
Change-Id: I0294d3308f029a6a6d3c2effa579d5f69a91e418
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1223840
(cherry picked from commit cca44c3f010f15918cdd2259c15170ba1917828a)
Reviewed-on: http://git-master/r/1233353
GVS: Gerrit_Virtual_Submit
It attaches the neccesary namemap structures to the clock struct so we can enumerate the clock domains in the debugfs code in nvgpu-t18x.
the other is to add an accessor for the fields.
JIRA DNVGPU-98
Change-Id: I6e5c6e763b2b88daa1995f4136a9a7b33ea25b17
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1199083
Reviewed-on: http://git-master/r/1204016
(cherry picked from commit b9d95a45791b93ddc010d1aeddbe798d2a9705d4)
Reviewed-on: http://git-master/r/1227910
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Putting the wmb() before the write only ensures that any previous
writes are done. But this doesn't really do anything for the
writel_relaxed(). The point of the wmb() here is to ensure that
the write performed by the writel_relaxed() is actually done
before proceeding.
Bug 1811382
Change-Id: I7250ea074b8548c899acfd34d816de466cf53b6f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1216434
(cherry picked from commit c9aa02dc61138615d971902fe58dc6a113cdf00a)
Reviewed-on: http://git-master/r/1223457
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add support for cyclestats snapshots in the virtual case
Bug 1700143
JIRA EVLR-278
Change-Id: I376a8804d57324f43eb16452d857a3b7bb0ecc90
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1211547
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
We have completely different versions of probe for
nvgpu and pci device
Extract out common steps into nvgpu_probe() function
and separate it out in new file nvgpu_common.c
Divide task of nvgpu_probe() into further smaller
functions
Do platform specific things (like irq handling,
memresource management, power management) only in
individual probes and then call nvgpu_probe() to
complete the common initialization
Move all debugfs initialization to common gk20a_debug_init()
This also helps to bringup all debug nodes to pci device
Pass debugfs_symlink name as a parameter to gk20a_debug_init()
This allows us to set separate debugfs symlink for nvgpu
and pci device
In case of railgating, cde and ce debugfs, check if
platform supports them or not
Copy vidmem_is_vidmem from platform to mm structure
and set it to true for pci device
Return from gk20a_scale_init() if we don't have either of
governor or qos_notifier
Fix gk20a_alloc_debugfs_init() and gk20a_secure_page_alloc()
to receive device pointer instead of platform_device
Export gk20a_railgating_debugfs_init() so that we can call
it from gk20a_debug_init()
Jira DNVGPU-56
Jira DNVGPU-58
Change-Id: I3cc048082b0a1e57415a9fb8bfb9eec0f0a280cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204207
(cherry picked from commit add6bb0a3d5bd98131bbe6f62d4358d4d722b0fe)
Reviewed-on: http://git-master/r/1204462
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add a new debug message type: gpu_dbg_map_v. This is used for mapping
messages that are not specifically memory map operations.
Also cleanup the memory mapping debugging a bit since there was one
duplicate print and the memory map print was difficult to parse
visually. As a result the message has been modified to put the most
important information first in an easily readable format.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ib19c9371ee958009ab5a2d89b9610e699d070ee2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198593
(cherry picked from commit 51dba53b06ca171cdb13d1707f2d026b0ce29f07)
Reviewed-on: http://git-master/r/1147670
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Flushing timestamp record method can fail in case FECS is not
processing the main method queue. In particular, this occurs
in case of ctxsw timeout, where we process fifo sched interrupts
from the host, but FECS is still waiting for idle (grWFI).
In such scenario, this adds huge delay in fifo recovery
procedure (timeout on FECS method). Since flushing the last
(incomplete) record from FECS would only be useful in that case
(context switch ongoing), remove flush operation on engine
reset. Note that an explicit ENGINE_RESET event (with pid)
is inserted in user-facing ctxsw buffer on engine reset.
Bug 200228310
Change-Id: I885525f8f197f81266b50db161bb511867fc74f4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1207305
(cherry picked from commit 44391b6204fd648949295f90481b0c424d9a5ddf)
Reviewed-on: http://git-master/r/1208414
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
We currently post bpt events (bpt.int and bpt.pause) even
before we process and clear the interrupts and this
could cause races with UMD
Fix this by posting bpt events only after we are done
processing the interrupts
Bug 200209410
Change-Id: Ic3ff7148189fccb796cb6175d6d22ac25a4097fb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1184109
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Added preemption mode (WFI, GFXP, CTA and CILP) support for gp10x
family gr class (PASCAL_B and PASCAL_COMPUTE_B).
Bug 200221149
Change-Id: I859a4d2db518bca0ffeb0d85a6bb271f6b15db87
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1193207
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Added interface to allow kernel to create privileged CE channels for
page migration and clearing support between sysmem and videmem.
JIRA DNVGPU-53
Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1173085
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Added a dedicated device node to allow an
app manager to control TSG scheduling parameters:
- Get list of TSGs
- Get list of recent TSGs
- Get list of TSGs per pid
- Get TSG current scheduling parameters
- Set TSG timeslice
- Set TSG runlist interleave
Jira VFND-1586
Change-Id: I014c9d1534bce0eaea6c25ad114cf0cff317af79
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1160384
(cherry picked from commit 75ca739517cc7f7f76714b5f6a1a57c39b8cb38e)
Reviewed-on: http://git-master/r/1167021
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
This patch calculates:
-Total time spent by GPU with rails gated.
-Total time spent by GPU with rails ungated.
-Total Railgating Cycles.
and dumps this information in debugfs file.
This feature requires CONFIG_DEBUG_FS set to true.
Bug 200195100
Change-Id: I1379f11237ce4900076947e18524caaa3304c7cb
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: http://git-master/r/1178308
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
We currenlty initialize both runtime PM and pm_domains frameworks
and use pm_domain to control runtime power management of NvGPU
But since GPU has a separate rail, using pm_domain is not
strictly required
Hence remove pm_domain support and use runtime PM only for all
the power management
This also simplifies the code a lot
Initialization in gk20a_pm_init()
- if railgate_delay is set, set autosuspend delay of runtime PM
- try enabling runtime PM
- if runtime PM is now enabled, keep GPU railgated
- if runtime PM is not enabled, keep GPU unrailgated
- if can_railgate = false, disable runtime PM and keep
GPU unrailgated
Set gk20a_pm_ops with below callbacks for runtime PM
static const struct dev_pm_ops gk20a_pm_ops = {
.runtime_resume = gk20a_pm_runtime_resume,
.runtime_suspend = gk20a_pm_runtime_suspend,
.resume = gk20a_pm_resume,
.suspend = gk20a_pm_suspend,
}
Move gk20a_busy() to use runtime checks of pm_runtime_enabled()
instead of using compile time checks on CONFIG_PM
Clean up some pm_domain related code
Remove use of gk20a_pm_enable/disable_clk() since this
should be already done in platform specific unrailgate()/
railgate() APIs
Fix "railgate_delay" and "railgate_enable" sysfs to use
runtime PM calls
For VGPU, disable runtime PM during vgpu_pm_init()
With this, we will initialize vgpu with vgpu_pm_finalize_poweron()
upon first call to gk20a_busy()
Jira DNVGPU-57
Change-Id: I6013e33ae9bd28f35c25271af1239942a4fa0919
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1163216
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that
buffers represented as a mem_desc and present in vidmem can be mapped to
gpu.
JIRA DNVGPU-18
JIRA DNVGPU-76
Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169308
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
add gk20a_aperture_mask() for memory target selection now that buffers
can actually be allocated from vidmem, and use it in all cases that have
a mem_desc available.
Jira DNVGPU-76
Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169306
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Revamp the support the nvgpu driver has for semaphores.
The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.
To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.
To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.
Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.
The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.
Bug 1732449
JIRA DNVGPU-12
Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
This method is called when setting up gr
hardware. It is meant to adjust preemption
parameters.
Bug 1593548
Jira VFND-1894
Change-Id: I0f5aa3212bec3058a0493366bed6fe2a365c9542
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1162625
(cherry picked from commit c2e6d12570af28b3aae087401d7f670df40d40bd)
Reviewed-on: http://git-master/r/1166987
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Define specific QoS notifier for common clk framework
and protect it with CONFIG_COMMON_CLK
This new API will first get min/max requirements from
pm_qos and set min/max freq values in devfreq
A call to update_devfreq() will then ensure that
new estimated frequency is clipped appropriately
between min and max values
This also ensures that frequency is set along with
all the book-keeping
Add below platform specific notifier callback and use it
with pm_qos_add_notifier()
int (*qos_notify)()
If qos_notify is set, then only register the callback
We currently support only one qos_id which is treated
as notifier for min frequency
Remove dependency on qos_id, and use appropriate QoS
APIs like pm_qos_read_min/max_bound()
Store devfreq's min/max frequency in struct gk20a
for reference
Bug 1772462
Change-Id: I63d6d17451d19c9d376b67df7db775b38929287d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1161161
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add CE engine to vgpu engine list. CE engine is defined differently
for different GPUs, so we also add HAL for initializing the engine
info.
Bug 1780185
Change-Id: I5ae265551feac08d0c4d45402dd3277514e62b2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1169720
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Tested-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
-update get_netlist_name ops declaration to support
to load GPU FW based on GPU-ARCH
-"GAxxx" string used to get size for "gm204/" or
"gm206/" which will added to NETIMAGE path like
"gm204/NETC_img.bin"
Change-Id: I5bfa13df014533a885c4328d3c767e51c29f9255
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1166783
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move all places that read ptimer to use the callback.
It's for add vgpu implementation of read ptimer.
Bug 1395833
Change-Id: Ia339f2f08d75ca4969a443fffc9a61cff1d3d2b7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1159587
(cherry picked from commit a01f804684f875c9cffc31eb2c1038f2f29ec66f)
Reviewed-on: http://git-master/r/1158449
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
When debug dump is called from an interrupt thread, we do not want
to call gk20a_busy() because it causes race in case rail gating is
being engaged at the same time. It has to be called from all debugfs
paths.
Bug 200198908
Bug 1770522
Change-Id: I7eda7d029b0a59cce0320ecc1b750dc2f4d7ccf0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1163440
GVS: Gerrit_Virtual_Submit
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>