Commit Graph

213 Commits

Author SHA1 Message Date
Terje Bergstrom
7d44a8d8d8 gpu: nvgpu: Support mclk initialization
Add ops for calling mclk initialization.

JIRA DNVGPU-85

Change-Id: I2e9da80fdb014d916b40513d605c38711818d2f6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1203975
(cherry picked from commit 9be482c4ece7ffc550ae19f133638c808b3a768f)
Reviewed-on: http://git-master/r/1217300
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-08 20:06:06 -07:00
Mahantesh Kumbar
39c48cb8bf gpu: nvgpu: get bios perf and clk table ptr
Implement support for reading perf and clk tables from VBIOS.

JIRA DNVGPU-83

Change-Id: I095fea08479161362e4c2ffa7500ee6a57d6d447
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1202602
(cherry picked from commit fb7c7356f131a198bd655a25fc6ff17067477e1b)
Reviewed-on: http://git-master/r/1217299
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-08 20:05:58 -07:00
Peter Daifuku
9aa7de15c2 gpu: nvgpu: vgpu: cyclestat snapshot support
Add support for cyclestats snapshots in the virtual case

Bug 1700143
JIRA EVLR-278

Change-Id: I376a8804d57324f43eb16452d857a3b7bb0ecc90
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1211547
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 16:04:09 -07:00
Deepak Nibade
70cad5fbb5 gpu: nvgpu: unify nvgpu and pci probe
We have completely different versions of probe for
nvgpu and pci device
Extract out common steps into nvgpu_probe() function
and separate it out in new file nvgpu_common.c
Divide task of nvgpu_probe() into further smaller
functions

Do platform specific things (like irq handling,
memresource management, power management) only in
individual probes and then call nvgpu_probe() to
complete the common initialization

Move all debugfs initialization to common gk20a_debug_init()
This also helps to bringup all debug nodes to pci device

Pass debugfs_symlink name as a parameter to gk20a_debug_init()
This allows us to set separate debugfs symlink for nvgpu
and pci device

In case of railgating, cde and ce debugfs, check if
platform supports them or not

Copy vidmem_is_vidmem from platform to mm structure
and set it to true for pci device

Return from gk20a_scale_init() if we don't have either of
governor or qos_notifier

Fix gk20a_alloc_debugfs_init() and gk20a_secure_page_alloc()
to receive device pointer instead of platform_device

Export gk20a_railgating_debugfs_init() so that we can call
it from gk20a_debug_init()

Jira DNVGPU-56
Jira DNVGPU-58

Change-Id: I3cc048082b0a1e57415a9fb8bfb9eec0f0a280cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204207
(cherry picked from commit add6bb0a3d5bd98131bbe6f62d4358d4d722b0fe)
Reviewed-on: http://git-master/r/1204462
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 09:43:51 -07:00
Alex Waterman
9eac0fd849 gpu: nvgpu: Add debugging to the semaphore code
Add GPU debugging to the semaphore code.

Bug 1732449
JIRA DNVGPU-12

Change-Id: I98466570cf8d234b49a7f85d88c834648ddaaaee
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198594
(cherry picked from commit 420809cc31fcdddde32b8e59721676c67b45f592)
Reviewed-on: http://git-master/r/1153671
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-08-30 10:04:30 -07:00
Alex Waterman
0e69c6707b gpu: nvgpu: Add gpu_dbg_map_v message type
Add a new debug message type: gpu_dbg_map_v. This is used for mapping
messages that are not specifically memory map operations.

Also cleanup the memory mapping debugging a bit since there was one
duplicate print and the memory map print was difficult to parse
visually. As a result the message has been modified to put the most
important information first in an easily readable format.

Bug 1732449
JIRA DNVGPU-12

Change-Id: Ib19c9371ee958009ab5a2d89b9610e699d070ee2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198593
(cherry picked from commit 51dba53b06ca171cdb13d1707f2d026b0ce29f07)
Reviewed-on: http://git-master/r/1147670
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-08-30 10:04:23 -07:00
Thomas Fleury
fba43012c0 gpu: nvgpu: do not flush FECS record on engine reset
Flushing timestamp record method can fail in case FECS is not
processing the main method queue. In particular, this occurs
in case of ctxsw timeout, where we process fifo sched interrupts
from the host, but FECS is still waiting for idle (grWFI).
In such scenario, this adds huge delay in fifo recovery
procedure (timeout on FECS method). Since flushing the last
(incomplete) record from FECS would only be useful in that case
(context switch ongoing), remove flush operation on engine
reset. Note that an explicit ENGINE_RESET event (with pid)
is inserted in user-facing ctxsw buffer on engine reset.

Bug 200228310

Change-Id: I885525f8f197f81266b50db161bb511867fc74f4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1207305
(cherry picked from commit 44391b6204fd648949295f90481b0c424d9a5ddf)
Reviewed-on: http://git-master/r/1208414
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-29 16:14:40 -07:00
Richard Zhao
198b895a88 gpu: nvgpu: use force_reset_ch in ch wdt handler
- let force_reset_ch pass down err code
- force_reset_ch callback can cover vgpu too.

Bug 1776876
JIRA VFND-2151

Change-Id: I48f7890294c6455247198e0cab5f21f83f61f0e1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1202255
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-18 15:03:54 -07:00
Deepak Nibade
59a115f3fe gpu: nvgpu: post bpt events after processing
We currently post bpt events (bpt.int and bpt.pause) even
before we process and clear the interrupts and this
could cause races with UMD

Fix this by posting bpt events only after we are done
processing the interrupts

Bug 200209410

Change-Id: Ic3ff7148189fccb796cb6175d6d22ac25a4097fb
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1184109
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-10 11:17:58 -07:00
Peter Daifuku
38a59acc77 gpu: nvgpu: move dbg_session_ops to gops
Move dbg_session_ops to gops for better code consistency

JIRA VFND-1905

Change-Id: I04a11d77dd8c26d9922e80e556822f80dd2bc36d
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1192641
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2016-07-30 11:29:20 -07:00
Lakshmanan M
92415fd366 gpu: nvgpu: Add preemption mode support for gp10x
Added preemption mode (WFI, GFXP, CTA and CILP) support for gp10x
family gr class (PASCAL_B and PASCAL_COMPUTE_B).

Bug 200221149

Change-Id: I859a4d2db518bca0ffeb0d85a6bb271f6b15db87
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1193207
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2016-07-28 22:25:58 -07:00
neha
f3d89a2997 gpu: nvgpu: Full chip support for ctxsw
nvgpu changes needed to handle the newly added ctxsw lists
Fix regops support for ppc registers

Squashed from:
Change-Id: I08e6dec3bb2f7aa51de912c9d1c84a350ce07f72
Signed-off-by: neha <njoshi@nvidia.com>
Reviewed-on: http://git-master/r/1151010
(cherry picked from commit fd03ad9f09e66f78db88fb7ece448e26e0515821)

and:
Change-Id: I75a7f810ee0b613c22ac2cef2d936563d8067f97
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1158888
(cherry picked from commit f00a7fcc57fb937b800e46760087ff6f7637520c)

Bug 200180000
Bug 1771830

Reviewed-on: http://git-master/r/1164397
(cherry picked from commit 7028f051e4f37edeff90a9923f022cec6c645a8f)
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Change-Id: I796ddf93ef37170843a4a6b44190cd6780d25852
Reviewed-on: http://git-master/r/1183588
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
GVS: Gerrit_Virtual_Submit
2016-07-22 15:10:22 -07:00
Lakshmanan M
89aecd1202 gpu: nvgpu: Add nvgpu infra to allow kernel to create privileged CE channels
Added interface to allow kernel to create privileged CE channels for
page migration and clearing support between sysmem and videmem.

JIRA DNVGPU-53

Change-Id: I3e18d18403809c9e64fa45d40b6c4e3844992506
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1173085
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2016-07-20 03:09:28 -07:00
Thomas Fleury
c8ffe0fdec gpu: nvgpu: add sched control API
Added a dedicated device node to allow an
app manager to control TSG scheduling parameters:
- Get list of TSGs
- Get list of recent TSGs
- Get list of TSGs per pid
- Get TSG current scheduling parameters
- Set TSG timeslice
- Set TSG runlist interleave

Jira VFND-1586

Change-Id: I014c9d1534bce0eaea6c25ad114cf0cff317af79
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1160384
(cherry picked from commit 75ca739517cc7f7f76714b5f6a1a57c39b8cb38e)
Reviewed-on: http://git-master/r/1167021
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2016-07-18 23:12:51 -07:00
Deepak Goyal
e875b4a66c gpu: nvgpu: Debugfs support for Railgating stats.
This patch calculates:
-Total time spent by GPU with rails gated.
-Total time spent by GPU with rails ungated.
-Total Railgating Cycles.
and dumps this information in debugfs file.

This feature requires CONFIG_DEBUG_FS set to true.

Bug 200195100

Change-Id: I1379f11237ce4900076947e18524caaa3304c7cb
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: http://git-master/r/1178308
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2016-07-18 04:02:06 -07:00
Deepak Nibade
e27c72446b gpu: nvgpu: simplify power management
We currenlty initialize both runtime PM and pm_domains frameworks
and use pm_domain to control runtime power management of NvGPU

But since GPU has a separate rail, using pm_domain is not
strictly required
Hence remove pm_domain support and use runtime PM only for all
the power management
This also simplifies the code a lot

Initialization in gk20a_pm_init()
- if railgate_delay is set, set autosuspend delay of runtime PM
- try enabling runtime PM
- if runtime PM is now enabled, keep GPU railgated
- if runtime PM is not enabled, keep GPU unrailgated
- if can_railgate = false, disable runtime PM and keep
  GPU unrailgated

Set gk20a_pm_ops with below callbacks for runtime PM
static const struct dev_pm_ops gk20a_pm_ops = {
.runtime_resume = gk20a_pm_runtime_resume,
.runtime_suspend = gk20a_pm_runtime_suspend,
.resume = gk20a_pm_resume,
.suspend = gk20a_pm_suspend,
}

Move gk20a_busy() to use runtime checks of pm_runtime_enabled()
instead of using compile time checks on CONFIG_PM

Clean up some pm_domain related code

Remove use of gk20a_pm_enable/disable_clk() since this
should be already done in platform specific unrailgate()/
railgate() APIs

Fix "railgate_delay" and "railgate_enable" sysfs to use
runtime PM calls

For VGPU, disable runtime PM during vgpu_pm_init()
With this, we will initialize vgpu with vgpu_pm_finalize_poweron()
upon first call to gk20a_busy()

Jira DNVGPU-57

Change-Id: I6013e33ae9bd28f35c25271af1239942a4fa0919
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1163216
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-07-08 00:58:53 -07:00
Konsta Holtta
b8915ab5aa gpu: nvgpu: support in-kernel vidmem mappings
Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that
buffers represented as a mem_desc and present in vidmem can be mapped to
gpu.

JIRA DNVGPU-18
JIRA DNVGPU-76

Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169308
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-07-06 03:34:23 -07:00
Konsta Holtta
e12c5c8594 gpu: nvgpu: initial support for vidmem apertures
add gk20a_aperture_mask() for memory target selection now that buffers
can actually be allocated from vidmem, and use it in all cases that have
a mem_desc available.

Jira DNVGPU-76

Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169306
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-07-04 23:10:59 -07:00
Alex Waterman
dfd5ec53fc gpu: nvgpu: Revamp semaphore support
Revamp the support the nvgpu driver has for semaphores.

The original problem with nvgpu's semaphore support is that it
required a SW based wait for every semaphore release. This was
because for every fence that gk20a_channel_semaphore_wait_fd()
waited on a new semaphore was created. This semaphore would then
get released by SW when the fence signaled. This meant that for
every release there was necessarily a sync_fence_wait_async() call
which could block. The latency of this SW wait was enough to cause
massive degredation in performance.

To fix this a fast path was implemented. When a fence is passed to
gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore
a semaphore acquire is directly used to block the GPU. No longer is
a sync_fence_wait_async() performed nor is there an extra semaphore
created.

To implement this fast path the semaphore memory had to be shared
between channels. Previously since a new semaphore was created
every time through gk20a_channel_semaphore_wait_fd() what address
space a semaphore was mapped into was irrelevant. However, when
using the fast path a sempahore may be released on one address
space but acquired in another.

Sharing the semaphore memory was done by making a fixed GPU mapping
in all channels. This mapping points to the semaphore memory (the
so called semaphore sea). This global fixed mapping is read-only to
make sure no semaphores can be incremented (i.e released) by a
malicious channel. Each channel then gets a RW mapping of it's own
semaphore. This way a channel may only acquire other channel's
semaphores but may both acquire and release its own semaphore.

The gk20a fence code was updated to allow introspection of the GPU
backed fences. This allows detection of when the fast path can be
taken. If the fast path cannot be used (for example when a fence is
sync-pt backed) the original slow path is still present. This gets
used when the GPU needs to wait on an event from something which
only understands how to use sync-pts.

Bug 1732449
JIRA DNVGPU-12

Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133792
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-28 15:49:11 -07:00
Thomas Fleury
d150daf75e gpu: nvgpu: add init_preemption_state gr method
This method is called when setting up gr
hardware. It is meant to adjust preemption
parameters.

Bug 1593548
Jira VFND-1894

Change-Id: I0f5aa3212bec3058a0493366bed6fe2a365c9542
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1162625
(cherry picked from commit c2e6d12570af28b3aae087401d7f670df40d40bd)
Reviewed-on: http://git-master/r/1166987
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-28 10:00:07 -07:00
Deepak Nibade
61d4e27607 gpu: nvgpu: add QoS notifier for common clk framework
Define specific QoS notifier for common clk framework
and protect it with CONFIG_COMMON_CLK

This new API will first get min/max requirements from
pm_qos and set min/max freq values in devfreq

A call to update_devfreq() will then ensure that
new estimated frequency is clipped appropriately
between min and max values
This also ensures that frequency is set along with
all the book-keeping

Add below platform specific notifier callback and use it
with pm_qos_add_notifier()
int (*qos_notify)()
If qos_notify is set, then only register the callback

We currently support only one qos_id which is treated
as notifier for min frequency
Remove dependency on qos_id, and use appropriate QoS
APIs like pm_qos_read_min/max_bound()

Store devfreq's min/max frequency in struct gk20a
for reference

Bug 1772462

Change-Id: I63d6d17451d19c9d376b67df7db775b38929287d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1161161
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-27 09:14:04 -07:00
Terje Bergstrom
475af509e1 gpu: nvgpu: vgpu: Add CE engine to engine list
Add CE engine to vgpu engine list. CE engine is defined differently
for different GPUs, so we also add HAL for initializing the engine
info.

Bug 1780185

Change-Id: I5ae265551feac08d0c4d45402dd3277514e62b2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1169720
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Tested-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
2016-06-24 09:10:39 -07:00
Mahantesh Kumbar
10b75f9cdd gpu: nvgpu: update get_netlist_name ops declaration
-update get_netlist_name ops declaration to support
to load GPU FW based on GPU-ARCH
-"GAxxx" string used to get size for "gm204/" or
 "gm206/" which will added to NETIMAGE path like
 "gm204/NETC_img.bin"

Change-Id: I5bfa13df014533a885c4328d3c767e51c29f9255
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1166783
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-21 15:20:49 -07:00
Richard Zhao
86225cb04e gpu: nvgpu: add read_ptimer to gops
Move all places that read ptimer to use the callback.
It's for add vgpu implementation of read ptimer.

Bug 1395833

Change-Id: Ia339f2f08d75ca4969a443fffc9a61cff1d3d2b7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1159587
(cherry picked from commit a01f804684f875c9cffc31eb2c1038f2f29ec66f)
Reviewed-on: http://git-master/r/1158449
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-16 14:06:46 -07:00
Terje Bergstrom
1409d216e5 gpu: nvgpu: Fix gk20a_busy() in debug dump
When debug dump is called from an interrupt thread, we do not want
to call gk20a_busy() because it causes race in case rail gating is
being engaged at the same time. It has to be called from all debugfs
paths.

Bug 200198908
Bug 1770522

Change-Id: I7eda7d029b0a59cce0320ecc1b750dc2f4d7ccf0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1163440
GVS: Gerrit_Virtual_Submit
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-06-14 04:50:56 -07:00
Terje Bergstrom
3daeac112b Revert "gpu: nvgpu: take power refcount in ISR"
This reverts commit 2219f38727. It leaves
GPU in on state for some tests that require powering down GPU.

Change-Id: I79d44fed729e98692021c57bbeff6a0ef2e8c983
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1161846
2016-06-09 11:20:28 -07:00
Konsta Holtta
d215bc1107 gpu: nvgpu: detect vidmem configuration from HW
Read video memory size from hardware during initialization for devices
that support it.

JIRA DNVGPU-14

Change-Id: If190f2d89f7148520ee274ca674f972987c8056d
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1157215
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-08 12:05:05 -07:00
Deepak Nibade
2219f38727 gpu: nvgpu: take power refcount in ISR
We sometimes see race conditions where power refcount
is zero during ISR or bottom half.
If bottom half calls gk20a_busy(), it will lead to
boot up of GPU, but it is also possible that we are
already trying to poweroff GPU since power refcount
is zero

Fix this by taking a power refcount with gk20a_busy_noresume()
in ISR and then dropping this refcount at the end of
bottom half
Add new API gk20a_idle_nosuspend() to drop a refcount
without initiating suspend

Bug 200198908
Bug 1770522

Change-Id: Iec3d4dc8d468f49b71919d2bbc327da48b97bcab
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1160035
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-08 11:22:59 -07:00
Lakshmanan M
6299b00beb gpu: nvgpu: Add multiple engine and runlist support
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
   for gm206 GPU family
5) Added generic mechanism to identify the
   CE engine pri_base address for gm206
   (CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
   made generic way
7) Code cleanup for readability

JIRA DNVGPU-26

Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-07 12:31:34 -07:00
Mahantesh Kumbar
f99de40936 gpu: nvgpu: WPR & PMU interface update
Update WPR interface &  PMU interface
to support latest ACR/PMU ucode versions

Change-Id: I4d1bd7a5c43751e96c1db58832cd316006d56954
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1158070
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-04 15:21:35 -07:00
Mahantesh Kumbar
9d13ddc17d gpu: nvgpu: update HAL of ACR BL
-update HAL of ACR BL which can support
gm204/gm206 and DMATRFBASE method to global

JIRA DNVGPU-10

Change-Id: I56fc7ce040dadb6473f6f375ee6ce90783a046ad
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1154954
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-01 11:24:12 -07:00
Richard Zhao
7a134457a8 gpu: nvgpu: vgpu: add tsg set timeslice support
Bug 1702773
JIRA VFND-1496

Change-Id: Ice570df78d974fa59f2a932caf0e6249b13493a1
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1144929
(cherry picked from commit 8b6ec996f3773e497a040a8fe4148e01e8dc35fa)
Reviewed-on: http://git-master/r/1150705
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-31 10:47:47 -07:00
Richard Zhao
d707c5a444 gpu: nvgpu: add tsg support for vgpu
- make tsg_gk20a.c call HAL for enable/disable channels
- add preempt_tsg HAL callbacks
- add tsg bind/unbind channel HAL callbacks
- add according tsg callbacks for vgpu

Bug 1702773
JIRA VFND-1003

Change-Id: I2cba74b3ebd3920ef09219a168e6433d9574dbe8
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1144932
(cherry picked from commit c3787de7d38651d46969348f5acae2ba86b31ec7)
Reviewed-on: http://git-master/r/1126942
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-31 10:47:22 -07:00
Lakshmanan M
f3cb140a71 gpu: nvgpu: Add device_info_data support
Added device_info_data parsing
support for maxwell GPU series.

JIRA DNVGPU-26

Change-Id: I06dbec6056d4c26501e607c2c3d67ef468d206f4
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1151602
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-27 11:36:52 -07:00
Mahantesh Kumbar
147330c2da gpu: nvgpu: move & rename acr_gm20b to acr_desc
acr_gm20b renamed to acr_desc to support
multiple gpu chips

JIRA DNVGPU-10

Change-Id: Ib3b38d5845043f026ddc365a682b7bb454463326
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1152401
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-26 16:06:30 -07:00
Mahantesh Kumbar
e9d5e7dfca gpu: nvgpu: secure boot HAL update
Updated/added secure boot HAL with methods
required to support multiple GPU chips.

JIRA DNVGPU-10

Change-Id: I343b289f2236fd6a6b0ecf9115367ce19990e7d5
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1151784
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-26 16:04:25 -07:00
Terje Bergstrom
dc08f78c57 gpu: nvgpu: Move PCI devnodes to own directory
To be able to scan, PCI devnodes need to be in a directory with read
permission. By default /dev is read protected by SELinux policy. Move
the devnodes to their own directory so that reading this one
directory can be allowed.

At the same time rename the nodes to start with string "card-".

JIRA DNVGPU-54

Change-Id: I0df4ced08afd1f3a468e983d07395ffcb8050365
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1152745
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
2016-05-25 11:58:54 -07:00
Terje Bergstrom
fb64e1f1b9 gpu: nvgpu: Add support for gm204 and gm206
Add support for chips gm204 and gm206. Adds also support for reading
VBIOS and booting devinit and pre-os images on PMU.

Change-Id: I4824b44245611e5379ace62793cc37158048f432
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120467
GVS: Gerrit_Virtual_Submit
Reviewed-by: Ken Adams <kadams@nvidia.com>
2016-05-23 14:15:25 -07:00
Thomas Fleury
64f2e3ee9b gpu: nvgpu: update trace for sched params
Use an inline function instead of a macro to
"expand" all channel parameters.

Jira EVLR-244
Jira EVLR-318

Change-Id: I4e8c5ee6bc9da36564af171be809f50dd2dfd439
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1150050
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-21 11:34:18 -07:00
Terje Bergstrom
72ae2dedf5 gpu: nvgpu: Add HAL op for PMU reset
Sequence to reset PMU is different for iGPU and dGPU. Specialize
and implement iGPU version.

Change-Id: I5b9ff2c018a736bc9e27b90d0942c52706b12a12
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1150540
2016-05-20 13:58:00 -07:00
Peter Daifuku
ce0fe5082e gpu: nvgpu: hwpm broadcast register support
Add support for hwpm broadcast registers (ltc and lts)

In gr_gk20a_find_priv_offset_in_buffer, replace "Unknown address type" error
with informational message: gr_gk20a_exec_ctx_ops calls
gk20a_get_ctx_buffer_offsets and if that fails,
calls gr_gk20a_get_pm_ctx_buffer_offsets; HWPM registers will fail the first
call, so an error or warning is overkill.

Bug 1648200

Change-Id: I197b82579e9894652add4ff254418f818981415a
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1131365
(cherry picked from commit 9f30a92c5d87f6dadd34cc37396a6b10e3a72751)
Reviewed-on: http://git-master/r/1133628
(cherry picked from commit 7eb7cfd998852ba7f7c4c40d3db286f66e83ab3a)
Reviewed-on: http://git-master/r/1127749
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-19 15:58:24 -07:00
Terje Bergstrom
67a41e46a2 gpu: nvgpu: Read all fields of device_info
We were not using the engine_type field in device info, and the code
did not handle chained entries properly. The code assumed that first
entry is for graphics and second for CE, which is not always true.

Improve the code to go through all entries of device_info, and
preserve values across entries until we reach the last entry.
Only last entry triggers a write to fifo engine info.

There can also be multiple engines with same type, so accumulate
interrupts and reset ids from all of them.

As the code got fixed, now it reads the engine enum correctly from
hardware. We used to compare that against CE0, but we should compare
against CE2.

gk20a_fifo_reset_engine() uses wrong constants - it is passed a
internal numbering of engines, but it compares them against hardware
engine enum.

Change-Id: Ia59273921c602d2a090f7a5b1404afb0fca2532c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1147746
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
2016-05-18 08:15:11 -07:00
Terje Bergstrom
211edaefb7 gpu: nvgpu: Fix CWD floorsweep programming
Program CWD TPC and SM registers correctly. The old code did not work
when there are more than 4 TPCs.

Refactor init_fs_mask to reduce code duplication.

Change-Id: Id93c1f8df24f1b7ee60314c3204e288b91951a88
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1143697
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
2016-05-16 10:57:48 -07:00
Terje Bergstrom
773b3f2034 gpu: nvgpu: Do not program max ways evict
Setting max_ways_evict reserves some of L2 for CB. In gk20a CB is in
dedicated RAM, so we don't need to reserve space for it.

The code gets invoked only on gk20a.

Change-Id: Ib8efec8c5e90c135bd0c10bb1eaa3f797ec68698
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1144993
2016-05-13 16:07:00 -07:00
Konsta Holtta
6eebc87d99 gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.

gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.

vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.

Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.

JIRA DNVGPU-23

Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
2016-05-13 07:11:33 -07:00
Richard Zhao
bc72480f8d gpu: nvgpu: add fuse overrides for tpc disabling
- add fuse_override in gops. Implement it starting from gm20b.
- set cwd fs register, so cuda won't use disabled TPCs

Bug 1757262
Bug 200169697

Change-Id: If7bac58bd3a6bcf2925197ea5b7c2d10a77e0933
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
(cherry picked from commit 66cde7724815e9e5e85ab9b07fc985a78530222f)
Reviewed-on: http://git-master/r/1132177
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Tested-by: Adeel Raza <araza@nvidia.com>
2016-05-10 12:29:49 -07:00
Deepak Nibade
771f742703 gpu: nvgpu: add supported preemptions to gpu characteristics
Add below flag fields to gpu characteristics to indicate
supported and default preemption modes on platform for
graphics and compute

__u32 graphics_preemption_mode_flags;
__u32 compute_preemption_mode_flags;
__u32 default_graphics_preempt_mode;
__u32 default_compute_preempt_mode;

Add struct nvgpu_preemption_modes_rec to struct gr_gk20a
to store these values locally

Use platform specific get_preemption_mode_flags() to
get the flags and define gk20a/gm20b specific
get_preemption_mode_flags() API

Bug 1646259

Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1133595
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:53 -07:00
Deepak Nibade
d868b65441 gpu: nvgpu: separate IOCTL to set preemption mode
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD

Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h

Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously

Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)

API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute

Make necessary changes in code to support two preemption
modes

Bug 1646259

Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:29 -07:00
Haley Teng
4c4d0e6eb2 nvgpu: vgpu: create fifo.force_reset_ch in gpu_ops
gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to
create a function pointer in gpu_ops and assign it differently for
vgpu and non-vgpu.

Bug 200184349

Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/1130420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 09:52:04 -07:00
Thomas Fleury
93678f571c gpu: nvgpu: Add trace and debugfs for sched params
JIRA EVLR-244
JIRA EVLR-318

Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
2016-05-05 09:25:02 -07:00