Commit Graph

282 Commits

Author SHA1 Message Date
Terje Bergstrom
4cc1457703 gpu: nvgpu: Move clk bypass div code to clk init
Clock bypass divider was changed just before resetting priv ring.
Move the code to a new clk op instead so that it is executed only on
gk20a.

Change-Id: Ic8084a4a5fac23770f50b50f910ced2543ba0f28
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/764970
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Aleksandr Frid <afrid@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2015-07-03 07:51:42 -07:00
Sami Kiminki
e7ba93fefb gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation
Add batch support for mapping and unmapping. Batching essentially
helps transform some per-map/unmap overhead to per-batch overhead,
namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB
invalidates. Batching with size 64 has been measured to yield >20x
speed-up in low-level fixed-address mapping microbenchmarks.

Bug 1614735
Bug 1623949

Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/733231
(cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91)
Reviewed-on: http://git-master/r/763812
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-30 08:35:23 -07:00
Sumit Singh
1f99458ed3 gpu: nvgpu: Uncomment suspend/resume ops
As upstream has removed them, but we are still using these.
So uncommenting these callback assignment.

Bug 200070810

Change-Id: I26a221f9d76f6acef70095eb8afcf440057f464c
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2015-06-29 11:42:30 +05:30
Sumit Singh
c65d41e768 gpu: nvgpu: Add DT support for gpu power-domain for T132
Make modification to add DT support for gpu
power-domain for T132 chip.

Bug 200070810

Change-Id: Iac63c8fb5fc5280e9a9f5758e63c9da009f3813d
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/739698
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-06-29 11:42:28 +05:30
Sumit Singh
c61ab8facb Revert "HACK: Disable genpd_pm_subdomain_attach"
This reverts commit 83699a4ec9ebf55f6cc12c76e57dad1d4ec2fbfa.

This hack was put in place as upstream has removed of_node
field from generic_pm_domain structure. But as we are still
using it, so removing this hack.

Bug 200100078

Change-Id: I14e533786fb814e361c580e2883ceff1f63d251f
2015-06-29 11:38:48 +05:30
Konsta Holtta
6085c90f49 gpu: nvgpu: add per-channel refcounting
Add reference counting for channels, and wait for reference count to
get to 0 in gk20a_channel_free() before actually freeing the channel.
Also, change free channel tracking a bit by employing a list of free
channels, which simplifies the procedure of finding available channels
with reference counting.

Each use of a channel must have a reference taken before use or held
by the caller. Taking a reference of a wild channel pointer may fail, if
the channel is either not opened or in a process of being closed. Also,
add safeguards for protecting accidental use of closed channels,
specifically, by setting ch->g = NULL in channel free. This will make it
obvious if freed channel is attempted to be used.

The last user of a channel might be the deferred interrupt handler,
so wait for deferred interrupts to be processed twice in the channel
free procedure: once for providing last notifications to the channel
and once to make sure there are no stale pointers left after referencing
to the channel has been denied.

Finally, fix some races in channel and TSG force reset IOCTL path,
by pausing the channel scheduler in gk20a_fifo_recover_ch() and
gk20a_fifo_recover_tsg(), while the affected engines have been identified,
the appropriate MMU faults triggered, and the MMU faults handled. In this
case, make sure that the MMU fault does not attempt to query the hardware
about the failing channel or TSG ids. This should make channel recovery
more safe also in the regular (i.e., not in the interrupt handler) context.

Bug 1530226
Bug 1597493
Bug 1625901
Bug 200076344
Bug 200071810

Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/448463
(cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353)
Reviewed-on: http://git-master/r/755147
Reviewed-by: Automatic_Commit_Validation_User
2015-06-09 11:13:43 -07:00
Leonid Moiseichuk
837ceffcab gpu: nvgpu: cyclestats mode E snapshots support
That is a kernel supporting code for cyclestats mode E.
Cyclestats mode E implemented following Windows-design in user-space
and required the following operations to be implemented:
- attach a client for shared hardware buffer of device
- detach client from shared hardware buffer
- flush means copy of available data from hardware buffer to private
  client buffers according to perfmon IDs assigned for clients
- perfmon IDs management for user-space clients
- a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added

Bug 1573150

Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/740653
(cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f)
Reviewed-on: http://git-master/r/753274
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-06 07:23:24 -07:00
Bharat Nihalani
b8aa486109 Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""""
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.

Bug 200112195

Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-04 10:41:00 -07:00
Bharat Nihalani
1d8fdf5695 Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.

Bug 200106514

Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
2015-06-02 20:18:55 -07:00
Terje Bergstrom
e19d349858 gpu: nvgpu: Support >32bit addresses in simulation
Change-Id: I96282b4e047ba8b5369dac039f0f51856c69235b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/747935
(cherry-picked from commit 0bb090745b4122fc4149b1bd6026138a1b9a32bc)
Reviewed-on: http://git-master/r/749235
2015-06-01 08:16:28 -07:00
Alex Waterman
01f359f3f1 Revert "Revert "gpu: nvgpu: New allocator for VA space""
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.

The original commit was actually fine.

Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-19 13:09:00 -07:00
Deepak Nibade
7ddd6e261e gpu: nvgpu: wait for running jobs to finish before shutdown
In gk20a_pm_shutdown(), we currently call __pm_runtime_disable()
which prevents h/w access to new requests made after shutdown() call
Also, once gk20a_pm_shutdown() completes, platform code will
just rail gate the GPU

But it is possible that some other thread is already accessing
h/w while shutdown() was triggered and this can result in hang

Hence, wait until all currently executing jobs are finished
before returning from gk20a_pm_shutdown()

Also, we need to wait for GPU's usage count to become 1 since
platform code will increase the usage count and then call
shutdown(). Hence usage count of 1 indicates that GPU is idle

Bug 200099940

Change-Id: I1f2457829e2737c07302d13f355353a30c3b4e67
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/734920
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-05-18 11:33:48 +05:30
Deepak Nibade
e7ea596363 Revert "gpu: nvgpu: Fix gk20a shutdown issue"
This reverts commit c18edd3686115ca0b7d8bb08b35f23264f865358.

Proper fixes for shutdown issue are being added with below changes
http://git-master/r/#/c/738509/
http://git-master/r/#/c/734920/

Hence revert this workaround

Bug 200099940

Change-Id: I74b29c804af2bdb9d95c6b93c5308a323575ae57
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/739082
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:39 +05:30
Deepak Nibade
d9854e4782 gpu: nvgpu: jump to fail path if pm_runtime_get_sync() fails
Currently we execute pm_runtime_get_sync() and then
gk20a_scale_notify_busy() without checking return value of
pm_runtime_get_sync()

In case of shutdown of GPU is already initiate, we get
a hard hang due to this as per below sequence :
- one thread invokes GPU shutdown and then forcibly rail
  gates the GPU
- another thread (unaware of shutdown) calls gk20a_busy()
- since runtime PM is disabled in shutdown path,
  pm_runtime_get_sync() fails
- but we still go on running gk20a_scale_notify_busy() which
  tries to access some GPU registers and hangs

Fix this by jumping to failure path in case
pm_runtime_get_sync() fails

Bug 200099940

Change-Id: I022f2dfa9408f640fb44e6f4b10a437688779c0a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/738509
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:38 +05:30
Sami Kiminki
520ff00e87 gpu: nvgpu: Implement compbits mapping
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.

Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.

Bug 200077571

Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:19 +05:30
Terje Bergstrom
d20afe7bd4 gpu: nvgpu: Dynamic betacb size
Allow querying and setting default betacb size via debugfs. For global buffers
the value takes effect upon first boot of GPU, and has no effect after that.

Bug 1628352

Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733328
2015-05-18 11:32:40 +05:30
Sumit Singh
96ffe0c64d gpu: nvgpu: Fix gk20a shutdown issue
With CONFIG_PM_GENERIC_DOMAINS_OF enabled, device reboot
was getting hung while shutting-down gk20a. It was
happening because genpd_dev_pm_detach() was railgating
gk20a while other thread was still accessing it.

So, assigning NULL to dev->pm_domain->detach for gk20a,
so that genpd_dev_pm_detach() is not called during gk20a
shutdown, which will not railgate it.

This patch will be reverted once we have clean shutdown
for gk20a.

Bug 200070810
Bug 200099940

Change-Id: Ie2e89ea01a98a9d4f2f68a3ab07b6923ffa374f6
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/735455
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-05-18 11:31:58 +05:30
Deepak Nibade
c5f2d00d04 gpu: nvgpu: fix deadlock on railgate_lock during race condition
We have below race condition during __gk20a_do_idle()
and force_reset case :

- before execution of __gk20a_do_idle(), a process drops the last
  usage count of GPU, which triggers GPU railgate process
- but before GPU is really railgated (there is 500 mS delay),
  some process calls __gk20a_do_idle()
- in __gk20a_do_idle(), we first take railgate_lock
- then we check if GPU is already railgated or not
- since it is not railgated yet (due to 500 mS delay), this
  returns false
- then we call pm_runtime_get_noresume() which just increases the
  usage counter
- in this particular case, this call just increases usage count to
  1 from 0, but whereas GPU is already on its way to railgate
- while we check if GPU usage count drops to one, GPU gets railgated
- now if we have force_reset=true case, we will end up calling
  pm_runtime_get_sync() which will take railgate_lock lock _again_
  and try to unrailgate GPU
- this causes a deadlock on railgate_lock

To fix this, use below sequence :

- take railgate_lock
- check if GPU is already railgated
- release railgate_lock
- call pm_runtime_get_sync() which will keep GPU active even if
  railgating is already triggered
- take railgate_lock again to prevent unrailgate in futher process

Also, add more descriptive comments to explain the flow

Bug 1624537

Change-Id: I0febc65d7bfac03ee738be200cf321322ffbe5a6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/719625
(cherry picked from commit 480284eda16e2b50ee6368bad3d15574e098b231)
Reviewed-on: http://git-master/r/719620
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-05-18 11:18:24 +05:30
Alex Van Brunt
900f63393d gpu: nvgpu: don't reset clk that doesn't exist
If the clock is null, calling the reset function will crash the
kernel. So, don't call the reset function.

Change-Id: I37ef25c8dca67bec8bf6654eb6e275b866bdae53
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/742361
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-05-15 15:42:41 -07:00
Terje Bergstrom
aa25a952ea Revert "gpu: nvgpu: New allocator for VA space"
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.

Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2015-05-12 02:46:39 -07:00
Alex Waterman
a2e8523645 gpu: nvgpu: New allocator for VA space
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.

The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.

The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.

Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.

Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:53:25 -07:00
Terje Bergstrom
b3a85df53b gpu: nvgpu: SMMU bypass
Improve GMMU mapping code to cope with discontiguous buffers.

Add debugfs entry that allows bypassing SMMU and disabling big pages.

Bug 1605769

Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717503
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/737533
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:59:01 -07:00
Dan Willemsen
75b50e8588 HACK: Disable genpd_pm_subdomain_attach
Upstream doesn't keep track of the DT node in the genpd struct anymore.

Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
2015-04-04 19:17:41 -07:00
Seshendra Gadagottu
d788d86e01 tegra: gpu: disable touch boost for gpu
Gpu boosting with input events is causing more gpu power consumption
than required. To avoid this, touch boot for gpu is disabled by not
registering gpu device for cfboost frame work. Current rail gate
entry/exit latencies are fast enough to give smooth user experience.

Bug 200087243

Change-Id: I18673d9c95a44ce9bee87e860b4edb29212dc766
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/721989
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 19:01:57 -07:00
Amit Sharma (SW-TEGRA)
5b28f2fe75 gpu: nvgpu: make the local function static
Fixed the following sparse warnings by making below APIs static:
- gk20a.c: warning: symbol 'gk20a_pm_restore_debug_setting' was not declared.
                    Should it be static?
- gr_gk20a.c: warning: symbol 'gr_gk20a_rop_l2_en_mask' was not declared.
	               Should it be static?
- gr_gm20b.c: warning: symbol 'gr_gm20b_rop_l2_en_mask' was not declared.
		       Should it be static?

Bug 200067946

Change-Id: I334893bb6614171bff835d270716a7dd262c9ba7
Signed-off-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-on: http://git-master/r/718756
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-04-04 19:00:32 -07:00
Supriya
81f5ffbfae gpu: nvgpu: Correct irq and elpg disable sequence
Bug 200066741

Change-Id: I873835c8aff0c53ac475090d727754ce1ccca0ee
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/715632
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:59:24 -07:00
sujeet baranwal
2155dfeaba gpu: nvgpu: Gpu characterstics enhancement
New members are added in nvgpu_gpu_characterstics to export more
information required specially from CUDA tools.

Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80
Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com>
Reviewed-on: http://git-master/r/679202
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:58:05 -07:00
sujeet baranwal
895675e1d5 gpu: nvgpu: Removal of regops from CUDA driver
The current CUDA drivers have been using the regops to
directly accessing the GPU registers from user space through
the dbg node. This is a security hole and needs to be avoided.
The patch alternatively implements the similar functionality
in the kernel and provide an ioctl for it.

Bug 200083334

Change-Id: Ic5ff5a215cbabe7a46837bc4e15efcceb0df0367
Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com>
Reviewed-on: http://git-master/r/711758
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:58:04 -07:00
Sumit Singh
86637dcef9 gpu: nvgpu: Add DT support for gpu power-domain
First, defining a new structure to support gk20a
power domain. Then making necessary modifications
to add so as to add DT support for gpu power-domain.

bug 200070810

Change-Id: I29e1c24b181e14743d3969103abfd1882d171f07
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/668973
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-04-04 18:57:41 -07:00
Deepak Nibade
45e261ac19 gpu: nvgpu: add flag for CAR reset in do_idle()
Add "force_reset" flag to __gk20a_do_idle()

For real world use cases like VPR resizing, we cannot wait
for railgate_delay (which is 500 mS). Hence use CAR reset
for this use case. (this is done via gk20a_do_idle() API
with force_reset = true)

Some of the test cases make use of sysfs "force_idle" and
they expect GPU to be into really railgated state and
not in CAR reset.
Hence when called from sysfs, set force_reset = false.

When global flag "force_reset_in_do_idle" is set, it will
override local flags and force CAR reset case.
This is desired in cases where railgating is not enabled

Also, set force_reset_in_do_idle = false for GM20B since
railgating has been enabled for GM20B

Bug 1592997

Change-Id: I6c5af2977c7211ef82551a86a7c1eb51b8ccee60
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/711615
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:08:54 -07:00
Terje Bergstrom
f3a920cb01 gpu: nvgpu: Refactor page mapping code
Pass always the directory structure to mm functions instead of
pointers to members to it. Also split update_gmmu_ptes_locked()
into smaller functions, and turn the hard
coded MMU levels (PDE, PTE) into run-time parameters.

Change-Id: I315ef7aebbea1e61156705361f2e2a63b5fb7bf1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/672485
Reviewed-by: Automatic_Commit_Validation_User
2015-04-04 18:08:16 -07:00
Terje Bergstrom
4aef10c950 gpu: nvgpu: Set compression page per SoC
Compression page size varies depending on architecture. Make it
129kB on gk20a and gm20b.

Also export some common functions from gm20b.

Bug 1592495

Change-Id: Ifb1c5b15d25fa961dab097021080055fc385fecd
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/673790
2015-04-04 18:04:45 -07:00
Dan Willemsen
b53b2973fe gpu: nvgpu: Fix/HACK for v3.18
Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
2015-03-18 20:19:10 -07:00
Dan Willemsen
e6292247ad gpu: remove devm_request_and_ioremap
Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
2015-03-18 12:12:37 -07:00
Deepak Nibade
d4b3b74044 gpu: nvgpu: add error prints for do_idle() failure
Add error prints in gk20a_do_idle() to narrow down
the failure point

Bug 200064302

Change-Id: Iffe1151bdc200a79b88e273b3b01523f8e46d130
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/664446
(cherry picked from commit bf1cd9b5551d27cb5cc468795cd147376f48e482)
Reviewed-on: http://git-master/r/666218
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-03-18 12:12:35 -07:00
Terje Bergstrom
0bc513fc46 gpu: nvgpu: Remove gk20a sparse texture & PTE freeing
Remove support for gk20a sparse textures. We're using implementation
from user space, so gk20a code is never invoked.

Also removes ref_cnt for PTEs, so we never free PTEs when unmapping
pages, but only at VM delete time.

Change-Id: I04d7d43d9bff23ee46fd0570ad189faece35dd14
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/663294
2015-03-18 12:12:32 -07:00
Aingara Paramakuru
5bac50c044 gpu: nvgpu: vgpu: debugger interface fixes
To run CUDA apps, the following minimal changes have been
made:
- power-gating is disabled for vgpu
- regop rd/wr returns -ENOSYS

Tools (debugger/profiler) support is known to not work and
not needed at this time.

Bug 200043227

Change-Id: I923caad78450e72d310fb9290cf2849ed5460ad5
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/592878
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:25 -07:00
Konsta Holtta
e99b59d14a gpu: nvgpu: add gk20a_scale_exit()
When removing the module, remove the device from devfreq and free
resources allocated when scaling is initialized.

Bug 1476801

Change-Id: I7bb0f8112a5bf7e5ce2fc56cf8af7059d910002c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/594444
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:22 -07:00
Konsta Holtta
6049301229 gpu: nvgpu: remove platform device on exit
Add ->remove() for undoing the ->probe() and ->late_probe() in
gk20a_platform devices, and call it when gk20a is removed.

Bug 1476801

Change-Id: Ic9b29c0a7ea4a4cae7b5a0f66774bd799eb28434
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/594443
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:22 -07:00
Konsta Holtta
1deb73b9c6 gpu: nvgpu: clean up module deinitialization
Fix gk20a module removal so that the driver can be correctly removed
when built as a module (CONFIG_GK20A=m). Reinsertion correctly will need
more missing deinit calls, which are not yet implemented.

This change has no effect yet on any builds since the module is built in
by the defconfig.

- don't free threaded irqs that are bound to the device
- destroy cde if it is enabled
- remove all debugfs entries recursively generated directly and by cde,
  pmu, clk etc.
- free secure buffer if it exists
- remove pm_runtime_put, since it doesn't have a pairing get
- pm_runtime defines proper function stubs if it is not enabled, so
  remove ifdefs and query pm_runtime_enabled()
- null and free gk20a only after all deinit has been done

Bug 1476801

Change-Id: I73ad72832bdb501fd7071d6ac68d461ee63a760d
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/594442
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:18 -07:00
Terje Bergstrom
c0668f05ea gpu: nvgpu: Retrieve intr & reset id from HW
Query interrupt number and reset id from HW. Use the number
from HW when enabling and detecting interrupts.

Bug 200036089
Bug 1567274

Change-Id: If9cb4db79a19dcb193ba7ad9db7081f4fe1ab433
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/600988
2015-03-18 12:12:14 -07:00
Sami Kiminki
d11fbfe7b1 gpu: nvgpu: GPU characteristics additions
Add the following info into GPU characteristics: available big page
sizes, support indicators for sync fence fds and cycle stats, gpc
mask, SM version, SM SPA version and warp count, and IOCTL interface
levels. Also, add new IOCTL to fetch TPC masks.

Bug 1551769
Bug 1558186

Change-Id: I8a47d882645f29c7bf0c8f74334ebf47240e41de
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/562904
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:07 -07:00
Deepak Nibade
0e89e42318 gpu: nvgpu: force CAR reset in do_idle() for gm20b
In gk20a_do_idle(), we wait for platform->railgate_delay
to allow GPU to go into rail gate

But sometimes we set platform->railgate_delay = INT_MAX
to disable GPU rail gating but allow it to suspend during
low power state
Due to this, force_idle API fails (it waits for INT_MAX)

To fix this, allow forcing CAR reset instead of rail gating
with flag "force_reset_in_do_idle" defined in gk20a_platform
Set this flag for gm20b until we fix the railgate_delay

Bug 1517584

Change-Id: I031aa56f87d4db3727e2c3a3e5eeaf18503dd449
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/593704
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:04 -07:00
Terje Bergstrom
afc470e867 gpu: nvgpu: Do not call ELPG if disabled
Do not call PMU ELPG calls if ELPG should be disabled. Also skips
initialization of PMU ucode if PMU is disabled.

Bug 1567274

Change-Id: Ia9cd3b553c358142ee05a1b0e0832f9412f7cf17
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/593335
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
2015-03-18 12:12:02 -07:00
Deepak Nibade
b3f575074b gpu: nvgpu: fix sparse warnings
Fix below sparse warnings :

warning: Using plain integer as NULL pointer
warning: symbol <variable/funcion> was not declared. Should it be static?
warning: Initializer entry defined twice

Also, remove dead functions

Bug 1573254

Change-Id: I29d71ecc01c841233cf6b26c9088ca8874773469
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/593363
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-03-18 12:12:01 -07:00
Seshendra Gadagottu
797e4dd319 gpu: nvgpu: cde: cancel delayed_work during suspend
During gpu suspend, cancel all pending delayed cde work
to avoid issues of scheduling this delayed work
during suspend/resume when gpu is not ready.

Bug 1574000

Change-Id: I2b6bfa489435a781dc576a077f9af01b1e1628ce
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/593557
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Tested-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-03-18 12:12:01 -07:00
Deepak Nibade
c3661adef8 gpu: nvgpu: fix reset clock in gm20b
To assert reset on GPU, we store "gpu_ref" clock in
platform->clk[0] and use it to assert/deassert reset

But for gm20b, "gpu_ref" is no longer resettable.

To fix this, add two callbacks in gk20a_platform :
.reset_assert and .reset_deassert
Also, add a pointer "clk_reset" to store the clock
which needs to be reset

For gk20a specific implementation, we continue to
reset platform->clk[0]

For gm20b specific implementation, we first request
"gpu_gate" clock, store it and use it to assert reset

Bug 1513685
Bug 1517584

Change-Id: I15a583a4a07eb663b442084be8b8c7d0c7c7a142
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
2015-03-18 12:12:00 -07:00
Terje Bergstrom
ca95cc76bb gpu: nvgpu: Assign T18x an own platform data
Bug 1572701

Change-Id: Id135eb2328765d00349b478d695914f7f8c5edf0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/592095
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-03-18 12:11:59 -07:00
Kenneth Adams
aec94d8093 gpu: nvgpu: T18x support
nvgpu framework and build for T18x

Bug 1567274

Change-Id: I77835302a1110573008869d1106eface512bb9b1
Signed-off-by: Ken Adams <kadams@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:11:57 -07:00
Terje Bergstrom
8371833f42 gpu: nvgpu: Per-chip interrupt processing
Move accesses to MC registers under HAL so that they can be
reimplemented per chip.

Do chip detection and HAL initialization only once.

Bug 1567274

Change-Id: I20bf2f439d267d284bfd536f1a1dfb5d5a2dce4c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/590385
2015-03-18 12:11:56 -07:00