Commit Graph

694 Commits

Author SHA1 Message Date
Deepak Nibade
d9854e4782 gpu: nvgpu: jump to fail path if pm_runtime_get_sync() fails
Currently we execute pm_runtime_get_sync() and then
gk20a_scale_notify_busy() without checking return value of
pm_runtime_get_sync()

In case of shutdown of GPU is already initiate, we get
a hard hang due to this as per below sequence :
- one thread invokes GPU shutdown and then forcibly rail
  gates the GPU
- another thread (unaware of shutdown) calls gk20a_busy()
- since runtime PM is disabled in shutdown path,
  pm_runtime_get_sync() fails
- but we still go on running gk20a_scale_notify_busy() which
  tries to access some GPU registers and hangs

Fix this by jumping to failure path in case
pm_runtime_get_sync() fails

Bug 200099940

Change-Id: I022f2dfa9408f640fb44e6f4b10a437688779c0a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/738509
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:38 +05:30
Kerwin Wan
8d177e7b74 gpu: nvgpu: use vzalloc for mm entries
When system is in low memory, kzalloc will fail if
kernel requests more than PAGE_SIZE continous memory block.

Bug 200096099

Change-Id: I44e217ffa6aa6c453a4d4afba45a8ee3b5756cc1
Signed-off-by: Kerwin Wan <kerwinw@nvidia.com>
Reviewed-on: http://git-master/r/732197
(cherry picked from commit 62861976421415f93e98a0a9f977ac1f66046714)
Reviewed-on: http://git-master/r/737057
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Tested-by: Krishna Reddy <vdumpa@nvidia.com>
2015-05-18 11:33:31 +05:30
Konsta Holtta
16fc6e3931 gpu: nvgpu: protect missing sgl in gk20a_mem_phys
Return zero for missing sgl (sgt is already checked) instead of
attempting to dereference NULL. Those NULL conditions should be almost
nonexistent, and zero is not normally used.

When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.

Change-Id: I7033979091951cba3e3004ddc7550cd327ad0baf
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/737759
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:27 +05:30
Sami Kiminki
520ff00e87 gpu: nvgpu: Implement compbits mapping
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.

Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.

Bug 200077571

Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:19 +05:30
Anders Kugler
069accc857 gpu: nvgpu: tegra gpu to emc frequency mapping
o emc clock scaling (bug fix):
  Take the gpu load into account for gpu frequencies less
  than or equal to fmax @ Vmin.

Bug 1591643

Change-Id: I0298adfdd4b7111557907c3bd6022fd6005355f0
Signed-off-by: Anders Kugler <akugler@nvidia.com>
Reviewed-on: http://git-master/r/735846
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:32:41 +05:30
Terje Bergstrom
d20afe7bd4 gpu: nvgpu: Dynamic betacb size
Allow querying and setting default betacb size via debugfs. For global buffers
the value takes effect upon first boot of GPU, and has no effect after that.

Bug 1628352

Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733328
2015-05-18 11:32:40 +05:30
Sumit Singh
96ffe0c64d gpu: nvgpu: Fix gk20a shutdown issue
With CONFIG_PM_GENERIC_DOMAINS_OF enabled, device reboot
was getting hung while shutting-down gk20a. It was
happening because genpd_dev_pm_detach() was railgating
gk20a while other thread was still accessing it.

So, assigning NULL to dev->pm_domain->detach for gk20a,
so that genpd_dev_pm_detach() is not called during gk20a
shutdown, which will not railgate it.

This patch will be reverted once we have clean shutdown
for gk20a.

Bug 200070810
Bug 200099940

Change-Id: Ie2e89ea01a98a9d4f2f68a3ab07b6923ffa374f6
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/735455
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-05-18 11:31:58 +05:30
Alex Frid
30e47f6984 gpu: nvgpu: Combine delays with GK20A parameters
Specified locking timeout and IDDQ exit delay as GK20A PLL parameters,
and used this data instead of hard-coded numbers.

Change-Id: I59e16ed11fdba6911f2751195d182e68aed96851
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/735481
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
2015-05-18 11:31:55 +05:30
David Li
11e732387d gpu: nvgpu: fix setting gr_pd_ab_dist_cfg1_r()
gr_*__set_alpha_circular_buffer_size() left max_batches field of
  gr_pd_ab_dist_cfg1_r as 0 which results in too many alpha beta
  transitions and poor performance when tessellation or geometry
  shaders are used

Change-Id: If18feb1119e9672005455155dc56337cd444a1f1
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: http://git-master/r/735476
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:31:47 +05:30
Konsta Holtta
024c14c3f5 gpu: nvgpu: dbg level for per-write ctx patch msg
The message "per-write ctx patch begin?" is a legacy message for warning
about probably inefficient code, but it's written at error loglevel.
Silence it out a bit by using gk20a_dbg_info(). The inefficient paths
can be fixed later.

Bug 200075565

Change-Id: Idae821aef3001ea5016de22a1a87fec747c42d31
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/734248
2015-05-18 11:31:41 +05:30
Konsta Holtta
7072bdc513 gpu: nvgpu: check sync existence in channel update
The channel sync object can get deleted before all channel updates have
finished if the channel is freed before them, so work around a null
dereference by testing if the sync exists. Channel and/or c->sync
refcounting would be necessary for proper fix.

Bug 200076344

Change-Id: Ica8ef2df9cd95cfa593cd4f41768dbb6641357b2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/734266
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:31:39 +05:30
Mahantesh Kumbar
ae2a356f36 gpu: nvgpu: updated gpmu interface data struct.
- pmu version 19494277 is from CL 19495746
- updated gpmu interface data struct with
  respect to latest pmu ucode interface headers.
gpmuifpg.h - 19199047
gpmuifperfmon.h - 18238819
gpmuifpmu.h - 19199047
gpmuifacr.h - 19343196
gpmuifcmn.h - 19264862
rmflcnbl.h - 19317152

Bug 200085428

Change-Id: I7db56dcf5a3038b40da37a69e8723a2e9a652e4b
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/728461
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:31:38 +05:30
Terje Bergstrom
3090ace793 gpu: nvgpu: Do not leak ACR header
4b6f83704f054f5b21e05873fa5862c667a9992e tried to fix ACR related
leak. It fell short, because the data structures related were local
and thus the leak was not really fixed.

This patch stores the ACR ucode blob in a global variable, which
survives across rail gating.

Change-Id: Iec3ac9d41156baa26048e079732568c0a95264f4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733732
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2015-05-18 11:31:31 +05:30
Terje Bergstrom
539fc07012 gpu: nvgpu: zbc: disable activity only from ioctl
Move the fifo engine activity disabling and wait-for-idle from the
lowest-level functions higher, into the ioctl path of zbc operations, so
that the sw initialization path wouldn't call them. During the init
path, the disable isn't necessary, and the code path could result in a
deadlock in the fifo runlist mutex.

Change-Id: Icf5c270ba29bc1c7f88874fba2d176d68e11278a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733668
2015-05-18 11:19:51 +05:30
Alex Frid
d1342b8aa2 gpu: nvgpu: Combine delays with GM20B parameters
Added delays definitions to GPCPLL parameters structure:
- locking timeout delay (applied to locking in fixed frequency mode and
  to PLL dynamic ramp in any mode)
- lock delay for GPCPLL NA mode
- IDDQ exit delay in any mode

Specified delay parameters for GM20B PLL, and used this data instead of
hard-coded numbers.

Change-Id: I63ce0abc9ee900c36ec34b8641513db3cbb6f7d5
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/732094
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
2015-05-18 11:19:49 +05:30
Alex Frid
1767c77951 gpu: nvgpu: Add GPU voltage debug access
- Added GPU voltage debug print to the initial locking of GPCPLL under
  bypass (available only when GPCPLL is in NA mode).
- Added /sys/kernel/debug/gpu.0/voltage debugfs node to read voltage
  through GPCPLL (available only when GPCPLL is in NA mode).

Change-Id: I6643ad4d1b228ec4cbc4ff5e8716cce3ef9dccfc
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/731572
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
2015-05-18 11:19:48 +05:30
Alex Waterman
603e28fbdc gpu: nvgpu: Use MC API for SECURITY_CARVEOUT2
This removes all direct access to the MC registers. This requires
that the MC be loaded before the GPU.

Bug 1540908

Change-Id: I90bcde62f65a0c0d73a2bbe92cbf4a980c671c7d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/453653
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Supriya Sharatkumar <ssharatkumar@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:19:45 +05:30
Terje Bergstrom
eed15a6bb7 Revert "gpu: nvgpu: Skip reg read of gpc2clk"
This reverts commit 259842f9d222dd2ca2e66bddaceef4a2fd626bc7.
The commit clears some init values that are never restored.

Change-Id: I4efee115863cbfb08b2e280a58b525cb49adc0b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/732428
2015-05-18 11:19:43 +05:30
Terje Bergstrom
e88a606932 gpu: nvgpu: Power up GPU in CDE only when converting
GPU does not need to powered up if user space calls kernel and there
is no new work to be done.

Bug 1623918

Change-Id: I531aa7033530ae652d13684d8f8568a0e05fc2e1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/732748
2015-05-18 11:19:43 +05:30
Deepak Nibade
2bdba8f161 gpu: nvgpu: fix compile error with ALLOCATOR_DEBUG
Fix compile time error of missing argument when
ALLOCATOR_DEBUG is enabled

Bug 200095967

Change-Id: I600330f3a75cf777d9cd35ec1f00fdd926fba429
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/731320
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sri Krishna Chowdary <schowdary@nvidia.com>
Tested-by: Sri Krishna Chowdary <schowdary@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:19:37 +05:30
Scott Long
6be61e4ae1 gpu: nvgpu: gm20b: correct hdr #define
__REGOPS_GK20A_H_ -> __REGOPS_GM20B_H_

Bug 1634208

Change-Id: Ic623563492c084162bfad10f895896d77b4192ed
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: http://git-master/r/729749
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:19:31 +05:30
Deepak Nibade
087ce7301b gpu: nvgpu: return error if GPU not initialized
While writing to sysfs "tpc_fs_mask", we need to have
GPU initialized (we need to have called gk20a_busy()
at least once before)

If this is not happened yet, then return error

Bug 1456969

Change-Id: I09db6bcaa44b8939246cb5ed1205f3fbc0ee0552
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/731327
(cherry picked from commit 0dbbcf60bbad6b9a31392d2290a3e26c5daa1e5d)
Reviewed-on: http://git-master/r/731671
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-05-18 11:19:27 +05:30
Terje Bergstrom
916a557bd6 gpu: nvgpu: Fill in ACR header only once
We call prepare_ucode_blob() once each time we un-railgate. We
allocate prepare the header for ACR ucode there, but the header
never gets freed.

Allocate and prepare the ACR header only once.

Change-Id: I948da8b47d6bb2fa021868d7038d2cc35eccb460
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/729745
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2015-05-18 11:19:24 +05:30
Konsta Holtta
c19c046446 gpu: nvgpu: protect missing sgt in gk20a_mem_phys
Return zero for missing sgt instead of attempting to dereference NULL.
Those NULL conditions should be almost nonexistent, and zero is not
normally used.

When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.

Change-Id: Id8ce37798d6fd3ceeb96a3f521c82569fccf30aa
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/729006
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:19:24 +05:30
Seshendra Gadagottu
c90a897c8e gpu: nvgpu: gm20b: enable slcg fb
Bug 1550628

Change-Id: I8daed555704b49ee0d50530e3d51c03027d31fc5
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/719892
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:18:55 +05:30
Alex Waterman
e3b62a54c9 gpu: nvgpu: fix return code in *_ltc_cbc_ctrl()
Fix the return code for both gk20a_ and gm20b_ltc_cbc_ctrl()
functions. Before a positive return woudl always happen. Now,
if there's a timeout -EBUSY is returned.

Change-Id: Id76dc44af1376fceebf5043afb057c153cb0752e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/729165
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:18:54 +05:30
Alex Waterman
72b565452e gpu: nvgpu: Fix timeout in gm20b's LTC flush
The flush timeout should have been comparing between the current
time (jiffies) not the snapshot in time when the L2 flush started.

Change-Id: Idba0ccbfeeab9e3fadd0b5bed7073acefbd403e3
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/729090
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:18:39 +05:30
Terje Bergstrom
3b8c8972ef gpu: nvgpu: Use common allocator for ACR
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: Ib70db4dff782176ed7f92b6809c8415b8c35abe1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721120
2015-05-18 11:18:27 +05:30
Deepak Nibade
c5f2d00d04 gpu: nvgpu: fix deadlock on railgate_lock during race condition
We have below race condition during __gk20a_do_idle()
and force_reset case :

- before execution of __gk20a_do_idle(), a process drops the last
  usage count of GPU, which triggers GPU railgate process
- but before GPU is really railgated (there is 500 mS delay),
  some process calls __gk20a_do_idle()
- in __gk20a_do_idle(), we first take railgate_lock
- then we check if GPU is already railgated or not
- since it is not railgated yet (due to 500 mS delay), this
  returns false
- then we call pm_runtime_get_noresume() which just increases the
  usage counter
- in this particular case, this call just increases usage count to
  1 from 0, but whereas GPU is already on its way to railgate
- while we check if GPU usage count drops to one, GPU gets railgated
- now if we have force_reset=true case, we will end up calling
  pm_runtime_get_sync() which will take railgate_lock lock _again_
  and try to unrailgate GPU
- this causes a deadlock on railgate_lock

To fix this, use below sequence :

- take railgate_lock
- check if GPU is already railgated
- release railgate_lock
- call pm_runtime_get_sync() which will keep GPU active even if
  railgating is already triggered
- take railgate_lock again to prevent unrailgate in futher process

Also, add more descriptive comments to explain the flow

Bug 1624537

Change-Id: I0febc65d7bfac03ee738be200cf321322ffbe5a6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/719625
(cherry picked from commit 480284eda16e2b50ee6368bad3d15574e098b231)
Reviewed-on: http://git-master/r/719620
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-05-18 11:18:24 +05:30
Alex Van Brunt
900f63393d gpu: nvgpu: don't reset clk that doesn't exist
If the clock is null, calling the reset function will crash the
kernel. So, don't call the reset function.

Change-Id: I37ef25c8dca67bec8bf6654eb6e275b866bdae53
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/742361
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-05-15 15:42:41 -07:00
Terje Bergstrom
aa25a952ea Revert "gpu: nvgpu: New allocator for VA space"
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.

Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2015-05-12 02:46:39 -07:00
Alex Waterman
a2e8523645 gpu: nvgpu: New allocator for VA space
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.

The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.

The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.

Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.

Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:53:25 -07:00
Alex Waterman
0566aee853 gpu: nvgpu: WAR for simulator bug
On linsim, when the push buffers are allowed to be allocated with small
pages above 4GB the simulator crashes. This patch ensures that for
linsim all small page allocations are forced to be below 4GB in the
GPU VA space. By doing so the simulator no longer crashes.

This bug has come up because the GPU buddy allocator work generates
allocations at the top of the address space first. Thus push buffers
were located at between 12GB and 16GB in the GPU VA space.

Change-Id: Iaef0af3fda3f37ac09a66b5e1179527d6fe08ccc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740728
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:52:09 -07:00
Alex Waterman
e206fdecb3 gpu: nvgpu: Fix off-by-one error in PDE calculations
The number of entries in the next level PDE data structure was one
half of what was needed since the bit shift was 1 bit too small.

Change-Id: Id4981f230dd206ae94336cddab117312e143e6a1
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740727
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:51:38 -07:00
Alex Waterman
4d405809a9 gpu: nvgpu: Reduce BAR1 kernel size
Reduce the BAR1 size in the kernel to match the reserved size in the
DTB. This caused problems for the buddy allocator since the allocator
can sometimes allocate from higher memory before lower memory in the
managed space. This would cause the kernel to access unmapped memory.

Change-Id: I70b72ef5bb4db01253e5087757051ef852e99bc6
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740726
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:51:08 -07:00
Sami Kiminki
8d6fe0f2ef gpu: nvgpu: Implement compbits padding for mapping
Implement NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS, which adds
extra alignment to compbits allocation for safe compbits mapping.

Bug 200077571

Change-Id: I3a74ebb81412e4e1e69501debeb9ef4e2056ef1a
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/730763
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/740693
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-05-11 08:50:49 -07:00
Terje Bergstrom
5a5662fffb gpu: nvgpu: Export VPR allocator
Export functions for VPR allocation.

Bug 1625090

Change-Id: Ief54613402965da3f41d8dd4a463c75729a3941a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/737847
Reviewed-on: http://git-master/r/738574
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:59:06 -07:00
Terje Bergstrom
b3a85df53b gpu: nvgpu: SMMU bypass
Improve GMMU mapping code to cope with discontiguous buffers.

Add debugfs entry that allows bypassing SMMU and disabling big pages.

Bug 1605769

Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717503
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/737533
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:59:01 -07:00
Vijayakumar
4425e9ebcf gpu: nvgpu: use 4K hole for pmu VM
bug N/A

with 128MB hole we are running into PDE
errors when 64K big page is used instead
of 128k

Signed-off-by: Vijayakumar <vsubbu@nvidia.com>

Change-Id: Id887b32484e2114a8707e7d534e6ebf5e108b83f
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/733497
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/737532
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:58:57 -07:00
Terje Bergstrom
852822b2ef gpu: nvgpu: Record size of page table level
Record size of each page table level. The size of level 0 depends
on size of the address space, and we generally do not support the
whole address space.

Change-Id: Iab47505af1a641e193d9e98a2246e522813f221a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/729730
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-on: http://git-master/r/737531
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:58:52 -07:00
Terje Bergstrom
2204f2a524 gpu: nvgpu: Use common allocator for patch
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: Idf51831e8be9cabe1ab9122b18317137fde6339f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721030
Reviewed-on: http://git-master/r/737530
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:57:34 -07:00
Terje Bergstrom
5486503343 gpu: nvgpu: Align VA of compressible buffer
Ensure that the GPU VA for a buffer is aligned correctly if
compression is enabled.

Bug 1605769

Change-Id: I12566ddd554da7cc9fb41dd553576c534ac96ba8
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/725767
Reviewed-on: http://git-master/r/737529
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:55:49 -07:00
Terje Bergstrom
d47b01d74f gpu: nvgpu: Free all page table levels
Convert the loop to free page tables into a recursive loop that goes
through all levels.

Change-Id: I3ab8f021bd8263f2f6dad29b5fbd0e6212c55a86
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/711393
Reviewed-on: http://git-master/r/737528
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:55:44 -07:00
Terje Bergstrom
06be77da37 gpu: nvgpu: Do not send WFI when finishing channel
The channel teardown process sends a WFI method to ensure that all
work has been completed. But we also preempt the channel a while
later, which also ensures that all work is completed.

Remove the code for submitting WFI, and rely on preemption to handle
idling the pipe.

Change-Id: I2af029184440ee73e70d377f15690ddaf9b8599f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/735067
Reviewed-on: http://git-master/r/737527
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:55:41 -07:00
Terje Bergstrom
9bbffa11de gpu: nvgpu: Reconfigure instance block with syncpt
Resetup RAMFC once sync point id is allocated for a channel.

Change-Id: Idbac406bea1c94c89ef587dda08fddc740c1fadb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/711302
Reviewed-on: http://git-master/r/737526
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:55:35 -07:00
Alex Waterman
6e1dfd0131 platform: tegra: mc: Centralize header files
Place all header files under linux/platform/tegra/. Also update
all source files that include the moved headers to correctly
reflect their new location.

Change-Id: Iff5738d3ad75e93519d1a4b573b80d03e6a9b053
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/728636
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Tested-by: Krishna Reddy <vdumpa@nvidia.com>
Reviewed-on: http://git-master/r/733651
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:55:21 -07:00
Mahantesh Kumbar
586bc05700 gpu: nvgpu: made gm20b_pmu_init_acr() global.
-made gm20b_pmu_init_acr() method to global to access
in pmu-T18x.

Bug 200085428

Change-Id: Ic262997d5c6f97cecf12d17d9a64a9d1cd20c83b
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/732210
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Tested-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/735727
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-04-27 10:45:37 -07:00
Dan Willemsen
75b50e8588 HACK: Disable genpd_pm_subdomain_attach
Upstream doesn't keep track of the DT node in the genpd struct anymore.

Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
2015-04-04 19:17:41 -07:00
Terje Bergstrom
029ccf28ec gpu: nvgpu: Sem wakeup to post event
Add posting a channel event whenever we do a wakeup due to semaphore.

Change-Id: Id1765123de93bcbc0822af7926d7f4e9919ffe10
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/726420
2015-04-04 19:17:38 -07:00
Terje Bergstrom
10e97dccc5 gpu: nvgpu: Check alignment of fixed allocs
When mapping buffer on a fixed address, ensure that the alignment of
buffer and the address are compabile. When freeing, retrieve page
size from the VA instead of choosing it again.

Bug 1605769

Change-Id: I4f73453996cd53a912b6a414caa41563cde28da7
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/725764
2015-04-04 19:17:38 -07:00