Commit Graph

309 Commits

Author SHA1 Message Date
Richard Zhao
5b7588a50b gpu: nvgpu: add characteristics flag NVGPU_GPU_FLAGS_SUPPORT_TSG
NVGPU_GPU_FLAGS_SUPPORT_TSG indicates both the kernel driver and
device support time slice group (TSG).

Bug 1617046
Bug 200155618

Change-Id: Ib3490a32b773222560c58f1fd6d32bffcb97d6cd
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1010173
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-02-11 10:27:37 -08:00
Deepak Nibade
8d311e5a91 gpu: nvgpu: add max freq to gpu characteristics
Bug 200097029

Change-Id: Id63dad1629b1d1919cbbfb20b0cb85d4855f526d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1000724
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-02 08:49:33 -08:00
Seshendra Gadagottu
9e02111a76 gpu: nvgpu: fix race condition with poweroff
When gpu rail-gating is enabled, it is possible that
both rail gating code and system shudown can start
executing gk20a_pm_prepare_poweroff() in parallel.
To synchronize this execution, protect gk20a_pm_prepare_poweroff()
with a mutex lock.

Bug 200168805

Change-Id: I19536a43ed20c3e82b32c316922dc3e19e3f59bb
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/999548
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-29 12:58:22 -08:00
Seshendra Gadagottu
3a68fb2d78 gpu: nvgpu: suspend cde cleanly
Few times cde is getting deadlocked because of pending cde
operation. So do the things cleanly, first suspend cde then
do channel suspend.

Bug 1709757

Change-Id: Iaf566b63d9efb13aa2691c19e2df676c70f26afc
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/926574
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-19 17:47:13 -08:00
dmitry pervushin
2113479bbf drivers: allow selected drivers to async probe
List of drivers that now want async probe:
- sdhci-tegra
- qspi-mtd
- nvmap
- gk20a
- dc

Bug 200083391

Change-Id: Ie0a0677961b704c78d4eb2cdab9f0e9a925a3ca1
Reviewed-on: http://git-master/r/923738
(cherry-picked from 75c067e83c7cde2a37c4fae01719e40c5b7d2835)
Signed-off-by: dmitry pervushin <dpervushin@nvidia.com>
Reviewed-on: http://git-master/r/923121
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
Tested-by: Sumeet Gupta <sumeetg@nvidia.com>
2016-01-13 10:39:00 -08:00
Peter Pipkorn
2b064ce65e gpu: nvgpu: add high priority channel interleave
Interleave all high priority channels between all other channels.
This reduces the latency for high priority work when there
are a lot of lower priority work present, imposing an upper
bound on the latency. Change the default high priority timeslice
from 5.2ms to 3.0 in the process, to prevent long running high priority
apps from hogging the GPU too much.

Introduce a new debugfs node to enable/disable high priority
channel interleaving. It is currently enabled by default.

Adds new runlist length max register, used for allocating
suitable sized runlist.

Limit the number of interleaved channels to 32.

This change reduces the maximum time a lower priority job
is running (one timeslice) before we check that high priority
jobs are running.

Tested with gles2_context_priority (still passes)
Basic sanity testing is done with graphics_submit
(one app is high priority)

Also more functional testing using lots of parallel runs with:
NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw
 –drawsperframe 20000 –triangles 50 –runtime 30 –finish
plus multiple:
NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 -finish

Previous to this change, the relative performance between
high priority work and normal priority work comes down
to timeslice value. This means that when there are many
low priority channels, the high priority work will still
drop quite a lot. But with this change, the high priority
work will roughly get about half the entire GPU time, meaning
that after the initial lower performance, it is less likely
to get lower in performance due to more apps running on the system.

This change makes a large step towards real priority levels.
It is not perfect and there are no guarantees on anything,
but it is a step forwards without any additional CPU overhead
or other complications. It will also serve as a baseline to
judge other algorithms against.

Support for priorities with TSG is future work.
Support for interleave mid + high priority channels,
instead of just high, is also future work.

Bug 1419900

Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98
Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com>
Reviewed-on: http://git-master/r/805961
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-11 09:04:01 -08:00
Deepak Nibade
632a86f7cd gpu: nvgpu: IOCTL to disable watchdog per-channel
Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable
watchdog per-channel

Also, if watchdog is disabled, we currently schedule
the worker with MAX timeout.
Instead of this, do not schedule any worker if
watchdog is disabled

Bug 1683059
Bug 1700277

Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/837962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-30 08:48:59 -08:00
Deepak Nibade
10f6da09eb gpu: nvgpu: fix Coverity issues
- operands not affecting result (id = 12845)
- logically dead code (id = 12890)
- dereference after null check (id = 12968)
- unsigned compared to 0 (id = 13176)
- resource leak (id = 13338, 18673)
- unused pointer value (id = 13916)

Bug 1703084

Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/835143
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-11-25 00:45:58 -08:00
Richard Zhao
1246629c19 gpu: nvgpu: abstract set mmu debug mode
Add new operaton g->ops.mm.set_debug_mode and let other places
that set debug mode call this callback.

It's preparing for adding vgpu set mmu debug mode hook.

JIRA VFND-1005
Bug 1594604

Change-Id: I1d227a0c0f96adb0035ae16ae1f4fbfa739bf0a7
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/833497
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2015-11-23 14:31:02 -08:00
Mahantesh Kumbar
b8b6df791b gpu: nvgpu: update name for gpu debugfs node
Create constant name for gpu debugfs node across all chip.

Bug n/a

Change-Id: I359b82b5389c49d8fe2a31ace49ff6daa1edfb10
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/805397
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
(cherry-picked from commit 17a3882cde09412c68f7a0ee4765f45be1a51c45)
Reviewed-on: http://git-master/r/817014
2015-11-23 08:37:41 -08:00
Sami Kiminki
9d2c9072c8 gpu: nvgpu: User-space managed address space support
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.

When an address space is marked as userspace-managed, the following
changes are in effect:

- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
  except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
  ref increments at kickoffs and decrements at job completion are
  skipped.

Bug 1614735
Bug 1623949
Bug 1660392

Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-18 09:45:07 -08:00
Deepak Nibade
3684ed8fd6 gpu: nvgpu: use railgate instead of reset_assert
In gk20a_do_idle(), if can_railgate = false, we do
reset_assert() on the GPU

But asserting reset might have dependencies on
specific h/w seqeunce for ensuring proper reset

Hence avoid calling reset_assert() and directly
call platform specific unrailgate() and railgate()
APIs which take care of the correct reset sequence

Bug 200142989
Bug 200137963
Bug 1678611

Change-Id: Ide886dd88b8422ad36de52d54378b1edd9c7bbd6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/820322
(cherry picked from commit d621ddd49da976a75c14aa7aaa37f700fb4e83f2)
Reviewed-on: http://git-master/r/822515
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-10-26 04:03:56 -07:00
Deepak Nibade
0334d7e303 gpu: nvgpu: unrailgate only if pm_domains not enabled
Currently we unrailgate the GPU if railgating is not enabled
or pm_domains are not enabled

But in case if railgating is not enabled and pm_domains
are enabled, we explicitly unrailgate GPU in gk20a_pm_init()
and then runtime PM unrailgates it again when first user
space request arrives - setting unrailgate refcount to 2

Now for gk20a_do_idle(), we need to railgate the GPU in
fist call but that does not happen since unrailgate
refcount != 1

hence, in case railgating is not enabled, we should
unrailgate the GPU from only one place i.e. when first user
space request arrives

Bug 200142989
Bug 200137963
Bug 1678611

Reviewed-on: http://git-master/r/820321
(cherry picked from commit 452a1ff8da8e3f47caed2371440f9ad150bf8699)

Change-Id: Ic9fe2267c9df5629315c30c1404c2b3044c1265a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/822296
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-10-26 01:53:15 -07:00
Deepak Nibade
1cf6539990 gpu: nvgpu: rework secure_page allocation path
Currently, if can_railgate = false, then we have below
sequence to allocate secure_page
- unrailgate GPU (forever)
- reset_assert()
- allocate secure_page
- reset_deassert()

But if we allocate secure page even before unrailgating GPU
for first time, then we can avoid reset_assert()/deassert()
calls since GPU should already be in reset/railgated at
boot time

hence, rework this sequence as below
- init required mutex, set platform->reset_control
- allocate secure page (GPU should already be in reset
  at this point)
- gk20a_pm_init() which unrailgates GPU in case of
  can_railgate = false

Bug 200137963
Bug 1678611

Change-Id: I79d0543bb5cf1eaf1009e1e6ac142532d84514a5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797153
(cherry picked from commit 368004501943d38c003747f6bec0384fed57ee65)
Reviewed-on: http://git-master/r/816005
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-12 08:58:35 -07:00
Kirill Artamonov
3ad8438c90 gpu: nvgpu: set correct timeslice value
Scale timeslice register value based on platform
specific ptimer scale koefficient.

Expose timeslice values through debugfs to simplify performance
tuning.

bug 1605552
bug 1603226

Change-Id: I49f86f22d58d26a366ee1b5f5a9ab9d7f896ad25
Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-on: http://git-master/r/800007
(cherry picked from commit 00c85ef24cf28ffaa81eb53fff7edef1c699220a)
Reviewed-on: http://git-master/r/808251
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-06 13:31:10 -07:00
Deepak Nibade
ed0737ab60 gpu: nvgpu: support reset_control API
Bug 200137963

Change-Id: I3197af905c945540b97ba191e5695d970d77af8e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797154
(cherry picked from commit 8a50245ea636deb87a3d9435fb115b4eac88fac9)
Reviewed-on: http://git-master/r/808247
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-06 13:29:37 -07:00
Richard Zhao
cc793c34cc gpu: nvgpu: let shutdown callback call vgpu_pm_prepare_poweroff for vgpu
It fixed the issue that system hang when reboot.

Bug 1638850

Change-Id: If53a31e86c10b2fce4a22fe4fcf92106d86c95ef
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/803234
(cherry picked from commit 4dbea2c7037a5244ccb9d6e886023c29ba584892)
Reviewed-on: http://git-master/r/808245
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-06 13:27:52 -07:00
Kirill Artamonov
ad113cf0a5 gpu: nvgpu: create debugfs node early
Create debugfs node before platform->probe() is called.

Allow chip specific debugfs entries go to correct
directory.

bug 1525327
bug 1581799

Change-Id: I2d91bdc1e72dac6787938eff01218c9f871029cb
Signed-off-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-on: http://git-master/r/796092
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/778729
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-09-30 08:21:42 -07:00
Deepak Nibade
613990cb39 gpu: nvgpu: implement per-channel watchdog
Implement per-channel watchdog/timer as per below rules :
- start the timer while submitting first job on channel or if
  no timer is already running
- cancel the timer when job completes
- re-start the timer if there is any incomplete job left
  in the channel's queue
- trigger appropriate recovery method as part of timeout
  handling mechanism

Handle the timeout as per below :
- get timed out channel, and job data
- disable activity on all engines
- check if fence is really pending
- get information on failing engine
- if no engine is failing, just abort the channel
- if engine is failing, trigger the recovery

Also, add flag "ch_wdt_enabled" to enable/disable channel
watchdog mechanism. Watchdog can also be disabled using
global flag "timeouts_enabled"

Set the watchdog time to be 5s using macro
NVGPU_CHANNEL_WATCHDOG_DEFAULT_TIMEOUT_MS

Bug 200133289

Change-Id: I401cf14dd34a210bc429f31bd5216a361edf1237
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/797072
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-28 09:08:12 -07:00
Leonid Moiseichuk
54c2ae59f0 gpu: nvgpu: cyclestats snapshot permissions rework
Cyclestats snapshot feature is expected for new devices.
The detection code was isolated in separate function and run-time
check added to validate/allow ioctl calls on the current GPU.

Bug 1674079

Change-Id: Icc2f1e5cc50d39b395d31d5292c314f99d67f3eb
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/781697
(cherry picked from commit bdd23136b182c933841f91dd2829061e278a46d4)
Reviewed-on: http://git-master/r/793630
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-04 09:03:07 -07:00
Sam Payne
22dacd2685 gpu: nvgpu: dump PGRAPH_PRI on error
dumps NV_PGRAPH_PRI_GPC0_GPCCS_FS_GPC
whenever pbus sends the 0xbadf13 error

bug 1662268

Change-Id: I302ffe5c86098e7235ecc8c071a5e2c852455565
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/789090
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-08-31 14:38:05 -07:00
Yogesh
05a6b54914 gpu: nvgpu: Inject function addresses
Inject function addresses of gk20a_do_idle and
gk20a_do_unidle once the nvgpu module loads.

Bug 1476801

Change-Id: I67a8ae7fb654524616c2c2c710013cbc097a3f32
Signed-off-by: Yogesh Bhosale <ybhosale@nvidia.com>
Reviewed-on: http://git-master/r/785047
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-08-21 15:14:44 -07:00
Sumit Singh
77a816e7bf Merge branch 'power-domain-t186' into 'kernel-3.18'
Add device-tree support for tegra power-domains and power-gating
for t186, then perform the related cleanup.
Also enable TEGRA_MC_DOMIANS, PM_GENERIC_DOMAINS_OF and
TEGRA_POWERGATE for t186.

Bug 200105664

Change-Id: I548c6b71a1577afa439a39a0eafc317a1c3cbc68
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2015-08-04 12:43:12 +05:30
Deepak Nibade
a76c6fc950 gpu: nvgpu: prepare_poweroff() in shutdown()
gk20a_pm_shutdown() is the last callback before
GPU railgate will be forced by platform code

Hence we need to call prepare_poweroff() before
returning from shutdown() to clean up below things
mainly,

1. disable interrupts to ensure that GPU is not
   processing any interrupts while railgating

2. disable clocks (and related flags) to ensure
   no h/w access from exported clock ops

Note that GPU railgate will be triggered by platform
code since config CONFIG_PM_GENERIC_DOMAINS_OF is
enabled by default

Bug 200123584

Change-Id: Ifaa0d1ba9b01d49bf5cc85d9c9a9feb3815866d8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/770485
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-07-22 02:23:25 -07:00
Sumit Singh
25f0faeb37 gpu: nvgpu: clean-up the code
As CONFIG_PM_GENERIC_DOMAINS_OF is enabled, so cleaning-up
the code which remains unused when this config is enabled.

Bug 200070810

Change-Id: I884ca3d6fb8fa6acdff8c1b2fbe66a672758274a
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2015-07-21 11:46:23 +05:30
Sumit Singh
a19024c4c3 gpu: nvgpu: Add DT support for gpu power-domain
Make modification to add DT support for gpu
power-domain for T186 chip.

Bug 200105664

Change-Id: Ief8d0a6c84918578c52d153db7eac02587b67ee7
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2015-07-21 11:46:22 +05:30
Leonid Moiseichuk
1e7b5ea793 gpu: nvgpu: cyclestats snapshots are only for t210
The cyclestats mode-e feature supported by userspace only
for t210 devices, so kernel should advertize it only for t210.

Also small check added to prevent BUG in dma-buf.c:826
if device has lack of memory.

Bug 1662506

Change-Id: I8417a8cdd9092e64126382f379d171932e4592a1
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/767073
(cherry picked from commit 06f86b6e78bae5e26e32466716c18e7918efb1b1)
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/767148
Reviewed-by: Automatic_Commit_Validation_User
2015-07-10 00:31:03 -07:00
Terje Bergstrom
4cc1457703 gpu: nvgpu: Move clk bypass div code to clk init
Clock bypass divider was changed just before resetting priv ring.
Move the code to a new clk op instead so that it is executed only on
gk20a.

Change-Id: Ic8084a4a5fac23770f50b50f910ced2543ba0f28
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/764970
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Aleksandr Frid <afrid@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2015-07-03 07:51:42 -07:00
Sami Kiminki
e7ba93fefb gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation
Add batch support for mapping and unmapping. Batching essentially
helps transform some per-map/unmap overhead to per-batch overhead,
namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB
invalidates. Batching with size 64 has been measured to yield >20x
speed-up in low-level fixed-address mapping microbenchmarks.

Bug 1614735
Bug 1623949

Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/733231
(cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91)
Reviewed-on: http://git-master/r/763812
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-30 08:35:23 -07:00
Sumit Singh
1f99458ed3 gpu: nvgpu: Uncomment suspend/resume ops
As upstream has removed them, but we are still using these.
So uncommenting these callback assignment.

Bug 200070810

Change-Id: I26a221f9d76f6acef70095eb8afcf440057f464c
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2015-06-29 11:42:30 +05:30
Sumit Singh
c65d41e768 gpu: nvgpu: Add DT support for gpu power-domain for T132
Make modification to add DT support for gpu
power-domain for T132 chip.

Bug 200070810

Change-Id: Iac63c8fb5fc5280e9a9f5758e63c9da009f3813d
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/739698
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-06-29 11:42:28 +05:30
Sumit Singh
c61ab8facb Revert "HACK: Disable genpd_pm_subdomain_attach"
This reverts commit 83699a4ec9ebf55f6cc12c76e57dad1d4ec2fbfa.

This hack was put in place as upstream has removed of_node
field from generic_pm_domain structure. But as we are still
using it, so removing this hack.

Bug 200100078

Change-Id: I14e533786fb814e361c580e2883ceff1f63d251f
2015-06-29 11:38:48 +05:30
Konsta Holtta
6085c90f49 gpu: nvgpu: add per-channel refcounting
Add reference counting for channels, and wait for reference count to
get to 0 in gk20a_channel_free() before actually freeing the channel.
Also, change free channel tracking a bit by employing a list of free
channels, which simplifies the procedure of finding available channels
with reference counting.

Each use of a channel must have a reference taken before use or held
by the caller. Taking a reference of a wild channel pointer may fail, if
the channel is either not opened or in a process of being closed. Also,
add safeguards for protecting accidental use of closed channels,
specifically, by setting ch->g = NULL in channel free. This will make it
obvious if freed channel is attempted to be used.

The last user of a channel might be the deferred interrupt handler,
so wait for deferred interrupts to be processed twice in the channel
free procedure: once for providing last notifications to the channel
and once to make sure there are no stale pointers left after referencing
to the channel has been denied.

Finally, fix some races in channel and TSG force reset IOCTL path,
by pausing the channel scheduler in gk20a_fifo_recover_ch() and
gk20a_fifo_recover_tsg(), while the affected engines have been identified,
the appropriate MMU faults triggered, and the MMU faults handled. In this
case, make sure that the MMU fault does not attempt to query the hardware
about the failing channel or TSG ids. This should make channel recovery
more safe also in the regular (i.e., not in the interrupt handler) context.

Bug 1530226
Bug 1597493
Bug 1625901
Bug 200076344
Bug 200071810

Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/448463
(cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353)
Reviewed-on: http://git-master/r/755147
Reviewed-by: Automatic_Commit_Validation_User
2015-06-09 11:13:43 -07:00
Leonid Moiseichuk
837ceffcab gpu: nvgpu: cyclestats mode E snapshots support
That is a kernel supporting code for cyclestats mode E.
Cyclestats mode E implemented following Windows-design in user-space
and required the following operations to be implemented:
- attach a client for shared hardware buffer of device
- detach client from shared hardware buffer
- flush means copy of available data from hardware buffer to private
  client buffers according to perfmon IDs assigned for clients
- perfmon IDs management for user-space clients
- a NVGPU_GPU_FLAGS_SUPPORT_CYCLE_STATS_SNAPSHOT capability added

Bug 1573150

Change-Id: I9e09f0fbb2be5a95c47e6d80a2e23fa839b46f9a
Signed-off-by: Leonid Moiseichuk <lmoiseichuk@nvidia.com>
Reviewed-on: http://git-master/r/740653
(cherry picked from commit 79fe89fd4cea39d8ab9dbef0558cd806ddfda87f)
Reviewed-on: http://git-master/r/753274
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-06 07:23:24 -07:00
Bharat Nihalani
b8aa486109 Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""""
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.

Bug 200112195

Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-04 10:41:00 -07:00
Bharat Nihalani
1d8fdf5695 Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.

Bug 200106514

Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
2015-06-02 20:18:55 -07:00
Terje Bergstrom
e19d349858 gpu: nvgpu: Support >32bit addresses in simulation
Change-Id: I96282b4e047ba8b5369dac039f0f51856c69235b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/747935
(cherry-picked from commit 0bb090745b4122fc4149b1bd6026138a1b9a32bc)
Reviewed-on: http://git-master/r/749235
2015-06-01 08:16:28 -07:00
Alex Waterman
01f359f3f1 Revert "Revert "gpu: nvgpu: New allocator for VA space""
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.

The original commit was actually fine.

Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-19 13:09:00 -07:00
Deepak Nibade
7ddd6e261e gpu: nvgpu: wait for running jobs to finish before shutdown
In gk20a_pm_shutdown(), we currently call __pm_runtime_disable()
which prevents h/w access to new requests made after shutdown() call
Also, once gk20a_pm_shutdown() completes, platform code will
just rail gate the GPU

But it is possible that some other thread is already accessing
h/w while shutdown() was triggered and this can result in hang

Hence, wait until all currently executing jobs are finished
before returning from gk20a_pm_shutdown()

Also, we need to wait for GPU's usage count to become 1 since
platform code will increase the usage count and then call
shutdown(). Hence usage count of 1 indicates that GPU is idle

Bug 200099940

Change-Id: I1f2457829e2737c07302d13f355353a30c3b4e67
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/734920
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-05-18 11:33:48 +05:30
Deepak Nibade
e7ea596363 Revert "gpu: nvgpu: Fix gk20a shutdown issue"
This reverts commit c18edd3686115ca0b7d8bb08b35f23264f865358.

Proper fixes for shutdown issue are being added with below changes
http://git-master/r/#/c/738509/
http://git-master/r/#/c/734920/

Hence revert this workaround

Bug 200099940

Change-Id: I74b29c804af2bdb9d95c6b93c5308a323575ae57
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/739082
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:39 +05:30
Deepak Nibade
d9854e4782 gpu: nvgpu: jump to fail path if pm_runtime_get_sync() fails
Currently we execute pm_runtime_get_sync() and then
gk20a_scale_notify_busy() without checking return value of
pm_runtime_get_sync()

In case of shutdown of GPU is already initiate, we get
a hard hang due to this as per below sequence :
- one thread invokes GPU shutdown and then forcibly rail
  gates the GPU
- another thread (unaware of shutdown) calls gk20a_busy()
- since runtime PM is disabled in shutdown path,
  pm_runtime_get_sync() fails
- but we still go on running gk20a_scale_notify_busy() which
  tries to access some GPU registers and hangs

Fix this by jumping to failure path in case
pm_runtime_get_sync() fails

Bug 200099940

Change-Id: I022f2dfa9408f640fb44e6f4b10a437688779c0a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/738509
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:38 +05:30
Sami Kiminki
520ff00e87 gpu: nvgpu: Implement compbits mapping
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.

Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.

Bug 200077571

Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:19 +05:30
Terje Bergstrom
d20afe7bd4 gpu: nvgpu: Dynamic betacb size
Allow querying and setting default betacb size via debugfs. For global buffers
the value takes effect upon first boot of GPU, and has no effect after that.

Bug 1628352

Change-Id: Ib63f4299249c41eab1b36cc501b525cc54211195
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/733328
2015-05-18 11:32:40 +05:30
Sumit Singh
96ffe0c64d gpu: nvgpu: Fix gk20a shutdown issue
With CONFIG_PM_GENERIC_DOMAINS_OF enabled, device reboot
was getting hung while shutting-down gk20a. It was
happening because genpd_dev_pm_detach() was railgating
gk20a while other thread was still accessing it.

So, assigning NULL to dev->pm_domain->detach for gk20a,
so that genpd_dev_pm_detach() is not called during gk20a
shutdown, which will not railgate it.

This patch will be reverted once we have clean shutdown
for gk20a.

Bug 200070810
Bug 200099940

Change-Id: Ie2e89ea01a98a9d4f2f68a3ab07b6923ffa374f6
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
Reviewed-on: http://git-master/r/735455
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2015-05-18 11:31:58 +05:30
Deepak Nibade
c5f2d00d04 gpu: nvgpu: fix deadlock on railgate_lock during race condition
We have below race condition during __gk20a_do_idle()
and force_reset case :

- before execution of __gk20a_do_idle(), a process drops the last
  usage count of GPU, which triggers GPU railgate process
- but before GPU is really railgated (there is 500 mS delay),
  some process calls __gk20a_do_idle()
- in __gk20a_do_idle(), we first take railgate_lock
- then we check if GPU is already railgated or not
- since it is not railgated yet (due to 500 mS delay), this
  returns false
- then we call pm_runtime_get_noresume() which just increases the
  usage counter
- in this particular case, this call just increases usage count to
  1 from 0, but whereas GPU is already on its way to railgate
- while we check if GPU usage count drops to one, GPU gets railgated
- now if we have force_reset=true case, we will end up calling
  pm_runtime_get_sync() which will take railgate_lock lock _again_
  and try to unrailgate GPU
- this causes a deadlock on railgate_lock

To fix this, use below sequence :

- take railgate_lock
- check if GPU is already railgated
- release railgate_lock
- call pm_runtime_get_sync() which will keep GPU active even if
  railgating is already triggered
- take railgate_lock again to prevent unrailgate in futher process

Also, add more descriptive comments to explain the flow

Bug 1624537

Change-Id: I0febc65d7bfac03ee738be200cf321322ffbe5a6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/719625
(cherry picked from commit 480284eda16e2b50ee6368bad3d15574e098b231)
Reviewed-on: http://git-master/r/719620
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-05-18 11:18:24 +05:30
Alex Van Brunt
900f63393d gpu: nvgpu: don't reset clk that doesn't exist
If the clock is null, calling the reset function will crash the
kernel. So, don't call the reset function.

Change-Id: I37ef25c8dca67bec8bf6654eb6e275b866bdae53
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/742361
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-05-15 15:42:41 -07:00
Terje Bergstrom
aa25a952ea Revert "gpu: nvgpu: New allocator for VA space"
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.

Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2015-05-12 02:46:39 -07:00
Alex Waterman
a2e8523645 gpu: nvgpu: New allocator for VA space
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.

The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.

The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.

Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.

Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:53:25 -07:00
Terje Bergstrom
b3a85df53b gpu: nvgpu: SMMU bypass
Improve GMMU mapping code to cope with discontiguous buffers.

Add debugfs entry that allows bypassing SMMU and disabling big pages.

Bug 1605769

Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717503
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/737533
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:59:01 -07:00
Dan Willemsen
75b50e8588 HACK: Disable genpd_pm_subdomain_attach
Upstream doesn't keep track of the DT node in the genpd struct anymore.

Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
2015-04-04 19:17:41 -07:00