Commit Graph

1353 Commits

Author SHA1 Message Date
Konsta Holtta
1a63ca3a65 gpu: nvgpu: fall back to sysmem for generic allocs
In gk20a_gmmu_alloc_attr(), which is used for in-kernel allocations,
fall back to attempting to allocate sysmem when vidmem allocation fails.

Bug 1809939

Change-Id: I0397026fd1b3bc803f6d8bb7409e05ab31ec961d
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1215447
(cherry picked from commit 3ec37992b830cee917e8ad35ede50e048907014a)
Reviewed-on: http://git-master/r/1217687
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-15 12:23:46 -07:00
Terje Bergstrom
eee2744d49 gpu: nvgpu: When powering down, abort if not idle
When trying to power down GPU the engine might be still busy. In this
case delay power down by returning -EBUSY from
gk20a_pm_runtime_suspend().

Bug 200224907

Change-Id: Ibad74c090add24a185bc1a7a02df367af9b95ced
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1213042
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-15 12:23:37 -07:00
Aingara Paramakuru
3366506072 gpu: nvgpu: move gpfifo submit wait to userspace
Instead of blocking for gpfifo space in the nvgpu driver,
return -EAGAIN and allow userspace to decide the blocking
policy.

Bug 1795076

Change-Id: Ie091caa92aad3f68bc01a3456ad948e76883bc50
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1202591
(cherry picked from commit 8056f422c6a34a4239fc4993c40c2e517c932714)
Reviewed-on: http://git-master/r/1203800
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-15 12:23:29 -07:00
Konsta Holtta
b700d3a040 gpu: nvgpu: fix null access in page table allocation
Check entry->mem.sgt for validity before attempting to dereference it in
a debug print.

Bug 1809939

Change-Id: If7aa7444c162a076d8f23a88dfd2e3e0a9c33813
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1215522
(cherry picked from commit 48c25cd4f1db9d5bb07847af4de29d8f369b52e3)
Reviewed-on: http://git-master/r/1220547
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-14 14:13:55 -07:00
Konsta Holtta
6029684eb0 gpu: nvgpu: fix chunk size mismatch in page allocator
When allocating discontiguous memory composed of several chunks,
update also the number of pages used by the current chunk, if a large
chunk was not available and a retry is performed with a smaller one.
Failing to do this would result in too few chunks reserved for a large
enough allocation in certain conditions.

Bug 1805067

Change-Id: I9d14864724d228b42c47eb4669fbe0f789334397
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1214914
(cherry picked from commit 9bece931b13e4dad808622462d4d98d421cfb383)
Reviewed-on: http://git-master/r/1220546
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-14 14:13:40 -07:00
Konsta Holtta
93d3199019 gpu: nvgpu: test free user vidmem atomically
An empty list of soon-to-be-freed userspace vidmem buffers is not enough
to safely assume that an allocation may succeed or not if tried again,
because removal from the list and actually marking the memory freed is
not atomic. Fix this by using an atomic counter for the number of
pending frees (so that it's still safe to first remove from the job list
and then perform the free), and making allocation attempts combined with
a test of pending frees atomic.

This still does not guarantee that there is memory available (as the
actual amount of pending memory in bytes plus the current free amount
isn't computed), but removes the race that produces false negatives in
case a single program expects repeated frees and allocs to succeed.

Bug 1809939

Change-Id: I6a92da2e21cbf3f886b727000c924d56f35ce55b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1217078
(cherry picked from commit 83c1f1e70dccd92fdd4481132cf5b6717760d432)
Reviewed-on: http://git-master/r/1220545
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-14 13:03:44 -07:00
Peter Daifuku
54e22a2bae gpu: nvgpu: vgpu: NULL out unused css entries
Fix cyclestats snapshots HAL entries in the vgpu case, need
to null out the ones that don't apply.

Bug 1700143
JIRA EVLR-278

Change-Id: I1b5f4652d1bf3283d96fdb3c2f66c4f69a9f6acc
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1217507
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2016-09-13 11:34:03 -07:00
Aingara Paramakuru
c0dd9ea9c8 gpu: nvgpu: use spinlock for ch timeout lock
The channel timeout lock guards a very small critical section. Use a
spinlock instead of a mutex for performance.

Bug 1795076

Change-Id: I94940f3fbe84ed539bcf1bc76ca6ae7a0ef2fe13
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1200803
(cherry picked from commit 4fa9e973da141067be145d9eba2ea74e96869dcd)
Reviewed-on: http://git-master/r/1203799
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-13 10:13:41 -07:00
Shardar Shariff Md
7ff4a760a8 gpu: nvgpu: change the usage of tegra_fuse_readl
tegra_fuse_readl() prototype is changed to match upstreamed
fuse driver, so change implementation accordingly.

Bug 200233653

Change-Id: I01f23cfafd5923d86ac48e67b36132ce690e962b
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1217374
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-09-12 23:09:16 -07:00
David Nieto
24c38aed59 gpu: nvgpu: fix pmu_copy_to_dmem spew
The error check was not taking account of
the DMEM address wrap-around

JIRA DNVGPU-34

Change-Id: Ibfed5532c3ee785b3061e6837f012939118a7ece
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1206460
(cherry picked from commit 080953c20f91068ccaaa564d9492a1582ffa28fe)
Reviewed-on: http://git-master/r/1218297
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-12 16:06:44 -07:00
Terje Bergstrom
2d35eee68f gpu: nvgpu: Call init_cbc only when defined
Call init_cbc only when it contains a non-NULL pointer.

Bug 1799537

Change-Id: Ic23f264e10daff30365bf3cf86ac9c155f50e497
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1208008
(cherry picked from commit ec69fa15c32f49d96939fd9a672faec45e078dfa)
Reviewed-on: http://git-master/r/1217298
Reviewed-by: Automatic_Commit_Validation_User
2016-09-12 16:06:44 -07:00
Vijayakumar Subbu
589179ad00 gpu: nvgpu: refactor pmu include
split pmu include files to add lot more APIs
pmu_api.h - all the current APIs used in igpu
pmu_common.h - common defines for all APIs
pmu_gk20a.h - SW defines specific needed for nvgpu
like PMU version, PMU SW structure definition etc.
Splitting APIs to separate files allows us to use auto
generated PMU task headers from RM

We have script which generates pmu interface herader files
in linux format. It replaces RM with NV. Adding typedef in existing pmu
code make auto generated files easy to compile/add

JIRA DNVGPU-85

Change-Id: I851b88769fe8d60561a44754ddb7dde45b45959e
Signed-off-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/1192702
Reviewed-on: http://git-master/r/1203124
(cherry picked from commit 0fe5f020c3f934cf2cc5336f1b6c3bafaf9e0c2a)
Reviewed-on: http://git-master/r/1217301
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 20:06:06 -07:00
Terje Bergstrom
7d44a8d8d8 gpu: nvgpu: Support mclk initialization
Add ops for calling mclk initialization.

JIRA DNVGPU-85

Change-Id: I2e9da80fdb014d916b40513d605c38711818d2f6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1203975
(cherry picked from commit 9be482c4ece7ffc550ae19f133638c808b3a768f)
Reviewed-on: http://git-master/r/1217300
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-08 20:06:06 -07:00
Mahantesh Kumbar
39c48cb8bf gpu: nvgpu: get bios perf and clk table ptr
Implement support for reading perf and clk tables from VBIOS.

JIRA DNVGPU-83

Change-Id: I095fea08479161362e4c2ffa7500ee6a57d6d447
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1202602
(cherry picked from commit fb7c7356f131a198bd655a25fc6ff17067477e1b)
Reviewed-on: http://git-master/r/1217299
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-08 20:05:58 -07:00
Terje Bergstrom
f56ed459dd gpu: nvgpu: Skip calling undefined prod callbacks
Fix rest of code to not call prod callbacks that are set to NULL.

Bug 1799537

Change-Id: I756bb1f7ef58ba753ac43a2be6f125107be3cf34
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1209133
(cherry picked from commit 5f4d7b42b6101407fde8c4a7dcdd3633eca85ae5)
Reviewed-on: http://git-master/r/1217297
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-08 20:05:49 -07:00
Terje Bergstrom
a0dd3ee5be gpu: nvgpu: Allocate vidmem fds from 1024
Allocate vidmem fds from 1024 onwards. This prevents us from
using up the 0-1023 range which is tracked per process, and
fits within FD_SETSIZE.

Bug 200222681

Change-Id: I104b81f2831f1816ff66fc245fa63013d78001ec
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1199269
(cherry picked from commit 5d5cbaf6a63dd31538fa35081b70e103d8a658f4)
Reviewed-on: http://git-master/r/1217294
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2016-09-08 20:05:41 -07:00
Terje Bergstrom
a222bc55b5 gpu: nvgpu: Do not print error on unknown engine
Unknown engine is expected, as we do not support all dGPU engines.
Remove the error spew.

JIRA DNVGPU-26

Change-Id: I6f7897c6ead168f1d8100421d16d0540a7f7b542
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1206449
(cherry picked from commit 4cc610755df94065afd28a90c63aca8fff9685b1)
Reviewed-on: http://git-master/r/1217292
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2016-09-08 20:05:40 -07:00
Peter Daifuku
9aa7de15c2 gpu: nvgpu: vgpu: cyclestat snapshot support
Add support for cyclestats snapshots in the virtual case

Bug 1700143
JIRA EVLR-278

Change-Id: I376a8804d57324f43eb16452d857a3b7bb0ecc90
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1211547
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 16:04:09 -07:00
Deepak Nibade
70cad5fbb5 gpu: nvgpu: unify nvgpu and pci probe
We have completely different versions of probe for
nvgpu and pci device
Extract out common steps into nvgpu_probe() function
and separate it out in new file nvgpu_common.c
Divide task of nvgpu_probe() into further smaller
functions

Do platform specific things (like irq handling,
memresource management, power management) only in
individual probes and then call nvgpu_probe() to
complete the common initialization

Move all debugfs initialization to common gk20a_debug_init()
This also helps to bringup all debug nodes to pci device

Pass debugfs_symlink name as a parameter to gk20a_debug_init()
This allows us to set separate debugfs symlink for nvgpu
and pci device

In case of railgating, cde and ce debugfs, check if
platform supports them or not

Copy vidmem_is_vidmem from platform to mm structure
and set it to true for pci device

Return from gk20a_scale_init() if we don't have either of
governor or qos_notifier

Fix gk20a_alloc_debugfs_init() and gk20a_secure_page_alloc()
to receive device pointer instead of platform_device

Export gk20a_railgating_debugfs_init() so that we can call
it from gk20a_debug_init()

Jira DNVGPU-56
Jira DNVGPU-58

Change-Id: I3cc048082b0a1e57415a9fb8bfb9eec0f0a280cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204207
(cherry picked from commit add6bb0a3d5bd98131bbe6f62d4358d4d722b0fe)
Reviewed-on: http://git-master/r/1204462
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 09:43:51 -07:00
Deepak Nibade
f31e575ed6 gpu: nvgpu: remove blocking wait for vidmem allocation
We have blocking 1sec wait for vidmem allocation
Remove this blocking wait and just return proper error
code to the caller

In case we have some buffers to be cleaned up in the
list (clear_list_head), return EAGAIN so that caller
can retry
Otherwise return ENOMEM indicating that no memory is
available right now

Jira DNVGPU-84

Change-Id: Ife2b17c989fc80e568f03bb18ad75b93a25be962
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204969
(cherry picked from commit 2bacdf0bc6d5b1cdcb8be37e574ca5f4f0663cae)
Reviewed-on: http://git-master/r/1213451
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 09:43:49 -07:00
Konsta Holtta
dc3976e4c3 gpu: nvgpu: use vidmem for gr ctx if available
Use the common gk20a_gmmu_alloc() that tries vidmem too.

Jira DNVGPU-24

Change-Id: I5dfd7eaab737a5290b4d21ac575d6b89777a567e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1209077
(cherry picked from commit e3085d37735c8f1cf4845621f29fe9d2689aad4b)
Reviewed-on: http://git-master/r/1184330
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 09:43:47 -07:00
Deepak Nibade
0ca01a3355 gpu: nvgpu: fix non-contiguous pramin access
In pramin_access_batched(), in each iteration of the
loop we first decide size of data that we should write
in that iteration.
In case this size is equal to length of the chunk, we
need to move to use next chunk for subsequent iteration

But since we change offset variable before we check
above, we end up using same chunk in next iteration

Fix this by correcting the sequnce to first check if
we should move to next chunk and then only adjust
the offset variable

Jira DNVGPU-24

Change-Id: I58c2e24678f4c6dfbe33bf111edd06788629eca8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1210892
(cherry picked from commit 83cc179199692d28a93b3b884c9bc094ff513298)
Reviewed-on: http://git-master/r/1213450
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-08 09:43:46 -07:00
Sri Krishna chowdary
4e0e13d92b gpu: nvgpu: suppress unbind operation
Unbind on nvgpu results in kernel panic.
Suppress it to avoid kernel panic.
Proper fix should follow later on.

bug 1779085

Change-Id: Ibc966ac031f7f04406db63310e2f5ea126649ac0
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Reviewed-on: http://git-master/r/1212759
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
2016-09-01 23:44:54 -07:00
Deepak Nibade
f5895f44ea gpu: nvgpu: fix compilation errors for 32 bit arch
Converting return value of sg_dma_address() (which is u64)
into a pointer results in compilation failure on 32 bit
machines

Hence convert address first into uintptr_t and then into
pointer

Change-Id: I8e036af8f4c936b88883cf8af1491f03025ed356
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1211243
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:12:24 -07:00
Deepak Nibade
f43231f7a5 gpu: nvgpu: enable big page support for pci
While mapping the buffer, first check if buffer is in
vidmem, and if yes convert allocation into base address
And then walk through each chunk to decide the alignment

Add new API gk20a_mm_get_align() which returns the
alignment based on scatterlist and aperture, and use
this API to get alignment during mapping

Enable big page support for pci by unsetting disable_bigpage

Jira DNVGPU-97

Change-Id: I358dc98fac8103fdf9d2bde758e61b363fea9ae9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1207673
(cherry picked from commit d14d42290eed4aa7a2dd2be25e8e996917a58e82)
Reviewed-on: http://git-master/r/1210959
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:12:15 -07:00
Deepak Nibade
737d634630 gpu: nvgpu: make default vidmem page size of 64k
Allocate 64k pages for vidmem by default
Also make sure that base address of vidmem is aligned
to page size

Jira DNVGPU-20

Change-Id: Ie2e5111f942467754db5b45f1518d72c925d3d19
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1206405
(cherry picked from commit 542ebf7f571ba6dc631466e562f7d8e05df4a9a6)
Reviewed-on: http://git-master/r/1210958
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:12:08 -07:00
Konsta Holtta
4f6c989898 gpu: nvgpu: use vidmem for page tables if available
Use the common gk20a_gmmu_alloc() that tries vidmem too.

Jira DNVGPU-20

Change-Id: I4ea02bc4962d299c6f71444048d4a2a22bd80f55
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1206404
(cherry picked from commit 7297727cce8c5c7b26f82afe98cc5428135b4777)
Reviewed-on: http://git-master/r/1178831
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:12:01 -07:00
Deepak Nibade
44c5b5877b gpu: nvgpu: add new API to get base address for sysmem/vidmem buffers
Add new API gk20a_mem_get_base_addr() which will return vidmem
base address in case of vidmem and IOVA address in case of
sysmem

Even though vidmem allocations are non-contiguous, this API
is useful (and should only be used) for allocations with one
chunk (e.g. page tables)

Also, since page tables could either reside in sysmem or vidmem,
use this API to get address of page tables

Jira DNVGPU-20

Change-Id: Ie04af9ca7bfccfec1a8a8e4be2c507cef5cef8e1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1206403
(cherry picked from commit a8c74dc188878f2948fa1e0e47bf1837fba6c5e0)
Reviewed-on: http://git-master/r/1210957
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:11:53 -07:00
Deepak Nibade
93a436f581 gpu: nvgpu: allocate blob space early
Allocting blob space for pmu might need fixed address
allocation in vidmem and during boot up

But if some page tables are allocated before blob space,
blob space allocation could fail

Fix this by allocating blob space early during boot up

Jira DNVGPU-20

Change-Id: I30eca1023c8f8f8be101bb7e160ba57a7040911a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1206402
(cherry picked from commit fad4309ce345ed3879f497bda27f2eceb1084dbb)
Reviewed-on: http://git-master/r/1210956
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:11:46 -07:00
Lakshmanan M
8de995d4af gpu: nvgpu: Add proper timeout handling for vidmem clear operations
gk20a_fence_wait() api may be interrupted by a signal before actual
its timeout elapsed. This CL does retry (-ERESTARTSYS) mechanism
if gk20a_fence_wait() return before its timeout elapsed.

Bug 200230544

Change-Id: I347ed2004935a8b9413f95dcb6fca2b74bf49f2a
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1206265
(cherry picked from commit d3ef533942487785d84d109f985ae648eb3c2434)
Reviewed-on: http://git-master/r/1210955
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:11:35 -07:00
Konsta Holtta
89d1075f26 gpu: nvgpu: use vidmem for gpfifos if available
Use the common gk20a_gmmu_alloc() that tries vidmem too.

Jira DNVGPU-21

Change-Id: Ie22cb0f5ed70ec71567fc85d348b3526c9a32b02
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204304
(cherry picked from commit 07cb99baeb10194c520addd77517841a6f99df93)
Reviewed-on: http://git-master/r/1169310
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:11:24 -07:00
Deepak Nibade
713f1ddcdf gpu: nvgpu: support pramin access for non-contiguous vidmem
API pramin_access_batched() currenly only supports contiguous
allocations.
Modify this API to support non-contiguous allocations from
page allocator as well

Update gk20a_mem_wr32() and gk20a_mem_rd32()to reuse
pramin_access_batched()

Use gk20a_memset() in gk20a_gmmu_free_attr_vid() to clear
vidmem pages for kernel buffers

Jira DNVGPU-30

Change-Id: I43630912f4837d8ebc6b9c58f4f427218ef9725b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1204303
(cherry picked from commit 2f84f141d02fd2f641cb18a48896fb3ae5f7e51f)
Reviewed-on: http://git-master/r/1210954
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:11:07 -07:00
Deepak Nibade
9ebd051779 gpu: nvgpu: track allocator and user for each mem
Store allocator pointer for each mem_desc
This pointer should be used while freeing the mem
instead of assuming a common allocator

Add flag user_mem to mem_desc which will be set
only in case of User vidmem allocations

We will delay free of mem in worker only if this
flag is set on mem. Otherwise, we will free it
immediately
This is needed so that all kernel allocations
can work with both sysmem and vidmem

Jira DNVGPU-84

Change-Id: Ib9a9209b164bc56b7880448f86bd6d42b324cc86
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1203099
(cherry picked from commit 8f0b0122f36a0b6f1932fa9a98d7eb03b1f623d1)
Reviewed-on: http://git-master/r/1210953
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:10:54 -07:00
Deepak Nibade
50fec50bff gpu: nvgpu: fix memory leak in case of failure
In __gk20a_alloc_pages(), if we fail to allocate a chunk
we free previously allocated chunks in error path
But we do not free up the memory reserved in those chunks
which could lead to OOM situations

Fix this by calling gk20a_free() for each chunk in error
path

Jira DNVGPU-96

Change-Id: I68aa18d68a5282405016e688c790ccbc0c2a0d69
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1203098
(cherry picked from commit f096bd1675600f4e2fc2d686f2911bb945fbbf0b)
Reviewed-on: http://git-master/r/1210952
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-01 09:10:45 -07:00
Deepak Nibade
3b6819bdf4 gpu: nvgpu: disable sync_fence for CE jobs
We do not need sync_fence for CE jobs submitted in
gk20a_ce_execute_ops() since all the waiters of
fence are in kernel space only

Jira DNVGPU-84

Change-Id: Idad6c40abcefb86e60a5327bbbff6827b1ca33cc
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1201347
(cherry picked from commit e294b2d37cf79182bb9a255adb188eb6afa47c27)
Reviewed-on: http://git-master/r/1210951
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:10:42 -07:00
Deepak Nibade
6a48f4b335 gpu: nvgpu: clear vidmem buffers in worker
We clear buffers allocated in vidmem in buffer free path.
But to clear buffers, we need to submit CE jobs and this
could cause issues/races if free called from critical
path

Hence solve this by moving buffer clear/free to a worker

gk20a_gmmu_free_attr_vid() will now just put mem_desc into
a list and schedule a worker
And worker thread will traverse the list and clear/free
the allocations

In struct gk20a_vidmem_buf, mem variable is statically
allocated. But since we delay free of mem, convert this
variable into a pointer and allocate it dynamically

Since we delay free of vidmem memory, it is now possible
to face OOM conditions during allocations. Hence while
allocating block until we have sufficient memory
available with an upper limit of 1S

Jira DNVGPU-84

Change-Id: I7925590644afae50b6fc04c6e1e43bbaa1c220fd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1201346
(cherry picked from commit b4dec4a30de2431369d677acca00e420f8e581a5)
Reviewed-on: http://git-master/r/1210950
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:10:31 -07:00
Deepak Nibade
f79639f618 gpu: nvgpu: clear whole vidmem on first allocation
We currently clear vidmem pages in gk20a_gmmu_alloc_attr_vid_at()
i.e. allocation path for each buffer
But since buffer allocation path could be latency critical,
clear whole vidmem first and before first User allcation
in gk20a_vidmem_buf_alloc()

And then clear buffer pages while releasing the buffer
In this way, we can ensure that vidmem pages are already cleared
during buffer allocation path

At a later stage, clearing of pages can be removed from free path
and moved to a separate worker as well

At this point, first allocation has overhead of clearing whole
vidmem which takes about 380mS and this should improve once
clocks are raised.
Also, this is one time larency, and subsequent allocations
should not have any overhead for clearing at all

Add API gk20a_vidmem_clear_all() to clear whole vidmem
We have WPR buffers allocated during boot up and
at fixed address in vidmem.
To prevent overwriting to these buffers in gk20a_vidmem_clear_all(),
clear whole vidmem except for the bootstrap allocator carveout

Add new API gk20a_gmmu_clear_vidmem_mem() to clear one mem_desc

Jira DNVGPU-84

Change-Id: I5661700585c6241a6a1ddeb5b7c068d3d2aed4b3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1194301
(cherry picked from commit 950ab61a04290ea405968d8b0d03e3bd044ce83d)
Reviewed-on: http://git-master/r/1193158
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:10:20 -07:00
Alex Waterman
aa7f4bf251 gpu: nvgpu: Add a bootstrap vidmem allocator
Add an allocator for allocating vidmem before the CE has had a
chance to be initialized (and clear the rest of vidmem).

Jira DNVGPU-84

Change-Id: I5166607a712b3a6eb4c2906b8c7d002c68a6567b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1197204
(cherry picked from commit b4e68e84eedd952637b2332d8dc73a9090d6d62e)
Reviewed-on: http://git-master/r/1210949
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:10:10 -07:00
Deepak Nibade
c845b21012 gpu: nvgpu: support GMMU mappings for vidmem page allocator
Switch to use page allocator for vidmem

Support GMMU mappings for page (non-contiguous page allocator)
in update_gmmu_ptes_locked()
If aperture is VIDMEM, traverse each chunk in an allocation
and map it to GPU VA separately

Fix CE page clearing to support page allocator

Fix gk20a_pramin_enter() to get base address from new
allocator
Define API gk20a_mem_get_vidmem_addr() to get base address
of allocation. Note that this API should not be used if we
have more than 1 chunk

Jira DNVGPU-96

Change-Id: I725422f3538aeb477ca4220ba57ef8b3c53db703
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1199177
(cherry picked from commit 1afae6ee6529ab88cedd5bcbe458fbdc0d4b1fd8)
Reviewed-on: http://git-master/r/1197647
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-09-01 09:10:00 -07:00
Cory Perry
c38cc24e1a gpu: nvgpu: send only one event to the debugger
Event notifications on TSGs should only be sent to the channel that caused the
event to happen in the first place, not evey channel in the tsg.  Any more and
the debugger will not be able to tell what channel actually got the event.
Worse yet, if all the channels in a tsg are bound to the same debug session
(as is the case with cuda-gdb), then multiple nvgpu events for the same gpu
event will be triggered, causing events to be buffered and the client to get
out of sync.

One gpu exception, one nvgpu event per tsg.

Bug 1793988

Signed-off-by: Cory Perry <cperry@nvidia.com>
Change-Id: I4efb83b0593bd1af38f2342c80793d9db56e42b1
Reviewed-on: http://git-master/r/1194203
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-09-01 08:03:51 -07:00
Alex Waterman
0ea97181f2 gpu: nvgpu: Fix error handling of __semaphore_bitmap_alloc()
The return from __semaphore_bitmap_alloc() is an int for which a
negative value indicates a failure. That return value was being
directly cast to an unsigned int before being checked for a
negative error code. This obviously isn't a good idea.

Coverity ID 38754

Change-Id: I50c0478e5504988b059e69b929e9c2e465df7cc0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1210317
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-31 14:04:12 -07:00
Alex Waterman
9247b610d2 gpu: nvgpu: Fix possible overflow in buddy allocator
Fix a possible overflow in the buddy allocator's initialization code. In
practice it should never happen that pde size is greater than 32bits but
this makes coverity happy.

Coverity ID 54964

Change-Id: I886fd962bb3e9e328f7305bdcf69827979a39a21
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1210316
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-31 14:04:11 -07:00
Bharat Nihalani
a3452ea763 gpu: nvgpu: gk20a: Use spin_lock for jobs_lock
This is done to boost performance of the GPU submit time, which
is critical for compute use-cases.

Bug 200215465
Bug 1804898

Conflicts:
	drivers/gpu/nvgpu/gk20a/channel_gk20a.c

Change-Id: Ic4884ee4eac910b92b84a47fdc1b2e9f26b2f1f0
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/1199860
Reviewed-on: http://git-master/r/1209834
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-31 14:04:10 -07:00
David Pu
91241ca8e9 gpu: nvgpu: fix build error when CONFIG_DEBUG_FS=n
adding 'ifdef CONFIG_DEBUG_FS' check to fix following compilation error
when CONFIG_DEBUG_FS=n(which is used for Android 'production' build):

mm_gk20a.c: In function 'gk20a_mm_debugfs_init':
mm_gk20a.c:4824:2: error: implicit declaration of function 
'debugfs_create_x64' [-Werror=implicit-function-declaration]


Bug 1778001

Change-Id: I785288a37b96c391b84925d5971d2691cf80206e
Signed-off-by: David Pu <dpu@nvidia.com>
Reviewed-on: http://git-master/r/1210393
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-31 12:03:54 -07:00
Alex Waterman
cfa0f2a6ea gpu: nvgpu: Turn the debug macro back to pr_info
Instead of having the debug prints from the allocators be
warnings they should be just regular prints.

Bug 1799159

Change-Id: Ic6e3c38fa286c4acd6fcba51dc59158dc2d655fc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1201372
(cherry picked from commit 107caf4ce68a7c76023ee1e66a98c5570f401059)
Reviewed-on: http://git-master/r/1208478
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-30 10:04:35 -07:00
Alex Waterman
e8a159defd gpu: nvgpu: Clarify comment in allocator code
One of the flags that is defined for allocators has not yet been
imlpemented. This clarifies the comment and explains why the flag
has been defined even though it is not yet implemented.

Bug 1799159

Change-Id: I1e84439d63ca391941cee8e5362ffd9cc959744b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1201371
(cherry picked from commit 8e6566b173f17d9c169a9fa0f6104f4bbf608dc1)
Reviewed-on: http://git-master/r/1208477
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-30 10:04:34 -07:00
Alex Waterman
9e43258438 gpu: nvgpu: Add checking in allocator functions
Add checks to make sure function pointers are valid before attempting
to call said function.

Also, ensure that any allocator created defines the following 3 functions
at minimum:

  alloc()
  free()
  fini()

Bug 1799159

Change-Id: I4cd3d5746ccb721c723a161c9487564846027572
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1200059
(cherry picked from commit e26557a49d7ca6629ada24f12a3be396b0ae22cd)
Reviewed-on: http://git-master/r/1208476
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-30 10:04:31 -07:00
Alex Waterman
9eac0fd849 gpu: nvgpu: Add debugging to the semaphore code
Add GPU debugging to the semaphore code.

Bug 1732449
JIRA DNVGPU-12

Change-Id: I98466570cf8d234b49a7f85d88c834648ddaaaee
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198594
(cherry picked from commit 420809cc31fcdddde32b8e59721676c67b45f592)
Reviewed-on: http://git-master/r/1153671
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-08-30 10:04:30 -07:00
Alex Waterman
0e69c6707b gpu: nvgpu: Add gpu_dbg_map_v message type
Add a new debug message type: gpu_dbg_map_v. This is used for mapping
messages that are not specifically memory map operations.

Also cleanup the memory mapping debugging a bit since there was one
duplicate print and the memory map print was difficult to parse
visually. As a result the message has been modified to put the most
important information first in an easily readable format.

Bug 1732449
JIRA DNVGPU-12

Change-Id: Ib19c9371ee958009ab5a2d89b9610e699d070ee2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198593
(cherry picked from commit 51dba53b06ca171cdb13d1707f2d026b0ce29f07)
Reviewed-on: http://git-master/r/1147670
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-08-30 10:04:23 -07:00
Alex Waterman
39624a04d8 gpu: nvgpu: Add semaphore debugging info
Add semaphore debugging information to the gk20a channel state
debug dump.

Bug 1732449
JIRA DNVGPU-12

Change-Id: I7caafd4f6420e1c478be22e236513603c315ce5e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1198592
(cherry picked from commit 3fa247adf5fdd8c9b16a24fec00903fdc3abc90a)
Reviewed-on: http://git-master/r/1133793
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-08-30 10:04:13 -07:00