Commit Graph

195 Commits

Author SHA1 Message Date
Konsta Holtta
dc45473eeb gpu: nvgpu: use mem_desc in priv_cmd_entry
Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a
reference to the underlying mem_desc (within priv_cmd_queue) paired with
an offset, for buffer aperture flexibility.

JIRA DNVGPU-21
JIRA DNVGPU-23

Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1145922
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-18 11:54:34 -07:00
Konsta Holtta
6eebc87d99 gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.

gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.

vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.

Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.

JIRA DNVGPU-23

Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
2016-05-13 07:11:33 -07:00
Deepak Nibade
d868b65441 gpu: nvgpu: separate IOCTL to set preemption mode
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD

Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h

Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously

Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)

API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute

Make necessary changes in code to support two preemption
modes

Bug 1646259

Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:29 -07:00
Thomas Fleury
93678f571c gpu: nvgpu: Add trace and debugfs for sched params
JIRA EVLR-244
JIRA EVLR-318

Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120603
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
2016-05-05 09:25:02 -07:00
Konsta Holtta
98679b0ba4 gpu: nvgpu: adapt gk20a_mm_entry for mem_desc
For upcoming vidmem refactor, replace struct gk20a_mm_entry's contents
identical to struct mem_desc, with a struct mem_desc member. This makes
it possible to use the page table buffers like the others too.

JIRA DNVGPU-23
JIRA DNVGPU-20

Change-Id: I714ee5dcb33c27aaee932e8e3ac367e84610b102
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1139694
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-03 13:27:22 -07:00
Alex Waterman
8891fd8267 gpu: nvgpu: Add gk20a_gmmu_fixed_map() function
Add a function to allow the kernel to do fixed mappings. Necessary
for the semaphore functionality since there needs to be a common
address in each VM for the semaphores.

Bug 1732449
JIRA DNVGPU-12

Change-Id: I2b451db2d3cb3c003d951f7b0ffc87f6c91db7dc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1133789
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-29 14:43:48 -07:00
Terje Bergstrom
b10e02f537 gpu: nvgpu: Program NISO sysmem flush addr
Program sysmem flush address to prevent random accesses of
address 0.

Change-Id: I886170395f036805f02e0bce7ecd3c8c46b921df
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129216
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
2016-04-25 08:16:10 -07:00
Terje Bergstrom
7d8e219389 gpu: nvgpu: Use sysmem aperture for SoC memory
In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it
has to be accessed as sysmem.

Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120468
2016-04-15 08:50:34 -07:00
Alex Van Brunt
a596f0bd6d debugfs: Pass bool pointer to debugfs_create_bool()
Port the change 621a5f7ad9cd1ce7933f1d302067cbd58354173c from kernel.org
to the nvgpu driver.

bug 200187033

Change-Id: I7d742f614161d9d4ed59c4216d7c730d57ef4116
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1118397
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-04-14 06:51:52 -07:00
Terje Bergstrom
9b5427da37 gpu: nvgpu: Support GPUs with no physical mode
Support GPUs which cannot choose between SMMU and physical
addressing.

Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120469
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2016-04-13 13:12:41 -07:00
Peter Daifuku
6eeabfbdd0 gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch
Add support for SMPC and HWPM context switching when virtualized

Bug 1648200
JIRASW EVLR-219
JIRASW EVLR-253

Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1122034
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-08 12:34:50 -07:00
Terje Bergstrom
e8bac374c0 gpu: nvgpu: Use device instead of platform_device
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.

Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
2016-04-08 09:42:41 -07:00
Peter Daifuku
37155b65f1 gpu: nvgpu: support for hwpm context switching
Add support for hwpm context switching

Bug 1648200

Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1120450
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 11:05:49 -07:00
Vishal Annapurve
cd29c45e67 gpu: nvgpu: Fix compilation with CONFIG_DEBUG_FS disabled
This change fixes issues with kernel compilation when
CONFIG_DEBUG_FS is disabled.

Bug 1737085

Change-Id: I74719674d07ae071e3df99b0dda249b54173f40b
Signed-off-by: Vishal Annapurve <vannapurve@nvidia.com>
Reviewed-on: http://git-master/r/1024167
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sandeep Trasi <strasi@nvidia.com>
2016-03-29 02:00:58 -07:00
Alex Waterman
fbc21ed2ee gpu: nvgpu: split address space for fixed allocs
Allow a special address space node to be split out from the
user adress space or fixed allocations. A debugfs node,

  /d/<gpu>/separate_fixed_allocs

Controls this feature. To enable it:

  # echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs

Where <SPLIT_ADDR> is the address to do the split on in the
GVA address range. This will cause the split to be made in
all subsequent address space ranges that get created until it
is turned off. To turn this off just echo 0x0 into the same
debugfs node.

Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1030303
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-25 13:19:17 -07:00
Konsta Holtta
db7095ce51 gpu: nvgpu: bitmap allocator for comptags
Restore comptags to be bitmap-allocated, like they were before we had
the buddy allocator.

The new buddy allocator introduced by
e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally
6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but
unsuitable for the small compbit store.

This commit reverts partially the combination of the above commit and
also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed
a bug which is not present when using a bitmap. With a bitmap allocator,
pruning the extra allocation necessary for user-mapped mode is possible,
so that is also restored.

The original generic bitmap allocator is not restored; instead, a
comptag-only allocator is introduced.

Bug 200145635

Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/837180
(cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2)
Reviewed-on: http://git-master/r/839742
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-19 17:44:27 -08:00
Terje Bergstrom
9812bd5eea gpu: nvgpu: Control comptagline assignment from kernel
On Maxwell comptaglines are assigned per 128k, but preferred big page
size for graphics is 64k. Bit 16 of GPU VA is used for determining
which half of comptagline is used.

This creates problems if user space wants to map a page multiple times
and to arbitrary GPU VA. In one mapping the page might be mapped to
lower half of 128k comptagline, and in another mapping the page might
be mapped to upper half.

Turn on mode where MSB of comptagline in PTE is used instead of bit 16
for determining the comptagline lower/upper half selection.

Bug 1704834

Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/924322
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2016-01-05 07:50:02 -08:00
Deepak Nibade
2d40ebb1ca gpu: nvgpu: rework private command buffer free path
We currently allocate private command buffers (wait_cmd
and incr_cmd) before submitting the job but we never
free them explicitly.
When private command queue of the channel is full, we
then try to recycle/remove free command buffers.
But this recycling happens during submit path, and
hence that particular submit path takes much longer

Rework this as below :
- add reference of command buffers to job structure
- when job completes, free the command buffers
  explicitly
- remove the code to recycle buffers since it should
  not be needed now

Note that command buffers need to be freed in order of
their allocation. Ensure this with error print before
freeing the command buffer entry

Bug 200141116
Bug 1698667

Change-Id: Id4b69429d7ad966307e0d122a71ad55076684307
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/827638
(cherry picked from commit c6cefd69b71c9b70d6df5343b13dfcfb3fa99598)
Reviewed-on: http://git-master/r/835802
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-23 08:33:01 -08:00
Sami Kiminki
9d2c9072c8 gpu: nvgpu: User-space managed address space support
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.

When an address space is marked as userspace-managed, the following
changes are in effect:

- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
  except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
  ref increments at kickoffs and decrements at job completion are
  skipped.

Bug 1614735
Bug 1623949
Bug 1660392

Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-18 09:45:07 -08:00
Sami Kiminki
30632cec54 gpu: nvgpu: Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO
Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used
to identify buffers and retrieve their sizes. This allows the
userspace to be agnostic to the dmabuf implementation, as the generic
dmabuf fd interface does not have a reliable way for buffer
identification.

Bug 1614735
Bug 1623949
Bug 1660392

Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/822845
Reviewed-on: http://git-master/r/833252
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-11-17 12:38:44 -08:00
Konsta Holtta
411c3a9a4f gpu: nvgpu: use a separate big vm for cde
Allocate a separate VM for CDE channels instead of using the system
(PMU) vm, and make it much bigger than the PMU's to fit the maximum
number of CDE channels there.

Bug 1566740

Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/815149
Reviewed-by: Automatic_Commit_Validation_User
2015-11-10 23:19:45 -08:00
Terje Bergstrom
8452348539 gpu: nvgpu: Do not use G_ELPG_FLUSH
G_ELPG_FLUSH is protected in some chips. Use L2 flush operations
instead.

Bug 1698618

Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/828656
(cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab)
Reviewed-on: http://git-master/r/829415
2015-11-10 10:33:39 -08:00
Terje Bergstrom
cccd038f8d Revert "gpu: nvgpu: Implement sparse PDEs"
This reverts commit c44947b1314bb2afa1f116b4928f4e8a4c34d7b1. It
introduces a regression in T124.

Bug 1702063

Change-Id: I64e333f66d98bd4dbcfe40a60f1aa825d90376a5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/830786
GVS: Gerrit_Virtual_Submit
2015-11-09 10:11:23 -08:00
Terje Bergstrom
4b5c08f4c0 gpu: nvgpu: Implement sparse PDEs
Change-Id: Idfeb3bf95751902d52a895d77045a529f69abc0b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/758651
GVS: Gerrit_Virtual_Submit
2015-10-30 16:36:06 -07:00
Aingara Paramakuru
ee18a3ae26 gpu: nvgpu: vgpu: re-factor gr ctx management
Move the gr ctx management to the GPU HAL. Also,
add support for a new interface to allocate gr ctxsw
buffers.

Bug 1677153

Change-Id: I5a7980acf4de0de7dbd94b7dd20f91a6196dc989
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/806961
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/817009
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-22 07:39:56 -07:00
Seshendra Gadagottu
5b7b59714a gpu: nvgpu: add support to remove bar2 mm
Adding support to remove bar2 mm on gpu module
remove.

Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/814571
(cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c)
Reviewed-on: http://git-master/r/815429
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-12 18:22:12 -07:00
Jussi Rasanen
f4b6d4d176 gpu: nvgpu: fix ctag computation overflow with 8GB
Bug 1689976

Change-Id: I97ad14c9698030b630d3396199a2a5296c661392
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/806590
(cherry picked from commit c90cd5ee674d6357db3be2243950ff0d81ef15ef)
Reviewed-on: http://git-master/r/808249
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-10-06 13:30:23 -07:00
Sami Kiminki
eade809c26 gpu: nvgpu: Separate kernel and user GPU VA regions
Separate the kernel and userspace regions in the GPU virtual address
space. Do this by reserving the last part of the GPU VA aperture for
the kernel, and extend GPU VA aperture accordingly for regular address
spaces. This prevents the kernel polluting the userspace-visible GPU
VA regions, and thus, makes the success of fixed-address mapping more
predictable.

Bug 200077571

Change-Id: I63f0e73d4c815a4a9fa4a9ce568709974690ef0f
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/747191
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-09-07 12:37:15 -07:00
Alex Waterman
ca26ce6f8c gpu: nvgpu: Increase VA space to 40 bits
Now that the buddy allocator is merged we can increase the VA space
without dramatically increasing memory usage by the allocator.

40 bits is the max VA space available on gk20a and gm20b.

Change-Id: I7bc8d86e35b28f041e9a435f2571c8288970c8ee
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/745076
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/771152
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
2015-07-20 11:33:07 -07:00
Terje Bergstrom
63714e7cc1 gpu: nvgpu: Implement priv pages
Implement support for privileged pages. Use them for kernel allocated buffers.

Change-Id: I720fc441008077b8e2ed218a7a685b8aab2258f0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/761919
2015-07-03 17:59:12 -07:00
Sami Kiminki
e7ba93fefb gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation
Add batch support for mapping and unmapping. Batching essentially
helps transform some per-map/unmap overhead to per-batch overhead,
namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB
invalidates. Batching with size 64 has been measured to yield >20x
speed-up in low-level fixed-address mapping microbenchmarks.

Bug 1614735
Bug 1623949

Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/733231
(cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91)
Reviewed-on: http://git-master/r/763812
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-30 08:35:23 -07:00
Alex Waterman
099af76674 gpu: nvgpu: Remove simulation WAR
The WAR put into simulation to avoid a simulator crash can now be
removed (c85be1a0968de813fe9b99ebd5c261dcb0ca8875). The first issue
with the failing test was found to be GPFIFO entries that were not
invalid.

Other issues are still present with the test and are fixed in a
later commit.

Change-Id: I7d3def2e384eede82cfc82b961f09ca23b239d30
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/753378
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/755815
Reviewed-by: Automatic_Commit_Validation_User
2015-06-11 10:16:47 -07:00
Terje Bergstrom
c5367d7350 gpu: nvgpu: Do not pre-allocate cmdbuf entries
Do not preallocate cmdbuf tracking entries. Allocate them only when
needed.

Bug 200104160

Change-Id: I12f8392723c301a368af1e280893ff993480477f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743953
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/755148
2015-06-09 11:14:34 -07:00
Terje Bergstrom
de48cc83b4 gpu: nvgpu: Fix build problem by including slab.h
Change-Id: I2346ed48bf82c032194264bab999097bf86404e2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/745885
(cherry picked from commit aba5f1ce08c2813c8bd570f36a5d15aa2a1aeaf8)
Reviewed-on: http://git-master/r/753286
Reviewed-by: Automatic_Commit_Validation_User
2015-06-06 07:26:32 -07:00
Terje Bergstrom
0dc66952e4 gpu: nvgpu: Use vmalloc only when size >4K
When allocation size is 4k or below, we should use kmalloc. vmalloc
should be used only for larged allocations.

Introduce nvgpu_alloc, which checks the size, and decides the API
to use.

Change-Id: I593110467cd319851b27e57d1bfe8d228d3f2909
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743974
(cherry picked from commit 7f56aa1f0ecafbfde7286353b60e25e494674d26)
Reviewed-on: http://git-master/r/753276
Reviewed-by: Automatic_Commit_Validation_User
2015-06-06 07:24:11 -07:00
Bharat Nihalani
b8aa486109 Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""""
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.

Bug 200112195

Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-04 10:41:00 -07:00
Bharat Nihalani
1d8fdf5695 Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.

Bug 200106514

Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
2015-06-02 20:18:55 -07:00
Alex Waterman
01f359f3f1 Revert "Revert "gpu: nvgpu: New allocator for VA space""
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.

The original commit was actually fine.

Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-19 13:09:00 -07:00
Konsta Holtta
16fc6e3931 gpu: nvgpu: protect missing sgl in gk20a_mem_phys
Return zero for missing sgl (sgt is already checked) instead of
attempting to dereference NULL. Those NULL conditions should be almost
nonexistent, and zero is not normally used.

When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.

Change-Id: I7033979091951cba3e3004ddc7550cd327ad0baf
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/737759
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:27 +05:30
Sami Kiminki
520ff00e87 gpu: nvgpu: Implement compbits mapping
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.

Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.

Bug 200077571

Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:19 +05:30
Konsta Holtta
c19c046446 gpu: nvgpu: protect missing sgt in gk20a_mem_phys
Return zero for missing sgt instead of attempting to dereference NULL.
Those NULL conditions should be almost nonexistent, and zero is not
normally used.

When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.

Change-Id: Id8ce37798d6fd3ceeb96a3f521c82569fccf30aa
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/729006
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:19:24 +05:30
Terje Bergstrom
aa25a952ea Revert "gpu: nvgpu: New allocator for VA space"
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.

Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2015-05-12 02:46:39 -07:00
Alex Waterman
a2e8523645 gpu: nvgpu: New allocator for VA space
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.

The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.

The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.

Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.

Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:53:25 -07:00
Alex Waterman
0566aee853 gpu: nvgpu: WAR for simulator bug
On linsim, when the push buffers are allowed to be allocated with small
pages above 4GB the simulator crashes. This patch ensures that for
linsim all small page allocations are forced to be below 4GB in the
GPU VA space. By doing so the simulator no longer crashes.

This bug has come up because the GPU buddy allocator work generates
allocations at the top of the address space first. Thus push buffers
were located at between 12GB and 16GB in the GPU VA space.

Change-Id: Iaef0af3fda3f37ac09a66b5e1179527d6fe08ccc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740728
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:52:09 -07:00
Alex Waterman
4d405809a9 gpu: nvgpu: Reduce BAR1 kernel size
Reduce the BAR1 size in the kernel to match the reserved size in the
DTB. This caused problems for the buddy allocator since the allocator
can sometimes allocate from higher memory before lower memory in the
managed space. This would cause the kernel to access unmapped memory.

Change-Id: I70b72ef5bb4db01253e5087757051ef852e99bc6
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740726
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:51:08 -07:00
Sami Kiminki
8d6fe0f2ef gpu: nvgpu: Implement compbits padding for mapping
Implement NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS, which adds
extra alignment to compbits allocation for safe compbits mapping.

Bug 200077571

Change-Id: I3a74ebb81412e4e1e69501debeb9ef4e2056ef1a
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/730763
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/740693
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-05-11 08:50:49 -07:00
Terje Bergstrom
b3a85df53b gpu: nvgpu: SMMU bypass
Improve GMMU mapping code to cope with discontiguous buffers.

Add debugfs entry that allows bypassing SMMU and disabling big pages.

Bug 1605769

Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717503
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/737533
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:59:01 -07:00
Terje Bergstrom
852822b2ef gpu: nvgpu: Record size of page table level
Record size of each page table level. The size of level 0 depends
on size of the address space, and we generally do not support the
whole address space.

Change-Id: Iab47505af1a641e193d9e98a2246e522813f221a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/729730
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-on: http://git-master/r/737531
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:58:52 -07:00
Terje Bergstrom
2204f2a524 gpu: nvgpu: Use common allocator for patch
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: Idf51831e8be9cabe1ab9122b18317137fde6339f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721030
Reviewed-on: http://git-master/r/737530
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:57:34 -07:00
Terje Bergstrom
1b7b271980 gpu: nvgpu: Use common allocator for context
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: I10c226e2377aa867a5cf11be61d08a9d67206b1d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/720507
2015-04-04 19:17:24 -07:00