Commit Graph

213 Commits

Author SHA1 Message Date
Terje Bergstrom
c5367d7350 gpu: nvgpu: Do not pre-allocate cmdbuf entries
Do not preallocate cmdbuf tracking entries. Allocate them only when
needed.

Bug 200104160

Change-Id: I12f8392723c301a368af1e280893ff993480477f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743953
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/755148
2015-06-09 11:14:34 -07:00
Terje Bergstrom
de48cc83b4 gpu: nvgpu: Fix build problem by including slab.h
Change-Id: I2346ed48bf82c032194264bab999097bf86404e2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/745885
(cherry picked from commit aba5f1ce08c2813c8bd570f36a5d15aa2a1aeaf8)
Reviewed-on: http://git-master/r/753286
Reviewed-by: Automatic_Commit_Validation_User
2015-06-06 07:26:32 -07:00
Terje Bergstrom
0dc66952e4 gpu: nvgpu: Use vmalloc only when size >4K
When allocation size is 4k or below, we should use kmalloc. vmalloc
should be used only for larged allocations.

Introduce nvgpu_alloc, which checks the size, and decides the API
to use.

Change-Id: I593110467cd319851b27e57d1bfe8d228d3f2909
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/743974
(cherry picked from commit 7f56aa1f0ecafbfde7286353b60e25e494674d26)
Reviewed-on: http://git-master/r/753276
Reviewed-by: Automatic_Commit_Validation_User
2015-06-06 07:24:11 -07:00
Bharat Nihalani
b8aa486109 Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""""
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.

Bug 200112195

Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-06-04 10:41:00 -07:00
Bharat Nihalani
1d8fdf5695 Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.

Bug 200106514

Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
2015-06-02 20:18:55 -07:00
Alex Waterman
01f359f3f1 Revert "Revert "gpu: nvgpu: New allocator for VA space""
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.

The original commit was actually fine.

Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-19 13:09:00 -07:00
Konsta Holtta
16fc6e3931 gpu: nvgpu: protect missing sgl in gk20a_mem_phys
Return zero for missing sgl (sgt is already checked) instead of
attempting to dereference NULL. Those NULL conditions should be almost
nonexistent, and zero is not normally used.

When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.

Change-Id: I7033979091951cba3e3004ddc7550cd327ad0baf
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/737759
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:27 +05:30
Sami Kiminki
520ff00e87 gpu: nvgpu: Implement compbits mapping
Implement NVGPU_AS_IOCTL_GET_BUFFER_COMPBITS_INFO for requesting info
on compbits-mappable buffers; and NVGPU_AS_IOCTL_MAP_BUFFER_COMPBITS,
which enables mapping compbits to the GPU address space of said
buffers. This, subsequently, enables moving comptag swizzling from GPU
to CDEH/CDEV formats to userspace.

Compbits mapping is conservative and it may map more than what is
strictly needed. This is because two reasons: 1) mapping must be done
on small page alignment (4kB), and 2) GPU comptags are swizzled all
around the aggregate cache line, which means that the whole cache line
must be visible even if only some comptag lines are required from
it. Cache line size is not necessarily a multiple of the small page
size.

Bug 200077571

Change-Id: I5ae88fe6b616e5ea37d3bff0dff46c07e9c9267e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/719710
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:33:19 +05:30
Konsta Holtta
c19c046446 gpu: nvgpu: protect missing sgt in gk20a_mem_phys
Return zero for missing sgt instead of attempting to dereference NULL.
Those NULL conditions should be almost nonexistent, and zero is not
normally used.

When reading gk20a_mem_phys() in gk20a_gr_get_chid_from_ctx() from an
isr, the mem desc may race with channel deletion and get suddendly
zeroed, even if the channel's in_use flag would be set. Plain zero
results in expected behaviour.

Change-Id: Id8ce37798d6fd3ceeb96a3f521c82569fccf30aa
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/729006
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-18 11:19:24 +05:30
Terje Bergstrom
aa25a952ea Revert "gpu: nvgpu: New allocator for VA space"
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.

Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
2015-05-12 02:46:39 -07:00
Alex Waterman
a2e8523645 gpu: nvgpu: New allocator for VA space
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.

The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.

The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.

Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.

Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:53:25 -07:00
Alex Waterman
0566aee853 gpu: nvgpu: WAR for simulator bug
On linsim, when the push buffers are allowed to be allocated with small
pages above 4GB the simulator crashes. This patch ensures that for
linsim all small page allocations are forced to be below 4GB in the
GPU VA space. By doing so the simulator no longer crashes.

This bug has come up because the GPU buddy allocator work generates
allocations at the top of the address space first. Thus push buffers
were located at between 12GB and 16GB in the GPU VA space.

Change-Id: Iaef0af3fda3f37ac09a66b5e1179527d6fe08ccc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740728
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:52:09 -07:00
Alex Waterman
4d405809a9 gpu: nvgpu: Reduce BAR1 kernel size
Reduce the BAR1 size in the kernel to match the reserved size in the
DTB. This caused problems for the buddy allocator since the allocator
can sometimes allocate from higher memory before lower memory in the
managed space. This would cause the kernel to access unmapped memory.

Change-Id: I70b72ef5bb4db01253e5087757051ef852e99bc6
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740726
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-05-11 08:51:08 -07:00
Sami Kiminki
8d6fe0f2ef gpu: nvgpu: Implement compbits padding for mapping
Implement NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS, which adds
extra alignment to compbits allocation for safe compbits mapping.

Bug 200077571

Change-Id: I3a74ebb81412e4e1e69501debeb9ef4e2056ef1a
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/730763
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/740693
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2015-05-11 08:50:49 -07:00
Terje Bergstrom
b3a85df53b gpu: nvgpu: SMMU bypass
Improve GMMU mapping code to cope with discontiguous buffers.

Add debugfs entry that allows bypassing SMMU and disabling big pages.

Bug 1605769

Change-Id: I14d32c62293a16ff8c7195377c75a85fa8061083
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717503
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/737533
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2015-05-05 13:59:01 -07:00
Terje Bergstrom
852822b2ef gpu: nvgpu: Record size of page table level
Record size of each page table level. The size of level 0 depends
on size of the address space, and we generally do not support the
whole address space.

Change-Id: Iab47505af1a641e193d9e98a2246e522813f221a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/729730
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-on: http://git-master/r/737531
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:58:52 -07:00
Terje Bergstrom
2204f2a524 gpu: nvgpu: Use common allocator for patch
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: Idf51831e8be9cabe1ab9122b18317137fde6339f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721030
Reviewed-on: http://git-master/r/737530
Reviewed-by: Alexander Van Brunt <avanbrunt@nvidia.com>
Tested-by: Alexander Van Brunt <avanbrunt@nvidia.com>
2015-05-05 13:57:34 -07:00
Terje Bergstrom
1b7b271980 gpu: nvgpu: Use common allocator for context
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: I10c226e2377aa867a5cf11be61d08a9d67206b1d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/720507
2015-04-04 19:17:24 -07:00
Deepak Nibade
38fc3a48a0 gpu: nvgpu: add platform specific get_iova_addr()
Add platform specific API pointer (*get_iova_addr)()
which can be used to get iova/physical address from
given scatterlist and flags

Use this API with g->ops.mm.get_iova_addr() instead
of calling API gk20a_mm_iova_addr() which makes it
platform specific

Bug 1605653

Change-Id: I798763db1501bd0b16e84daab68f6093a83caac2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/713089
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 19:02:17 -07:00
Terje Bergstrom
42d17018b4 gpu: nvgpu: Use common allocator for compbit store
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: I7c1662b669ed8c86465254f6001e536141051ee5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/720435
2015-04-04 19:01:53 -07:00
Terje Bergstrom
90e42e424a gpu: nvgpu: Helper for no kernel mapping alloc
Reduce amount of duplicate code around memory allocation by
introducing a variant of allocation helper that does not map the
allocated buffer to kernel address space.

NO_KERNEL_MAPPING allocations return a struct page **, so store the
results of allocation in a new field of mem_desc.

Bug 1605769

Change-Id: Ib760b9e6d34b229b04d1fb4f3abf10648670fc69
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/721029
2015-04-04 19:01:37 -07:00
Terje Bergstrom
92b667b888 gpu: nvgpu: Use common allocator for GPFIFO
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: I81701427ae29b298039a77f1634af9c14237812e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/719872
2015-04-04 19:01:37 -07:00
Terje Bergstrom
2fc4169293 gpu: nvgpu: Use common allocator for cmd queue
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.

Bug 1605769

Change-Id: If93063acbbfaa92aef530208241988427b5df8eb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/719871
2015-04-04 19:01:36 -07:00
Terje Bergstrom
7290a6cbd5 gpu: nvgpu: Implement common allocator and mem_desc
Introduce mem_desc, which holds all information needed for a buffer.
Implement helper functions for allocation and freeing that use this
data type.

Change-Id: I82c88595d058d4fb8c5c5fbf19d13269e48e422f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/712699
2015-04-04 18:59:26 -07:00
Seshendra Gadagottu
cf0085ec23 gpu:nvgpu: add support for unmapped ptes
Add support for unmapped ptes during gmmu map.

Bug 1587825

Change-Id: I6e42ef58bae70ce29e5b82852f77057855ca9971
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/696507
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:57:59 -07:00
Konsta Holtta
5f6cc1289e Revert "gpu: nvgpu: cache cde compbits buf mappings"
This reverts commit 9968badd26490a9d399f526fc57a9defd161dd6c. The commit
accidentally introduced some memory leaks.

Change-Id: I00d8d4452a152a8a2fe2d90fb949cdfee0de4c69
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/714288
Reviewed-by: Juha Tukkinen <jtukkinen@nvidia.com>
2015-04-04 18:57:23 -07:00
Seshendra Gadagottu
b653c8f62f gpu: nvgpu: add bar2 memory descriptor
Save bar2 memory descriptor in mm data structure.

Bug 1587825

Change-Id: I3063c3e28a4e583c2d2c099402077d2d25fc6e68
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/682101
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:08:52 -07:00
Seshendra Gadagottu
173aceb5b7 gpu: nvgpu: setup chip specific mm hw init
Add support for setting-up mm hw init per soc.

Bug 1587825

Change-Id: Ie5c5e49a767cfb14e3dbbb6902349284cd3dca95
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/681784
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:08:52 -07:00
Konsta Holtta
593c7a3f30 gpu: nvgpu: cache cde compbits buf mappings
don't unmap compbits_buf explicitly from system vm early but store it in
the dmabuf's private data, and unmap it later when all user mappings to
that buffer have been disappeared.

Bug 1546619

Change-Id: I333235a0ea74c48503608afac31f5e9f1eb4b99b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/661949
(cherry picked from commit ed2177e25d9e5facfb38786b818330798a14b9bb)
Reviewed-on: http://git-master/r/661835
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:08:24 -07:00
Terje Bergstrom
f3a920cb01 gpu: nvgpu: Refactor page mapping code
Pass always the directory structure to mm functions instead of
pointers to members to it. Also split update_gmmu_ptes_locked()
into smaller functions, and turn the hard
coded MMU levels (PDE, PTE) into run-time parameters.

Change-Id: I315ef7aebbea1e61156705361f2e2a63b5fb7bf1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/672485
Reviewed-by: Automatic_Commit_Validation_User
2015-04-04 18:08:16 -07:00
Konsta Holtta
3877adcd65 gpu: nvgpu: add hw perfmon buffer mapping ioctls
Map/unmap buffers for HWPM and deal with its instance block, the minimum
work required to run the HWPM via regops from userspace.

Bug 1517458
Bug 1573150

Change-Id: If14086a88b54bf434843d7c2fee8a9113023a3b0
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/673689
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:08:02 -07:00
Terje Bergstrom
f9fd5bbabe gpu: nvgpu: Unify PDE & PTE structs
Introduce a new struct gk20a_mm_entry. Allocate and store PDE and PTE
arrays using the same structure. Always pass pointer to this struct
when possible between functions in memory code.

Change-Id: Ia4a2a6abdac9ab7ba522dafbf73fc3a3d5355c5f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/696414
2015-04-04 18:07:35 -07:00
Terje Bergstrom
a3b26f25a2 gpu: nvgpu: TLB invalidate after map/unmap
Always invalidate TLB after mapping or unmapping, and remove the
delayed TLB invalidate.

Change-Id: I6df3c5c1fcca59f0f9e3f911168cb2f913c42815
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/696413
Reviewed-by: Automatic_Commit_Validation_User
2015-04-04 18:06:37 -07:00
Konsta Holtta
5b6e8995b2 gpu: nvgpu: use vm param to vm_map/unmap_buffer
Pass vm instead of as share to the userspace buffer mapping functions,
since they need to be called also from other places than just the AS
device ioctls, and as share is specific to them.

Bug 1573150

Change-Id: I994872f23ea7b1582361f3f4fabbd64b4786419c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/674020
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:04:48 -07:00
Terje Bergstrom
4aef10c950 gpu: nvgpu: Set compression page per SoC
Compression page size varies depending on architecture. Make it
129kB on gk20a and gm20b.

Also export some common functions from gm20b.

Bug 1592495

Change-Id: Ifb1c5b15d25fa961dab097021080055fc385fecd
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/673790
2015-04-04 18:04:45 -07:00
Seshendra Gadagottu
f4883ab97a gpu:nvgpu: add bar2 aperture support
Bug 1587825

Change-Id: I884c6b268aabb04b4990713395ebedf92410e02a
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/659239
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:04:38 -07:00
Konsta Holtta
2dda8077ec gpu: nvgpu: unify instance block initialization
Create gk20a_init_inst_block() to reduce reg write clutter when
initializing instance blocks, which is done in several places.

Change-Id: Idcb8b604851a849e0bb6abce5743c9f4cbf98033
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/672434
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:04:35 -07:00
Alex Waterman
a99bbc5f60 gpu: nvgpu: make larger address space work
Implement several fixes for allowing the GVA address space to grow
to larger than 32GB and increase the address space to 128GB.

 o  Implement dynamic allocation of PDE backing pages. The memory
    to store the PDE entries was hard coded to 1 page. Now the
    number of pages necessary is computed dynamically based on the
    size of the address space and the size of large pages.

 o  Fix an arithmetic problem in the gm20b sparse texture code
    that caused large address spaces to be truncated when sparse
    PDEs/PTEs were being filled in. This caused a kernel panic
    when freeing the address space since a lot of the backing
    PTE memory was not allocated.

 o  Change the address space split for large and small pages. Small
    pages now occupy the bottom 16GB of the address space. Large
    pages are used for the rest of the address space. Now, with a
    128GB address space, there are 112GB of large page GVA available.

This patch exists to allow large (16GB) sparse textures to be allocated
without running into lack of memory issues and kernel panics.

Bug 1574267

Change-Id: I7c59ee54bd573dfc53b58c346156df37a85dfc22
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/671204
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 18:02:38 -07:00
Terje Bergstrom
5df3d09e16 gpu: nvgpu: gm20b: Enable CTA preemption
CTA preemption needs to be enabled by setting a value in context. Set
it for gm20b.

Bug 200063473
Bug 1517461

Change-Id: I080cd71b348d08f834fd23ebbe7443dba79224db
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/661299
2015-04-04 15:06:45 -07:00
Konsta Holtta
4ccb162da7 gpu: nvgpu: unify instance block creation
Reduce copypaste code in instance block allocation and deletion with
functions purposed for that.

Change-Id: I2c8ae6a317ac89e2c857dde4296cb4316b8aaafe
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/668698
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 15:06:38 -07:00
Janne Hellsten
57eeccb4a5 gpu: nvgpu: use u32 for priv_cmd_queue get/put/size
Switch to a larger integer type for priv_cmd_queue get/put/size
fields.  The previous 16-bit int type overflowed on >= 2048 gpfifo
buffer sizes.  This triggered a div-by-zero kernel panic.

Bug 1592391

Signed-off-by: Janne Hellsten <jhellsten@nvidia.com>
Change-Id: Ibffcbbd145f39fdb4a63d05b1dcb42bb4b101795
Reviewed-on: http://git-master/r/667103
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-04-04 15:06:05 -07:00
Terje Bergstrom
0bc513fc46 gpu: nvgpu: Remove gk20a sparse texture & PTE freeing
Remove support for gk20a sparse textures. We're using implementation
from user space, so gk20a code is never invoked.

Also removes ref_cnt for PTEs, so we never free PTEs when unmapping
pages, but only at VM delete time.

Change-Id: I04d7d43d9bff23ee46fd0570ad189faece35dd14
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/663294
2015-03-18 12:12:32 -07:00
Terje Bergstrom
0d9bb7f82e gpu: nvgpu: Per-chip context creation
Add HAL for context creation, and expose functions that T18x context
creation needs.

Bug 1517461
Bug 1521790
Bug 200063473

Change-Id: I63d1c52594e851570b677184a4585d402125a86d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/660237
2015-03-18 12:12:27 -07:00
Terje Bergstrom
5477d0f4c2 gpu: nvgpu: Generic mem_desc & allocation
Make mem_desc a generic container for buffers. Add functions for
allocating and mapping buffers to an address space which store their
data in mem_desc.

Change-Id: I031643442c6fd41f5e7222fe9b7bfcaf9b784db5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/660908
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
2015-03-18 12:12:27 -07:00
Terje Bergstrom
2d71d633cf gpu: nvgpu: Physical page bits to be per chip
Retrieve number of physical page bits based on chip.

Bug 1567274

Change-Id: I5a0f6a66be37f2cf720d66b5bdb2b704cd992234
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/601700
2015-03-18 12:12:19 -07:00
Jussi Rasanen
529962911c gpu: nvgpu: cde: Combine H and V passes
When using CDE firmware v1, combine H and V swizzling passes into one
pushbuffer submission. This removes one GPU context switch, almost
halving the time taken for swizzling.

Map only the compbit part of the destination surface.

Bug 1546619

Change-Id: I95ed4e4c2eefd6d24a58854d31929cdb91ff556b
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/553234
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:12:06 -07:00
Terje Bergstrom
2eb6dcb469 gpu: nvgpu: Implement 64k large page support
Implement support for 64kB large page size. Add an API to create an
address space via IOCTL so that we can accept flags, and assign one
flag for enabling 64kB large page size.

Also adds APIs to set per-context large page size. This is possible
only on Maxwell, so return error if caller tries to set large page
size on Kepler.

Default large page size is still 128kB.

Change-Id: I20b51c8f6d4a984acae8411ace3de9000c78e82f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:11:46 -07:00
Terje Bergstrom
ecc6f27fd1 gpu: nvgpu: Common VM initializer
Merge initialization code from gk20a_init_system_vm(),
gk20a_init_bar1_vm() and gk20a_vm_alloc_share() into gk20a_init_vm().

Remove redundant page size data, and move the page size fields to be
VM specific.

Bug 1558739
Bug 1560370

Change-Id: I4557d9e04d65ccb48fe1f2b116dd1bfa74cae98e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:11:46 -07:00
Konsta Holtta
719923ad9f gpu: nvgpu: rename gpu ioctls and structs to nvgpu
To help remove the nvhost dependency from nvgpu, rename ioctl defines
and structures used by nvgpu such that nvhost is replaced by nvgpu.
Duplicate some structures as needed.

Update header guards and such accordingly.

Change-Id: Ifc3a867713072bae70256502735583ab38381877
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542620
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:11:33 -07:00
Lauri Peltonen
41f6befed0 gpu: nvgpu: Support ZBC color tracking
The compression state tracking user space API already accepts and
returns the ZBC color used for the surface. Actually store the color
in kernel so that the feature works.

Bug 1536227
Bug 1524301

Change-Id: I264e1eeb90f0c4d40fe35fc2479b0ce83e19a7d7
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/497476
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Tested-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-03-18 12:11:15 -07:00