Comptags are assigned per 128k. For 64k pages this means we need to
assign same index to two pages. Change the logic to use byte sizes.
Change-Id: If298d6b10f1c1cad8cd390f686d22db103b02d12
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/594403
When using CDE firmware v1, combine H and V swizzling passes into one
pushbuffer submission. This removes one GPU context switch, almost
halving the time taken for swizzling.
Map only the compbit part of the destination surface.
Bug 1546619
Change-Id: I95ed4e4c2eefd6d24a58854d31929cdb91ff556b
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/553234
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Trying to kfree pmu.vm or bar1.vm is not allowed, since they are not
directly allocated. Separate the vm kfree from the actual vm support
removal, so that they can be done individually.
Change-Id: I7628f546b94e0de909371ce315e4cb065e5ef953
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/592112
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Some vm's do not have big pages.
Bug 1476801
Change-Id: Ic82ca7a1380834ea30582631af224c81fd01e4bb
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/592113
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Fix below sparse warnings :
warning: Using plain integer as NULL pointer
warning: symbol <variable/funcion> was not declared. Should it be static?
warning: Initializer entry defined twice
Also, remove dead functions
Bug 1573254
Change-Id: I29d71ecc01c841233cf6b26c9088ca8874773469
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/593363
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Allow compression always when page size matches the big page
size for the context.
Bug 1558739
Change-Id: I27b0aed06c24d69bd1555626b9affb1149536615
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/590422
Convert GR and LTC HALs to use const structs, and initialize them
with macros.
Bug 1567274
Change-Id: Ia3f24a5eccb27578d9cba69755f636818d11275c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/590371
VM size should not depend on CPU architecture. It should be always
u64.
Change-Id: I81539807f6674877fd04f0079b2bec05b2a0640d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/562466
Implement support for 64kB large page size. Add an API to create an
address space via IOCTL so that we can accept flags, and assign one
flag for enabling 64kB large page size.
Also adds APIs to set per-context large page size. This is possible
only on Maxwell, so return error if caller tries to set large page
size on Kepler.
Default large page size is still 128kB.
Change-Id: I20b51c8f6d4a984acae8411ace3de9000c78e82f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Merge initialization code from gk20a_init_system_vm(),
gk20a_init_bar1_vm() and gk20a_vm_alloc_share() into gk20a_init_vm().
Remove redundant page size data, and move the page size fields to be
VM specific.
Bug 1558739
Bug 1560370
Change-Id: I4557d9e04d65ccb48fe1f2b116dd1bfa74cae98e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
gk20a_vm_map_buffer() used to ignore silently map requests for
non-dmabuf fd:s. Fix this by returning the error code from
dma_buf_get().
Bug 1566862
Change-Id: If01b03f43b67b17d9fb997d914db871520f50c6e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
When validating buffers to be mapped, check that the buffer end does not
overflow over the virtual address node space.
Bug 1562361
Change-Id: I3c78ec7380584ae55f1e6bf576f524abee846ddd
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Move nvgpu ioctls from the many user space interface headers to a new
single nvgpu.h header under include/uapi. No new code or replaced names
are introduced; this change only moves the definitions and changes
include directives accordingly.
Bug 1434573
Change-Id: I4d02415148e437a4e3edad221e08785fac377e91
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542651
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
To help remove the nvhost dependency from nvgpu, rename ioctl defines
and structures used by nvgpu such that nvhost is replaced by nvgpu.
Duplicate some structures as needed.
Update header guards and such accordingly.
Change-Id: Ifc3a867713072bae70256502735583ab38381877
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542620
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Expose supported nvgpu ioctls to userspace via bits in the flags field
of nvhost_gpu_characteristics; currently define two bits for special
memory allocation support.
Bug 1539747
Change-Id: I1bc9333b12825d07a00b7a4136ae9d35816a5855
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/495942
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
dma_buf_get returns PTR_ERRs, so fix checking for null to proper IS_ERR
in gk20a_vm_map_buffer. Buffer mapping from user space with ioctls would
also have paniced here if an improper handle would be passed.
Change-Id: I245fe41cd209e49fc9265e56340c1c8215ffb1d2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/498320
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Alloc space writes the page size to a field that requires pgsz_idx.
That can cause corruption in internal kernel structures.
Clear_sparse treated a parameter as page size instead of index.
Bug 1549451
Change-Id: I73ce17b99aae6865056facce72d2ab9ca8b3f81d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/495692
Regenerate clock gating lists. Add new blocks, and takes them into
use. Also moves some clock gating settings to be applied at the
earliest possible moment right after reset.
Change-Id: I21888186c200f7a477c63bd3332e8ed578f63741
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/457698
The nvgpu driver now supports using the Tegra graphics virtualization
interfaces to support gk20a in a virtualized environment.
Bug 1509608
Change-Id: I6ede15ee7bf0b0ad8a13e8eb5f557c3516ead676
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/440122
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Current implementation is based on config GK20A_PHYS_PAGE_TABLES
to have APIs to create/free/map/unmap phys pages
Remove this config based implementaion and move the APIs so that
they are called at runtime based on tegra_platform_is_linsim()
In generic APIs, we first check if platform is linsim and if it
is then we forward the call to phys page specific APIs
Change-Id: I23eb6fa6a46b804441f18fc37e2390d938d62515
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/488843
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Gk20a unmaps the addresses binding to dummy page to clear sparse.
On Gm20b, we need to free the allocated page table entry for sparse
memory.
Bug 1538384
Change-Id: Ie2409ab016c29f42c5f7d97dd7287b093b47f9df
Signed-off-by: Kevin Huang <kevinh@nvidia.com>
Reviewed-on: http://git-master/r/448645
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
This patch adds mm helpers to access compression backing store
from in-kernel shader.
Bug 1409151
Change-Id: Icb4f6dc0b5a35fdb97bc4221ab3657866f775fae
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/440263
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com>
GVS: Gerrit_Virtual_Submit
In simulation and emulation 50ms is not enough to ensure a job is
complete. Bump it to 5s when not running on silicon.
Bug 1510751
Change-Id: I90883b70ce2a75a8f07344f713d647b3fa0d0c7d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/432044
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Chris Dragan <kdragan@nvidia.com>
Tested-by: Chris Dragan <kdragan@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
VPR resize is done by forcing GPU to idle and then updating
VPR size from TLK.
There is no need now to call vpr_resize funtion from kernel
and hence these functions can be removed.
Bug 1487804
Change-Id: I758a6e0a99a58757866f1138b0a89594e2a33908
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/421703
(cherry picked from commit 391d9bacf053fe0dacffc76c36768f82912ad1f4)
Reviewed-on: http://git-master/r/419612
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove redundant cache maintenance operations. Instance blocks and
graphics context buffers are uncached, so they do not need any cache
maintenance.
Bug 1421824
Change-Id: Ie0be67bf0be493d9ec9e6f8226f2f9359cba9f54
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406948
PMU, FECS and GPCCS use the same address space. We used to initialize
the address space only if PMU is enabled. Create the system address
space always.
FECS and GPCCS used to have slower bit bang and faster DMA method
for loading ucode. Slower method is needed when FECS and GPCCS do not
have an address space. Remove the slower method as not anymore
needed.
Change-Id: I155619741ecc36aa6bf13a9c1ccb03c7c1330f0a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406771
This patch addresses two issues in fixes offset mappings:
- VA unmapping did not use lists safely. This caused an application hang
if the application did not free all (fixed offset) buffers before quiting.
- GPU was not powered closing AS node. If the address space had areas that
were not freed, the driver tried to access hw without powering it up first.
Change-Id: Ida526d222ea4e03b8d765eca16574ddc1823e60d
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/405872
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
If DMA address is not defined, use the physical address.
Bug 1500983
Change-Id: Ic33b21f74c8c2760e43146b87eec7ea467fc87be
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
(cherry picked from commit 8ae9a6567349241ce1cfff383526b0d9d39c28a1)
Reviewed-on: http://git-master/r/415238
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
TLB invalidate can have a race if several contexts use the same
address space. One thread starting an invalidate allows another
thread to submit before invalidate is completed.
Bug 1502332
Change-Id: I074ec493eac3b153c5f23d796a1dee1d8db24855
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/407578
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
This patch modifies the code to fall-back to 4k pages if the current
VA does not support 128k pages.
Bug 1409151
Change-Id: I94e9ca5953740388db689bc9306b0392191e29d2
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
On 64 bit architecture, we have plenty of vmalloc space
so that we can keep all the ptes mapped always
But on 32 but architecture, vmalloc space is limited and
hence we have to map/unmap ptes when required
Hence add new APIs for arch64 and call them if
IS_ENABLED(CONFIG_ARM64) is true
Bug 1443071
Change-Id: I091d1d6a3883a1b158c5c88baeeff1882ea1fc8b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/387642
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add two fb flushes after bar1 bind. This should hang the thread instead
of whole system in case there is a BAR1 hang.
Change-Id: I2385a243711219297b889daa30c9fc81106e5825
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/390183