Alloc space writes the page size to a field that requires pgsz_idx.
That can cause corruption in internal kernel structures.
Clear_sparse treated a parameter as page size instead of index.
Bug 1549451
Change-Id: I73ce17b99aae6865056facce72d2ab9ca8b3f81d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/495692
Regenerate clock gating lists. Add new blocks, and takes them into
use. Also moves some clock gating settings to be applied at the
earliest possible moment right after reset.
Change-Id: I21888186c200f7a477c63bd3332e8ed578f63741
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/457698
The nvgpu driver now supports using the Tegra graphics virtualization
interfaces to support gk20a in a virtualized environment.
Bug 1509608
Change-Id: I6ede15ee7bf0b0ad8a13e8eb5f557c3516ead676
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/440122
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Current implementation is based on config GK20A_PHYS_PAGE_TABLES
to have APIs to create/free/map/unmap phys pages
Remove this config based implementaion and move the APIs so that
they are called at runtime based on tegra_platform_is_linsim()
In generic APIs, we first check if platform is linsim and if it
is then we forward the call to phys page specific APIs
Change-Id: I23eb6fa6a46b804441f18fc37e2390d938d62515
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/488843
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Gk20a unmaps the addresses binding to dummy page to clear sparse.
On Gm20b, we need to free the allocated page table entry for sparse
memory.
Bug 1538384
Change-Id: Ie2409ab016c29f42c5f7d97dd7287b093b47f9df
Signed-off-by: Kevin Huang <kevinh@nvidia.com>
Reviewed-on: http://git-master/r/448645
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
This patch adds mm helpers to access compression backing store
from in-kernel shader.
Bug 1409151
Change-Id: Icb4f6dc0b5a35fdb97bc4221ab3657866f775fae
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/440263
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Lauri Peltonen <lpeltonen@nvidia.com>
GVS: Gerrit_Virtual_Submit
In simulation and emulation 50ms is not enough to ensure a job is
complete. Bump it to 5s when not running on silicon.
Bug 1510751
Change-Id: I90883b70ce2a75a8f07344f713d647b3fa0d0c7d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/432044
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Chris Dragan <kdragan@nvidia.com>
Tested-by: Chris Dragan <kdragan@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
VPR resize is done by forcing GPU to idle and then updating
VPR size from TLK.
There is no need now to call vpr_resize funtion from kernel
and hence these functions can be removed.
Bug 1487804
Change-Id: I758a6e0a99a58757866f1138b0a89594e2a33908
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/421703
(cherry picked from commit 391d9bacf053fe0dacffc76c36768f82912ad1f4)
Reviewed-on: http://git-master/r/419612
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove redundant cache maintenance operations. Instance blocks and
graphics context buffers are uncached, so they do not need any cache
maintenance.
Bug 1421824
Change-Id: Ie0be67bf0be493d9ec9e6f8226f2f9359cba9f54
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406948
PMU, FECS and GPCCS use the same address space. We used to initialize
the address space only if PMU is enabled. Create the system address
space always.
FECS and GPCCS used to have slower bit bang and faster DMA method
for loading ucode. Slower method is needed when FECS and GPCCS do not
have an address space. Remove the slower method as not anymore
needed.
Change-Id: I155619741ecc36aa6bf13a9c1ccb03c7c1330f0a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406771
This patch addresses two issues in fixes offset mappings:
- VA unmapping did not use lists safely. This caused an application hang
if the application did not free all (fixed offset) buffers before quiting.
- GPU was not powered closing AS node. If the address space had areas that
were not freed, the driver tried to access hw without powering it up first.
Change-Id: Ida526d222ea4e03b8d765eca16574ddc1823e60d
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/405872
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
If DMA address is not defined, use the physical address.
Bug 1500983
Change-Id: Ic33b21f74c8c2760e43146b87eec7ea467fc87be
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
(cherry picked from commit 8ae9a6567349241ce1cfff383526b0d9d39c28a1)
Reviewed-on: http://git-master/r/415238
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
TLB invalidate can have a race if several contexts use the same
address space. One thread starting an invalidate allows another
thread to submit before invalidate is completed.
Bug 1502332
Change-Id: I074ec493eac3b153c5f23d796a1dee1d8db24855
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/407578
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
This patch modifies the code to fall-back to 4k pages if the current
VA does not support 128k pages.
Bug 1409151
Change-Id: I94e9ca5953740388db689bc9306b0392191e29d2
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
On 64 bit architecture, we have plenty of vmalloc space
so that we can keep all the ptes mapped always
But on 32 but architecture, vmalloc space is limited and
hence we have to map/unmap ptes when required
Hence add new APIs for arch64 and call them if
IS_ENABLED(CONFIG_ARM64) is true
Bug 1443071
Change-Id: I091d1d6a3883a1b158c5c88baeeff1882ea1fc8b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/387642
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add two fb flushes after bar1 bind. This should hang the thread instead
of whole system in case there is a BAR1 hang.
Change-Id: I2385a243711219297b889daa30c9fc81106e5825
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/390183