This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.
Bug 200106514
Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
Fix the return code for both gk20a_ and gm20b_ltc_cbc_ctrl()
functions. Before a positive return woudl always happen. Now,
if there's a timeout -EBUSY is returned.
Change-Id: Id76dc44af1376fceebf5043afb057c153cb0752e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/729165
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.
The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.
The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.
Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.
Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.
Bug 1605769
Change-Id: I7c1662b669ed8c86465254f6001e536141051ee5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/720435
Use busy looping for l2 tag flush and elpg flush
operations. This is making total flash time more
accurate and reduced overall time compared with
usleep. Also added trace points to measure
performance for these operations.
Also corrected timeout error check for non-silicon
platforms.
Bug 200081799
Change-Id: I63410bb7528db9258501633996fbdee5fdec1c74
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/710472
(cherry picked from commit 18684cf9d5d6870a1a1fd5711c4fc2d733caad20)
Reviewed-on: http://git-master/r/710986
GVS: Gerrit_Virtual_Submit
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
We have infinite timeouts for loops in emulation. Some functions with
the loops still return error if we exceed the original retry count.
Change-Id: I1f9ddbfc0acd9f30f6bd49d9e748d8d8fbefa154
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/709491
Fix below sparse warnings :
warning: Using plain integer as NULL pointer
warning: symbol <variable/funcion> was not declared. Should it be static?
warning: Initializer entry defined twice
Also, remove dead functions
Bug 1573254
Change-Id: I29d71ecc01c841233cf6b26c9088ca8874773469
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/593363
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Convert GR and LTC HALs to use const structs, and initialize them
with macros.
Bug 1567274
Change-Id: Ia3f24a5eccb27578d9cba69755f636818d11275c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/590371
gk20a and gm20b calculate L2 size with different parameters. Split
the function for calculating size so that it does not query GPU id.
Bug 1567274
Change-Id: I09510c1bf0286c9df125d74e51df322c32bde646
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
L2 bypass registers have moved in gm20b. Move the code to
ltc_common.c, which gets compiled once per chip version.
Change-Id: I0ab4dd03c78b8ad8abc7a7b18c094b6002827587
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/499220
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Current implementation is based on config GK20A_PHYS_PAGE_TABLES
to have APIs to create/free/map/unmap phys pages
Remove this config based implementaion and move the APIs so that
they are called at runtime based on tegra_platform_is_linsim()
In generic APIs, we first check if platform is linsim and if it
is then we forward the call to phys page specific APIs
Change-Id: I23eb6fa6a46b804441f18fc37e2390d938d62515
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/488843
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
When exiting rail gate, we reloaded default ZBC values. The correct
behavior is to reload the values.
Bug 1447255
Change-Id: I7aad3586dda91a91a3629062a27001af281b955e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/418346
ELPG flush is initiated from a common broadcast register, but must be
waited on via per-L2 registers. Split gk20a and gm20b versions of
the flush.
Change-Id: I75c2d65e8da311b50d35bee70308b60464ec2d4d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/401545
Reviewed-by: Automatic_Commit_Validation_User
CBC frontdoor access works incorrectly in the simulator if CBC
is allocated from IOVA. This patch makes CBC allocation to happen
from physical memory if are running in simulator.
Bug 1409151
Change-Id: Ia1d1ca35b5a0375f4707824df3ef06ad1b9117d4
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
This patch adds necessary code to store the gpu configuration into
gr structure.
Bug 1409151
Change-Id: I045b21ebdc849833380a3d953d951f8352842ac7
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
We poll completion of flush sequence by polling the broadcast
register. The polling should be done for a per-slice register
instead.
Bug 1457723
Change-Id: I10aba939175b6d05b05f5f26eebebcbe09d9b4a7
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/382521
Reviewed-by: Juha Tukkinen <jtukkinen@nvidia.com>
Tested-by: Juha Tukkinen <jtukkinen@nvidia.com>