There were still a couple of places using sg_phys directly. Use new
gk20a_mem_phys() to make the code shorter.
Change-Id: I6eb9b14e0c14a27ec39bacd06ab24e31e99769ca
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/717502
Fixed the following sparse warnings by making below APIs static:
- gk20a.c: warning: symbol 'gk20a_pm_restore_debug_setting' was not declared.
Should it be static?
- gr_gk20a.c: warning: symbol 'gr_gk20a_rop_l2_en_mask' was not declared.
Should it be static?
- gr_gm20b.c: warning: symbol 'gr_gm20b_rop_l2_en_mask' was not declared.
Should it be static?
Bug 200067946
Change-Id: I334893bb6614171bff835d270716a7dd262c9ba7
Signed-off-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-on: http://git-master/r/718756
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Introduce mem_desc, which holds all information needed for a buffer.
Implement helper functions for allocation and freeing that use this
data type.
Change-Id: I82c88595d058d4fb8c5c5fbf19d13269e48e422f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/712699
New members are added in nvgpu_gpu_characterstics to export more
information required specially from CUDA tools.
Change-Id: I907f3bcbd272405a13f47ef6236bc2cff01c6c80
Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com>
Reviewed-on: http://git-master/r/679202
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
The current CUDA drivers have been using the regops to
directly accessing the GPU registers from user space through
the dbg node. This is a security hole and needs to be avoided.
The patch alternatively implements the similar functionality
in the kernel and provide an ioctl for it.
Bug 200083334
Change-Id: Ic5ff5a215cbabe7a46837bc4e15efcceb0df0367
Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com>
Reviewed-on: http://git-master/r/711758
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
We currently have below exceptions enabled but we do
not have any handler for them. So if any of these
exception is raised, we do not clear it.
NV_PGRAPH_EXCEPTION_PD
NV_PGRAPH_EXCEPTION_SCC
NV_PGRAPH_EXCEPTION_DS
NV_PGRAPH_EXCEPTION_MME
NV_PGRAPH_EXCEPTION_SKED
Hence do not enable above exceptions.
Bug 200078514
Change-Id: I0dd3a2299f80f3fe06994818f64151e7cc83a84e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/714166
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
In gk20a_gr_isr(), handle memfmt exception as below :
- read NV_PGRAPH_PRI_MEMFMT_HWW_ESR
- debug print for contents of above register
- write same value back to NV_PGRAPH_PRI_MEMFMT_HWW_ESR and
clear the exception
Bug 200078514
Change-Id: I5b9afacd7f99b5a37de953041582b3a53b863642
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/713713
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Added support for dumping vpr/wpr info for gm20b.
This dump info called when ever gk20a_mm_fb_flush
is timed-out.
Bug 200082817
Change-Id: I21b0372d0e3f976a189c9c428c015165b715bf88
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/711439
(cherry picked from commit b69897d71c8f6119b49ceb8d3273cdb354178cc5)
Reviewed-on: http://git-master/r/712675
GVS: Gerrit_Virtual_Submit
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
Use busy looping for l2 tag flush and elpg flush
operations. This is making total flash time more
accurate and reduced overall time compared with
usleep. Also added trace points to measure
performance for these operations.
Also corrected timeout error check for non-silicon
platforms.
Bug 200081799
Change-Id: I63410bb7528db9258501633996fbdee5fdec1c74
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/710472
(cherry picked from commit 18684cf9d5d6870a1a1fd5711c4fc2d733caad20)
Reviewed-on: http://git-master/r/710986
GVS: Gerrit_Virtual_Submit
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
Allow enabling of PC sampling hardware workaround. It is only
applicable to gm20b.
Bug 1517458
Bug 1573150
Change-Id: Iad6a3ae556489fb7ab9628637d291849d2cd98ea
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/710421
Pass always the directory structure to mm functions instead of
pointers to members to it. Also split update_gmmu_ptes_locked()
into smaller functions, and turn the hard
coded MMU levels (PDE, PTE) into run-time parameters.
Change-Id: I315ef7aebbea1e61156705361f2e2a63b5fb7bf1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/672485
Reviewed-by: Automatic_Commit_Validation_User
Introduce a new struct gk20a_mm_entry. Allocate and store PDE and PTE
arrays using the same structure. Always pass pointer to this struct
when possible between functions in memory code.
Change-Id: Ia4a2a6abdac9ab7ba522dafbf73fc3a3d5355c5f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/696414
Add below APIs to dump various GR status registers
1. debugfs : /d/gpu.0/gr_status
Read this debugfs at runtime to get status registers
2. API gk20a_gr_debug_dump()
Add this API in code to dump registers at any point
Bug 200062436
Change-Id: Ic1115b5a2fc16362954b5ed8a9e70afb872a8d91
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/486465
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
bug 200078367
using udelay for fecs status polling
during GR init phase brings down fecs
transaction time to < 20usec from few
hundred usec.
Change-Id: I61a27daaf1187ac086a42779b46aa3fbee3b37f2
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/691918
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Bug 200066741
ACR ucode has mechanism to skip WPR blob copy for second time,
in case WPR size is sent as 0 to acr ucode.
With above there is a saving of around 0.5 ms, but, in
conjunction with acr change to disable LS sig verification,
and scrubbing empty spaces in WPR sections to 0. This change
can reduce railgate exit latency by 4ms
ACR ucodes to be checked in main, as a different CL, and after
getting prod signs for ACR
Change-Id: I9d662027abf0b2615176d17433ff3ec3ae53d78a
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/681892
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Make recovery a more straightforward process. When we detect a fault,
trigger MMU fault, and wait for it to trigger, and complete recovery.
Also reset engines before aborting channel to ensure no stray sync
point increments can happen.
Change-Id: Iac685db6534cb64fe62d9fb452391f43100f2999
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/709060
(cherry picked from commit 95c62ffd9ac30a0d2eb88d033dcc6e6ff25efd6f)
Reviewed-on: http://git-master/r/707443
Compression page size varies depending on architecture. Make it
129kB on gk20a and gm20b.
Also export some common functions from gm20b.
Bug 1592495
Change-Id: Ifb1c5b15d25fa961dab097021080055fc385fecd
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/673790
Make gr_gm20b_alloc_gr_ctx static. It is used only in the same file.
Bug 200067946
Change-Id: I484ff84ebe9a356f251db5a14ca0e60db64578bf
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/673267
Implement several fixes for allowing the GVA address space to grow
to larger than 32GB and increase the address space to 128GB.
o Implement dynamic allocation of PDE backing pages. The memory
to store the PDE entries was hard coded to 1 page. Now the
number of pages necessary is computed dynamically based on the
size of the address space and the size of large pages.
o Fix an arithmetic problem in the gm20b sparse texture code
that caused large address spaces to be truncated when sparse
PDEs/PTEs were being filled in. This caused a kernel panic
when freeing the address space since a lot of the backing
PTE memory was not allocated.
o Change the address space split for large and small pages. Small
pages now occupy the bottom 16GB of the address space. Large
pages are used for the rest of the address space. Now, with a
128GB address space, there are 112GB of large page GVA available.
This patch exists to allow large (16GB) sparse textures to be allocated
without running into lack of memory issues and kernel panics.
Bug 1574267
Change-Id: I7c59ee54bd573dfc53b58c346156df37a85dfc22
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/671204
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
enables non-blocking interrupts in ce2 all other
ce2 interrupts are cleared and not handled.
bug 200036089
Change-Id: I9f47b06c677c72ac523019e6a3f70fedd07830a2
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/671783
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
CTA preemption needs to be enabled by setting a value in context. Set
it for gm20b.
Bug 200063473
Bug 1517461
Change-Id: I080cd71b348d08f834fd23ebbe7443dba79224db
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/661299
The default zbc entries were never populated in zbc HW table
because the conditional flag "gr->sw_ready" was always set thus
avoided the zbc default loading function call. Now zbc default
loading would happen only during boot time in sw structure.Hw
zbc regs would be loaded from that structure every time a
railgate exit happens.
Bug 1580210
Change-Id: Ie3e40738cbc84cf724c3f3871f15b17a5c84025a
Signed-off-by: Sujeet Baranwal <sbaranwal@nvidia.com>
Reviewed-on: http://git-master/r/662306
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Tested-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move the debug dump to HAL and add a stub for vgpu.
Bug 1595164
Change-Id: Ifdcdd8a8caca7a41919dad075fee1c87032f53b0
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/662722
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove an undesired register from the regops whitelist on both
gk20a and gm20b.
Bug 1589732
Change-Id: I7747fafd3c2c32a9c5ce6388be73c7f61e509f0a
Signed-off-by: Matt Craighead <mcraighead@nvidia.com>
Reviewed-on: http://git-master/r/663373
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove support for gk20a sparse textures. We're using implementation
from user space, so gk20a code is never invoked.
Also removes ref_cnt for PTEs, so we never free PTEs when unmapping
pages, but only at VM delete time.
Change-Id: I04d7d43d9bff23ee46fd0570ad189faece35dd14
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/663294
Remove an undesired register from the regops whitelist on both
gk20a and gm20b.
Bug 1589712
Change-Id: I76e8ff1f4b68d6d5ce2c11adc08d984df7883e5e
Signed-off-by: Matt Craighead <mcraighead@nvidia.com>
Reviewed-on: http://git-master/r/663371
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Make pagepool size query into a function instead of storing the value
during boot time in a structure. This simplifies the structure and
users of pagepool size do not need to worry about whether it has
already been set.
Change-Id: Iba16e840cdf9b6c39449730237aa7d8fdff47848
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/660907
Fix below sparse warnings :
kernel/drivers/gpu/nvgpu/gm20b/mm_gm20b.c:283:5:
warning: symbol 'gm20b_mm_get_big_page_sizes' was not declared.
Should
it be static?
kernel/drivers/gpu/nvgpu/gm20b/clk_gm20b.c:1055:12:
warning: symbol 'gm20b_clk_get' was not declared. Should it be
static?
Bug 200032218
Change-Id: Id199b4b1853b3c933c91509fd550c7b5538cff29
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/660133
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
In simulation we disable timeouts system-wide. Use the system-wide
timeout for ACR boot to enable ACR boot in simulation.
Bug 1546850
Change-Id: I58fc0485725195feab24ae5fe4f249116668bbcc
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/606273