A reference to the default scheduling domain is taken when a TSG is
opened. Although the explicit bind is designed to support only one bind,
the TSG is bound to the default one implicitly at that point. Release
the reference to avoid leaking it.
The domain might be null at that point if the default domain has been
removed; in that case there's just no domain to put back.
Change-Id: I7db43f7bbb2a8c86c391280eb7fa32431c8982da
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2663420
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
The nvgpu timeout API has an internal override for presilicon mode by
default: in presi simulation environments the timeouts never trigger.
This behaviour is intended in the original usecase of the timer unit
with hardware polling loops. In pure software logic though, the timer
must trigger after the specified timeout even in presi mode so add a new
init function to produce a timer for software logic. Use this new kind
of timer in channel and scheduling worker threads.
The channel worker currently times out for just the purpose of the
channel watchdog timer which has its own internal timer. Although that's
just software, the general expectation is that the watchdog does not
trigger in presilicon tests that run slower than usual. The internal
watchdog timer thus keeps the non-sw mode.
Bug 3521828
Change-Id: I48ae8522c7ce2346a930e766528d8b64195f81d8
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662541
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Currently, VAB implementation is using fixed number of access bits. This
value can be computed using fb_mmu_vidmem_access_bit_size_f() value.
- Modify VAB implementation to compute number of access bits.
- Modify nvgpu_vab structure to hold VAB entry size corresponding to
number of access bits.
- Information given by nvgpu_vab structure is more related to the GPU
than nvgpu_mm structure. Move nvgpu_vab struct element to gk20a struct.
- Add fb.set_vab_buffer_address to update vab buffer address in hw
registers.
- Rename gr.vab_init HAL to gr.vab_reserve to avoid any confusion about
when this HAL should be used.
-Replace gr.vab_release and gr.vab_recover with gr.vab_configure HAL.
Bug 3465734
Change-Id: I1b67bfa9be7728be5bda978c6bb87b196d55ab65
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2659467
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
CBC contig allocation requires mempool node in DT and the
node can be used for contig allocations. The code duplication
can be avoided by unifying the code from vgpu.
Change-Id: I6eaa1d0c9db47b158602bf0ba68ce4e09cf487a7
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2650459
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
At present, for each resume cycle the driver sends the
"nvgpu_cbc_op_clear" command to L2 cache controller, this causes the
contents of the compression bit backing store to be cleared, and results
in corrupting the metadata for all the compressible surfaces already allocated.
Fix this by updating cbc.init function to be aware of resume state and
not clear the compression bit backing store, instead issue
"nvgpu_cbc_op_invalide" command, this should leave the backing store in a
consistent state across suspend/resume cycles.
The updated cbc.init HAL for gv11b is reusable acrosss multiple chips, hence
remove unnecessary chip specific cbc.init HALs.
Bug 3483688
Change-Id: I2de848a083436bc085ee98e438874214cb61261f
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2660075
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In DRIVE 6.0, NvGPU needs to support error reporting in QNX-Safety,
QNX-Standard, and Linux. To support error reporting in all these
platform variants, SDL unit will be moved from QNX to common code.
As part of this refactoring activity, this patch removes ops assignment
for report error. Also, it removes API calls that are used to take
time-stamp for stall interrupt thread. This time-stamp APIs will be
brought back later, if required to support periodic diagnostics.
JIRA NVGPU-7353
Change-Id: I38536019dc7165e6a97674863b37d009854af948
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2655958
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
CONFIG_NVGPU_NVMAP_NEXT was defined for all kernels except 4.9. However,
Android builds nvgpu without NV_BUILD_KERNEL_OPTIONS set and it fails
while looking for definitions of the functions nvmap_dma_alloc_attrs
and nvmap_dma_free_attrs.
Define it for kstable if CONFIG_TEGRA_NVMAP_NEXT is set and define it
for kernel 5.10 (downstream) explicitly.
Bug 3445216
Change-Id: If73ca56cfc5668d6e318f470b31d999d663a4483
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2654677
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Kevin Kuo (SW-GPU) <kevkuo@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Ashish Mhetre <amhetre@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Separate nvgpu_cg_blcg/slcg_fb_ltc_load_enable function
into nvgpu_cg_blcg/slcg_fb_load_enable and
nvgpu_cg_blcg/slcg_ltc_load_enable.
Program fb slcg/blcg prod values during fb init and
program ltc slcg/blcg prod values after acr boot to
have correct privilege for ltc cg programming.
Update unit tests to have sperate blcg/slcg hal for
fb and ltc programming.
Bug 3423549
Change-Id: Icdb45528abe1a3ab68a47f689310dee9a4fe9366
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2646039
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Here, the freq_counter is set to track the count of number of
frequencies enumerated and capped by GP10B_MAX_SUPPORTED_FREQS.
There is an early terminating condition when new_rate equals max_rate.
The line following this is set to
WARN_ON(freq_counter == GP10B_MAX_SUPPORTED_FREQS);
This line is probably incorrect and contradicts the above loop as in
there is definite probability of freq_counter equaling
GP10B_MAX_SUPPORTED_FREQS. Probably the original intention might have
been to catch an off-by-1 error where freq_counter equals
GP10B_MAX_SUPPORTED_FREQS + 1.
Even then instead of printing a warning message, a better idea is to
handle the possible bug in the code itself.
Bug 3407276
Change-Id: I7f2a9d5c41be62227d08045e959e16c4228fbff4
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623380
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
All the VPR related code has been removed from core kernel from K5.10
onwards. So make use of NvMap exported APIs for handling of VPR carveout
from K5.10 onwards.
Use those exported APIs in nvgpu when CONFIG_NVGPU_NVMAP_NEXT is
defined. This config will be defined on K5.10 onwards.
Bug 3415117
Change-Id: I7be68a3a60a8b7811a7233681421f85395a81e2b
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2650509
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Allocator (bitmap, buddy, page) debugfs files are not cleaned up when
the allocators are destroyed. This leads to warning logs from nvgpu
like below:
[21073.493000] debugfs: File 'gk20a_as_17' in directory 'allocators' already present!
[21073.493026] debugfs: File 'gk20a_as_17-sys' in directory 'allocators' already present!
Remove the per-allocator debugfs node when destroying an allocator in
runtime.
While at this, add missing nvgpu_allocator locking to the function
nvgpu_bitmap_alloc_destroy. And create nop functions for the
functions nvgpu_init_alloc_debug and nvgpu_fini_alloc_debug
when CONFIG_DEBUG_FS is not defined to avoid adding the
CONFIG checks at multiple places.
Move gk20a_debug_deinit to the end of gk20a_free_cb called in nvgpu_put
as that tears down all debugfs entries. Allocator destroy happens as
part of nvgpu_put call and it can lead to invalid debugfs dentry
access if gk20a_debug_deinit is called before it.
Bug 3481097
Change-Id: I8a66bcf6ade7e5707f9207c78a54d12d7bd94c02
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2648012
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
To support nvmap as OOT module in kstable it implemented the APIs
nvmap_dma_alloc_attrs and nvmap_dma_free_attrs to replace usage
of kernel dma_alloc_attrs and dma_free_attrs. nvmap APIs have
special handling for VPR carveout.
Use those exported APIs in nvgpu when CONFIG_NVGPU_NVMAP_NEXT is
defined. This config will be defined only for kstable builds.
JIRA LS-458
Bug 200754700
Change-Id: I717aa579d29ee10c006b044f6b0fafbedc57dba8
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2647951
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
VPR functionality is split up as static VPR and VPR resize. Static VPR
is supported on all kernels. VPR resize is enabled only on 4.9 kernel.
Enable CONFIG_NVGPU_VPR unconditionally in Linux Makefile. Compile
VPR resize related functionality in nvgpu under the check for
Linux kernel version using new define NVGPU_VPR_RESIZE_SUPPORTED.
JIRA LS-458
Bug 200754700
Change-Id: Ib92f7f1b95afc6c69fbdf33354459c147337350c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2647619
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
nvs worker thread is created on each resume and deinitialized on every
suspend. nvgpu can be resumed when process is getting killed. Thread
creation can fail when the process is getting killed. That will lead
to driver resume failure.
To avoid the issue above, don't stop the nvs worker thread in suspend
and let the first created thread handle the nvs work always.
Deinitialize the nvs worker thread during nvgpu unload.
Also, log the error returned by nvgpu_thread_create in the function
nvgpu_worker_start.
bug 3480192
Change-Id: I8d5d9e7716a950b162cc3c2d9fcfde07c4edfcf6
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2646218
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
gops.gr.init.set_default_compute_regs() HAL configures compute specific
settings in safety build and this eliminates need of using SW methods.
Define this HAL for Orin safety build and configure sked check related
registers from the HAL. Other settings done on gv11b are no more
applicable for ga10b safety.
Bug 3456240
Change-Id: Ic125cdf414a5402511949015e3424b8cb2dab1e0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2646284
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch performs the following improvements for VAB:
1) It avoids an infinite loop when collecting VAB information.
Previously, nvgpu incorrectly assumed that the valid bit would
be eventually set for the checker when polling. It may not be set
if a VAB-related fault has occurred.
2) It handles the VAB_ERROR mmu fault which may be caused for various
reasons: invalid vab buffer address, tracking in protected mode,
etc. The recovery sequence is to set the vab buffer size to 0 and
then to the original size. This clears the VAB_ERROR bit. After
reseting, the old register values are again set in the recovery
code sequence.
3) Use correct number of VAB buffers. There's only one VAB buffer on
ga10b, not two.
4) Simplify logic.
Bug 3374805
Bug 3465734
Bug 3473147
Change-Id: I716f460ef37cb848ddc56a64c6f83024c4bb9811
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621290
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
During gpu re-initialization(rail gate exit/sc7 exit), vab buffers
needs to programmed in hw by writing vab buffer address to
NV_PFB_PRI_MMU_VIDMEM_ACCESS_BIT_BUFFER_LO(HI)_ADDR and
setting vab entries to NV_PFB_PRI_MMU_VIDMEM_ACCESS_BIT_BUFFER_SIZE_VAL.
For this moved nvgpu nvgpu_fb_vab_init_hal from
nvgpu_init_mm_setup_sw to nvgpu_init_mm_support.
nvgpu_init_mm_setup_sw is skipping during re-init, because
sw_ready is set during first boot.
Bug 3468562
Change-Id: Iee2bd4bc5165397ea4f9cca0ba6744eaf361a342
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2643244
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>