The function gk20a_mm_l2_flush incorrectly returns an error value
when it skips l2_flush when hardware is powered off.
This causes the following prints to occur even when the behavior is expected.
gv11b_mm_l2_flush:43 [ERR] gk20a_mm_l2_flush failed
nvgpu_gmmu_unmap_locked:1043 [ERR] gk20a_mm_l2_flush[1] failed
The above errors occur from the following paths
1) gk20a_remove -> gk20a_free_cb -> gk20a_remove_support ->
nvgpu_pmu_remove_support -> nvgpu_pmu_pg_deinit ->
nvgpu_dma_unmap_free
2) gk20a_remove -> gk20a_free_cb -> gk20a_remove_support ->
nvgpu_remove_mm_support -> gv11b_mm_mmu_fault_info_mem_destroy ->
nvgpu_dma_unmap_free
Since, these do not belong in the Poweron/Poweroff path, its okay to
skip flushing them when the hardware has powered off.
Fixed the userspace tests by allocating g->mm.bar1.vm to prevent NULL access
in gv11b_mm_l2_flush->tlb_invalidate.
Jira LS-77
Change-Id: I3ca71f5118daf4b2eeacfe5bf83d94317f29d446
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2523751
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Mapping of large buffers to GMMU end up needing many
pages for the PTE tables. Allocating these one by one
can end up being a performance bottleneck, particularly
in the virtualized case.
This is adding the following changes:
- As the TLB invalidation doesn't have access to mem_off,
allow top-level allocation by alloc_cache_direct().
- Define NVGPU_PD_CACHE_SIZE, the allocation size for a new slab
for the PD cache, effectively set to 64K bytes
- Use the PD cache for any allocation < NVGPU_PD_CACHE_SIZE
When freeing up cached entries, avoid prefetch errors by
invalidating the entry (memset to 0).
- Try to fall back to direct allocation of smaller chunk for
contiguous allocation failures.
- Unit test changes.
Bug 200649243
Change-Id: I0a667af0ba01d9147c703e64fc970880e52a8fbc
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2404371
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Many tests used various incarnations of the mock register framework.
This was based on a dump of gv11b registers. Tests that greatly
benefitted from having generally sane register values all rely
heavily on this framework.
However, every test essentially did their own thing. This was not
efficient and has caused a some issues in cleaning up the device and
host code.
Therefore introduce a much leaner and simplified register framework.
All unit tests now automatically get a good subset of the gv11b
registers auto-populated. As part of this also populate the HAL with
a nvgpu_detect_chip() call. Many tests can now _probably_ have all
their HAL init (except dummy HAL stuff) deleted. But this does
require a few fixups here and there to set HALs to NULL where tests
expect HALs to be NULL by default.
Where necessary HALs are cleared with a memset to prevent unwanted
code from executing.
Overall, this imposes a far smaller burden on tests to initialize
their environments.
Something to consider for the future, though, is how to handle
supporting multiple chips in the unit test world.
JIRA NVGPU-5422
Change-Id: Icf1a63f728e9c5671ee0fdb726c235ffbd2843e2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335334
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The two MM functions nvgpu_set_pd_level() and nvgpu_vm_do_free_entries()
are simple recursive functions for processing page descriptors (PDs) by
traversing the levels in the PDs. MISRA Rule 17.2 prohibits functions
calling themselves because "unless recursion is tightly controlled, it
is not possible to determine before execution what the worst-case stack
usage could be." So, this change limits the recursion depth of each of
these functions to the maximum number of page table levels by checking
the level against the MM HAL max_page_table_levels().
This also required configuring this HAL in various unit tests as they
were no previously using this HAL.
JIRA NVGPU-3489
Change-Id: Iadee3fa5ba9f45cd643ac6c202e9296d75d51880
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2224450
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>