mirror of
git://nv-tegra.nvidia.com/linux-nvgpu.git
synced 2025-12-23 18:16:01 +03:00
The problem here, and the solution, requires some background so let's start there. During page table programming page directories (PDs) are allocated as needed. Each PD can range in size, depending on chip, from 256 bytes all the way up to 32KB (gk20a 2-level page tables). In HW, two distinct PTE sizes are supported: large and small. The HW supports mixing these at will. The second to last level PDE has pointers to both a small and large PD with corresponding PTEs. Nvgpu doesn't handle that well and as a result historically we split the GPU virtual address space up into a small page region and a large page region. This makes the GMMU programming logic easier since we now only have to worry about one type of PD for any given region. But this presents issues for CUDA and UVM. They want to be able to mix PTE sizes in the same GPU virtual memory range. In general we still don't support true dual page directories. That is page directories with both the small and large next level PD populated. However, we will allow adjecent PDs to have different sized next-level PDs. Each last level PD maps the same amount. On Pascal+ that's 2MB. This is true regardless of the PTE coverage (large or small). That means the last level PD will be different in size depending on the PTE size. So - going back to the SW we allocate PDs as needed when programming the page tables. When we do this allocation we allocate just enough space for the PD to contain the necessary number of PTEs for the page size. The problem manifests when a PD flips in size from large to small PTEs. Consider the following mapping operations: map(gpu_va -> phys) [large-pages] unmap(gpu_va) map(gpu_va -> phys) [small-pages] In the first map/unmap we go and allocate all the necessary PDs and PTEs to build this translation. We do so assuming a large page size. When unmapping, as an optimzation/quirk of nvgpu, we leave the PDs around. We know they may well be used again in the future. But if we swap the size of the mapping from large to small then we now need more space in the PD for PTEs. But the logic in the GMMU coding assumes if the PD has memory allocated then that memory is sufficient. This worked back when there was no potential for a PD to swap in page size. But now that there is we have to re-allocate the PD doesn't have enough space for the required PTEs. So that's the fix - reallocate PDs when they require more space than they currently have. Change-Id: I9de70da6acfd20c13d7bdd54232e4d4657840394 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1933076 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Nicolas Benech <nbenech@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>