The pd_cache header declarations were originally part of the
gmmu.h header. This is not good from a unit isolation perspective
so this patch moves all the pd_cache specifics over to a new
header file: <nvgpu/pd_cache.h>.
Also a couple of static inlines that were possible when the code
was part of gmmu.h were turned into real, first class functions.
This allows the pd_cache.h header to not include the gmmu.h
header file.
Also fix an issue in the nvgpu_pd_write() function where the data
was being passed as a size_t for some reason. This has now been
changed to a u32.
JIRA NVGPU-1444
Change-Id: Iead9a0d998396d2289ffcb3b48765d770400397b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1965271
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Split the nvgpu_sgt code out from the nvgpu_mem code. Although the
two chunks of code are related the SGT code is distinct and as
such should be its own unit. To do this a new source file has been
added - nvgpu_sgt.c - which contains all the nvgpu_sgt common APIs.
These are the facade APIs to abstract the actual details of how any
given nvgpu_sgt is actually implemented.
An abstract unit - nvgpu_sgt_os - was also defined. This unit
exists solely for the nvgpu_sgt unit to call so that the OS
specific nvgpu_sgt_os_create_from_mem() API can be moved from the
common nvgpu_sgt unit. Note this also updates the name of what the
OS specific units are expected to call. Common code may still use
the generic nvgpu_sgt_create_from_mem() API.
JIRA NVGPU-1391
Change-Id: I37f5b2bbf9f84c0fb6bc296c3e04ea13518bd4d0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1946012
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The problem here, and the solution, requires some background
so let's start there.
During page table programming page directories (PDs) are
allocated as needed. Each PD can range in size, depending on
chip, from 256 bytes all the way up to 32KB (gk20a 2-level
page tables).
In HW, two distinct PTE sizes are supported: large and small.
The HW supports mixing these at will. The second to last level
PDE has pointers to both a small and large PD with
corresponding PTEs. Nvgpu doesn't handle that well and as a
result historically we split the GPU virtual address space
up into a small page region and a large page region. This
makes the GMMU programming logic easier since we now only have
to worry about one type of PD for any given region.
But this presents issues for CUDA and UVM. They want to be
able to mix PTE sizes in the same GPU virtual memory range.
In general we still don't support true dual page directories.
That is page directories with both the small and large next
level PD populated. However, we will allow adjecent PDs to
have different sized next-level PDs.
Each last level PD maps the same amount. On Pascal+ that's
2MB. This is true regardless of the PTE coverage (large or
small). That means the last level PD will be different in
size depending on the PTE size.
So - going back to the SW we allocate PDs as needed when
programming the page tables. When we do this allocation we
allocate just enough space for the PD to contain the
necessary number of PTEs for the page size. The problem
manifests when a PD flips in size from large to small PTEs.
Consider the following mapping operations:
map(gpu_va -> phys) [large-pages]
unmap(gpu_va)
map(gpu_va -> phys) [small-pages]
In the first map/unmap we go and allocate all the necessary
PDs and PTEs to build this translation. We do so assuming a
large page size. When unmapping, as an optimzation/quirk of
nvgpu, we leave the PDs around. We know they may well be used
again in the future.
But if we swap the size of the mapping from large to small
then we now need more space in the PD for PTEs. But the logic
in the GMMU coding assumes if the PD has memory allocated then
that memory is sufficient. This worked back when there was no
potential for a PD to swap in page size. But now that there is
we have to re-allocate the PD doesn't have enough space for
the required PTEs.
So that's the fix - reallocate PDs when they require more
space than they currently have.
Change-Id: I9de70da6acfd20c13d7bdd54232e4d4657840394
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1933076
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add two new sub-directories under MM: gmmu and allocators.
The allocators directory is for all the allocator code we have.
There's a fair amount and as such could be considered a component
with a bunch of sub-units.
The new GMMU directory will contain the GMMU component (which used to
be a single unit). The new GMMU component is comprised of the
page_table and pd_cache units. Also when we migrate the chip specific
GMMU code out of mm_gk20a.c and mm_gp10b.c it will be placed in this
new GMMU directory.
JIRA NVGPU-1390
Change-Id: I7aa47ea2a32612b7d69972671fccb72770e1ae09
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1944385
Reviewed-by: Nicolas Benech <nbenech@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>