This "HAL" exists to handle the vGPU specific bind channel operation.
This patch moves the native function implementation to common/mm/vm.c
and renames the gk20a to nvgpu to follow the convention for vGPU vs
native HAL functions.
JIRA NVGPU-2042
Change-Id: I02b9ebf0d53d58a6d2ede544e34f2b8ff1b1eb42
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2104540
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make a hal/mm/gmmu sub-unit for the GMMU HAL code. Also move the
gk20a specific HAL code there. gp10b will happen in the next patch.
This change also updates all the GMMU related HAL usage, of which
there is quite a bit. Generally the only change is a .gmmu needs to
be inserted into the HAL path. Each HAL init was also updated.
JIRA NVGPU-2042
Change-Id: I6c46bdfddb8e021f56103d9457fb3e2a226f8947
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2099693
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename the two native GPU GMMU map/unmap functions and update the
HAL initializations to reflect this:
gk20a_locked_gmmu_map -> nvgpu_gmmu_map_locked
gk20a_locked_gmmu_unmap -> nvgpu_gmmu_unmap_locked
This matches what other units do for handling vGPU "HAL" indirection.
Also move the function declarations to <nvgpu/gmmu.h> since these are
shared among all non-vGPU chips. But since these are still technically
HAL operations they should never be called directly. This is a bit of
an organixational issue that I have not thought through hwo to solve
yet.
Ideally they would go into a "hal/mm/gmmu/" include somewhere, but
that A) doesn't yet exist, and B) those are chip specific; these
functions are native specific. Ugh.
JIRA NVGPU-2042
Change-Id: Ibc614f2928630d12eafcec6ce73019628b44ad94
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2099692
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a new MM HAL directory to contain all MM related HAL units.
As part of this change add cache unit to the MM HAL. This contains
several related fixes:
1. Move the cache code in gk20a/mm_gk20a.c and gv11b/mm_gv11b.c to
the new cache HAL. Update makefiles and header includes to take
this into account. Also rename gk20a_{read,write}l() to their
nvgpu_ variants.
2. Update the MM gops: move the cache related functions to the new
cache HAL and update all calls to this HAL to reflect the new
name.
3. Update some direct calls to gk20a MM cache ops to pass through
the HAL instead.
4. Update the unit tests for various MM related things to use the
new MM HAL locations.
This change accomplishes two architecture design goals. Firstly it
removes a multiple HW include from mm_gk20a.c (the flush HW header).
Secondly it moves code from the gk20a/ and gv11b/ directories into
more proper locations under hal/.
JIRA NVGPU-2042
Change-Id: I91e4bdca4341be4dbb46fabd72622b917769f4a6
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2095749
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gv11b_mm_l2_flush was not checking error codes from the various
functions it was calling. MISRA Rule-17.7 requires the return value
of all functions to be used. This patch now checks return values and
propagates the error upstream.
JIRA NVGPU-677
Change-Id: I9005c6d3a406f9665d318014d21a1da34f87ca30
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1998809
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new unit common/gr/ctx.c to manage GR context
This unit provides interfaces to allocate/free/map/unmap GR context,
patch context, pm context, ctxsw {preempt/spill/betacb/pagepool/rtvcb}
buffers.
It also provides APIs to set size of above buffers
Add new header file include/nvgpu/gr/ctx.h to declare all the interfaces.
Move nvgpu_gr_ctx, patch_desc, pm_ctx_desc, zcull_ctx_desc structures
to this unit
Add new structure nvgpu_gr_ctx_desc to hold context description
parameters. For now we add sizes of all the buffers here.
Add this structure to gr_gk20a for global reference
Remove gr_gp10b_alloc_buffer() since it is no longer used
Rename g->ops.gr.alloc_gfxp_rtv_cb() to g->ops.gr.init_gfxp_rtv_cb()
since this HAL now only sets the size of rtvcb ctxsw buffer
Remove gr->ctx_vars.buffer_size and gr->ctx_vars.buffer_total_size
since they were redundant. We already have gr->ctx_vars.golden_image_size
to denote golden image size
Jira NVGPU-1527
Change-Id: I8847b347f80235209dd5e28d979e79984ab85408
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1987702
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Follow-up change to rename g->ops.mm.bar1_map (and implementations)
to more specific g->ops.mm.bar1_map_userd.
Also use nvgpu_big_zalloc() to allocate userd slabs memory descriptors.
Bug 2422486
Bug 200474793
Change-Id: Iceff3bd1d34d56d3bb9496c179fff1b876b224ce
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1970891
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We had to force allocation of physically contiguous memory for
USERD in nvlink case, as a channel's USERD address is computed as
an offset from fifo->userd address, and nvlink bypasses SMMU.
With 4096 channels, it can become difficult to allocate 2MB of
physically contiguous sysmem for USERD on a busy system.
PBDMA does not require any sort of packing or contiguous USERD
allocation, as each channel has a direct pointer to that channel's
512B USERD region. When BAR1 is supported we only need the GPU VAs
to be contiguous, to setup the BAR1 inst block.
- Add slab allocator for USERD.
- Slabs are allocated in SYSMEM, using PAGE_SIZE for slab size.
- Contiguous channels share the same page (16 channels per slab).
- ch->userd_mem points to related nvgpu_mem descriptor
- ch->userd_offset is the offset from the beginning of the slab
- Pre-allocate GPU VAs for the whole BAR1
- Add g->ops.mm.bar1_map() method
- gk20a_mm_bar1_map() uses fixed mapping in BAR1 region
- vgpu_mm_bar1_map() passes the offset in TEGRA_VGPU_CMD_MAP_BAR1
- TEGRA_VGPU_CMD_MAP_BAR1 is called for each slab.
Bug 2422486
Bug 200474793
Change-Id: I202699fe55a454c1fc6d969e7b6196a46256d704
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 8.3 requires that all declarations of a function
shall use the same parameter names and type qualifiers. There
are cases where the parameter names do not match between
function prototype and declaration. This patch will fix some of
these violations by renaming the prototype parameter.
JIRA NVGPU-847
Change-Id: I980ca7ba8adc853de9c1b6f6c7e7b3e4ac12f88e
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1926980
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 8.3 requires that all declarations of a function
shall use the same parameter names and type qualifiers. There
are cases where the parameter names do not match between
function prototype and declaration. This patch will fix some of
these violations by renaming the parameter as required.
JIRA NVGPU-847
Change-Id: I3f7280b0e4c21b1c2d70fd7f899cf920075f87a3
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1927103
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Changed the enum gmmu_pgsz_gk20a into macros and changed all the
instances of it.
The enum gmmu_pgsz_gk20a was being used in for loops, where it was
compared with an integer. This violates MISRA rule 10.4, which only
allows arithmetic operations on operands of the same essential type
category. Changing this enum into macro will fix this violation.
JIRA NVGPU-993
Change-Id: I6f18b08bc7548093d99e8229378415bcdec749e3
Signed-off-by: Amulya <Amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1795593
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The current code does not properly calculate the indexes within the PDE
to access the proper entry, and it has a bug in assignement of the big
page entries. This change fixes the issue by:
(1) Passing a pointer to the level structure and dereferencing the
index offset to the next level.
(2) Changing the format of the address.
(3) Ensuring big pages are only selected if their address is set.
Bug 200364599
Change-Id: I46e32560ee341d8cfc08c077282dcb5549d2a140
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1610562
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Bhosale <dbhosale@nvidia.com>
Delte the Linux headers and make some modifications to get rid of the
minor compilation issues that resulted.
- Add <linux/iommu.h> to os_linux.h
- Delete #if 0 code that "flushed" a buffer in gr_gk20a.c
- Delete FLUSH_CPU_DCACHE() macro
- Move the cache flush definitions to <nvgpu/linux/vm.h>
and include this header in sim_gk20a.c. This file will
not be used by QNX so this should be fine.
- Add <linux/pci_ids.h> to gp106/bios_gp106.c and
gp106/mclk_gp106.c.
- Move function to common/linux/dmabuf.h since it is a
dmabuf related function and uses a struct device pointer
as an argument.
JIRA NVGPU-30
Change-Id: I11f56b98524c7fac3efa91b4686592130e5f8a46
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1585510
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move much of the remaining generic MM code to a new common location:
common/mm/mm.c. Also add a corresponding <nvgpu/mm.h> header. This
mostly consists of init and cleanup code to handle the common MM
data structures like the VIDMEM code, address spaces for various
engines, etc.
A few more indepth changes were made as well.
1. alloc_inst_block() has been added to the MM HAL. This used to be
defined directly in the gk20a code but it used a register. As a
result, if this register hypothetically changes in the future,
it would need to become a HAL anyway. This path preempts that
and for now just defines all HALs to use the gk20a version.
2. Rename as much as possible: global functions are, for the most
part, prepended with nvgpu (there are a few exceptions which I
have yet to decide what to do with). Functions that are static
are renamed to be as consistent with their functionality as
possible since in some cases function effect and function name
have diverged.
JIRA NVGPU-30
Change-Id: Ic948f1ecc2f7976eba4bb7169a44b7226bb7c0b5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1574499
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Convert the work_struct used by the vidmem background clearing to
a thread to make it more cross platform. The thread waits on a
condition variable to determine when work needs to be done. The
signal comes from the DMA API when it enqueues a new nvgpu_mem that
needs clearing.
Add logic for handling suspend: the CE cannot be accessed while
the GPU is suspended. As such the background thread must be paused
while the GPU is suspended and the CE is not available.
Several other changes were also made:
o Move the code that enqueues a nvgpu_mem from the DMA API
code to a function in the VIDMEM code.
o Move nvgpu_vidmem_get_pending_alloc() to the Linux specific
code as this function is only used there. It's a trivial
function that QNX can easily implement as well.
o Remove the was_empty logic from the enqueue. Now just always
signal the condition variable when anew nvgpu_mem comes in.
o Move CE suspend to after MM suspend.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: Ie9286ae5a127c3fced86dfb9794e7d81eab0491c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1574498
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Refactor the last nvgpu_vm functions from the mm_gk20a.c code. This
removes some usages of dma_buf from the mm_gk20a.c code, too, which
helps make mm_gk20a.c less Linux specific.
Also delete some header files that are no longer necessary in
gk20a/mm_gk20a.c which are Linux specific. The mm_gk20a.c code is now
quite close to being Linux free.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: I72b370bd85a7b029768b0fb4827d6abba42007c3
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1566629
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move most of the dma_buf usage present in the mm_gk20a.c code
out to Linux specific code and some commom/mm code. There's
two primary groups of code:
1. dma_buf priv field code (for holding comptag data)
2. Comptag usage that relies on dma_buf pointers
For (1) the dma_buf code was simply moved to common/linux/dmabuf.c
since most of this code is clearly Linux specific.
The comptag code was a bit more complicated since there is two
parts to the comptag code. Firstly there's the code that manages
the comptag memory. This is essentially a simple allocator. This
was moved to common/mm/comptags.c since it can be shared across
all chips. The second set of code is moved to
common/linux/comptags.c since it is the interface between dma_bufs
and the comptag memory.
Two other fixes were done as well:
- Add struct gk20a to the comptag allocator init so that
the proper nvgpu_vzalloc() function could be used.
- Add necessary includes to common/linux/vm_priv.h.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: I96c57f2763e5ebe18a2f2ee4b33e0e1a2597848c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1566628
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove an SGL reference from the mm_gk20a.c code. This code is
common code and as such all linuxisms need to be fixed. It just
so happens that this particular function is only used by the
CDE code which is only present in Linux. So simply move this
function over to the CDE code.
JIRA NVGPU-30
JIRA NVGPU-225
JIRA NVGPU-226
Change-Id: Ifb0cb427c742c6d9cada382ace4a52f52474c379
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1576436
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Split VIDMEM support into its own code files organized as such:
common/mm/vidmem.c - Base vidmem support
common/linux/vidmem.c - Linux specific user-space interaction
include/nvgpu/vidmem.h - Vidmem API definitions
Also use the config to enable/disable VIDMEM support in the makefile
and remove as many CONFIG_GK20A_VIDMEM preprocessor checks as possible
from the source code.
And lastly update a while-loop that iterated over an SGT to use the
new for_each construct for iterating over SGTs.
Currently this organization is not perfectly adhered to. More patches
will fix that.
JIRA NVGPU-30
JIRA NVGPU-138
Change-Id: Ic0f4d2cf38b65849c7dc350a69b175421477069c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540705
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename get_physical_addr_bits and related functions to something that
more clearly conveys what they are doing. The basic idea of these
functions is to translate from a physical GPU address to a IOMMU GPU
address. To do that a particular bit (that varies from chip to chip)
is added to the physical address.
JIRA NVGPU-68
Change-Id: I536cc595c4397aad69a24f740bc74db03f52bc0a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1542966
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The basic nvgpu_mem_sgl implementation provides support
for OS specific scatter-gather list implementations by
simply copying them node by node. This is inefficient,
taking extra time and memory.
This patch implements an nvgpu_mem_sgt struct to act as
a header which is inserted at the front of any scatter-
gather list implementation. This labels every struct
with a set of ops which can be used to interact with
the attached scatter gather list.
Since nvgpu common code only has to interact with these
function pointers, any sgl implementation can be used.
Initialization only requires the allocation of a single
struct, removing the need to copy or iterate through the
sgl being converted.
Jira NVGPU-186
Change-Id: I2994f804a4a4cc141b702e987e9081d8560ba2e8
Signed-off-by: Sunny He <suhe@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1541426
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The last major item preventing the core MM code in the nvgpu
driver from being platform agnostic is the usage of Linux
scattergather tables and scattergather lists. These data
structures are used throughout the mapping code to handle
discontiguous DMA allocations and also overloaded to represent
VIDMEM allocs.
The notion of a scatter gather table is crucial to a HW device
that can handle discontiguous DMA. The GPU has a MMU which
allows the GPU to do page gathering and present a virtually
contiguous buffer to the GPU HW. As a result it makes sense
for the GPU driver to use some sort of scatter gather concept
so maximize memory usage efficiency.
To that end this patch keeps the notion of a scatter gather
list but implements it in the nvgpu common code. It is based
heavily on the Linux SGL concept. It is a singly linked list
of blocks - each representing a chunk of memory. To map or
use a DMA allocation SW must iterate over each block in the
SGL.
This patch implements the most basic level of support for this
data structure. There are certainly easy optimizations that
could be done to speed up the current implementation. However,
this patches' goal is to simply divest the core MM code from
any last Linux'isms. Speed and efficiency come next.
Change-Id: Icf44641db22d87fa1d003debbd9f71b605258e42
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1530867
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove the kernel feature to map compbit backing store areas to GPU VA.
The feature is not used, and the relevant user space code is getting
removed, too.
Change-Id: I94f8bb9145da872694fdc5d3eb3c1365b2f47945
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1547898
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Remove the mm.get_iova_addr() HAL and replace it with a new HAL
called mm.gpu_phys_addr(). This new HAL provides the real phys
address that should be passed to the GPU from a physical address
obtained from a scatter list. It also provides a mechanism by
which the HAL code can add extra bits to a GPU physical address
based on the attributes passed in. This is necessary during GMMU
page table programming.
Also remove the flags argument from the various address functions.
This flag was used for adding an IO coherence bit to the GPU
physical address which is not supported.
JIRA NVGPU-30
Change-Id: I69af5b1c6bd905c4077c26c098fac101c6b41a33
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1530864
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Refactor the sync_debugfs LTC HAL op so that the logic to enable
or disable LTC goes to common code nvgpu_ltc_sync_enabled() and
the LTC HAL set_enabled only performs the hardware register access.
Create a new common function nvgpu_init_ltc_support() to initialize
the LTC software variable, and move hardware initialization of LTC to
be called from it.
JIRA NVGPU-62
Change-Id: Ib1cf4f5b83ca3dac08407464ed56a732e0a33923
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1528262
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In some cases page directories require less than a full page of memory.
For example, on Pascal, the final PD level for large pages is only 256 bytes;
thus 16 PDs can fit in a single page. To allocate an entire page for each of
these 256 B PDs is extremely wasteful. This patch aims to alleviate the
wasted DMA memory from having small PDs in a full page by packing multiple
small PDs into a single page.
The packing is implemented as a slab allocator - each page is a slab and
from each page multiple PD instances can be allocated. Several modifications
to the nvgpu_gmmu_pd struct also needed to be made to support this. The
nvgpu_mem is now a pointer and there's an explicit offset into the nvgpu_mem
struct so that each nvgpu_gmmu_pd knows what portion of the memory it's
using.
The nvgpu_pde_phys_addr() function and the pd_write() functions also require
some changes since the PD no longer is always situated at the start of the
nvgpu_mem.
Initialization and cleanup of the page tables for each VM was slightly
modified to work through the new pd_cache implementation. Some PDs (i.e
the PDB), despite not being a full page, still require a full page for
alignment purposes (HW requirements). Thus a direct allocation method for
PDs is still provided. This is also used when a PD that could in principle
be cached is greater than a page in size.
Lastly a new debug flag was added for the pd_cache code.
JIRA NVGPU-30
Change-Id: I64c8037fc356783c1ef203cc143c4d71bbd5d77c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1506610
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Update the high level mapping logic. Instead of iterating over the
GPU VA iterate over the scatter-gather table chunks. As a result
each GMMU page table update call is simplified dramatically.
This also modifies the chip level code to no longer require an SGL
as an argument. Each call to the chip level code will be guaranteed
to be contiguous so it only has to worry about making a mapping from
virt -> phys.
This removes the dependency on Linux that the chip code currently
has. With this patch the core GMMU code still uses the Linux SGL but
the logic is highly transferable to a different, nvgpu specific,
scatter gather list format in the near future.
The last major update is to push most of the page table attribute
arguments to a struct. That struct is passed on through the various
mapping levels. This makes the funtions calls more simple and
easier to follow.
JIRA NVGPU-30
Change-Id: Ibb6b11755f99818fe642622ca0bd4cbed054f602
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master/r/1484104
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Remove gk20a support. Leave only gk20a code which is reused by other
GPUs.
JIRA NVGPU-38
Change-Id: I3d5f2bc9f71cd9f161e64436561a5eadd5786a3b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master/r/1507927
GVS: Gerrit_Virtual_Submit
Separate the non-chip specific GMMU mapping implementation code
out of mm_gk20a.c. This puts all of the chip-agnostic code into
common/mm/gmmu.c in preparation for rewriting it.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I6f7fdac3422703f5e80bb22ad304dc27bba4814d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1480228
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Support only VM pointers and ref-counting for maintaining VMs. This
dramatically reduces the complexity of the APIs, avoids the API
abuse that has existed, and ensures that future VM usage is
consistent with current usage.
Also remove the combined VM free/instance block deletion. Any place
where this was done is now replaced with an explict free of the
instance block and a nvgpu_vm_put().
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: Ib73e8d574ecc9abf6dad0b40a2c5795d6396cc8c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1480227
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Unify the initialization routines for the vGPU and regular GPU paths.
This helps avoid any further code divergence. This also assumes that
the code running on the regular GPU essentially works for the vGPU.
The only addition is that the regular GPU path calls an API in the
vGPU code that sends the necessary RM server message.
JIRA NVGPU-12
JIRA NVGPU-30
Change-Id: I37af1993fd8b50f666ae27524d382cce49cf28f7
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1480226
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
As the vidmem_is_vidmem flag has got two separate meanings in one bit,
split it in two bits into the enabled() API:
Add NVGPU_MM_HONORS_APERTURE bit, which is the same as vidmem_is_vidmem
with its original meaning, and use it to test which aperture bits to
write to hardware.
Add NVGPU_MM_UNIFIED_MEMORY bit, which has the opposite meaning: that
the GPU shares the SoC memory. When this flag is false, the GPU has its
own local video memory.
Jira NVGPU-86
Change-Id: I2d0bed3b1ede5a712be99323d3035b154bb23c3a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1496080
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>