Add a new MM HAL directory to contain all MM related HAL units.
As part of this change add cache unit to the MM HAL. This contains
several related fixes:
1. Move the cache code in gk20a/mm_gk20a.c and gv11b/mm_gv11b.c to
the new cache HAL. Update makefiles and header includes to take
this into account. Also rename gk20a_{read,write}l() to their
nvgpu_ variants.
2. Update the MM gops: move the cache related functions to the new
cache HAL and update all calls to this HAL to reflect the new
name.
3. Update some direct calls to gk20a MM cache ops to pass through
the HAL instead.
4. Update the unit tests for various MM related things to use the
new MM HAL locations.
This change accomplishes two architecture design goals. Firstly it
removes a multiple HW include from mm_gk20a.c (the flush HW header).
Secondly it moves code from the gk20a/ and gv11b/ directories into
more proper locations under hal/.
JIRA NVGPU-2042
Change-Id: I91e4bdca4341be4dbb46fabd72622b917769f4a6
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2095749
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added the following HALs
- ramin.base_shift
- ramin.alloc_base
Use above HALs in mm, instead of using hw definitions.
Defined nvgpu_inst_block_ptr to
- get inst_block address,
- shift if by base_shift
- assert upper 32 bits are 0
- return lower 32 bits
Added missing #include for <nvgpu/mm.h>
Jira NVGPU-3015
Change-Id: I558a6f4c9fbc6873a5b71f1557ea9ad8eae2778f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2077840
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new variable in nvgpu_as_map_buffer_ex_args for app
to specify the platform atomic support for the page.
When platform atomic attribute flag is set, pte memory
aperture is set to be coherent type.
renamed nvgpu_aperture_mask_coh -> nvgpu_aperture_mask_raw
function.
bug 200473147
Change-Id: I18266724dafdc8dfd96a0711f23cf08e23682afc
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2012679
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gv11b_mm_l2_flush was not checking error codes from the various
functions it was calling. MISRA Rule-17.7 requires the return value
of all functions to be used. This patch now checks return values and
propagates the error upstream.
JIRA NVGPU-677
Change-Id: I9005c6d3a406f9665d318014d21a1da34f87ca30
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1998809
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 10.1 mandates that the correct data types are used as
operands of operators. For example, only unsigned integers can be used
as operands of bitwise operators.
This patch fixes rule 10.1 vioaltions for gk20a.
JIRA NVGPU-777
JIRA NVGPU-1006
Change-Id: I965eae017350156c6692fd585292b7a54e4190d8
Signed-off-by: Sai Nikhil <snikhil@nvidia.com>
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1971010
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Follow-up change to rename g->ops.mm.bar1_map (and implementations)
to more specific g->ops.mm.bar1_map_userd.
Also use nvgpu_big_zalloc() to allocate userd slabs memory descriptors.
Bug 2422486
Bug 200474793
Change-Id: Iceff3bd1d34d56d3bb9496c179fff1b876b224ce
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1970891
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We had to force allocation of physically contiguous memory for
USERD in nvlink case, as a channel's USERD address is computed as
an offset from fifo->userd address, and nvlink bypasses SMMU.
With 4096 channels, it can become difficult to allocate 2MB of
physically contiguous sysmem for USERD on a busy system.
PBDMA does not require any sort of packing or contiguous USERD
allocation, as each channel has a direct pointer to that channel's
512B USERD region. When BAR1 is supported we only need the GPU VAs
to be contiguous, to setup the BAR1 inst block.
- Add slab allocator for USERD.
- Slabs are allocated in SYSMEM, using PAGE_SIZE for slab size.
- Contiguous channels share the same page (16 channels per slab).
- ch->userd_mem points to related nvgpu_mem descriptor
- ch->userd_offset is the offset from the beginning of the slab
- Pre-allocate GPU VAs for the whole BAR1
- Add g->ops.mm.bar1_map() method
- gk20a_mm_bar1_map() uses fixed mapping in BAR1 region
- vgpu_mm_bar1_map() passes the offset in TEGRA_VGPU_CMD_MAP_BAR1
- TEGRA_VGPU_CMD_MAP_BAR1 is called for each slab.
Bug 2422486
Bug 200474793
Change-Id: I202699fe55a454c1fc6d969e7b6196a46256d704
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The pd_cache header declarations were oriignally part of the
gmmu.h header. This is not good from a unit isolation perspective
so this patch moves all the pd_cache specifics over to a new
header file: <nvgpu/pd_cache.h>.
Also a couple of static inlines that were possible when the code
was part of gmmu.h were turned into real, first class functions.
This allowed the pd_cache.h header to not include the gmmu.h
header file.
Also fix an issue in the nvgpu_pd_write() function where the data
was being passed as a size_t for some reason. This has now been
changed to a u32.
JIRA NVGPU-1444
Change-Id: Ib9e9e5a54544de403bfcd8e11c30de05721ddbcc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1966352
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This reverts commit 15603b9fd5.
Causes a build break in the PD cache unit test. Not sure how this
passed GVS - must have been a race or something? Unclear.
Change-Id: Ia484a801d098d69441326fa1dd40a1c86e2e23ce
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1966335
The pd_cache header declarations were originally part of the
gmmu.h header. This is not good from a unit isolation perspective
so this patch moves all the pd_cache specifics over to a new
header file: <nvgpu/pd_cache.h>.
Also a couple of static inlines that were possible when the code
was part of gmmu.h were turned into real, first class functions.
This allows the pd_cache.h header to not include the gmmu.h
header file.
Also fix an issue in the nvgpu_pd_write() function where the data
was being passed as a size_t for some reason. This has now been
changed to a u32.
JIRA NVGPU-1444
Change-Id: Iead9a0d998396d2289ffcb3b48765d770400397b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1965271
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 12.2 states that the right hand operand of a shift
operator shall lie in the range zero to one less than the width
in bits of the essential type of the left hand operand. This
patch will fix these violations by casting them to an appropriate
type or using the relevant BITxx() macros.
JIRA NVGPU-666
Change-Id: I57b6081e9bd98c45ca9f7aa5f35e1d2d66ed0134
Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1945655
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Rule 10.4 only allows the usage of arithmetic operations on
operands of the same essential type category.
Adding "U" at the end of the integer literals to have same type of
operands when an arithmetic operation is performed.
This fixes violation where an arithmetic operation is performed on
signed and unsigned int types.
JIRA NVGPU-992
Change-Id: I4c04e2720a3b068909cc4af6847d4718568c13ea
Signed-off-by: Sai Nikhil <snikhil@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1822740
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 14.4 doesn't allow the usage of non-boolean variable as
boolean in the controlling expression of an if statement or an
iteration statement.
Fix violations where a non-boolean variable is used as a boolean in the
controlling expression of if and loop statements.
JIRA NVGPU-1022
Change-Id: I957f8ca1fa0eb00928c476960da1e6e420781c09
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1941002
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Update the PD cache code to handle 64KB pages. To do this the
number of partial/full lists is expanded for when we have 64K
pages. Currently we only explicitly support 4K and 64K page
sizes. Other pages sizes (16K for example) will fail compilation
during preprocessing.
This change also cleans up the definitions for some of the
internal structs. They have been moved into pd_cache.c since
they are not used outside of pd_cache.c.
This allows the following functions to be removed from the global
context:
__nvgpu_pd_cache_alloc_direct()
__nvgpu_pd_cache_free_direct()
They have been replaced by calls to nvgpu_pd_{alloc,free}().
The nvgpu_pd_mem_entry alloc_map also had to be expanded to a
real bitmap. 32 or 64 bits is not sufficient for packing 256
byte PDs into a 64K page (there's 256 PDs per nvgpu_pd_mem_entry
in that case). To prevent doing too many find_first_zero
operations on the bitmap an 'allocs' field was also added which
tracks how many allocs are done. We can use this instead of
comparing a mask against the bitmap to determine if an
nvgpu_pd_mem_entry is full.
Note: there's still a limitation with the TLB invalidate code:
it simply assumes an nvgpu_mem is a 1 to 1 with a PDB. This means
we can't invalidate a PDB allocated at an offset greater than 0
in a nvgpu_pd_mem_entry. This in turn means we must always use a
full page size for a context's PDB.
Bug 1977822
Change-Id: I6a7a3a95b7c902bc6487cba05fde58fbc4a25da5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1718755
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule-14.4 doesn't allow the usage of function pointers & integer
types as booleans in the controlling expression of an if statement or
an iteration statement.
Fix violations where a function pointer or a function whose return
value is an integer, is used as a boolean in the controlling expression
of if and loop statements.
JIRA NVGPU-1021
Change-Id: Ic5336268394ba4396ce80744c25930d2fb44dc42
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1932147
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gops.fb.dump_vpr_wpr_info() accesses both VPR and WPR registers.
Split this into two different HALs gops.fb.dump_vpr_info() and
gops.fb.dump_wpr_info()
Also unset HALs accessing VPR registers on dGPUs
We don't support VPR on dGPUs
Remove fb_mmu_vpr_info_r() register and all its accessors from
dGPU headers
Bug 2173122
Change-Id: I5b2712f8c5389e422a84c375a7e836add48bfd1c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1850947
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We don't support big page size beginning Pascal, so set HAL
gops.fb.set_mmu_page_size() to NULL on all those platforms
Also remove these accessors from corresponding platforms
fb_mmu_ctrl_use_pdb_big_page_size_v()
fb_mmu_ctrl_use_pdb_big_page_size_true_f()
fb_mmu_ctrl_use_pdb_big_page_size_false_f()
Bug 2173122
Change-Id: I7353412860a7a6f8a993ca9184a0dc3ca9d749af
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1850946
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA 21.2 states that we may not use reserved identifiers; since
all identifiers beginning with '_' are reserved by libc, the usage
of '__' as a prefix is disallowed.
Handle the 21.2 fixes for nvgpu_mem.c and mm.c; this deletes the
'__' prefixes and slightly renames the __nvgpu_aperture_mask()
function since there's a coherent version and a general version.
Change-Id: Iee871ad90db3f2622f9099bd9992eb994e0fbf34
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1813623
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Rule-15.6 requires that all if-else blocks be enclosed in braces,
including single statement blocks. Fix errors due to single statement
if blocks without braces by introducing the braces.
JIRA NVGPU-671
Change-Id: Icdeede22dd26fd70fae92aa791d35b115ef49e32
Signed-off-by: Srirangan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1797691
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Changed the enum gmmu_pgsz_gk20a into macros and changed all the
instances of it.
The enum gmmu_pgsz_gk20a was being used in for loops, where it was
compared with an integer. This violates MISRA rule 10.4, which only
allows arithmetic operations on operands of the same essential type
category. Changing this enum into macro will fix this violation.
JIRA NVGPU-993
Change-Id: I6f18b08bc7548093d99e8229378415bcdec749e3
Signed-off-by: Amulya <Amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1795593
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In the current code, gk20a.h includes io.h which gets directly included
in a lot of other files. io.h contains methods which uses a struct
gk20a as a parameter leading to a circular dependency between io.h
and gk20a.h. This can be mitigated by removing io.h from gk20a.h as
part of larger effort to moving gk20a.h to nvgpu/gk20a.h
JIRA NVGPU-597
Change-Id: I93e504fa9371b88152737b342a75580c65e8f712
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1787316
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Switch all logging to nvgpu_log*(). gk20a_dbg* macros are
intentionally left there because of use from other repositories.
Because the new functions do not work without a pointer to struct
gk20a, and piping it just for logging is excessive, some log messages
are deleted.
Change-Id: I00e22e75fe4596a330bb0282ab4774b3639ee31e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1704148
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Also revert other changes related to IO coherence. This may be the
culprit in a recent dev-kernel lockdown.
Bug 2070609
Change-Id: Ida178aef161fadbc6db9512521ea51c702c1564b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665914
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Srikar Srimath Tirumala <srikars@nvidia.com>
When using a coherent DMA API wee must make sure to program
any aperture fields with the coherent aperture setting. To
do this the nvgpu_aperture_mask() function was modified to
take a third aperture mask argument, a coherent setting, so
that code can use this function to generate coherent aperture
settings.
The aperture choice is some what tricky: the default version
of this function uses the state of the DMA API to determine
what aperture to use for SYSMEM: either coherent or
non-coherent internally. Thus a kernel user need only specify
the normal nvgpu_mem struct and the correct mask should be
chosen. Due to many uses of nvgpu_mem structs not created
directly from the DMA API wrapper it's easier to translate
SYSMEM to SYSMEM_COH after creation.
However, the GMMU mapping code, will encounter buffers from
userspace with difference coerency attributes than the DMA
API. Thus the __nvgpu_aperture_mask() really respects the
aperture setting passed in regardless of the DMA API state.
This aperture setting is pulled from NVGPU_VM_MAP_IO_COHERENT
since this is either passed in from userspace or set by the
kernel when using coherent DMA. The aperture field in attrs
is upgraded to coh if this flag is set.
This change also adds a coherent sysmem mask everywhere that
it can. There's a couple places that do not have a coherent
register field defined yet. These need to eventually be
defined and added.
Lastly the aperture mask code has been mvoed from the Linux
vm.c code to the general vm.c code since this function has
no Linux dependencies.
Note: depends on https://git-master.nvidia.com/r/1664536 for
new register fields.
JIRA EVLR-2333
Change-Id: I4b347911ecb7c511738563fe6c34d0e6aa380d71
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1655220
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The current code does not properly calculate the indexes within the PDE
to access the proper entry, and it has a bug in assignement of the big
page entries. This change fixes the issue by:
(1) Passing a pointer to the level structure and dereferencing the
index offset to the next level.
(2) Changing the format of the address.
(3) Ensuring big pages are only selected if their address is set.
Bug 200364599
Change-Id: I46e32560ee341d8cfc08c077282dcb5549d2a140
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1610562
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Bhosale <dbhosale@nvidia.com>