Account NvMap allocated memory into both RSS and CG tracking to make
efficient OOM kill decisions during memory pressure.
NvMap allocates memory via kernel APIs like alloc_pages, the kernel
memory is not accounted on behalf of process who requests the
allocation. Hence in case OOM, the OOM killer never kills the process
who has allocated memory via NvMap even though this process might be
holding most of the memory.
Solve this issue using following approach:
- Use __GFP_ACCOUNT and __GFP_NORETRY flag
- __GFP_NORETRY will not let the current allocation flow to go into OOM
path, so that it will never trigger OOM.
- __GFP_ACCOUNT causes the allocation to be accounted to kmemcg. So any
allocation done by NvMap will be definitely accounted to kmemcg and
cgroups can be used to define memory limits.
- Add RSS counting for the process which allocates by NvMap, so that OOM
score for that process will get updated and OOM killer can pick this
process based upon the OOM score.
- Every process that has a reference to NvMap Handle would have the
memory size accounted into its RSS. On releasing the reference to
handle, the RSS would be reduced.
Bug 5222690
Change-Id: I3fa9b76ec9fc8d7f805111cb96e11e2ab1db42ce
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3427871
(cherry picked from commit 2ca91098aa)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3453101
Tested-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
In Linux v6.5, the 'vmas' parameter is removed from get_user_pages() by
commit 54d020692b34 ("mm/gup: remove unused vmas parameter from
get_user_pages()"). Update the conftest script test for
'get_user_pages' to test for this and use the generated definition in
the NVMAP driver accordingly.
Bug 4221847
Change-Id: I3b2b2b9937bba86a8924cf5852fe62d57cd5d3b6
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2990512
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add more checks in nvmap code so as to avoid any possible races.
- Update is_nvmap_id_ro and is_nvmap_dmabuf_fd_ro functions so that they
return error value during error conditions and also update their callers
to handle those error values.
- Move all trace statements from end of the function to before handle
refcount or dup count is decremented, this make sure we are not
dereferencing any freed handle/reference/dambuf.
- Increment ref's dup count wherever we feel data race is possible, and
decrement it accordingly towards end of function.
Bug 4253911
Change-Id: I50fc7cc98ebbf3c50025bc2f9ca32882138fb272
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2972602
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Compression carveout is not correct carveout name as it will be used
by gpu even for non-compression usecases. Hence rename it to gpu
carveout.
Bug 3956637
Change-Id: I802b91d58d9ca120e34655c21f56c0da8c8cf677
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2955536
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
Deferred dmabuf unmapping is being removed from kernel.
So, add similar support to cache sgt in NvMap.
During map_dma_buf() call, NvMap will create a mapping and an sgt
corresponding to it. It will also cache this sgt.
When unmap_dma_buf() is called for same sgt, NvMap will not unmap
the mappings. It will simply return from there.
Next time when the mapping request comes for same dmabuf, it will
look for existing sgt in cache and return it. This significantly
reduces mapping overhead for same buffer when it's mapped and unmapped
multiple times.
Free the sgt and unmap only when corresponding buffer is freed. When
all references from a buffer are removed, dmabuf_release() will be
called where sgt will be freed.
Bug 4064339
Change-Id: I7ed767ecaaac7aa44e6576e701b28537b84986ec
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2925224
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Add support for Serial Id feature which will be used by Nsight for
buffer tracking purpose. This feature expects a unique serial id per
buffer even if it is shared across multiple client processes.
Add following code:
- Create a new global counter field for serial id in nvmap device.
Initialize it to 0 when nvmap device is initialized.
- Introduce a new field for serial_id in nvmap_handle struct.
- When nvmap_handle is created, assign it's serial_id field with global
counter's value, and increment global counter.
- During NvRmMemQueryHandleParameters return this serial_id associated
with the handle.
- Do not decrement counter for serial_id even after freeing the handle.
Bug 4138373
Change-Id: Ic1fe22b082eefb352986f8fa44d4c38d186a366f
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2918510
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In NUMA system, user can request nvmap to allocate from any particular
node using NvRmMem APIs. Add code to request allocation from such
requested node. While allocating pages from page pool, check if those
pages belong to requested NUMA node. In some cases, we may not need to
use NUMA support e.g. in color pages allocation code, we don't need to
use NUMA, as page coloring is needed only for t19x chips.
JIRA TMM-5287
Bug 4043165
Change-Id: If8043278122068773a802d83723d287d32914e14
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2889707
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
CBC carveout is not the correct carveout name, rather compression
carveout is the correct name. Compresssion carveout is used to store the
gpu compressible buffers. While the comptags and other metadata related
to these buffers will be stored in CBC carveout which is a GSC carveout
not created/maintained by nvmap. Hence update carveout naming.
Bug 3956637
Change-Id: I50e01a6a8d8bda66960bdee378a12fde176b682a
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2888397
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
There is a potential data race for RO dma-buf in the following scenario:
------------------------------------------------------------------
Process 1 | Process 2 | Process 3 |
------------------------------------------------------------------|
AllocAttr handle H1 | | |
MemMap (H1) | | |
AllocAttr(H2) | | |
MemMap(H2) | | |
id1 = GetSciIpcId(H1)| | |
id2 = GetSciIpcId(H2)|H3=HandleFromSciIpcId | |
id3 = GetSciIpcId(H1)| (id1, RO) |H4=HandleFromSciIpcId|
MemUnmap(H2) |QueryHandlePararms(H3)|(id2, RO) |
MemUnmap(H1) |MemMap(H3) |QueryHandleParams(H4)|
HandleFree(H2) |MemUnmap(H3) |MemMap(H4) |
HandleFree(H1) |HandleFree(H3) |H5=HandleFromSciIpcId|
| |(id3, RO) |
| |QueryHandleParams(H5)|
| |MemMap(H5) |
| |MemUnmap(H4) |
| |MemUnmap(H5) |
| |HandleFree(H4) |
| |HandleFree(H5) |
-------------------------------------------------------------------
The race is happening between the HandleFree(H3) in process 2 and
HandleFromSciIpcId(id3, RO) in process 3. Process 2 tries to free the
H3, and function nvmap_free_handle decrements the RO dma-buf's counter,
so that it reaches 0, but nvmap_dmabuf_release is not called immediately
because of which the process 3 get's false value for the following check
if (is_ro && h->dmabuf_ro == NULL)
It results in calling nvmap_duplicate_handle and then meanwhile function
nvmap_dmabuf_release is called and it makes h->dmabuf_ro to NULL. Hence
get_dma_buf fails with null pointer dereference error.
Fix this issue with following approach:
- Before using dmabuf_ro, take the handle->lock, then check if it is not
NULL.
- If not NULL, then call get_file_rcu on the file associated with RO
dma-buf and check return value.
- If return value is false, then dma-buf's ref counter is zero and it is
going away. So wait until dmabuf_ro is set to NULL; and then create a
new dma-buf for RO.
- Otherwise, use the existing RO dma-buf and decrement the refcount
taken with get_file_rcu.
Bug 3741751
Change-Id: I8987efebc476a794b240ca968b7915b4263ba664
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2850394
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
On TOT, in case of carveouts, nvmap performs cache flush operation
during carveout creation, buffer allocation and buffer release. Due to
cache flush for entire carveout at creation time, nvmap takes ~430 ms
for probe. This is affecting boot KPIs.
Fix this by performing cache flush only at buffer allocation time,
instead of carveout creation and buffer release. This is reducing nvmap
probe time to ~0.69 ms.
Bug 3821631
Change-Id: I54da7dd179f8d30b8b038daf3eceafb355b2e789
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2802353
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: Ashish Mhetre <amhetre@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>