Commit Graph

13 Commits

Author SHA1 Message Date
Ketan Patil
033c964778 video: tegra: nvmap: Remove use of add_mm_counter
add_mm_counter is not an exported function, so instead use
atomic_long_add_return/percpu_counter_add to directly modify RSS stat
counters.

Bug 5222690

Change-Id: I51a68d932aeb04f96e51a4a3c286ee5c8efc789a
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3446982
(cherry picked from commit 8e1a6b2dd1)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3453099
(cherry picked from commit e488812038)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3454643
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Tested-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
2025-09-23 10:55:05 -07:00
Ketan Patil
3b830eb8a7 video: tegra: nvmap: Account NvMap memory for OOM Decisions
Account NvMap allocated memory into both RSS and CG tracking to make
efficient OOM kill decisions during memory pressure.

NvMap allocates memory via kernel APIs like alloc_pages, the kernel
memory is not accounted on behalf of process who requests the
allocation. Hence in case OOM, the OOM killer never kills the process
who has allocated memory via NvMap even though this process might be
holding most of the memory.

Solve this issue using following approach:
- Use __GFP_ACCOUNT and __GFP_NORETRY flag
-  __GFP_NORETRY will not let the current allocation flow to go into OOM
path, so that it will never trigger OOM.
- __GFP_ACCOUNT causes the allocation to be accounted to kmemcg. So any
allocation done by NvMap will be definitely accounted to kmemcg and
cgroups can be used to define memory limits.
- Add RSS counting for the process which allocates by NvMap, so that OOM
score for that process will get updated and OOM killer can pick this
process based upon the OOM score.
- Every process that has a reference to NvMap Handle would have the
memory size accounted into its RSS. On releasing the reference to
handle, the RSS would be reduced.

Bug 5222690

Change-Id: I3fa9b76ec9fc8d7f805111cb96e11e2ab1db42ce
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3427871
(cherry picked from commit 2ca91098aa)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3453101
(cherry picked from commit cfe6242c8c)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3454642
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Tested-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
2025-09-23 10:54:56 -07:00
Laxman Dewangan
6b290b7f84 nvmap: Use conftest to find get_rcu_file() argument type
The argument of get_rcu_file() get changed to pointer
type of file handle form Linux 6.7 with
commit 0ede61d8589cc ("file: convert to SLAB_TYPESAFE_BY_RCU").

Use conftest to findout this new argument type.

Bug 4346767

Change-Id: I18943421dd4c2ed4f409ce071b182e68d3d393e6
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3028575
(cherry picked from commit ed5ae591c9)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3036792
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-12-18 09:11:43 -08:00
Laxman Dewangan
b5178e70d8 nvmap: Pass proper argument for get_file_rcu() for Linux 6.7
The function get_file_rcu() has modification in its
argument to take the pointer to pointer of file from
Linux 6.7 from below change
***
commit 0ede61d8589cc2d93aa78230d74ac58b5b8d0244
Author: Christian Brauner <brauner@kernel.org>

    file: convert to SLAB_TYPESAFE_BY_RCU
***

Add support for Linux 6.7.

Bug 4346767

Change-Id: I1e2e005900c7d2c57ac487b5f6ac5e1fcbfbafe7
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3020000
(cherry picked from commit 4ef0a332e8)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3036791
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-12-18 09:11:39 -08:00
Ketan Patil
5c59fead46 video: tegra: nvmap: Fix data race
Consider the following execution scenario, it can lead to a data race

Thread 1                    Thread 2
-------------------------------------------------------
NvRmMemHandleAllocAttr
NvRmMemGetFd
NvRmMemHandleFromFd         close(fd), NvRmMemHandleFree

When thread 1 is executing NvRmMemHandleFromFd while thread 2 is closing
the fd and freeing the handle, then following sequence will lead to
accessing a freed dma-buf and may lead to dereference issue.

- TH1: Get handle from fd and increment handle's refcount.
- TH2: Close the fd and start execution of NvRmMemHandleFree.
- TH2: Decrement ref's dup count, it will become 0, hence ref would be
freed and dma-buf as well, but as handle's refcount is incremented in
step 1, handle won't be freed.
- TH1: Resume HandleFromFd part, call to nvmap_duplicate_handle. Ref is
already freed, so generate new ref and increment dma-buf's count but as
dma-buf is freed already, accessing dma-buf will lead to dereferene
issue. Hence, we need to add a null check here and return error value in
such scenario.

Also, add check for return value of nvmap_handle_get, at missing places.

Bug 4214453

Change-Id: Ib6ef66b4a7126bef2ed1dbb48643445a4ded1bab
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2994962
(cherry picked from commit 7ba6a3abfe65f7cffeeb4004cf0868468121fc32)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2994440
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-10-13 15:38:16 -07:00
Ketan Patil
ebf51c43ae video: tegra: nvmap: Add more checks to avoid races
Add more checks in nvmap code so as to avoid any possible races.
- Update is_nvmap_id_ro and is_nvmap_dmabuf_fd_ro functions so that they
return error value during error conditions and also update their callers
to handle those error values.
- Move all trace statements from end of the function to before handle
refcount or dup count is decremented, this make sure we are not
dereferencing any freed handle/reference/dambuf.
- Increment ref's dup count wherever we feel data race is possible, and
decrement it accordingly towards end of function.

Bug 4253911

Change-Id: I50fc7cc98ebbf3c50025bc2f9ca32882138fb272
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2972602
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-09-06 11:58:16 -07:00
Ketan Patil
01db38f08e video: tegra: nvmap: Add support for Serial Id feature
Add support for Serial Id feature which will be used by Nsight for
buffer tracking purpose. This feature expects a unique serial id per
buffer even if it is shared across multiple client processes.
Add following code:
- Create a new global counter field for serial id in nvmap device.
Initialize it to 0 when nvmap device is initialized.
- Introduce a new field for serial_id in nvmap_handle struct.
- When nvmap_handle is created, assign it's serial_id field with global
counter's value, and increment global counter.
- During NvRmMemQueryHandleParameters return this serial_id associated
with the handle.
- Do not decrement counter for serial_id even after freeing the handle.

Bug 4138373

Change-Id: Ic1fe22b082eefb352986f8fa44d4c38d186a366f
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2918510
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2023-06-10 19:46:37 -07:00
Laxman Dewangan
e8b7a5ca26 nvmap: Use SPDX license GPL 2.0 format
Use SPDX license GPL-V2.0 format and change Nvidia
copyright year to include 2023.

Bug 4078035

Change-Id: I4db6577ddb806690f6ec04f5eaf1364578102d14
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2890635
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Ketan Patil <ketanp@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-22 09:59:20 -07:00
Ketan Patil
678dc695bb video: tegra: nvmap: Fix data race for RO dma-buf
When one process is trying to duplicate RO handle while other process is
trying to free the same RO handle, then race can occur and second
process can decrement the dma-buf's refcount and it may reach to 0. The
first process can then call get_dma_buf on it, leading to NULL pointer
dereference and ultimately to kernel panic. Fix this by taking an extra
dma-buf refcount before duplicating the handle and then decrease it once
duplication is completed.

Bug 3991243

Change-Id: I99901ce19d8a5d23c5192cb10a17efd2ebaf9d3a
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2865519
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-11 05:54:22 +00:00
Ketan Patil
9fdb78226d video: tegra: nvmap: Fix data race for RO dma-buf
There is a potential data race for RO dma-buf in the following scenario:
------------------------------------------------------------------
Process 1            | Process 2            | Process 3           |
------------------------------------------------------------------|
AllocAttr handle H1  |                      |                     |
MemMap (H1)          |                      |                     |
AllocAttr(H2)        |                      |                     |
MemMap(H2)           |                      |                     |
id1 = GetSciIpcId(H1)|                      |                     |
id2 = GetSciIpcId(H2)|H3=HandleFromSciIpcId |                     |
id3 = GetSciIpcId(H1)| (id1, RO)            |H4=HandleFromSciIpcId|
MemUnmap(H2)         |QueryHandlePararms(H3)|(id2, RO)            |
MemUnmap(H1)         |MemMap(H3)            |QueryHandleParams(H4)|
HandleFree(H2)       |MemUnmap(H3)          |MemMap(H4)           |
HandleFree(H1)       |HandleFree(H3)        |H5=HandleFromSciIpcId|
                     |                      |(id3, RO)            |
                     |                      |QueryHandleParams(H5)|
                     |                      |MemMap(H5)           |
                     |                      |MemUnmap(H4)         |
                     |                      |MemUnmap(H5)         |
                     |                      |HandleFree(H4)       |
                     |                      |HandleFree(H5)       |
-------------------------------------------------------------------

The race is happening between the HandleFree(H3) in process 2 and
HandleFromSciIpcId(id3, RO) in process 3. Process 2 tries to free the
H3, and function nvmap_free_handle decrements the RO dma-buf's counter,
so that it reaches 0, but nvmap_dmabuf_release is not called immediately
because of which the process 3 get's false value for the following check
if (is_ro && h->dmabuf_ro == NULL)
It results in calling nvmap_duplicate_handle and then meanwhile function
nvmap_dmabuf_release is called and it makes h->dmabuf_ro to NULL. Hence
get_dma_buf fails with null pointer dereference error.

Fix this issue with following approach:
- Before using dmabuf_ro, take the handle->lock, then check if it is not
NULL.
- If not NULL, then call get_file_rcu on the file associated with RO
dma-buf and check return value.
- If return value is false, then dma-buf's ref counter is zero and it is
going away. So wait until dmabuf_ro is set to NULL; and then create a
new dma-buf for RO.
- Otherwise, use the existing RO dma-buf and decrement the refcount
taken with get_file_rcu.

Bug 3741751

Change-Id: I8987efebc476a794b240ca968b7915b4263ba664
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2850394
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-11 05:54:22 +00:00
Ashish Mhetre
ae282d2c22 video: tegra: nvmap: Fix error pointer dereference
In nvmap_try_duplicate_by_ivmid(), the return pointer from
nvmap_duplicate_handle() is getting dereferenced without checking
whether the pointer is error or valid. This is causing kernel panic.
Fix this by checking if the return pointer is invalid then return error.

Bug 3766497

Change-Id: I010893c9ebda60e313c4f776044a123073399ef2
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2846180
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-11 05:54:22 +00:00
Puneet Saxena
c4a88d6515 tegra: nvmap: dmabuf fd for range from list of handles
It returns dmabuf fd from list of nvmap handles
for a range between offset and size.

The range should fall under total size of all of the handles.
Range may lie between sub set of handles.

Bug 3494980

Change-Id: I5c688f832a7a3bb0b6e3713ec6462224bb6fbfc5
Signed-off-by: Puneet Saxena <puneets@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2789431
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2023-04-11 05:54:22 +00:00
Laxman Dewangan
4cf8c80669 nvmap: Copy drivers and headers from kernel/nvidia
Copy the driver and header sources of the nvmap to
kernel/nvidia-oot from kernel/nvidia as part of
removing the dependency of kernel/nvidia for OOT drivers.

The latest (few) git history of the files copied are
	b7a355916 video: tegra: nvmap: Fix type casting issue
	2128c5433 video: tegra: nvmap: Fix type casting issues
	0cd082559 video: tegra: nvmap: Change peer vm id data type
	4bd7ece67 tegra: nvmap: mark ivm carveout pages occupied
	e86f3630a video: tegra: nvmap: Fix type casting issue
	c43a23e58 video: tegra: nvmap: Fix type casting issue
	ca1dda22e video: tegra: nvmap: Fix type casting issue
	1f567abfe video: tegra: nvmap: Fix wrap up condition
	29db4d31c video: tegra: nvmap: Remove unnecessary debugfs
	fe72f1413 video: tegra: nvmap: Remove get_drv_data() call
	3b0fc79e7 video: tegra: nvmap: Fix coverity defect
	3cc0ce41b video: tegra: nvmap: Fix coverity defect
	6da39e966 video: tegra: nvmap: Fix WARN_ON condition
	a16351ff1 video: tegra: nvmap: Remove dead code
	9993f2d2d video: tegra: nvmap: Update print level
	6066a2077 video: tegra: nvmap: Remove nvmap_debug_lru_allocations_show
	3cdf2b7ba video: tegra: nvmap: Add kernel version check
	716ded4fc video: tegra: nvmap: Initialize the return value
	9b6c1b4ab video: tegra: nvmap: Correct debugfs code
	33e70118b video: tegra: nvmap: Fix Cert-C error handling bug
	7b960ed79 video: tegra: nvmap: Fix Cert-C error handling bug
	945dc1471 video: tegra: nvmap: Fix Cert-C error handling bug
	31e572de2 video: tegra: nvmap: Fix Cert-C error handling bug
	1f25cbf68 video: tegra: nvmap: Fix Cert-C error handling bug
	fa5428107 video: tegra: nvmap: Remove nvmap_handle_get_from_fd
	df73f2208 video: tegra: nvmap: Protect kmap/kunmap code
	9842e7c6a video: tegra: nvmap: Remove t19x dma_buf map/unmap
	06dff1a8d video: tegra: nvmap: Remove unnecessary export symbols
	6f097f86b video: tegra: nvmap: Fix Cert-C error handling bug
	f14171608 video: tegra: nvmap: load nvmap for T23x compatible platforms
	266812814 video: tegra: nvmap: Get rid of NVMAP_CONFIG_KSTABLE_KERNEL
	1b38c0887 nvmap: Don't use NV_BUILD_KERNEL_OPTIONS
	0ab8dc032 video: tegra: nvmap: Reintroduce NVMAP_CONFIG_VPR_RESIZE
	cc8db9797 driver: platform: tegra: Separate out vpr code
	28955d95c video/tegra: nvmap: Enable build as OOT module
	876d1fbb8 video: tegra: nvmap: Remove IS_ENABLED check
	5ea30867a nvmap: Add support to build as module from OOT kernel
	a71ad020e video: tegra: nvmap: Protect tegra_vpr args under config
	e70061cc1 video: tegra: nvmap: Do not export cvnas_dev
	d2a26ff36 video: tegra: nvmap: Include missing header
	692e4f682 video: tegra: nvmap: Update page coloring algo
	2b9dbb911 video: tegra: nvmap: Check for return value
	de8de12b6 video: tegra: nvmap: Enable legacy init support
	65d478158 video: tegra: nvmap: Remove dependency of cvnas
	38bdd6f05 video: tegra: nvmap: Make nvmap as loadable module
	9668e410b video: tegra: nvmap: Enable handle as ID
	11c6cbd23 tegra: nvmap: Fix build for Linux v5.18
	fbd95c3ab linux: nvmap: change ivm_handle to u32
	eb1e2c302 video: tegra: nvmap: Fix NVSCIIPC support
	022689b29 tegra: nvmap: return error if handle as ID enabled but id is fd
	19e5106ed video: tegra: nvmap: Don't treat ivm as reserved mem carveouts

Bug 4038415

Change-Id: I7108aec3b8532fe79c9423c2835744b1213719e8
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
2023-04-11 05:47:21 +00:00