video: tegra: nvmap: Fix data race between create and destroy client

nvmap uses pid of group_leader task to indicate a client process. During
create_client operation, whenever any client with the same group_leader
pid already exists in clients list of nvmap_device, then nvmap
increments the count field of nvmap_client struct. Otherwise, create a
new nvmap_client. Both of the operations i.e. checking the list for
client and incrementing the counter happen inside lock. On the other
hand, during nvmap_release, first the counter is decremented and checked
if it's zero or not. If it's zero then the lock is taken and client is
removed from client list of nvmap_device. As both the operations i.e.
decrementing the counter value and removing client from list (if the
counter becomes 0) are not happening inside a lock, it's resulting into
the following data race scenario.
1) nvmap_release on existing client process 1
   - decrement client's counter
   - counter value has become zero
   - client is yet to be removed from the dev->clients list
   - context switch happen to __nvmap_create_client as another
namespace/thread with same with same group_leader pid is created.
2) __nvmap_create_client
   - as the client with same pid exists in dev->client list, it
increments counter value to 1, instead of creating a new client struct.
   - context switch happen to nvmap_release from step 1
3) nvmap_release
   - It calls destroy_client and remove the client from dev->client
list.
   - Completes rest of the operations in destroy_client and returns.
   - Context switch to remaining operations from step 2
4) nvmap_release
   - Now, when the nvmap_release will be called for the thread/namespace
which was created in step 2, then list_del operation would fail as the
client struct was already removed from dev->client list.

Fix the above issue by doing both operations i.e. decrementing the
counter value and removing the client struct from dev->client list in a
single lock.

Bug 4829958

Change-Id: I87ebbcb45b18114d0ec75520443bee010f88d59e
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3209794
(cherry picked from commit cc74d1fe1b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3207520
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
This commit is contained in:
Ketan Patil
2024-09-06 13:48:31 +00:00
committed by Jon Hunter
parent 5f010bd073
commit ebabca3f65

View File

@@ -245,12 +245,20 @@ static void destroy_client(struct nvmap_client *client)
if (!client)
return;
mutex_lock(&nvmap_dev->clients_lock);
/*
* count field tracks the number of namespaces within a process.
* Destroy the client only after all namespaces close the /dev/nvmap node.
*/
if (atomic_dec_return(&client->count)) {
mutex_unlock(&nvmap_dev->clients_lock);
return;
}
nvmap_id_array_exit(&client->id_array);
#ifdef NVMAP_CONFIG_HANDLE_AS_ID
client->ida = NULL;
#endif
mutex_lock(&nvmap_dev->clients_lock);
if (!IS_ERR_OR_NULL(nvmap_dev->handles_by_pid)) {
pid_t pid = nvmap_client_pid(client);
nvmap_pid_put_locked(nvmap_dev, pid);
@@ -317,13 +325,7 @@ static int nvmap_release(struct inode *inode, struct file *filp)
return 0;
trace_nvmap_release(priv, priv->name);
/*
* count field tracks the number of namespaces within a process.
* Destroy the client only after all namespaces close the /dev/nvmap node.
*/
if (!atomic_dec_return(&priv->count))
destroy_client(priv);
return 0;
}