Use offset and memory handle to calculate IOVA address
for descriptors. It allows using single buffer in user
space for all descriptors.
Jira DLA-176
Change-Id: I141efa7fc8662be8aa4b5c3bd2ea7a369a90769a
Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Reviewed-on: http://git-master/r/1220956
Some commands need completion or error notification
from firmware to proceed ahead such as clearing some
resources or reading response from firmware.
Add mechanism to wait for command complete or error
notification from firmware. This add limitation of
handling only one command at a time as there is single
set of registers for command send and response.
Add locking for command send so that only one command
is processed at one time.
Remove wait for idle from ping command and instead
use waiting mechanism.
Clean up task resources if task submit command fails.
Jira DLA-127
Jira DLA-176
Change-Id: I92246c080c730dcae514bcea93b78372799bda4a
Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Reviewed-on: http://git-master/r/1219539
- pin mapped operation descriptor buffers during task submission
- get operation descriptors handle from user and pass its IOVA to engine
- pin API returns IOVA for given mem handle
- unpin operation descriptors buffers in task cleanup
Jira DLA-93
Change-Id: I78fb22301ab472685c3bae7c424d75140b814887
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1213761
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
GVS: Gerrit_Virtual_Submit
- add debug macro functions for different debug levels like,
info, function, register access etc.
- add option to print to either on console or trace
- add debugfs to set different debug levels and flag to choose
trace
Jira DLA-134
Jira DLA-135
Jira DLA-136
Change-Id: I4cfbb463a2cf1a47d40dce911c86abb4542f957a
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1203575
GVS: Gerrit_Virtual_Submit
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
- fix include files path in makefile
- fix ioctl include header file path
- update comments in doxygen format
add support to submit task to engine as per tasklist
management protocol
- maintain list of tasks under assigned queue with ref counts
- allocates task to maintain list of fences and update them
- dma alloc task descriptor and action list and update them
- submit tasks one by one and send received fence back
to application
- register for syncpoint notifier with nvhost for completion
of fence
- on fence completion interrupt handler, cleanup task
Jira DLA-52
Change-Id: Ibe385f47dc9f17dda79cca3daf29b89218dc7289
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1191495
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
- as falcon may not be in a state to process any more request in poweroff
path, do not send set region command
Jira DLA-19
Change-Id: I7ac858554f769b659d2738f7c8ed48b53cc8ec15
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1197453
- move queue related DLA API's to new file to include task API's
along with
Jira DLA-19
Change-Id: I312e021314a3fb7d03dd31a557fb7cf6d6fc86ca
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1191494
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
- as core driver code is growing, split IOCTL related API's
to new source file.
Jira DLA-19
Change-Id: I42ce24300671392e6ac99fcdae12e2525f74e57e
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1191491
Update set region command as per new interface
Jira DLA-19
Change-Id: Ia171fb89b890f79b8df27785079a00cef7351003
Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Reviewed-on: http://git-master/r/1180574
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Fix below warning from sparse checks:
- nvdla/nvdla.c warning: symbol 'nvdla_queue_abort' was not
declared. Should it be static?
- pva/pva.c warning: symbol 'pva_queue_abort' was not declared.
Should it be static?
Bug 200088648
Change-Id: I084156f1b0605008fe9b1dbe534211a682257e2e
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1176532
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
- this adds first IOCTL for NvDLA for ping cmd
- this ping cmd added to make sure that, falcon and memory
read/write are working.
- Through IOCTL, pass ping number to falcon via KMD
- From falcon, for CRC check update mailbox and writeback with
multiplier.
Jira DLA-20
Change-Id: I9cd1bb57d42d00b03907d7cb45750dcec0b2df7b
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1170198
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Fix sparse warning:
nvdla.c:40:1: warning: symbol 'attrs' was not declared.
Should it be static?
Bug 200088648
Change-Id: Ic83c46c938fe82d1e8cbdd7c7e2337b39580cc88
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1167423
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
- this adds falcon interrupt support for NvDLA driver.
- Register device for falcon interrupt
- Allocate dump data region and pass dma addr to falcon
after firmware load
- In ISR, read dump data from a allocated dump region and dump
to console
- During engine power off, free dump region
- add dla KMD<->ucode interface header file for cmd communication.
Jira DLA-45
Jira HOSTX-61
Change-Id: I2163c0e50ce8e2231e185d37bcd3ef8e979f7bdf
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1160994
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
GVS: Gerrit_Virtual_Submit
- add support to boot falcon through nvhost_nvdla_finalize_poweron() PM API.
- nvhost_nvdla_finalize_poweron() called by nvhost as restore start
dev ops
- nvhost_flcn_finalize_poweron() is API provided by falcon framework to
request firmware, parse ucode and boot falcon.
- Specify firmware name in NvDLA device data, which is required for
request_firmware
- Fix Kconfig to enable TEGRA_GRHOST_NVDLA by default
Jira DLA-16
Change-Id: I14791fc1f97c283ff9e9b1890183033bfc4087aa
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1147900
GVS: Gerrit_Virtual_Submit
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
private data element of platform data is set during falcon init time.
So remove setting it probe time.
Jira DLA-33
Change-Id: I9807d4520757e8e708b674a1b8f4f95aa24ad526
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1156132
GVS: Gerrit_Virtual_Submit
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
- NvDLA is a fixed function accelerator engine for deep learning in
Tegra. Engine supports various layers such as convolution, fully-connected,
activation, pooling and normalization.
- This patch adds minimal support stub for engine for device
initialization.
Jira DLA-5
Change-Id: Iecdd3963a77a2f20979ae412ff2f9388c57a26b1
Signed-off-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-on: http://git-master/r/1132605
GVS: Gerrit_Virtual_Submit
Reviewed-by: Prashant Gaikwad <pgaikwad@nvidia.com>
Remove dummy makefile to prepare the nvdla folder
to do driver integrations from kernel/nvidia to
kernel/nvidia-oot.
Bug 4038415
Change-Id: I45d8fffc504ab9530718c1fa4f3960037e909f25
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Merge nvmap driver from kernel/nvidia to kernel/nvidia-oot
to get rid of kernel/nvidia for OOT drivers.
Merge remote-tracking branch
'origin/dev/ldewangan/nvidia-nvmap-dev-main'
into
oot-nvidia-nvmap-dev-main
Changes merged are:
3059baed0 video: tegra: nvmap: Resolve incorrect check
b20790d77 video: tegra: nvmap: Fix data race for RO dma-buf
fba5e4766 video: tegra: nvmap: Fix build for Linux v6.3
6fd5f8966 video: tegra: nvmap: Fix overflow condition
47077f96f video: tegra: nvmap: Update contig flag check
49628c223 video: tegra: nvmap: Fix data race for RO dma-buf
73e81759d video: tegra: nvmap: Fix error pointer dereference
2c51bb8eb video: tegra: nvmap: Fix kmemleak issues
a2e6ee293 tegra:nvmap: do not export symbol for init functions
4629804ea video: tegra: nvmap: Remove use of bitmap_allocate_region
ffb301fb9 video: tegra: nvmap: Fix kmemleak issue
86a33032e nvmap: Don't free pages while freeing subhandle
06bf4d6ac nvmap: Fix coherency issues while creating subhandle
064bf50f9 nvmap: Add traces for big pages in page pool
dcac1c0a9 nvmap: Use same nvmap client for namespaces in a process
3a59e1982 video: tegra: nvmap: Make NvMap load based on DT
d703b19fe nvmap: Fix type casting issue
d6b9b8d21 nvmap: Fix type casting issue
3dbdad35c nvmap: Fix type casting issue
eebe13973 tegra: nvmap: dmabuf fd for range from list of handles
699708288 video: tegra: nvmap: Remove unused code
9a13e1eb9 nvmap: Remove use of __dma_flush_area
cc519075b nvmap: Register nvmap device towards end of probe
66b9a3912 nvmap: Keep cache flush at allocation time only
34ec2845a tegra: nvmap: make sciipc_id as 64 bit
bed3861b1 tegra: nvmap: add _dma_*area prototype
0ae70ffd0 tegra: nvmap: fix build for kernel 6.0
d826f0508 video: tegra: nvmap: Fix type casting issue
003efb449 tegra: nvmap: replace _dma_* and __iomap
Bug 4038415
Change-Id: I43422655bf7b28f215902d4c01f21d99c579a97d
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
When one process is trying to duplicate RO handle while other process is
trying to free the same RO handle, then race can occur and second
process can decrement the dma-buf's refcount and it may reach to 0. The
first process can then call get_dma_buf on it, leading to NULL pointer
dereference and ultimately to kernel panic. Fix this by taking an extra
dma-buf refcount before duplicating the handle and then decrease it once
duplication is completed.
Bug 3991243
Change-Id: I99901ce19d8a5d23c5192cb10a17efd2ebaf9d3a
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2865519
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
When the carveout size is changed to 2GB, mem->size << PAGE_SHIFT will
overflow the int limit and get wrapped to negative value. Hence
during freeing bitmap, one of the comparison condition is not meeting,
resulting into not freeing bitmap. Ultimately the entire bitmap get
consumed even though it is expected to have empty bits. Fix this by
typecasting the size to u64.
Bug 3962552
Change-Id: Ieaf93a3a91062d3f630921259aa9b3935853e91c
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2861614
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
There is a potential data race for RO dma-buf in the following scenario:
------------------------------------------------------------------
Process 1 | Process 2 | Process 3 |
------------------------------------------------------------------|
AllocAttr handle H1 | | |
MemMap (H1) | | |
AllocAttr(H2) | | |
MemMap(H2) | | |
id1 = GetSciIpcId(H1)| | |
id2 = GetSciIpcId(H2)|H3=HandleFromSciIpcId | |
id3 = GetSciIpcId(H1)| (id1, RO) |H4=HandleFromSciIpcId|
MemUnmap(H2) |QueryHandlePararms(H3)|(id2, RO) |
MemUnmap(H1) |MemMap(H3) |QueryHandleParams(H4)|
HandleFree(H2) |MemUnmap(H3) |MemMap(H4) |
HandleFree(H1) |HandleFree(H3) |H5=HandleFromSciIpcId|
| |(id3, RO) |
| |QueryHandleParams(H5)|
| |MemMap(H5) |
| |MemUnmap(H4) |
| |MemUnmap(H5) |
| |HandleFree(H4) |
| |HandleFree(H5) |
-------------------------------------------------------------------
The race is happening between the HandleFree(H3) in process 2 and
HandleFromSciIpcId(id3, RO) in process 3. Process 2 tries to free the
H3, and function nvmap_free_handle decrements the RO dma-buf's counter,
so that it reaches 0, but nvmap_dmabuf_release is not called immediately
because of which the process 3 get's false value for the following check
if (is_ro && h->dmabuf_ro == NULL)
It results in calling nvmap_duplicate_handle and then meanwhile function
nvmap_dmabuf_release is called and it makes h->dmabuf_ro to NULL. Hence
get_dma_buf fails with null pointer dereference error.
Fix this issue with following approach:
- Before using dmabuf_ro, take the handle->lock, then check if it is not
NULL.
- If not NULL, then call get_file_rcu on the file associated with RO
dma-buf and check return value.
- If return value is false, then dma-buf's ref counter is zero and it is
going away. So wait until dmabuf_ro is set to NULL; and then create a
new dma-buf for RO.
- Otherwise, use the existing RO dma-buf and decrement the refcount
taken with get_file_rcu.
Bug 3741751
Change-Id: I8987efebc476a794b240ca968b7915b4263ba664
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2850394
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
NvMap is using bitmap_allocate_region while doing allocation from IVM
carveout. It expects size to be always power of 2, this is resulting
into memory shortage. Better way to handle this is to use the function
bitmap_find_next_zero_area which expects bitmap size rather than order.
Then use bitmap_set to set the allocated bits from the bitmap.
Similarly, while freeing the buffer, use bitmap_clear instead of
bitmap_release_region.
Bug 3923812
Change-Id: I91005d16f678405f341c4fc620509f56af538e1c
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2839848
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Ashish Mhetre <amhetre@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
get_task_struct increment the ref count over task struct, and it will be
decremented as part of put_task_struct. task_struct won't be freed
unless it's refcount becomes 0. Hence the missing put_task_struct in
nvmap code was resulting into kmemleak. Fix it by add this missing call.
Also, mutex_unlock was missing in one of the return path, add it.
Bug 3901618
Change-Id: I630eac19e628a549179a8ddaad86ad4d2c9b9a53
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2837383
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Ashish Mhetre <amhetre@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
This patch fixes the following issues:
- When the handle associated with the sub-buffer is freed, the call to
the nvmap_page_pool_fill_lots is made, which would free the pages when
the refcount on the pages is > 1, even though the main handle is not
freed, hence add a check for sub-handle.
- In cpu unmap code, list_for_each_entry is used for iterating over the
vma list and in the same loop list_del is aldo being called, this can
lead to undefined behavior, instead use list_for_each_entry_safe which
is safe against removal of list entry.
- Mutex of sub-handle is not initialized, fix it by initializing it.
- Set error value when handle creation failed.
Bug 3494980
Change-Id: I0659d7f70b44814e87e3081702352e891d9191f7
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2824668
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Puneet Saxena <puneets@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Some of the libraries may be linked into different namespaces, but they
belong to same process. So each namespace can open the /dev/nvmap node
and call nvmap ioctls, instead of reusing the already opened /dev/nvmap
FD. Because of this, memory handles created by one namespace can't be
used by other namespaces, even though they belong to same process.
Fix this issue by following:
- When /dev/nvmap node is opened, check if there is already any nvmap
client with same pid, if yes use that client; otherwise, create new
nvmap client.
- Track the number of namespaces via count field.
- When /dev/nvmap node is closed, destroy the nvmap client only when
count reaches to 0.
Bug 3689604
Change-Id: I4c91db36b88e78b7526dd63b006e562c8f66c7ae
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2819915
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
Reviewed-by: Ivan Raul Guadarrama <iguadarrama@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>