This change removes the check in tsec comms part of dce
which blocks sending more than 1 command to tsec fw for the
same unit. This is to let the display driver force send
command to tsec fw when it is in the middle of a command
to inform it about certain events ex: hotplug. Tsec driver
should trust that display driver is doing checks when sending
the command to tsec fw.
Bug 5008088
Change-Id: Id765c558a8350c501466685d3894a2c8349550eb
Signed-off-by: spatki <spatki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3288182
Reviewed-by: Nikesh Oswal <noswal@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Add a few microseconds delay between IVC channel reset retries.
This prevents kernel logs from flooding if dce bootstrapping takes
some time for any reason.
Based on logs below, DCE is taking around 10-30 microsecond for channel reset.
So, Keeping the delay of 10-20 microseconds.
20:26:46.637409: dce: dce_mailbox_set_full_interrupt:157 Intr bit set multiple times for MB : [0x5]
20:26:46.637421: message repeated 15 times: [ dce: dce_mailbox_set_full_interrupt:157 Intr bit set multiple times for MB : [0x5]]
---
20:26:46.637429: dce: dce_mailbox_set_full_interrupt:157 Intr bit set multiple times for MB : [0x1]
20:26:46.637458: message repeated 12 times: [ dce: dce_mailbox_set_full_interrupt:157 Intr bit set multiple times for MB : [0x1]]
----
20:26:46.637461: dce: dce_mailbox_set_full_interrupt:157 Intr bit set multiple times for MB : [0x2]
20:26:46.637471: message repeated 15 times: [ dce: dce_mailbox_set_full_interrupt:157 Intr bit set multiple times for MB : [0x2]]
Jira TDS-6381
Change-Id: I0f8d3c55058019df5a52edd232eae93b3bf84276
Signed-off-by: Mahesh Kumar <mahkumar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3304216
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
- add support for zero copy SHA/GMAC operations
- add support to read zero copy nodes in DT
- support memory buf map/unmap ioctl interfaces
- unmap all memory buffers when FD corresponding
to device node is closed.
- support only one open call at a time for zero
copy nodes.
Bug 4999798
Change-Id: If110108a73b24ca9f523a8c67a47c02b922c3fd8
Signed-off-by: Nagaraj P N <nagarajp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3292084
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Leo Chiu <lchiu@nvidia.com>
Reviewed-by: Sandeep Trasi <strasi@nvidia.com>
There are debug/duplicate APIs like NvRmMemGetIVCId,
NvRmMemHandleFromIVCId, NvRmMemWrite etc. which don't have corresponding
requirements in DriveOS 7.0 Linux NSR. We have taken sign-off from the
stakeholders to confirm that they are not using these APIs in T264 Linux
Prod NSR variant. But some of them are using these APIs in dev-nsr and
did not agree to remove it from dev-nsr, L4T, HOS etc. Hence we need to
make sure that they do not accidentally start using these APIs. Hence
add following DT based disabling support.
- Add disable-debug-support property in tegra-carveouts DT node in T264
prod nsr dts.
- Parse this DT node in nvmap and if the above property is present then
BUG_ON in the ioctl functions corresponding to these APIs.
Bug 4980348
Change-Id: Icdd5aadf3197d0649b61d285f433fa65ea69e806
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3298507
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
Reviewed-by: Ashish Mhetre <amhetre@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
- Added UID member to nvsciipc_config_entry data
structure. this is needed for implementing
test_nvsciipc_cfgblob in linux.
- removed static from ioctl function to attach eBPF program
- add error-injection.h and ALLOW_ERROR_INJECTION macro to ioctl
to use bpf_override_return()
JIRA NVIPC-2817
Change-Id: Ic27156e321368041f41fbabff9e6375140fe1d0e
Signed-off-by: Suneel Kumar Pemmineti <spemmineti@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3301786
Tested-by: Joshua Cha <joshuac@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Simon Je <sje@nvidia.com>
When pva recovery is initiated due to command or
task submit timeout or initiated by pva-fw, queues
are cleared and tasks returned with error among other
cleanup activities and pva reset and fw reboot.
There could be multiple concurrent attempts to recover
the PVA engine. Additionally, task and command submit
may be at varying stages of execution.
- Skip recovery requests while recovery work is pending.
- Skip task removal in case of timeout or invalidated task
during task submit.
- Skip task submit to CCQ if task was removed during abort.
- Guard against concurrency during recovery
- Re-attempt pva reboot on fail during boot except in recovery.
- Reset driver PM state if module busy fails and PM error is set.
- Set default FW trace mask to WARN+ERROR+BOOT
- Set default driver log mask to FW TRACING
- ccq polling routine exits with timeout if abort is active
Bug 4944591
Change-Id: Id3a7388700ccada135b568c978176bb9f2c5f8a0
Signed-off-by: omar <onemri@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3284303
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Amruta Sai Anusha Bhamidipati <abhamidipati@nvidia.com>
In order to test hypothesis that severe system loads are causing
erroneous timeouts on command submit and task submit, polling
loop and event wait functions are changed to check for true timeout.
-- dump out FW traces on abort outside ISR.
-- add debug fs node to override driver timeout in mailbox wait event
and ccq wait event.
-- add device info dump on abort.
-- dump queues
Bug 4944591
Change-Id: Iea78131016e0913d909f504272a6370bb37c35db
Signed-off-by: omar <onemri@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3259651
Reviewed-by: Amruta Sai Anusha Bhamidipati <abhamidipati@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Fix for: Sparse defects
Sparse stated that:
-symbol 'tegra_vpr1_dev' was not declared. Should it be static?
-symbol 'tegra_vpr_cma_dev' was not declared. Should it be static?
-symbol 'tegra_generic_cma_dev' was not declared. Should it be static?
-symbol 'tegra_vpr_dev' was not declared. Should it be static?
-symbol 'tegra_generic_dev' was not declared. Should it be static?
Making all the above functions static since it is being used in nvmap_init.c only.
Bug 4513982
Change-Id: I4887d994d9ae852a4faa7da735c18d25b393187a
Signed-off-by: Surbhi Singh <surbhis@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3295831
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
In current implementation, when NvRmMemQueryHeapParams is
called and multiple numa nodes are online:
1. For iovmm carveout, numa_id is set to garbage value,
and we are calling compute_memory_stat with it.
2. For gpu carveout, we are returning values for
numa_id 0.
3. For other carveouts, we are returning params for the
first matching entry in nvmap_dev->heaps[i].
Correct this behavior as follows:
Regardless of carveout type, return params for numa_id 0
when NvRmMemQueryHeapParams is called and multiple numa
nodes are online.
In long-term, we need to disable NvRmMemQueryHeapParams
when multiple numa nodes are online. Clients should use
NvRmMemQueryHeapParamsNuma instead.
Jira TMM-5970
Change-Id: Id49289e51eda187b1d676e5192583f320835c2f4
Signed-off-by: N V S Abhishek <nabhishek@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3290730
Reviewed-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
In order to serve MEMSERV70-REQ-670 requirement, which makes validation
checks mandatory for input flowing across execution boundary. Hence add
checks for input flags in nvmap and make sure the execution does not
proceed if flag other than read or write is provided in handle
duplication, creating sciipc id or during handle creation from sciipc id
even though the checks are present at libnvrm_mem layer.
JIRA TMM-5962
Change-Id: I1fc6ce6ec4435c50220d4e49a08de50320a8f574
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3295201
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
- use TSC lock trigger interval param from dt to optimize
As per HW suggestion we should not trigger sync on every
PPS edge. TSC needs atleast 2 PPS edges to align with the
PTP clock.
- save platform specific register offset during drv init time
instead of checking plat id everytime in monitoring thread
Bug 5042311
Bug 4899241
Bug 5082436
Signed-off-by: Sheetal Tigadoli <stigadoli@nvidia.com>
Change-Id: I22befbc2a52c22ace1a8573b9a34a544ed1ae8f9
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3294329
(cherry picked from commit 209dc26eddd2cd5e9d88ea8c6eb603706cd3c3f0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3292322
Reviewed-by: Amlan Kundu <akundu@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Add kernel module parameter to enable a mode where syncpoints
must be freed explicitly (using the free IOCTL) or they will be
left dangling. This ensures that, for particularly locked down
configurations, a syncpoint will be forever left in an expected
state even if the process owning it dies -- for example another
process will not be able to allocate it.
Change-Id: I2f350c710775a296c70910df21e95737a36c6a45
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3284405
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Santosh BS <santoshb@nvidia.com>
On TOT, NvMap does page by page cache flush i.e. it takes virtual
address of each page present in the buffer and then perform cache
flush on it using dcache_by_line_op. This result in very poor
performance for larger buffers. ~70% of the time taken by
NvRmMemHandleAllocAttr is consumed in cache flush.
Address this perf issue using multithreaded cache flush
- Use a threshold value of 32768 pages which is derived from perf
experiments and as per discussion with cuda as per usecases.
- When the cache flush request of >= 32768 pages is made, then vmap
pages to map them in contiguous VA space and create n number of kernel
threads; where n indicate the number of online CPUs.
- Divide the above VA range among the threads and each thread would do
cache flush on the VA range assigned to it.
This logic in resulting into following % improvement for alloc tests.
-----------------------------------
Buffer Size in MB | % improvement |
----------------------------------|
128 | 52 |
256 | 56 |
512 | 57 |
1024 | 58 |
1536 | 57 |
2048 | 58 |
2560 | 57 |
3072 | 58 |
3584 | 58 |
4096 | 58 |
4608 | 58 |
5120 | 58 |
-----------------------------------
Bug 4628529
Change-Id: I803ef5245ff9283fdc3afc497a6b642c97e89c06
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3187871
Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>