Problem:
- When a Downstream Port Containment (DPC) software trigger is issued, the LTR_EN bit in the Root Port (RP) is cleared as per PCIe spec.
- However, LTR_EN bit of RTL8126 endpoint (EP) which is being expected to reset is still active and sends Latency Tolerance Reports (LTR) to RP.
- This behavior violates the PCIe spec, as LTR_EN is a non-sticky bit and should be cleared automatically on reset.
- As the RP has LTR disabled but the EP still sends LTR messages, it results in Unsupported Request (UR) errors on the RP.
- These UR errors trigger AER (Advanced Error Reporting) recovery, which includes a Secondary Bus Reset (SBR).
- The SBR causes the PCIe link to go down and come back up, but the EP again starts sending LTRs, leading to a infinite error-recovery loop.
Workaround:
- As a temporary fix, disable the LTR_EN bit in the RTL8126 EP during its probe.
- This prevents the EP from sending LTR messages, thereby avoiding UR errors and breaking the loop of AER recovery.
Impact:
- Disabling LTR prevents the EP from entering the L1.2 low power state.
- However, ASPM is currently not enabled in the system, so this workaround has no impact.
Bug 4869463
Change-Id: Ibf7effaeb0f22e952645ef7bf6a18287264e1463
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3420019
Reviewed-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Reviewed-by: Ashutosh Jha <ajha@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
In fringe unexpected cases, HSB (Holoscan sensor bringe) sends image
byte offset larger then allocated image size (e.g. if HSB just sends
incorrect packet, or is configured incorrectly for a different image
size. or just packet corruption).
In such cases, we run into SMMU faults.
To mitigate this, a buffer size of two check was introduced so even
were this to happen, it would not cause SMMU errors.
However, the support for this in UMD is not complete.
Therefore, disable this check until UMD is able to comply with this
buffer constraint.
Jira L4T-7463
Change-Id: I2de31740284627ca117f1fa0a28bde2ef9a82785
Signed-off-by: Rakibul Hassan <rakibulh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3419644
Reviewed-by: Igor Mitsyanko <imitsyanko@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Narendra Kondapalli <nkondapalli@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Modify CoE capture logic a bit to make it more robust and error-proof:
- RCE Rx queue limit size is 16, no point to have 32 elements long queue
in kernel.
- Pass kernel's queue length to RCE when opening a channel so it can be
validated (to not exceed RCE max depth)
- validate image buffers IOVA addresses and buffer length before queuing
to RCE
Jira CT26X-1892
Change-Id: I199143fe726ebab05a1236d4b14b59f0528d65a8
Signed-off-by: Igor Mitsyanko <imitsyanko@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3419638
Reviewed-by: svcacv <svcacv@nvidia.com>
Tested-by: Raki Hassan <rakibulh@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Narendra Kondapalli <nkondapalli@nvidia.com>
Upstream commit aa7a9275ab81 ("PM: sleep: Suspend async parents after
suspending children") triggers a suspend issue on Tegra234 Jetson
Orin Nano boards because it had reordered the suspend of devices with
async suspend enabled with respect to some other devices. This commit
is present in Linux v6.16 kernels.
The same issue was observed with the cypd4226 Type-C controller used on
other Jetson platforms and due to its dependencies on other devices it
is necessary to disable async suspend to fix the issue [0]. Fix suspend
for Tegra234 Jetson Nano platforms by disabling 'async' suspend for the
fusb301 device. Note that it is safe to disable this for all kernel
versions.
[0] https://lore.kernel.org/lkml/6180608.lOV4Wx5bFT@rjwysocki.net/
JIRA LINQPJ14-73
Change-Id: If08932406c43bca2736164a2fdd96a5a4b9fa81c
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3404885
(cherry picked from commit 21686177a6d395701cc8f19088090142657899a0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3411825
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
This is for avoiding kernel hang when DCE FW fails to respond.
Failures of IPC call will return -ERESTARTSYS or -ETIMEOUT, which
will be handled by caller functions:
1. tegra_dce_client_ipc_send_recv (EXPORT_SYMBOL)
This is module export symbol and caller have the responsibility
of checking return value.
2. DCE FSM event handler
Error return will change back to previous state.
DCE_IPC_TIMEOUT_MS_MAX is set to 10000[ms]
SHA computation time on SC7 entry request can go up 2sec.
Host tolerance time must be larger than this.
Jira TDS-16567
https://nvbugspro.nvidia.com/bug/5335034
Change-Id: I5d77a9497f14f305d07b98e39a58fbcecafedf92
Signed-off-by: charliej <charliej@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3358620
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Mahesh Kumar <mahkumar@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Tested-by: Mahesh Kumar <mahkumar@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
(cherry picked from commit 6c2ab3c78ce7cba0e88455b263d51d1a88c03927)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3402917
This change adds the support for programming streamids to
allow tsec fw on t264 to access PA at a low privilege level.
It also includes the synchronization logic to communicate
with the fw regarding completion of stream id programming
so that the fw can go ahead and initialize itself.
In addition to this, the mailbox used for communicating init done
from tsec fw to ccplex is changed from NV_PTSEC_FALCON_MAILBOX0 to
NV_PTSEC_MAILBOX1 since CCPLEX does not have access to the former from
t26x onwards. Hence falcon based mailboxes are used for tsec-psc comms
and non-falcon ones for tsec-ccplex comms (stream id comms and init done).
Jira TSEC-14
Change-Id: I2871a52222cd69786a8cc3f53162a80486611bb5
Signed-off-by: Sahil Patki <spatki@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3366343
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
(cherry picked from commit db54fde9c4d786b22b7f8694753de3ec80649b17)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3400219
The tegra_dai_fixup function was incorrectly updating the format mask
when S24_LE format was requested and DAI has 32-bit sample width incase
the stream on that DAI is already running.
S24_LE format uses 32-bit physical width on the DAI interface, but
the format mask should remain as S24_LE to maintain proper format
negotiation. The fix adds a check to skip format mask updates when
S24_LE is requested and DAI sample_bits is 32.
This resolves issues with RT5640 and other codecs that support
24-bit audio formats on Tegra platforms.
Bug 5350165
Change-Id: Ie297a4176866c9bb3dbc9f40ac7b6d9051a879f6
Signed-off-by: Sheetal <sheetal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3396978
Reviewed-by: Sameer Pujar <spujar@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Mohan kumar <mkumard@nvidia.com>
When switching the governor to nvhost_pogdov or switching back to other
governors, we will need to lock the devfreq lock to prevent triggering
DVFS cycle from other paths.
The nvhost_pod_target_freq callback will be called when triggering the
DVFS cycle. However, the callback expects governor data is already
allocated and initialized. We need to synchronize the operations when we
switch the governor so that DVFS cycle can only be triggered when
governor data is ready.
Bug 5354161
Bug 5351714
Change-Id: Iaf8af8291ea09a7c2bfbdc5e1453bb976ee0987b
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3392341
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Android builds don't have CONFIG_NUMA enabled hence
/sys/devices/system/node/node0/meminfo is not present on android.
While nvscibuf calls the QueryHeapParams to check presence of the
hugetlbfs based carveout, the error prints will be seen due to absence
of the above sysfs file. Hence first check whethere there are multiple
numa nodes are not. If not, then use /proc/meminfo file to retrieve the
hugetlbfs size otherwise use the meminfo sysfs node from the
corresponding numa node.
Bug 5200644
Change-Id: I5495de91726d323210807e86f22757b798226fca
Signed-off-by: Ketan Patil <ketanp@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3338255
Reviewed-by: Pritesh Raithatha <praithatha@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Jian-Min Liu <jianminl@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Change licensing of include/soc/tegra/camrtc-diag-messages.h and
include/soc/tegra/camrtc-diag.h from NVIDIA Proprietary to GPL-2.0-only,
as these files are used by GPL code. The license incompatibility is
resolved by ensuring all files maintain consistent licensing terms.
Bug 5278776
Change-Id: Ia42d64339458eb6f3320aea142f0360350614b8b
Signed-off-by: Mohit Ingale <mohiti@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3365826
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Frank Chen <frankc@nvidia.com>
Reviewed-by: Semi Malinen <smalinen@nvidia.com>
Reviewed-by: Ganesh Ram Savithri Sreenivas Murthy <ganeshrams@nvidia.com>
cleanup & modularize driver code structure to separate
out HW dependent & independent parts to facilitate adding
support for new SoCs
This patch
- restructure SoC specific code into separate files
- Add function pointers to call HW specific sequences
- adds a common header which is needed by all platforms
- cleans up obsolete code such as memmap of phc regs,
xavier support, etc
- Removes default value assumption for lock_threshold,
pps_freq, sync_trig_interval
Bug 5175333
Change-Id: I106e130fdaa1a166a4a2c9bbaeb3b924af90ab66
Signed-off-by: Sheetal Tigadoli <stigadoli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3321185
Reviewed-by: Kiran Kumar Bobbu <kbobbu@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
Previously, this option was disabled because the clang version
used was too old (clang-r370808, clang 10). This option has been
supported since clang-r416183b, clang 12. In order to avoid potential
build errors, this option is re-enabled.
Bug 5289423
Change-Id: I1d0fd5a3dfdff06e95eeca13f85a263922c6ecaf
Signed-off-by: Jian-Min Liu <jianminl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3371014
Reviewed-by: Ankita Garg <ankitag@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Fix riace condition between host1x_syncpt_alloc()
and host1x_syncpt_put() by using kref_put_mutex()
instead of kref_put() + manual mutex locking.
This ensures no thread can acquire the
syncpt_mutex after the refcount drops to zero
but before syncpt_release acquires it.
This prevents races where syncpoints could
be allocated while still being cleaned up
from a previous release.
Remove explicit mutex locking in syncpt_release
as kref_put_mutex() handles this atomically.
Bug 5170956
Change-Id: I9e2348482d5c9646556576772f6b90fa7df3acd2
Signed-off-by: Mainak Sen <msen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3369121
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Enhance IOCTL handler to identify and
handle dma_fence_chain objects that might contain
host1x dma fences. This fixes issues
when userspace passes a dma_fence_chain
(created by DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT
operations) to HOST1X_IOCTL_FENCE_EXTRACT.
The updated code iteratively unwraps fence
chains until it finds a host1x_syncpt_fence or
reaches a fence it can't process. This ensures
proper operation with DRM-based applications
that use timeline syncobj features which internally
use dma_fence_chain.
Bug 4983872
Change-Id: I3eef9d54e2c42180cb5c74236cd64f42a863b7ea
Signed-off-by: Mainak Sen <msen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3364940
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Leslin Varghese <lvarghese@nvidia.com>
Tested-by: Arunmozhikannan Soundarapandian <asoundarapan@nvidia.com>
Reviewed-by: Sourab Gupta <sourabg@nvidia.com>