In Orin and previous chipsets, engine timestamp counter
runs 32 times faster than CNTVCT register, so the actual
timestamp is obtained by right shifting with factor of 5.
In Thor onwards, this shift is not required is for VIC engine.
Jira HOSTX-5905
Change-Id: I69980fdfcf50b15db99b1fbad522aecd571a0f17
Signed-off-by: Mainak Sen <msen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3306825
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
The following error messages are sometimes observed on boot ...
tegra-nvjpg 15380000.nvjpg: failed to get icc write handle
tegra-nvdec 15480000.nvdec: failed to get icc write handle
tegra-nvjpg 15540000.nvjpg: failed to get icc write handle
tegra-nvenc 154c0000.nvenc: failed to get icc write handle
tegra-vic 15340000.vic: failed to get icc write handle
tegra-nvjpg 15380000.nvjpg: failed to get icc write handle
The above messages are harmless because the ICC core is returning
-EPROBE_DEFER to indicate that the ICC provider is not available. When
-EPROBE_DEFER is returned the kernel will attempt to probe the device
again and so print an error on -EPROBE_DEFER can be misleading. Fix the
above by using the function dev_err_probe() to print error messages
because this function will only print an error message if the error code
is not -EPROBE_DEFER.
Bug 3436156
Bug 4496044
Change-Id: I47b77e5a0f2bdb817a832daa305246c8803f456b
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3075237
(cherry picked from commit 61319aef4d)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3075849
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
When BPMP BWMGR receives ICC avg_bw requests from different memory
clients, it will sum up all avg_bw values together and use it to
determine the final EMC frequency.
Transitioning to ICC peak_bw requests will cause BPMP to evaluate
the bandwidth requirements of each memory client individually,
selecting the maximum bandwidth request from among all the clients
to determine the final EMC frequency.
This modification prevents excessive memory bandwidth allocation
when multiple host1x clients collaborate for data processing.
Currently, host1x clients request the full theoretical 100% data
bandwidth, even though the system typically doesn't fully utilize
that amount during runtime.
To address the issue of insufficient memory bandwidth when multiple
host1x clients are used together, we can reduce the boost_up_threshold
value of cactmon DFS. This adjustment ensures that when actual memory
bandwidth utilization surpasses the specified boost-up bandwidth
threshold, EMC frequency will be further scaled by one step further
to alleviate the problem.
Bug 4328471
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Change-Id: I826a374666f38718652c5cae449c0aadeabfbdb5
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2996561
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
If probing of the VIC engine fails then the following crash is observed
...
tegra-vic 15340000.vic: devfreq_add_device: Unable to find governor for
the device
tegra-vic 15340000.vic: failed to init devfreq: -22
tegra-vic: probe of 15340000.vic failed with error -22
tegra-nvdec 15480000.nvdec: Adding to iommu group 45
list_add corruption. prev->next should be next
(ffff000094505f80), but was 0000000000000000. (prev=ffff000095245cf0).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:26!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
...
pc : __list_add_valid+0x80/0x84
lr : __list_add_valid+0x80/0x84
...
Call trace:
__list_add_valid+0x80/0x84
host1x_subdev_register+0x80/0x170 [host1x_next]
__host1x_client_register+0xac/0x160 [host1x_next]
nvdec_probe+0x1c8/0x58c [tegra_drm_next]
platform_probe+0x6c/0xe0
really_probe+0xc4/0x3dc
__driver_probe_device+0x114/0x190
driver_probe_device+0x44/0x114
__driver_attach+0xcc/0x1f0
bus_for_each_dev+0x78/0xd0
driver_attach+0x28/0x30
bus_add_driver+0xbc/0x220
driver_register+0x70/0x120
__platform_register_drivers+0x70/0x140
host1x_drm_init+0x58/0x1000 [tegra_drm_next]
do_one_initcall+0x48/0x2d0
do_init_module+0x5c/0x270
load_module+0xb08/0xc70
__do_sys_init_module+0x1dc/0x214
__arm64_sys_init_module+0x20/0x2c
invoke_syscall.constprop.0+0x7c/0xd0
el0_svc_common.constprop.0+0x140/0x150
do_el0_svc+0x38/0xa0
el0_svc+0x38/0x18c
el0t_64_sync_handler+0xb4/0x130
el0t_64_sync+0x17c/0x180
Code: aa0103e2 910fa000 aa0403e1 941b9509 (d4210000)
---[ end trace f8f20f933fe03c7e ]---
The problem is that some of the engines are not cleaning up correctly on
probe failure and so fix this by adding the necessary calls clean up if
probe fails.
Bug 4303860
Change-Id: Ife1e0ce7429458d187422162a298cc140f0e3bc4
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2986790
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Avoid using suspend_freq to set the Fmax for the device, since it will
re-enable the actmon watermark interrupts again if the suspend_freq is
not zero in the devfreq_suspend_device call.
Since we want to make sure device is running at Fmax but avoid using
suspend_freq, we can forcelly override the resume_freq with the
scaling_max_freq to reach the same effect.
Bug 4271562
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Change-Id: Ia19a7ef9de14643abf1b174b6180e40a847de132
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2974074
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
The suspend_freq is a fixed value for devfreq core, while the
resume_freq will be changed dynamically based on the last previous
updated frequency value of the device.
When device is put into suspend mode, devfreq core will update the
resume frequency with the suspend frequency. Therefore, when the device
is resumed back again, it will run at suspend_freq.
Forcelly set the suspend_freq as Fmax so that device will run at Fmax
when it is resumed back.
Bug 4269900
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Change-Id: Ic6511613ae5d02831a66dd1c2a93f21c142bf3a7
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2973229
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
The original resume_freq will be overridden when the device is put
into suspend mode. Therefore, the resume cycle won't always use the
max frequency of the device to set the actmon count weight.
Change the implementation to use scaling_max_freq to always use the
max frequency of the device to set the actmon count weight.
Bug 4252125
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Change-Id: Iaaae8ada989cd1ed7dd728fb526ad4391a2f192c
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2967098
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Method invocation for updating count weight for nvdec may conflict
with a job being submitted through host1x.
Change the implementation to set count weight values for both actmon and
the engine once in runtime resume cycle to prevent possible conflicts.
Bug 3962196
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Change-Id: I19d160abd90373721df78cdb107ca396a206a6e8
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2952692
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Interconnect support for multimedia engines is disable for Linux v6.2
kernels. Rather than always disabling interconnect support for these
kernels, we just need to ensure the necessary upstream kernel patches
are in place for Linux v6.2 kernels. Therefore, enable interconnect
support for all kernels.
Bug 4097374
Change-Id: Ibceb866ba239454a131543cdff90e68007b28942
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2937758
Reviewed-by: Johnny Liu <johnliu@nvidia.com>
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
When the device is in suspend mode, There is no point to update
actmon count weight and send ICC request for the device.
If ICC request is sent for the device in suspend mode, in osidle state,
the default EMC frequency will be affected due to the accumulated ICC
bandwidth QoS constraint, and affect the overall osidle power.
Bug 4168788
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Change-Id: I8700b131b5f914d93feed9f8a9d792c2fdaa9b53
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2931905
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, engine drivers only enable runtime PM during the host1x
init callback. This can happen slightly later than the probe, which
can cause the power domain to intermittently not be turned off after
probe.
My hypothesis is that there is a race condition between the post-probe
power domain poweroff that is done from a queued work, and the
pm_runtime_enable call happening in the host1x init callback.
If the pm_runtime_enable call happens first, everything is OK and
the power off work can disable the power domain as PM runtime is
enabled and the device is runtime suspended. If power off work runs
first, PM runtime is still disabled for the device and the domain
must be kept powered.
Resolve the issue by moving the runtime PM enablement to the
probe function.
Bug 3982357
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I3e028bc03c8ff05d18e63039ac9e590c4557e268
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2859475
Reviewed-by: Santosh BS <santoshb@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Implement the get_streamid_offset and can_use_memory_ctx callbacks
required for supporting context isolation. Since old firmware on VIC
cannot support context isolation without hacks that we don't want to
implement, check the firmware binary to see if context isolation
should be enabled.
Bug 3724727
Change-Id: Ie4c952987f2a5ae660257194b43c18829ac707c4
NVDEC's TRANSCFG register is at a different offset than VIC.
This becomes a problem now when context isolation is enabled and
the reset value of the register is no longer sufficient.
Bug 3724727
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: Ic83043dcc49a3739fe310cfb5b143b42d9aa259b