The current submission opcode sequence first takes the engine MLOCK,
and then switches to HOST1X class to wait prefences. This is fine
while we only use a single channel per engine and there is no
virtualization, since jobs are serialized on that one channel anyway.
However, when that assumption doesn't hold, we are keeping the
engine locked while not running anything on it while waiting for
prefences to complete.
To resolve this, execute wait commands in the beginning of the job
outside the engine MLOCK. We still take the HOST1X MLOCK because
recent hardware requires register opcodes to be executed within some
MLOCK, but the hardware also allows unlimited channels to take the
HOST1X MLOCK at the same time.
Jira HOSTX-4687
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I783cbc7f1bbd7415fbf0e61163935186b2ba0a44
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2911124
Reviewed-by: Santosh BS <santoshb@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Add support for running as a guest system under a hypervisor, using
Host1x HW's virtualization capabilities.
In effect this involves not touching apertures other than the 'vm'
aperture, and channels and syncpoints other than those that are
assigned to the VM.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: Ideec5b0b9a692aa3ee6b4a0240c5755c983cb7bd
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2811837
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently all fences have a 30 second timeout to ensure they are
cleaned up if the fence never completes otherwise. However, this
one size fits all solution doesn't actually fit in every case,
such as syncpoint waiting where we want to be able to have timeouts
longer than 30 seconds. As such, we want to be able to give control
over fence cancellation to the caller (and maybe eventually get rid
of the internal timeout altogether).
Here we add this cancellation mechanism by essentially adding a
function for entering the timeout path by function call, and changing
the syncpoint wait function to use it.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I4600544afe21efdd3f7d06362bd124130ddec3db
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2786637
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Move from the old, complex intr handling code to a new implementation
based on dma_fences. While there is a fair bit of churn to get there,
the new implementation is much simpler and likely faster as well due
to allowing signaling directly from interrupt context.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I81c47fa1946679813f90e3fd8e1d1e9d6342143e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2786635
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In anticipation of removal of the intr API, implement job tracking
using DMA fences instead. The main two things about this are
making cdma_update schedule the work since fence completion can
now be called from interrupt context, and some complication in
ensuring the callback is not running when we free the fence.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I25f7f5a6cad24a00563eed79e0e17b1df1eadcdc
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2786636
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
With the full-featured opcode sequence using MLOCKs, we need to also
unlock those MLOCKs in the event of a timeout. However, it turns out
that on Tegra186/Tegra194, by default, we don't need to do this;
furthermore, on Tegra234 it is much simpler to do; so only implement
this on Tegra234 for the time being.
Bug 3724727
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: Icc15ae705844cd26ae3f1d1146ff20f1d9b7a14d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2745956
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Brad Griffis <bgriffis@nvidia.com>
For new (Tegra186+) SoCs, use a new ('full-featured') job opcode
sequence that is compatible with virtualization. In particular,
the Host1x hardware in Tegra234 is more strict regarding the sequence,
requiring ACQUIRE_MLOCK-SETCLASS-SETSTREAMID opcodes to occur in
that sequence without gaps (except for SETPAYLOAD), so let's do it
properly in one go now.
Bug 3724727
Change-Id: Ifae148975457d2d275cfae25fcaf735e6529fbd3
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2745964
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Brad Griffis <bgriffis@nvidia.com>
On Tegra234, each Host1x VM has 8 interrupt lines. Each syncpoint
can be configured with which interrupt line should be used for
threshold interrupt, allowing for load balancing.
For now, to keep backwards compatibility, just set all syncpoints
to the first interrupt.
Bug 3724727
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I7058f9bd38bc59db83ee92613e4e733813db7a46
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2745954
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Brad Griffis <bgriffis@nvidia.com>
Add code to do stream ID switching at the beginning of a job. The
stream ID is switched to the stream ID specified by the context
passed in the job structure.
Before switching the stream ID, an OP_DONE wait is done on the
channel's engine to ensure that there is no residual ongoing
work that might do DMA using the new stream ID.
Bug 3724727
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I7705c33417e27d99bb265db13eb3c19960f5a363
Update the host1x driver to align with the latest upstream host1x
driver from Linux v5.19-rc1. Please note that the context bus support
is not included, because this needs to be built into the kernel.
Bug 3724727
Change-Id: Ic6fbe001462d160d1bb24f76038dd755c5550690
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2739538
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Syncpoint interrupts are not working as expected on Tegra194. The
problem is that the syncpoint interrupt threshold being used is the
global interrupt threshold and not the virtual interrupt threshold.
Fix this by using the virtual interrupt threshold which aligns with
downstream.
JIRA LS-34
Change-Id: I4a3191c12298d4f0af264fd1f89754171a3829e9
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2498820
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add the upstream host1x driver with the 'Host1x/Tegra UAPI' series [0]
applied. This driver will be built as an external module for use with
the NVGPU driver on upstream Linux kernels.
The following modifications have been made to the series posted upstream
1. Update the Makefile to always build the driver as a module
2. Remove the tests to see if CONFIG_DRM_TEGRA_STAGING is enabled
3. Rename the include/linux/host1x.h to include/linux/host1x-next.h to
avoid conflicts with upstream headers when building as an external
module.
4. Rename the include/uapi/linux/host1x.h to
include/uapi/linux/host1x-next.h to avoid conflicts with upstream
headers when building as an external module.
5. Rename the module that is built to be host1x-next.ko instead of
host1x.ko to avoid any depmod conflicts with the upstream driver.
[0] https://patchwork.ozlabs.org/project/linux-tegra/list/?series=215770
Bug 3156385
Change-Id: Ic60299546809097dd0e4a9a7157bce1491d9f794
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2435801
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Define host1x_dev_ops for T186
Add API load_streamid_regs() to load streamid registers
Set this API to host1x_dev_ops.load_regs so that we
set streamid registers during host1x_probe()
Add a static table which includes streamid mappings
for all the clients
Bug 1704301
Change-Id: I7aeefc43776472a7ccf868bfa18c810f3b80b52b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1023438
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Add host1x05 version and fill in the h/w details
in host1x05_info
Add all the hardware accessors for T186
(channel, sync, uclass accessors)
Add hardware support files for host1x channel,
syncpoints, cdma, pushbuffer, interrupt, and debug support
Keep gather filter disabled
Things working with this :
- cdma operations
- basic channel job submit path
- syncpoint support
- syncpoint interrupt mechanism
- debug dump
With this support, below tests pass for VIC client
$nvrm_channel channel_Basic
$nvrm_channel --module=vic
Bug 1704301
Change-Id: I7d97560cb1e3a57733fa0853936b0783c71b7060
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1021434
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>