Commit Graph

429 Commits

Author SHA1 Message Date
Konsta Hölttä
e8201d6ce3 gpu: nvgpu: decouple channel watchdog dependencies
The channel code needs the watchdog code and vice versa. Cut this
circular dependency with a few simplifications so that the watchdog
wouldn't depend on so much.

When calling watchdog APIs that cause stores or comparisons of channel
progress, provide a snapshot of the current progress instead of a whole
channel pointer. struct nvgpu_channel_wdt_state is added as an interface
for this to track gp_get and pb_get.

When periodically checking the watchdog state, make the channel code ask
whether a hang has been detected and abort the channel from within
channel code instead of asking the watchdog to abort the channel. The
debug dump verbosity flag is also moved back to the channel data.

Move the functionality to restart all channels' watchdogs to channel
code from watchdog code. Looping over active channels is not a good
feature for the watchdog; it's better for the channel handling to just
use the watchdog as a tracking tool.

Move a few unserviceable checks up in the stack to the callers of the
wdt code. They're a kludge but this will do for now and demonstrates
what needs to be eventually fixed.

This does not leave much code in the watchdog unit. Now the purpose of
the watchdog is to only isolate the logic to couple a timer and progress
snapshots with careful locking to start and stop the tracking.

Jira NVGPU-5582

Change-Id: I7c728542ff30d88b1414500210be3fbaf61e6e8a
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369820
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
fba96fdc09 gpu: nvgpu: Replace nvgpu_engine_info with nvgpu_device
Delete the struct nvgpu_engine_info as it's essentially identical to
struct nvgpu_device. Duplicating data structures is not ideal as it's
terribly confusing what does what.

Update all uses of nvgpu_engine_info to use struct nvgpu_device. This
is often a fairly straight forward replacement. Couple of places though
where things got interesting:

  - The enum_type that engine_info uses is defined in engines.h and
    has a bit of SW abstraction - in particular the GRCE type. The only
    place this seemed to be actually relevant (the IOCTL providing device
    info to userspace) the GRCE engines can be worked out by comparing
    runlist ID.
  - Addition of masks based on intr_id and reset_id; those can be
    computed easily enough using BIT32() but this is an area that
    could be improved on.

This reaches into a lot of extraneous code that traverses the fifo
active engines list and dramtically simplifies this. Now, instead of
having to go through a table of engine IDs that point to the list of
all host engines, the active engine list is just a list of pointers to
valid engines. It's now trivial to do a for-all-active-engines type
loop. This could even be turned into a generic macro or otherwise
abstracted in the future.

JIRA NVGPU-5421

Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
a04525ece8 gpu: nvgpu: require deterministic for usermode
Deterministic mode has always been a requirement for usermode submit;
enforce it in the setup_bind path. Adjust tests to use the flag.

QNX uses NVGPU_SETUP_BIND_FLAGS_SUPPORT_DETERMINISTIC only if
CONFIG_NVGPU_IOCTL_NON_FUSA is set, so guard the check with that for
now.

Jira NVGPU-5582

Change-Id: Idedd01a3a24420b45195a472e8ca5c9f32f4ef46
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369818
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
3245d48736 gpu: nvgpu: forbid watchdog on deterministic mode
The channel watchdog feature has always been a blocker for deterministic
submits. Instead of waiting for a submit call to happen just to reject
it, nack already the setup_bind ioctl if deterministic is set and the
watchdog has not been disabled before. This can avoid confusion with
usermode submits where leaving the watchdog set would have worked but
the watchdog would never see updates from userspace.

Disallow also any other watchdog adjustments than disabling it when the
channel has been set up for deterministic mode.

Jira NVGPU-5582

Change-Id: I0ba4584bbc035197d952e5b562197c36aa483867
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369819
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
91515d1b47 gpu: nvgpu: unify joblist api names
Add the nvgpu_ prefix to the peek, add and delete functions to make them
consistent with the rest of the joblist functions. Rename the "prealloc
resources" alloc and free functions to joblist init and deinit; there
are many other resources that are also preallocated, and these handle
just the job tracking list.

NVGPU-5772

Change-Id: Ie5e6ba4f4b17465d626f36a0239bddb03a0a2fcb
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397395
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
345eae584d gpu: nvgpu: remove nvgpu_channel_joblist_is_empty
channel_joblist_peek() returns NULL if the list is empty.
nvgpu_channel_joblist_is_empty() has been used only together with that
function; remove it and check against NULL to see whether there are jobs
in flight.

This removes some duplication, simplifies the call sites slightly, and
gets rid of a Coverity nag about a possible NULL pointer from peek that
really isn't (when the emptiness was already checked).

Jira NVGPU-5772

Change-Id: I814e9c510d99b88e59539359992fb44d4e7ce2ea
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397394
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
27a64f2e23 gpu: nvgpu: enforce priv usage of fence
Add a "priv" fence struct type and use that in the fence type to
emphasize that the inner data is not meant to be seen.

The fence unit needs to have an outside-visible fence type so that
fences can be allocated directly as a struct field in job metadata for
performance and simplicity, so hiding the type entirely wouldn't work.

A couple of places need to touch the priv data directly in channel code.
Those can be thought to be technically fence unit's code scattered
outside the fence files, but they mean that the architecture is not
perfect yet.

Jira NVGPU-5773

Change-Id: Ifa3c95757ae31eef0e32f2605293e23e210b065f
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395071
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
baaf25f8b0 gpu: nvgpu: decouple async and immediate cleanup
Split up nvgpu_channel_clean_up_jobs() on the clean_all parameter so
that there's one version for the asynchronous ("deferred") cleanup and
another for the synchronous deterministic cleanup that occurs in the
submit path.

Forking another version like this adds some repetition, but this lets us
look at both versions clearly in order to come up with a coherent plan.
For example, it might be feasible to have the light cleanup of pooled
items in also the nondeterministic path, and deferring heavy cleanup to
another, entirely separated job queue.

Jira NVGPU-5493

Change-Id: I5423fd474e5b8f7b273383f12302126f47076bd3
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346065
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
7aa852b31c gpu: nvgpu: emphasize fence syncpt/sema interfaces
Sometimes the syncpt-based fences are not used, and often the sema-based
fences are not used. Move code around to new files to make it easier to
see what happens and to allow leaving code out of the build easily.

Start using nvgpu_fence_ops::free again and move the fence release
there. The syncpt data is not refcounted, so it doesn't have this.

Jira NVGPU-5773

Change-Id: I991f91886c59cf2c2fbfd2e75305ba512b5d7371
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395069
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
e6c0d84683 gpu: nvgpu: allocate fences in job structs
As the submit job metadata has been simplified, the fence pool for job
tracking fences is now just complex code for very simple purposes, so
delete it. It's enough to hold the fence memory in the job struct itself
instead of having separately allocated objects with different lifetimes.

Each channel is using preallocated job arrays based on the prespecified
inflight job count. The fences are used for tracking job completion, and
a new job cannot be submitted before a previous wait has completed.
This means that even with a ringbuffer with space for only one job, the
previous job memory cannot get reclaimed by a new submit because the
submits are ordered.

Jira NVGPU-5773

Change-Id: I0c777df700aa7cfda6f971efa47aa72c5462b53a
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392704
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
969b901999 gpu: nvgpu: create device/context profiler dev nodes
Create new dev nodes for device and context profilers. Example of dev
nodes on iGPU
/dev/nvhost-prof-dev-gpu - device scope profiler
/dev/nvhost-prof-ctx-gpu - context scope profiler

Add below APIs to open/close above dev nodes :
nvgpu_prof_dev_fops_open()
nvgpu_prof_ctx_fops_open()
nvgpu_prof_fops_release()

Add common API nvgpu_prof_fops_ioctl() to handle IOCTL call on these
dev nodes. Add IOCTL NVGPU_PROFILER_IOCTL_BIND_CONTEXT to bind the TSG
to profiler objects.

Add nvgpu_tsg_get_from_file() to retrieve TSG struct pointer from
file descriptor. Also store profiler object pointer into TSG struct.

Enable NVGPU_SUPPORT_PROFILER_V2_DEVICE capability on gv11b and tu104.
Note that this is not yet enabled for vGPU.
Keep NVGPU_SUPPORT_PROFILER_V2_CONTEXT capabiity disabled since this
will take longer to support.

Add new IOCTL NVGPU_PROFILER_IOCTL_UNBIND_CONTEXT so that userspace can
explicitly unbind the context and release the resources before closing
the profiler descriptor.

Add context_init flag to profiler object for book keeping.

Bug 2510974
Jira NVGPU-5360

Change-Id: Ie07e0cfd5a9da9d80008f79c955c7ef93b4bc60f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2384354
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
d0714b40c1 gpu: nvgpu: Add engine reset profiling
This is a key part of the fifo recovery sequence.

JIRA NVGPU-5606

Change-Id: I8807884394834b912f25d7c535ee22f547988b2d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2382590
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
1bcdc306a0 gpu: nvgpu: Add gv11b recovery profiling
Add some basic profiling to the gv11b recovery sequence. This captures
the high level events. Subsequent patches start to dig into the
subsections in more detail.

JIRA NVGPU-5606

Change-Id: I488a448ca1cbf961651588e24685e2a5b4420c44
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368302
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
ab2b0b5949 gpu: nvgpu: Set unserviceable flag early during RC
During recovery, we set ch->unserviceable at the end after we preempt
the TSG and reset the engines. It might be too late and user-space
might submit more work to the broken channel which is not desirable.
Move setting this unserviceable flag right at the start
of recovery sequence.
Another thread doing a submit can still read the unserviceable flag
just before it is set here, leaving that submit stuck if recovery
completes before the submit thread advances enough to set up a post
fence visible for other threads. This could be fixed with a big lock
or with a double check at the end of the submit code after the job
data has been made visible.
We still release the fences, semaphore and error notifier wait queues
at the end; so user-space would not trigger channel unbind while
channel is being recovered.

Also, change the handle_mmu_fault APIs to return void as the
debug_dump return value is not used in any of the caller APIs.

JIRA NVGPU-5843

Change-Id: Ib42c2816dd1dca542e4f630805411cab75fad90e
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2385256
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Dinesh
d0087f3ad8 gpu: nvgpu: Support for runlist_max_supported
nvgpu_next needs support for max_runlist_supported by litter
value. So the function is changed to support.

JIRA NVGPU-5534

Change-Id: I097f6343295049532c46904316314dc82092a46b
Signed-off-by: Dinesh <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2382882
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
881a6f35be gpu: nvgpu: Trigger quiesce on PBDMA preempt fail
During recovery, we preempt the faulty TSG from PBDMA and engines.
If the TSG preempt on PBDMA times out(timeout = 100ms), the PBDMA
might be hung state. We do not reset the HOST during recovery, so
stuck PBDMAs are unrecoverable.
Abort the recovery and trigger GPU to quiesce as there is no way
back.

Triggering Quiesce from recovery sequence should be fine as the only
redundant operation will be write to FIFO_RUNLIST_PREEMPT register.
The error notifiers will eventually be set by Quiesce thread.

Bug 2768005
JIRA NVGPU-4631

Change-Id: I914b9379aa8e48014e6ddace9abe47180a072863
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368187
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
359fc24aaf gpu: nvgpu: Rework engine management to work with vGPU
Currently the vGPU engine management rewrites a lot of the common
device agnostic engine management code.

With the new top HAL parsing one device at a time, it is now more
easily possible to tie the vGPU into the new common device framework
by implementing the top HAL but with the vGPU engine list backend.

This lets the vGPU inherit all the common engine and device
management code. By doing so the vGPU HAL need only implement a
trivial and simple HAL.

This also gets us a step closer to merging all of the CE init
code: logically it just iterates through all CE engines whatever
they may be. The only reason this differs between chips is because
of the swap from CE0-2 to LCEs in the Pascal generation. This could
be abstracted by the unit code easily enough.

Also, the pbdma_id for each engine has to be added to the device
struct. Eventually this was going to happen anyway, since the
device struct will soon replace the nvgpu_engine_info struct.
It's a little bit of an abuse but might be worth it long term. If
not, it should not be difficult to replace uses of dev->pbdma_id
with a proper lookup of PBDMA ID based on the device info.

JIRA NVGPU-5421

Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
9fda0b2354 gpu: nvgpu: allow vpr channels when VPR supported
Currently, if VPR support is requested with nvgpu_channel_setup_bind(),
channel is marked as vpr independent of nvgpu VPR support.
Modify nvgpu_channel_setup_bind() to mark channel as vpr only if
nvgpu supports VPR, otherwise return error.

Bug 2046782
JIRA NVGPU-5302

Change-Id: I5f1717651b7bcff0597a6f0d9c746d50af7af0bf
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368411
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
ff41d97ab5 gpu: nvgpu: always prealloc jobs and fences
Unify the job metadata handling by deleting the parts that have handled
dynamically allocated job structs and fences. Now a channel can be in
one less mode than before which reduces branching in tricky places and
makes the submit/cleanup sequence easier to understand.

While preallocating all the resources upfront may increase average
memory consumption by some kilobytes, users of channels have to supply
the worst case numbers anyway and this preallocation has been already
done on deterministic channels.

Flip the channel_joblist_delete() call in nvgpu_channel_clean_up_jobs()
to be done after nvgpu_channel_free_job(). Deleting from the list (which
is a ringbuffer) makes it possible to reuse the job again, so the job
must be freed before that. The comment about using post_fence is no
longer valid; nvgpu_channel_abort() does not use fences.

This inverse order has not posed problems before because it's been buggy
only for deterministic channels, and such channels do not do the cleanup
asynchronously so no races are possible. With preallocated job list for
all channels, this would have become a problem.

Jira NVGPU-5492

Change-Id: I085066b0c9c2475e38be885a275d7be629725d64
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346064
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
7b8a08af7a gpu: nvgpu: check ch->wdt on wdt restart all channels
ch->wdt is not always initialized. For example it's not initialized on
gpu server, since the channel wdt is managed on client side.

Bug 2833924

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Idb06f7de6a15e093bbb08be16454777b9d7582b9
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361978
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
98264f7505 gpu: nvgpu: call gops.tsg.unbind_channel on fail path
When current context is busy, nvgpu_tsg_unbind_channel_common may fail
because of preemption failed. In such case, the .unbind_channel hal
still need to be called to notify vserver that the channel will be
removed from tsg in teardown path.

Bug 2833924

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I9996202485429b4d9cba0c2f985f8e55fcdd3f29
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361977
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
fbb6a5bc1c gpu: nvgpu: Remove fifo->pbdma_map
The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists.
This array allows other software to query what PBDMA(s) serves a given
runlist. The PBDMA map is read verbatim from an array of host registers.
These registers are stored in a kmalloc()'ed array.

This causes a problem for the device management code. The device
management initialization executes well before the rest of the FIFO
PBDMA initialization occurs. Thus, if the device management code
queries the PBDMA mapping for a given device/runlist, the mapping has
yet to be populated.

In the next patches in this series the engine management code is subsumed
into the device management code. In other words the device struct is
reused by the engine management and all host SW does is pull pointers to
the host managed devices from the device manager. This means that all
engine initialization that used to be done on top of the device
management needs to move to the device code.

So, long story short, the PBDMA map needs to be read from the registers
directly, instead of an array that gets allocated long after the device
code has run.

This patch removes the pbdma map array, deletes two HALs that managed
that, and instead provides a new HAL to query this map directly from
the registers so that the device code can use it.

JIRA NVGPU-5421

Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
194fac7f3c gpu: nvgpu: Remove clutter in engine code
Remove the get_mask_on_id() HAL and replace it's usage with the
global nvgpu_engine_get_mask_on_id() function. There's no need
to have this function as a HAL.

JIRA NVGPU-5420

Change-Id: I4fc843beff8e65806da26a0addc83fa218d390ac
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361315
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
ca1f93bdd7 gpu: nvgpu: add user fence type
Decouple the fence information needed for providing submit postfences to
userspace by adding a separate type for that and using it to pass fence
data to ioctls.

The data in struct nvgpu_fence_type is used in various places:

- job tracking needs to know when a post fence is expired
- job submitters within the driver (vidmem clears) need to be able to
  wait for these fences
- userspace needs the fence as an id, value pair or as a file descriptor
  created from an os fence

To keep object lifetimes strict, start decoupling the os fence data out
of struct nvgpu_fence_type: delete nvgpu_fence_install_fd() and add
nvgpu_fence_extract_user() to return a struct nvgpu_user_fence that
contains only the necessary information. Storing the os fence in job
tracking metadata is legacy code and not useful. Passing the os fence
from where it's created through the whole submit path inside this
combined fence type has been convenient, though.

The internally stored cde job fence in dmabuf compression metadata is
still nvgpu_fence_type to keep this patch simple.

Jira NVGPU-5248

Change-Id: I75b7da676fb6aa083828f888c55571bbf7645ef3
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359064
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
160669a7bb gpu: nvgpu: return device from nvgpu_device_get()
Instead of copying the device contents into the passed pointer have
nvgpu_device_get() return a device pointer. This will let the engines.c
code move towards using the nvgpu_device type directly, instead of
maintaining its own version of an essentially identical struct.

JIRA NVGPU-5421

Change-Id: I6ed2ab75187a207c8962d4c0acd4003d1c20dea4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319758
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
70ce67df2d gpu: nvgpu: Add a generic profiler
Add a generic profiler based on the channel kickoff profiler. This
aims to provide a mechanism to allow engineers to (more) easily profile
arbitrary software paths within nvgpu.

Usage of this profiler is still primarily through debugfs. Next up is
a generic debugfs interface for this profiler in the Linux code.

The end goal for this is to profile the recovery code and generate
interesting statistics.

JIRA NVGPU-5606

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: I99783ec7e5143855845bde4e98760ff43350456d
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355319
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
319520ff57 gpu: nvgpu: Add a new device manager unit
This adds a new device management unit in the common code responsible
for facilitating the parsing of the GPU top device list and providing
that info to other units in nvgpu.

The basic idea is to read this list once from HW and store it in a
set of lists corresponding to each device type (graphics, LCE, etc).
Many of the HALs in top can be deleted and instead implemented using
common code parsing the SW representation.

Every time the driver queries the device list it does so using a
device type and instance ID. This is common code. The HAL is responsible
for populating the device list in such a way that the driver can
query it in a chip agnostic manner.

Also delete some of the unit tests for functions that no longer
exist. This code will require new unit tests in time; those should be
quite simple to write once unit testing is needed.

JIRA NVGPU-5421

Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6cbc174fc2 gpu: nvgpu: avoid channel wdt ifdefs
Implement empty stubs of the channel watchdog functions for when
watchdog is disabled from build. Add some forward declarations that were
missing. Now most call sites don't need #idefs for the build flag.

Add error checks for the wdt alloc failure.

Jira NVGPU-5494
Jira NVGPU-5493

Change-Id: I2d42e8ab4c5e045cd280b2e1f254396127bd154b
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352050
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
2a3bb9107f gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h>
top.h is a description of "devices" available on the GPU. As such
rename this header to device.h.

device.h will ultimately be a unit of actual C code that will rely
on the top HAL to fill a device list.

JIRA NVGPU-5421

Change-Id: If6e4a537d2209e429a678761a34713723da7a00a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
tkudav
957b19092f gpu: nvgpu: Enable Quiesce on all builds
Make Recovery and quiesce co-exist to support quiesce state
on unrecoverrable errors. Currently, the quiesce code is wrapped
under ifndef CONFIG_NVGPU_RECOVERY. Isolate the quiesce code from
recovery config, thereby enabling it on all builds.

On Linux, the hung_task checker(check_hung_uninterruptible_tasks()
in kernel/hung_task.c) complains that quiesce thread is stuck for
more than 120 seconds.

INFO: task sw-quiesce:1068 blocked for more than 120 seconds.

The wait time of more than 120 seconds is expected as quiesce
thread will wait until quiesce call is triggered on fatal
unrecoverable errors. However, the INFO print upsets the
kernel_warning_test(KWT) on Linux builds. To fix the failing
KWT, change the quiesce task to interruptible instead of
uninterruptible as checker only looks at uninterruptible tasks.

Bug 2919899
JIRA NVGPU-5479

Change-Id: Ibd1023506859d8371998b785e881ace52cb5f030
Signed-off-by: tkudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342774
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
16fb7654a5 gpu: nvgpu: isolate channel watchdog unit
Move the definition of struct nvgpu_channel_wdt to watchdog.c. Adjust
users of it to access it via an unified interface instead of poking
directly at the channel internals.

Jira NVGPU-5494

Change-Id: Ie11826e6732a8b98e72c4f81dd06bd7e49848121
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345935
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
21e02878f4 gpu: nvgpu: move wdt code out of channel.c
Cut and paste the existing channel watchdog functions to another file
for better isolation of units.

Jira NVGPU-5494

Change-Id: Id437f0939e69a4a8b495eaee164c4d7a9f283fa9
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345934
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
6b1302f23c gpu: nvgpu: Reduce linux debug log spew
Currently when nvgpu prints debug information for something like an MMU
fault the result includes a lot of usless boiler plate logging spew. In
some cases this can be helpful in identifying where the log message came
from in the nvgpu code base. However, for debug spews from faults, the
viewer of that info does not care which function printed the log (for
example).

Instead having a fast and readable debug dump is more valuable. So to
that end, add a special debug dump printing function that does not use
the normal log format. Instead, it prints only a breif prefix to use as
a grep search query. The new print out is listed below.

Since often the kernel logs are impressively long and obtuse, having a
clear debug search string can be helpful. With this log format, one can
simply do:

  $ grep __$CHIP__ kernel.log

And find any debug logs for the desired chip.

New log format - collected on a gv11b under L4T running
`nvgpu_submit_mmu_fault':

[   32.005793] nvgpu: 17000000.gv11b      gv11b_fb_mmu_fault_info_dump:311  [ERR]  [MMU FAULT] mmu engine id:  32, ch id:  511, fault addr: 0x1000, fault addr aperture: 0, fault type: invalid pde, access type: virt read,
[   32.006137] nvgpu: 17000000.gv11b      gv11b_fb_mmu_fault_info_dump:320  [ERR]  [MMU FAULT] protected mode: 0, client type: hub, client id:  host, gpc id if client type is gpc: 0,
[   32.006417] nvgpu: 17000000.gv11b                nvgpu_rc_mmu_fault:296  [ERR]  mmu fault id=0 id_type=1 act_eng_bitmask=00000000
[   32.007125] __gv11b__ Channel Status - chip gv11b
[   32.007128] __gv11b__ ---------------------------
[   32.007241] __gv11b__ 511-gv11b, TSG: 0, pid 955, refs: 2, deterministic:
[   32.007364] __gv11b__ channel status:  in use pending busy
[   32.007509] __gv11b__ RAMFC : TOP: 8000000000001000 PUT: 0000000000001030 GET: 0000000000001000 FETCH: 0000600000001000HEADER: 60400000 COUNT: 00000000SEMAPHORE: addr 0000000000000000payload 0000000000000000 execute 00000000
[   32.007601] __gv11b__
[   32.008696] __gv11b__
[   32.008700] __gv11b__ PBDMA Status - chip gv11b
[   32.008894] __gv11b__ -------------------------
[   32.013477] __gv11b__ pbdma 0:
[   32.017840] __gv11b__   id: -1 - [channel] next_id: - -1 [channel] | status: invalid
[   32.020992] __gv11b__   PBDMA_PUT 0000000000001030 PBDMA_GET 0000000000001000
[   32.029037] __gv11b__   GP_PUT    00000001  GP_GET  00000001  FETCH   00000001 HEADER 60400000
[   32.036386] __gv11b__   HDR       00000000  SHADOW0 00001000  SHADOW1 80003000
[   32.044787] __gv11b__ pbdma 1:
[   32.051964] __gv11b__   id: -1 - [channel] next_id: - -1 [channel] | status: invalid
[   32.055099] __gv11b__   PBDMA_PUT 0000000042003200 PBDMA_GET 00000050728bc914
[   32.062997] __gv11b__   GP_PUT    00000000  GP_GET  2080a000  FETCH   00000000 HEADER e1850010
[   32.070424] __gv11b__   HDR       00110000  SHADOW0 02000000  SHADOW1 10000004
[   32.078652] __gv11b__ pbdma 2:
[   32.085913] __gv11b__   id: -1 - [channel] next_id: - -1 [channel] | status: invalid
[   32.088973] __gv11b__   PBDMA_PUT 00000021040c0004 PBDMA_GET 0000000140020000
[   32.096502] __gv11b__   GP_PUT    00000000  GP_GET  8080a440  FETCH   00000000 HEADER 61400040
[   32.103679] __gv11b__   HDR       14000010  SHADOW0 00000000  SHADOW1 00000400
[   32.112336] __gv11b__
[   32.119860] __gv11b__ gv11b eng 0:
[   32.122119] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.125807] __gv11b__
[   32.135954] __gv11b__ gv11b eng 1:
[   32.135958] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.139457] __gv11b__
[   32.149945] __gv11b__ gv11b eng 2:
[   32.149950] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.153543] __gv11b__
[   32.163598] __gv11b__ gv11b eng 3:
[   32.163601] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid
[   32.167278] __gv11b__
[   32.177076] __gv11b__
[   32.186145] nvgpu: 17000000.gv11b       nvgpu_tsg_set_ctx_mmu_error:492  [ERR]  TSG 0 generated a mmu fault
[   32.189443] nvgpu: 17000000.gv11b     nvgpu_set_err_notifier_locked:140  [ERR]  error notifier set to 31 for ch 511

JIRA NVGPU-5541

Change-Id: Iad60adfab5198ee11dd2ec595f2422ea541b7a2a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349166
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
5d06a59bc5 gpu: nvgpu: Cleanup uart and debugfs debug prints
The gk20a_debug_dump() function implicitly adds a newline since it
uses nvgpu_err() under the hood (for uart destined prints). For the
seq_file destined writes it does not so there is an annoying inconsistency.

Remove the newline that many of the gk20a_debug_dump() calls add and add
the newline to the (now) seq_printf() call. This reduces the length of
debug dump logs and speeds them up - UART is _very_ slow after all.

Also cleanup some formatting issues in the various debug prints I
happened to notice.

JIRA NVGPU-5541

Change-Id: Iabf853d5c50214794fc4cbb602dfffabeb877132
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347956
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
2077df9b1a gpu: nvgpu: use set_syncpt only with nvhost
nvgpu_channel_set_syncpt() is not useful if nvhost and thus syncpts are
missing and semaphores are used for synchronization. Require
CONFIG_TEGRA_GK20A_NVHOST to be set for the set_syncpt hal.

Jira NVGPU-5496

Change-Id: Ief8b4a0fb29af631817aba55c04181b1a360ce56
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2344064
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
b6bf13290e gpu: nvgpu: alloc correct prealloc buffer sizes
The trivial ringbuffer implementation in channel job list and priv cmd
buffers acts such that the buffer is full when the number of inserted
entries in it is one less than allocation size, similarly to the
hardware gpfifo. Take this into account when allocating the job tracking
resources: previously the allocation has been off-by-one too small.

Jira NVGPU-5492

Change-Id: If7bfd4919daa5b0328394ca289d5692c0d2b4f5f
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342129
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
d916e85171 gpu: nvgpu: incr sync once submit is ready to go
Split out the max value increment and syncpt interrupt registration out
of nvgpu_channel_sync_incr*(). This API is called in the submit path to
prepare buffers and tracking resources, but later on in the submit path
errors can still occur so that the increment wouldn't happen (unless
artificially forced by sw).

The increment and irq registration cannot easily be undone and it makes
more sense to do these at the moment when the prepared job is finally
ready, so add a new nvgpu_channel_sync_mark_progress() API to be called
later in the submit path to signal that progress shall eventually happen
on the sync. Without this, the max value would stay too large after an
unsuccessful submit until the channel gets closed.

The sync object (syncpt or semaphore) is always exclusively owned by the
channel that allocated it, so nonatomically reading the max value first
in sync_incr() and incrementing it later in mark_progress() is racefree;
all submits per channel are serialized.

Change the channel syncpoint to client managed from host managed so that
nvhost-exported sync fences behave correctly with the temporary state
where the fence threshold is over the max value. Ideally we'd always
track nvgpu-owned syncpts' max values internally, but this is enough for
now.

Jira NVGPU-5491

Change-Id: Idf0bda7ac93d7f2f114cdeb497fe6b5369d21c95
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340465
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
a629b48013 gpu: nvgpu: split channel sema wakeup function
Extract the functionality to post semaphore signals to one channel into
a separate function for readability.

Jira NVGPU-5491

Change-Id: Ib5e8d34f42a64c253b3b3b8cb9e2c5dd2656fd1f
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340466
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
23d6b36101 gpu: nvgpu: add dma_fence semaphore support
Support exporting and importing semaphore-based synchronization with the
stable dma-fence API. The "Android" sync fence API used until now is
deprecated.

The Android sync framework is still kept as the default.

Jira NVGPU-5353

Change-Id: I9e57947adeb4d2ef5d59135ed7d008553c44f97c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2336119
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
d44ed9d3a8 gpu: nvgpu: rollback gpfifo on error
Submitting new work may fail in the middle of writing the gpfifo
entries. Undo the increments on the gp_put shadow pointer in case of
error to avoid submitting wrong data during the next submit.

Jira NVGPU-5491

Change-Id: I064eaac8773b24da0a56db79ac6bfd07c008da03
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340464
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
f388b1f596 gpu: nvgpu: simplify cmdbuf construction in submit
Split out the wait cmd and incr cmd setup work in submit path to
separate functions to minimize cyclomatic complexity and to increase
readability.

Jira NVGPU-5491

Change-Id: I7dfabd2de287ae10aaae9fb8d4d85d752db8631c
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340463
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
3875c0825f gpu: nvgpu: avoid sema/channel dependencies
Move the per-channel hw semaphore object to be owned by the channel sync
(just like with syncpoints, too). Store just the channel ID in the hw
sema for debug prints to get rid of sema->channel dependencies. Make
nvgpu_semaphore_alloc() take a hw sema instead of a channel.

Fix up some channel-related documentation that has been incorrect.

Jira NVGPU-5353

Change-Id: I04d49da3aac50a4cea32e7393f48e6f85a80ca0d
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2339931
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
b8f398f6a7 gpu: nvgpu: clean up struct priv_cmd_entry
The valid flag is no longer useful as the lifetime of priv cmd entries
is clearer than before. Delete it. Delete also the stored gva that can
be calculated from the nvgpu_mem plus offset.

Jira NVGPU-4548

Change-Id: Ibf322acbb2ab1a454e9b644af24c02d291b75633
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
(cherry picked partially from commit
b9f6512e803873aaa92218dcbc090ff31a4f9c50)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332509
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
05df07945a gpu: nvgpu: avoid channel dependency in priv cmdbuf
The priv cmdbuf queue needs only the vm_gk20a of the channel that owns
it. Pass the vm to the queue constructor and have the channel code store
the queue to itself instead of poking at the channel from the queue
code. Adjust the cmdbuf queue api to take the queue, not the channel.

Move the inflight job fallback calculation to the channel code. The size
of the channel gpfifo isn't needed in the queue; just the job count is.

[not part of the cherry-pick: a bunch of MISRA mitigations.]

Jira NVGPU-4548

Change-Id: I4277dc67bb50380cb157f3aa3c5d57b162a8f0ba
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329659
(cherry picked from commit 83b2276f7bea563602eee20ce24b70ce70c8475a)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332508
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
991002c88b gpu: nvgpu: hide struct priv_cmd_entry
The type for entries allocated from the priv cmd queue is no longer
necessary to be visible for its users other than as an opaque handle,
except for a few minor debug prints. Make those prints output the entry
pointer value instead and move the struct definition to priv_cmdbuf.c.

Jira NVGPU-4548

Change-Id: Ia75ff41d840ac928561525a46d5973640e4b5f7e
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329658
(cherry picked from commit 3292cdadbc78ca129d1e0878c3947b0839487fc2)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332507
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
9bee2fe660 gpu: nvgpu: prealloc priv cmdbuf metadata
Move preallocation of priv cmdbuf metadata structs to the priv cmdbuf
level and do it always, not only on deterministic channels. This makes
job tracking simpler and loosens dependencies from jobs to cmdbuf
internals. The underlying dma memory for the cmdbuf data has always been
preallocated.

Rename the priv cmdbuf functions to have a consistent prefix.

Refactor the channel sync wait and incr ops to free any priv cmdbufs
they allocate. They have been depending on the caller to free their
resources even on error conditions, requiring the caller to know how
they work.

The error paths that could occur after a priv cmdbuf has been allocated
have likely been wrong for a long time. Usually the cmdbuf queue allows
allocating only from one end and freeing from only the other end, as
that's natural with the hardware job queue. However, in error conditions
the just recently allocated entries need to be put back. Improve the
interface for this.

[not part of the cherry-pick:] Delete the error prints about not enough
priv cmd buffer space. That is not an error. When obeying the
user-provided job sizes more strictly, momentarily running out of job
tracking resources is possible when the job cleanup thread does not
catch up quickly enough. In such a case the number of inflight jobs on
the hardware could be less than the maximum, but the inflight job count
that nvgpu sees via the consumed resources could reach the maximum.
Also remove the wrong translation to -EINVAL from err from one call to
nvgpu_priv_cmdbuf_alloc() - the -EAGAIN from the failed allocation is
important.

[not part of the cherry-pick: a bunch of MISRA mitigations.]

Jira NVGPU-4548

Change-Id: I09d02bd44d50a5451500d09605f906d74009a8a4
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329657
(cherry picked from commit 25412412f31436688c6b45684886f7552075da83)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332506
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6fc1e41150 gpu: nvgpu: split submit on deterministic
Avoid repetitive branching on the c->deterministic flag and on build
time flags by breaking the submit function on the runtime flag into two
functions of which one gets called.

In deterministic mode the job tracking conditions are simpler, there are
a few extra prechecks to guarantee deterministic latency and the
railgate corner case, and deferred cleanup is never done.

In nondeterministic mode job tracking has more conditions, a power
reference is taken for the job lifetime, and deferred cleanup is
assumed.

These two paths still share some common code. Split it to two more
functions to act as easy building blocks so that the main logic is
apparent.

Jira NVGPU-4548

Change-Id: I64f91dcf09acb16f409dc04a12ad1e144d0cce56
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333728
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
b077c6787d gpu: nvgpu: split sync and gpfifo work in submit
Make the big submit function somewhat shorter by splitting out the work
to do job allocation, sync command buffer creation and gpfifo writing
out to another function. To emphasize the difference between tracked and
fast submits, add two separate functions for those two cases.

Jira NVGPU-4548

Change-Id: I97432a3d70dd408dc5d7c520f2eb5aa9c76d5e41
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333727
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
dd2fb50a1a gpu: nvgpu: require deferred cleanup for aggressive sync destroy
Aggressive sync destroy is used on some platforms where the amount of
syncpoints is limited. It can cause sync objects to get allocated and
freed in the submit path and when jobs are cleaned up, so require
deferred cleanup. Allocations do not belong to job tracking in a
deterministic submit path.

Although this has been technically allowed before, deterministic
channels have likely not been a priority on those old platforms with
aggressive sync destroy set.

Update virtualized gp10b platform data to match on a gp10b-vgpu compat
string instead of gk20a-vgpu. gk20a (Tegra T124) hasn't been supported
for a long time. Delete the aggressive sync destroy field from this
platform. It's got enough syncpoints to not dynamically allocate them;
having this property set for gp10b-vgpu has likely been a mistake.

This is not a completely pure cherry-pick: also extend the gpu
characteristics to not advertise full deterministic submit support when
aggressive sync destroy is off. This platform flag cannot be adjusted by
the user unlike many other flags.

Jira NVGPU-4548

Change-Id: I283f546d48b79ac94b943d88e5dce55710858330
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322042
(cherry picked from commit b1ba2b997b2174e365bcb0782ef3e67260ff9e57)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328411
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
0b70fff5db gpu: nvgpu: fix job count calculation for non-pow2
The CIRC_SPACE and CIRC_CNT macros work as expected when the buffer size
is a power of two. The userspace-supplied number of inflight jobs is not
necessarily so. Compare the get and put pointers manually.

Jira NVGPU-4548

Change-Id: Ifa7bd6d78f82ec8efcac21fcca391053a2f6f311
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328572
(cherry picked from commit 33dffa1cfb142eea0f28474566c31b632eee04f5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331340
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00