Adjust documentation and validity checks in the fence functions for
simplicity.
Now that the cde code is using user fences cleanly, the
do-nothing-on-null action can cause unintended behaviour in new code
using nvgpu_fence_get and nvgpu_fence_put. It does not make sense to
call these with a null fence, so delete the checks.
Extend the documentation in nvgpu_fence_extract_user() for the os fence
lifetime to give a reason for the dup call.
Make nvgpu_fence_from_semaphore() and nvgpu_fence_from_syncpt() return
void. These fill a previously allocated object; the only failure would
have been a null object, but that never happens and is not acceptable
behaviour for callers so delete these null checks and fix types.
Jira NVGPU-5248
Change-Id: I9f82365d50ab5600374c8f7dd513691eac14a2f1
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359624
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Split out the max value increment and syncpt interrupt registration out
of nvgpu_channel_sync_incr*(). This API is called in the submit path to
prepare buffers and tracking resources, but later on in the submit path
errors can still occur so that the increment wouldn't happen (unless
artificially forced by sw).
The increment and irq registration cannot easily be undone and it makes
more sense to do these at the moment when the prepared job is finally
ready, so add a new nvgpu_channel_sync_mark_progress() API to be called
later in the submit path to signal that progress shall eventually happen
on the sync. Without this, the max value would stay too large after an
unsuccessful submit until the channel gets closed.
The sync object (syncpt or semaphore) is always exclusively owned by the
channel that allocated it, so nonatomically reading the max value first
in sync_incr() and incrementing it later in mark_progress() is racefree;
all submits per channel are serialized.
Change the channel syncpoint to client managed from host managed so that
nvhost-exported sync fences behave correctly with the temporary state
where the fence threshold is over the max value. Ideally we'd always
track nvgpu-owned syncpts' max values internally, but this is enough for
now.
Jira NVGPU-5491
Change-Id: Idf0bda7ac93d7f2f114cdeb497fe6b5369d21c95
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340465
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move preallocation of priv cmdbuf metadata structs to the priv cmdbuf
level and do it always, not only on deterministic channels. This makes
job tracking simpler and loosens dependencies from jobs to cmdbuf
internals. The underlying dma memory for the cmdbuf data has always been
preallocated.
Rename the priv cmdbuf functions to have a consistent prefix.
Refactor the channel sync wait and incr ops to free any priv cmdbufs
they allocate. They have been depending on the caller to free their
resources even on error conditions, requiring the caller to know how
they work.
The error paths that could occur after a priv cmdbuf has been allocated
have likely been wrong for a long time. Usually the cmdbuf queue allows
allocating only from one end and freeing from only the other end, as
that's natural with the hardware job queue. However, in error conditions
the just recently allocated entries need to be put back. Improve the
interface for this.
[not part of the cherry-pick:] Delete the error prints about not enough
priv cmd buffer space. That is not an error. When obeying the
user-provided job sizes more strictly, momentarily running out of job
tracking resources is possible when the job cleanup thread does not
catch up quickly enough. In such a case the number of inflight jobs on
the hardware could be less than the maximum, but the inflight job count
that nvgpu sees via the consumed resources could reach the maximum.
Also remove the wrong translation to -EINVAL from err from one call to
nvgpu_priv_cmdbuf_alloc() - the -EAGAIN from the failed allocation is
important.
[not part of the cherry-pick: a bunch of MISRA mitigations.]
Jira NVGPU-4548
Change-Id: I09d02bd44d50a5451500d09605f906d74009a8a4
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329657
(cherry picked from commit 25412412f31436688c6b45684886f7552075da83)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332506
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add an API to append data to a priv cmdbuf entry. Hold the write pointer
offset internally in the entry instead of having the user keep track of
where those words are written to.
This helps in eventually hiding struct priv_cmd_entry from users and
provides a more consistent interface in general. The wait and incr
commands are now slightly easier to read as well when they're just
arrays of data.
A syncfd-backed prefence may be composed of several individual fences.
Some of those (or even a fence backed by just one) may be already
expired, and currently the syncfd export design releases and nulls
semaphores when expired (see gk20a_sync_pt_has_signaled()) so for those
the wait cmdbuf is appended with zeros; the specific function is for
this purpose.
Jira NVGPU-4548
Change-Id: I1057f98c1b5b407460aa6e1dcba917da9c9aa9c9
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325099
(cherry picked from commit 6a00a65a86d8249cfeb06a05682abb4771949f19)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331336
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Instead of one HAL op with a boolean flag to decide whether to do one
thing or another entirely different thing, use two separate HAL ops for
filling priv cmd bufs with semaphore wait and semaphore increment
commands. It's already two ops for syncpoints, and explicit commands are
more readable than boolean flags.
Change offset into cmdbuf in sem wait HAL to be relative to the cmdbuf,
so the HAL adds the cmdbuf internal offset to it.
While at it, modify the syncpoint cmdbuf HAL ops' prototypes to be
consistent.
Jira NVGPU-4548
Change-Id: Ibac1fc5fe2ef113e4e16b56358ecfa8904464c82
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323319
(cherry picked from commit 08c1fa38c0fe4effe6ff7a992af55f46e03e77d0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328409
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
A corner case has existed since ancient times for syncpoint-backed
prefences to not cause a gpu wait if the fence is found to be completed
in the submit path. This adds some unnecessary complexity, so don't
check for completion in software. Let the gpu "wait" for these
known-to-be-trivial waits too. Necessary priv cmdbuf space has been
allocated anyway.
Originally nvhost had 16-bit fences which would wrap around relatively
quickly, so waiting for an old fence could have looked like waiting for
a fence that will expire long in the future. With 32-bit thresholds,
this hasn't been the case for several Tegra generations anymore, and
nvhost doesn't ignore waits like this either.
The wait priv cmdbuf in submit path can still be missing even with a
prefence supplied because the Android sync framework supports sync fds
that contain zero fences inside; this can happen at least when merging
fences that have all been expired. In such conditions the wait cmdbuf
wouldn't even get allocated.
[this is squashed with commit 8b3b0cb12d118 (gpu: nvgpu: allow no wait
cmd with valid input fence) from
https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325677]
Jira NVGPU-4548
Change-Id: Ie81fd8735c2614d0fedb7242dc9869d0961610eb
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321762
(cherry picked from commit 8f3dac44934eb727b1bf4fb853f019cf4c15a5cd)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324254
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Refactor user managed syncpoints out of the channel sync infrastructure
that deals with jobs submitted via the kernel api. The user syncpt only
needs to expose the id and gpu address of the reserved syncpoint. None
of the rest (fences, priv cmdbufs) is needed for that, so it hasn't been
ideal to couple with the user-allocated syncpts.
With user syncpts now provided by channel_user_syncpt, remove the
user_managed flag from the kernel sync api.
This allows moving all the kernel submit sync code to be conditionally
compiled in only when needed, and separates the user sync functionality
in a more clear way from the rest with a minimal API.
[this is squashed with commit 5111caea601a (gpu: nvgpu: guard user
syncpt with nvhost config) from
https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325009]
Jira NVGPU-4548
Change-Id: I99259fc9cbd30bbd478ed86acffcce12768502d3
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321768
(cherry picked from commit 1095ad353f5f1cf7ca180d0701bc02a607404f5e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319629
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The following functions belong to the path of kernel_mode submit and
the flag CONFIG_NVGPU_KERNEL_MODE_SUBMIT is used to compile these out
of safety builds.
channel_gk20a_alloc_priv_cmdbuf
channel_gk20a_free_prealloc_resources
channel_gk20a_joblist_add
channel_gk20a_joblist_delete
channel_gk20a_joblist_peek
channel_gk20a_prealloc_resources
nvgpu_channel
nvgpu_channel_add_job
nvgpu_channel_alloc_job
nvgpu_channel_alloc_priv_cmdbuf
nvgpu_channel_clean_up_jobs
nvgpu_channel_free_job
nvgpu_channel_free_priv_cmd_entry
nvgpu_channel_free_priv_cmd_q
nvgpu_channel_from_worker_item
nvgpu_channel_get_gpfifo_free_count
nvgpu_channel_is_prealloc_enabled
nvgpu_channel_joblist_is_empty
nvgpu_channel_joblist_lock
nvgpu_channel_joblist_unlock
nvgpu_channel_kernelmode_deinit
nvgpu_channel_poll_wdt
nvgpu_channel_set_syncpt
nvgpu_channel_setup_kernelmode
nvgpu_channel_sync_get_ref
nvgpu_channel_sync_incr
nvgpu_channel_sync_incr_user
nvgpu_channel_sync_put_ref_and_check
nvgpu_channel_sync_wait_fence_fd
nvgpu_channel_update
nvgpu_channel_update_gpfifo_get_and_get_free_count
nvgpu_channel_update_priv_cmd_q_and_free_entry
nvgpu_channel_wdt_continue
nvgpu_channel_wdt_handler
nvgpu_channel_wdt_init
nvgpu_channel_wdt_restart_all_channels
nvgpu_channel_wdt_restart_all_channels
nvgpu_channel_wdt_rewind
nvgpu_channel_wdt_start
nvgpu_channel_wdt_stop
nvgpu_channel_worker_deinit
nvgpu_channel_worker_from_worker
nvgpu_channel_worker_init
nvgpu_channel_worker_poll_init
nvgpu_channel_worker_poll_wakeup_post_process_item
nvgpu_channel_worker_poll_wakeup_process_item
nvgpu_submit_channel_gpfifo_kernel
nvgpu_submit_channel_gpfifo_user
gk20a_userd_gp_get
gk20a_userd_pb_get
gk20a_userd_gp_put
nvgpu_fence_alloc
The following members of struct nvgpu_channel are compiled out of
safety build.
struct gpfifo_desc gpfifo;
struct priv_cmd_queue priv_cmd_q;
struct nvgpu_channel_sync *sync;
struct nvgpu_list_node worker_item;
struct nvgpu_channel_wdt wdt;
The following files are compiled out of safety build.
common/fifo/submit.c
common/sync/channe1_sync_semaphore.c
hal/fifo/userd_gv11b.c
Jira NVGPU-3479
Change-Id: If46c936477c6698f4bec3cab93906aaacb0ceabf
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127212
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Renamed gk20a_channel_* APIs to nvgpu_channel_* APIs.
Removed unused channel API int gk20a_wait_channel_idle
Renamed nvgpu_channel_free_usermode_buffers in os/linux-channel.c to
nvgpu_os_channel_free_usermode_buffers to avoid conflicts with the API
with the same name in channel unit.
Jira NVGPU-3248
Change-Id: I21379bd79e64da7e987ddaf5d19ff3804348acca
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2121902
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There are many miscellaneous HALs for various MM related functionality.
This patch aims to migrate all the remaining MM code from the <chip>/
mm_<chip>.[ch] files in HAL files under hal/.
Much of this is fairly straightforward copy/paste and updates to the
HAL init files.
The exception to that is the move of the left over gv11b MMU fault
handling code in mm_gv11b.c. Having both a hal/mm/mm/mm_gv11b.c and
a gv11b/mm_gv11b.c file causes tmake to choke so the gv11b/mm_gv11b.c
file was moved to gv11b/mmu_fault_gv11b.c. This will be cleaned up in
a subsequent patch.
JIRA NVGPU-2042
Change-Id: I12896de865d890a61afbcb71159cff486119ffb8
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2109050
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make the APIs nvgpu_channel_sync_get_syncpt_id() and
channel_sync_syncpt_get_id() return u32s rather than converting to
ints and back.
Also define FIFO_INVAL_SYNCPT_ID to use for invalid syncpt IDs rather
than using magic numbers.
JIRA NVGPU-1008
Change-Id: I4dde6b15fd3708fb0126b46c6fea8ac1b447c7ce
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2014821
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
sync cmbbuf specific ops pointers are moved into a new struct sync_ops
under the parent struct gpu_ops. The HAL assignments to the gk20a and
gv11b versions are updated to match the new struct type.
Jira NVGPU-1308
Change-Id: I1d9832ed5e938cb65747f0f6d34088552f75e2bc
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1975919
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The container_of() macro used in nvgpu produces the following set
of MISRA required rule violations:
* Rule 11.3 : A cast shall not be performed between a pointer to
object type and a pointer to a different object to type.
* Rule 11.8 : A cast shall not remove any const or volatile
qualification from the type pointed to be a pointer.
* Rule 20.7 : Expressions resulting from the expansion of macro
parameters shall be enclosed in parentheses.
Using the same modified implementation of container_of() as that
used in the nvgpu_list_node/nvgpu_rbtree_node routines eliminates
the Rule 11.8 and Rule 20.7 violations and exchanges the Rule 11.3
violation with an advisory Rule 11.4 violation.
This patch uses that same equivalent implementation in two new
(static) functions that are used to replace references to
container_of() in common channel sync code:
* nvgpu_channel_sync_semaphore_from_ops
* nvgpu_channel_sync_syncpt_from_ops
It should be noted that replacement functions still contain
potentially dangerous (and non-MISRA compliant code) and that it is
expected that deviation requests will be filed for the new advisory
rule violations accordingly.
JIRA NVGPU-782
Change-Id: Ib4279c7d28824ce5888838580a54b573323f7152
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1993787
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
struct nvgpu_channel_sync is moved to a private header i.e.
channel_sync_priv.h present in common/sync/. All accesses to callback
functions inside the struct nvgpu_channel_sync in NVGPU driver is replaced by
the public channel_sync specific APIs.
Jira NVGPU-1093
Change-Id: I52d57b3d458993203a3ac6b160fb569effbe5a66
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929783
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add the following syncpt specific APIs
nvgpu_channel_sync_get_syncpt_id
nvgpu_channel_sync_get_syncpt_address
nvgpu_channel_sync_wait_syncpt
nvgpu_channel_sync_to_syncpt
nvgpu_channel_sync_syncpt_create
Definition of struct nvgpu_channel_sync_syncpt and syncpoint
specific static implementations of the channel_sync callbacks
as well as definitions of the public APIs are moved to a
separate execution unit i.e. channel_sync_syncpt.c
Jira NVGPU-1093
Change-Id: Ib0163c6b9bc6dfc2ab2a2b7a5fa5027be13316e2
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929780
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>