BIT() is defined as returning a 64-bit value. We use it to create the
log mask values, but the functions that accept log mask take only
u32 as parameter.
Use u64 as log mask parameter for the logging functions to match the
sizes.
Change-Id: I6f0803a7d04ee6a2ee725b5defc4cc14b5b7acf5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1683818
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
If golden context creation happens before any gpu railgate then
channel creation is always fine. If gpu railgate happens after gpu
finalize poweon, but before golden context creation, then golden
context creation is failing during first channel creation with
watchdog timeout from ctxsw because of invalid ctxsw state.
To Fix this issue, if the golden context is not created, then during
finalize power on always query ctxsw image sizes, which is making ctxsw
hw in correct state before golden context creation.
Bug 2051863
Change-Id: I81d221100a099b12bad3adc2d252de4621c335a5
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1682265
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gk20a_create_priv_addr_table() and gv11b_gr_egpc_etpc_priv_addr_table(),
we create a table of unicast addresses from broadcast addresses
For GPC boardcast addresses like NV_PGRAPH_PRI_EGPCS_ETPC6_SM_*, we generate
the table assuming there are 7 TPCs in all the GPCs
But this is incorrect in some cases like GV100 where GPC0/1 have only 6 TPCs
And hence we end up generating registers which do not exist
Fix this by explicitly checking the number of TPCs and ensuring that address
generated is belongs to valid TPC
Bug 200400376
Jira NVGPU-564
Change-Id: I65d7d6cd7f0bf16171eb54ed71f1f3840ade3495
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1686806
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
CLANG, when compiling regops_gk20a.c sees the following warning:
../drivers/gpu/nvgpu/gk20a/regops_gk20a.c:464:30: error: equality comparison with extraneous parentheses [-Werror,-Wparentheses-equality]
if (unlikely(skip_read_lo == false)) {
~~~~~~~~~~~~~^~~~~~~~
../drivers/gpu/nvgpu/gk20a/regops_gk20a.c:464:30: note: remove extraneous parentheses around the comparison to silence this warning
if (unlikely(skip_read_lo == false)) {
~ ^ ~
../drivers/gpu/nvgpu/gk20a/regops_gk20a.c:464:30: note: use '=' to turn this equality comparison into an assignment
if (unlikely(skip_read_lo == false)) {
^~
=
1 error generated.
But this obviously is fine. However, it's simple enough to work around
by just deleting the unlikely() call. We don't do anything with that
anyway.
JIRA NVGPU-525
Change-Id: I674855ad08daf65ac6d79ceab7d4f56f637d4437
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673818
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The patch declares globally few channel/fifo HAL functions
required for QNX code compilation (as they are being referred
elsewhere in QNX code). This is required as a part of
bringing in the nvgpu Channel/FIFO HAL into QNX.
Jira VQRM-3058
Change-Id: Ia176535b64de981d2f7ddb20f62015a0da74fd2a
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1662411
GVS: Gerrit_Virtual_Submit
Tested-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Dump timeout save0 and save1 even if they could
be unreliable when fecs_tgt in set in save0 . This
is good to have for debug purposes.
-Add priv_ring hal for decode_error_code
-Decode fecs error code for supported error types
Bug 1998067
Change-Id: I60cb6902d099df4a7df45fa624e44d9e0d46360f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1683014
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Current code does not compute priv error register offsets
properly. This leads to invalid decoding of priv errors, and
can also trigger additional priv errors.
- add GPU_LIT_GPC_PRIV_STRIDE define
- return proj_gpc_priv_stride for GPU_LIT_GPC_PRIV_STRIDE in hals
- use GPU_LIT_GPC_PRIV_STRIDE instead of GPU_LIT_GPC_STRIDE in
g->ops.priv_ring.isr() to compute priv error register offsets.
Bug 2093058
Change-Id: Ia7c36ccba0441126784bb0e00452f2cf1196ef71
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1682118
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Instead of looping all jobs and releasing their semaphores separately,
do just one semaphore release. All the jobs are using the same sema
index, and the final, maximum value of it is known.
Move also this resetting into ch->sync->set_min_eq_max() to be
consistent with syncpoints.
Change-Id: I03601aae67db0a65750c8df6b43387c042d383bd
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1680362
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Semaphores don't need to be released from CPU anymore, so clarify the
code by deleting nvgpu_semaphore_release() and refactoring
__nvgpu_semaphore_release() to nvgpu_semaphore_reset() that only
"fast-forwards" the semaphore to a later value.
While doing this, the meaning of nvgpu_semaphore_incr() changes, so
rename it to nvgpu_semaphore_prepare(). Now it's only used to prepare an
nvgpu_semaphore for a value that the HW will increment the sema to.
Also change the BUG_ON that guards sema double-inits into just WARN_ON.
Change-Id: I6f6df368ec5436cc97a229697742b6a4115dca51
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1680361
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Accept submits on deterministic channels even when the prefence is a
syncfd, but only if it has just one fence inside.
Because NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE is shared between pre- and
postfences, a postfence (SUBMIT_GPFIFO_FLAGS_FENCE_GET) is not allowed
at the same time though.
The sync framework is problematic for deterministic channels due to
certain allocations that are not controlled by nvgpu. However, that only
applies for postfences, yet we've disallowed FLAGS_SYNC_FENCE for
deterministic channels even when a postfence is not needed.
Bug 200390539
Change-Id: I099bbadc11cc2f093fb2c585f3bd909143238d57
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1680271
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MAX/threshold value of user managed syncpoint is not tracked by nvgpu
So if channel is reset by nvgpu there could be waiters still waiting on some
user syncpoint fence
Fix this by setting a large safe value to user managed syncpoint when aborting
the channel and when closing the channel
We right now increment the current value by 0x10000 which should be sufficient
to release any pending waiter
Bug 200326065
Jira NVGPU-179
Change-Id: Ie6432369bb4c21bd922c14b8d5a74c1477116f0b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1678768
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GV100 ucode is changed so that it expects LIST_nv_perf_pma_ctx_reg list in
ctxsw buffer to be 256 byte aligned but same change is not applied to other
chip ucodes
ADD new HAL (*add_ctxsw_reg_perf_pma) to configure PMA register list and
define a common HAL gr_gk20a_add_ctxsw_reg_perf_pma() for all other
chips except GV100
Define a separate HAL for GV100 gr_gv100_add_ctxsw_reg_perf_pma() and fix
the required alignment in this function
Bug 1998067
Change-Id: Ie172fe90e2cdbac2509f2ece953cd8552e66fc56
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676655
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For LIST_nv_pm_fbpa_ctx_regs, we right now call
add_ctxsw_buffer_map_entries_subunits() to add registers corresponding
to all the FBPAs
But while configuring total number of registers, we do not consider
floorswept FBPAs and that causes misalignment in subsequent lists for GV100
Fix this by reading disabled/floorswept FBPAs from fuse and consider only those
FBPAs which are active for GV100
Add new HAL (*add_ctxsw_reg_pm_fbpa) to support this setting and define a
common HAL gr_gk20a_add_ctxsw_reg_pm_fbpa() for all chips except GV100
Define GV100 specific gr_gv100_add_ctxsw_reg_pm_fbpa() with above mentioned
implementation to consider floorsweeping
Bug 1998067
Change-Id: Id560551bb0b8142791c117b6d27864566c90b489
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676654
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Delete the proxy waiter for non-semaphore-backed syncfds in sema wait
path to simplify code, to remove dependencies to the sync framework (and
thus Linux) and to support upcoming refactorings. This feature has never
been used for actually foreign fences.
Jira NVGPU-43
Jira NVGPU-66
Change-Id: I2b539aefd2d096a7bf5f40e61d48de7a9b3dccae
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665119
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The pre-fence wait for semaphores in the submit path has supported a
fast path for fences that have only one underlying semaphore. The fast
path just inserts the wait on this sema to the pushbuffer directly. For
other fences, the path has been using a CPU wait indirection, signaling
another semaphore when we get the CPU-side callback.
Instead of only supporting prefences with one sema, unroll all the
individual semaphores and insert waits for each to a pushbuffer, like
we've already been doing with syncpoints. Now all sema-backed syncs get
the fast path. This simplifies the logic and makes it more explicit that
only foreign fences need the CPU wait.
There is no need to hold references to the sync fence or the semas
inside: this submitted job only needs the global read-only sema mapping
that is guaranteed to stay alive while the VM of this channel stays
alive, and the job does not outlive this channel.
Jira NVGPU-43
Jira NVGPU-66
Jira NVGPU-513
Change-Id: I7cfbb510001d998a864aed8d6afd1582b9adb80d
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1636345
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Created volt ops under pmu_ver to support volt_set_voltage,
volt_get_voltage & volt_send_load_cmd_to_pmu.
- Renamed volt load, set_voltage & get_voltage gp10x method names.
- Added new volt load, set_voltage & get_voltage methods for gv10x
using RPC & added code to handle ack in pmu_rpc_handler() along
with struct rail_list changes.
- Updated volt ops of gp106 & gv100 to point to respective methods.
- Added member volt_dev_idx_ipc_vmin & volt_scale_exp_pwr_equ_idx to
"struct nv_pmu_volt_volt_rail_boardobj_set" & "struct voltage_rail"
made changes to update members as needed.
- Added member volt_scale_exp_pwr_equ_idx to
"struct vbios_voltage_rail_table_1x_entry" to read
value from VBIOS table & update rail boardobj set interface.
- Defines for volt RPC "NV_PMU_RPC_ID_VOLT_*"
- Define struct's volt load, set_voltage & get_voltage to execute
volt RPC.
Change-Id: I4a41adcf7536468beaa8a73f551b1d608aabd161
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1659728
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Updated & added new parameter "bool is_copy_back" to
nvgpu_pmu_rpc_execute() to support copy back processed
RPC request from PMU to caller by passing parameter value
true & this blocks method till it receives ACK from PMU
for requested RPC.
- Added "struct rpc_handler_payload" to hold info
required for RPC handler like RPC buff address &
clear memory if copy back is not requested.
- Added define PMU_RPC_EXECUTE_CPB to support to copy back
processed RPC request from PMU to caller.
- Updated RPC callback handler support, crated memory &
assigned default handler if callback is not requested
else use callback parameters data to request to PMU.
- Added define PMU_RPC_EXECUTE_CB to support callback
- Updated pmu_wait_message_cond(), restricted condition
check to 8-bit instead 32-bit condition check.
Change-Id: Ic05289b074954979fd0102daf5ab806bf1f07b62
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1664962
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
struct nvgpu_semaphore represents (mainly) a threshold value that a sema
at some index will get and struct nvgpu_semaphore_int (aka "hw_sema")
represents the allocation (and write access) of a semaphore index and
the next value that the sema at that index can have. The threshold
object doesn't need a pointer to the sema allocation that is not even
guaranteed to exist for the whole threshold lifetime, so replace the
pointer by the position of the sema in the sema pool.
This requires some modifications to pass a hw sema around explicitly
because it now represents write access more explicitly.
Delete also the index field of semaphore_int because it can be directly
derived from the offset in the sema location and is thus unnecessary.
Jira NVGPU-512
Change-Id: I40be523fd68327e2f9928f10de4f771fe24d49ee
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658102
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add __nvgpu_sgl_phys function that can be used to implement IPA
to PA translation in a subsequent change.
Adapt existing function prototypes to add pointer to gpu context,
as we will need to check if IPA to PA translation is needed.
JIRA EVLR-2442
Bug 200392719
Change-Id: I5a734c958c8277d1bf673c020dafb31263f142d6
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1673142
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Replace the padding in nvgpu_channel_wdt_args with a timeout value in
milliseconds, and add NVGPU_IOCTL_CHANNEL_WDT_FLAG_SET_TIMEOUT to
signify the existence of this new field. When the new flag is included
in the value of wdt_status, the field is used to set a per-channel
timeout to override the per-GPU default.
Add NVGPU_IOCTL_CHANNEL_WDT_FLAG_DISABLE_DUMP to disable the long debug
dump when a timed out channel gets recovered by the watchdog. Printing
the dump to serial console takes easily several seconds. (Note that
there is NVGPU_TIMEOUT_FLAG_DISABLE_DUMP about ctxsw timeout separately
for NVGPU_IOCTL_CHANNEL_SET_TIMEOUT_EX as well.)
The behaviour of NVGPU_IOCTL_CHANNEL_WDT is changed so that either
NVGPU_IOCTL_CHANNEL_ENABLE_WDT or NVGPU_IOCTL_CHANNEL_DISABLE_WDT has to
be set. The old behaviour was that other values were silently ignored.
The usage of the global default debugfs-controlled ch_wdt_timeout_ms is
changed so that its value takes effect only for newly opened channels
instead of in realtime. Also, zero value no longer means that the
watchdog is disabled; there is a separate flag for that after all.
gk20a_fifo_recover_tsg used to ignore the value of "verbose" when no
engines were found. Correct this.
Bug 1982826
Bug 1985845
Jira NVGPU-73
Change-Id: Iea6213a646a66cb7c631ed7d7c91d8c2ba8a92a4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1510898
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Created ops for below boardobj methods to support gp10x & gv10x
branch boardobj changes, and defined methods for gv10x with
postfix _v1 with below names
boardobjgrp_pmucmd_construct_impl
boardobjgrp_pmuset_impl
boardobjgrp_pmugetstatus_impl
is_boardobjgrp_pmucmd_id_valid
- These ops are assigned based on PMU version to respective
chip.
- Modified BOARDOBJGRP_PMU_CMD_GRP_SET_CONSTRUCT &
BOARDOBJGRP_PMU_CMD_GRP_GET_STATUS_CONSTRUCT to support
gp10x & gv10x branch changes
- Updated struct boardobjgrp_pmu_cmd to include members
needed for gv10x boardobj changes
- Created "struct nv_pmu_rpc_struct_board_obj_grp_cmd"
to execute BOARD_OBJ_GRP_CMD using RPC.
- Defined method boardobjgrp_pmucmdsend_rpc() to
send BOARD_OBJ_GRP_CMD to PMU.
Change-Id: If2551bdda80e897e7b21d2966881586f3bbc7a9b
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656511
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Added ops "pmu.alloc_super_surface" to create
memory space for pmu super surface
- Defined method nvgpu_pmu_sysmem_surface_alloc()
to allocate pmu super surface memory & assigned
to "pmu.alloc_super_surface" for gv100
- "pmu.alloc_super_surface" set to NULL for gp106
- Memory space of size "struct nv_pmu_super_surface"
is allocated during pmu sw init setup if
"pmu.alloc_super_surface" is not NULL &
free if error occur.
- Added ops "pmu_ver.config_pmu_cmdline_args_super_surface"
to describe PMU super surface details to PMU ucode
as part of pmu command line args command if
"pmu.alloc_super_surface" is not NULL.
- Updated pmu_cmdline_args_v6 to include member
"struct flcn_mem_desc_v0 super_surface"
- Free allocated memory for PMU super surface in
nvgpu_remove_pmu_support() method
- Added "struct nvgpu_mem super_surface_buf" to "nvgpu_pmu" struct
- Created header file "gpmu_super_surf_if.h" to include interface
about pmu super surface, added "struct nv_pmu_super_surface"
to hold super surface members along with rsvd[x] dummy space
to sync members offset with PMU super surface members.
Change-Id: I2b28912bf4d86a8cc72884e3b023f21c73fb3503
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656571
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
When adding a sema wait to a pushbuf, verify that the sema threshold has
been incremented from the original value by reading the incremented
field instead of value (which is set to nonzero by
nvgpu_semaphore_incr()). Value could be 0 even after an increment if new
semas weren't reset to 0.
Jira NVGPU-514
Change-Id: I295451fbc7eb9e597aea12d73074e99f74a6a899
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658100
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>