Currently after sending change seq RPC, nvgpu waits for a fixed time
of 20ms.
This CL replaces this with pmu_wait_message_cond, which will return
immediately after getting change seq completion event.
Also added debug fs node to get the change seq execution time.
Bug 200545366
Change-Id: Iba283f65d4949858be9cbff88de4d21a8c92ff81
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2202423
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
golden image size will be set when memory allocated.
See function:
- nvgpu_gr_obj_ctx_init
If golden image size is 0, gr_golden_image should be a NULL
pointer in most cases. So add NULL pointer checking in
tpc_pg_mask_store to avoid NULL pointer exception.
Bug 2403210
Change-Id: I14df5cd94d7a4418c3089c5f84b6eab93c485ba6
Signed-off-by: Sunny Li <sunnyl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2161280
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch addresses misra violations due to SDL error reporting
callbacks. In particular, it addresses the following misra violation:
- misra_c_2012_directive_4_7_violation: Calling function
"nvgpu_report_*_err()" which returns error information without testing
the error information.
JIRA NVGPU-4025
Change-Id: Ia10b6b3fd9c127a8c5189c3b6ba316f243cedf04
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196895
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
ifdef function prototypes with CONFIG_* defines. This fixes MISRA rule
8.6 violations which complain about undefined functions.
Also moved nvgpu_channel_get_from_file prototype to ioctl_channel.h &
nvgpu_probe to driver_common.h as those are linux specific. Define
nvgpu_init_soc_vars in posix/soc.c as it is implemented in QNX.
JIRA NVGPU-3873
Change-Id: I5d2b238e1b5d1318867cd2416ac5f03cc6ab7c6a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196794
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In some cases, we would get deadlock issue due to there are two locks
acquisition on common clk driver's lock and nvgpu driver's locks. At
the bug, inconsistent lock ordering problem will come with one thread
gets "nvgpu lock -> clk lock" and the other thread gets "clk lock ->
nvgpu lock".
Slove the latter path with one-time initializing clk_parent entry
and use cached data afterward.
Bug 2555115
Change-Id: I31c5c2728f406307e7cfd4e555f4db0c163234d8
Signed-off-by: Jeremy Ho <jeremyh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2146727
(cherry picked from commit 42c2bdfb9f)
Reviewed-on: https://git-master.nvidia.com/r/2160290
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add macros for whitelisting coverity violations. These macros use pragma
directives. The pragma directives and whitelisting macros are only
enabled when a coverity scan is being run.
The whitelisting macros have been added to a new header called
static_analysis.h. The contents of safe_ops.h (CERT C safe ops) have
been moved into static_analysis.h because this will be the new header
for static analysis related macros/defines/etc.
JIRA NVGPU-3820
Change-Id: I9c63f20f670880b420415535738034619314b7c3
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2180600
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add posix support for nvgpu_request_firmware and
nvgpu_release_firmware calls.
In x86, needed firmware are copied under userspace/firmware
directory.For jetson, firmware files will be copied under
nvgpu_unit/firmware directory.
Update Makefile.tmk to copy firmware in systemimage under
nvgpu_unit/firmware directory.
Jira NVGPU-3582
Bug 2693908
Change-Id: I5f5e5819dc5501e587bc8afc0a3944c18a8e9bef
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2189493
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This reverts commit 2a7e6a1111c2e52df2eae22fd084f0c955ed0759.
Bug 2693908
Change-Id: Id9ed7a6b18929cf1b319a54aca227c7c36515f26
Signed-off-by: Bo Yan <byan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2189199
Add posix support for nvgpu_request_firmware and
nvgpu_release_firmware calls.
In x86, needed firmware are copied under userspace/firmware
directory. For jetson, firmware files will be copied under
nvgpu_unit/firmware directory.
Update Makefile.tmk to copy firmware under systemimage under
nvgpu_unit/firmware directory
Jira NVGPU-3582
Change-Id: I9ce729af797e59c8d41a1aa4ee964d7d9b8b666e
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2181572
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Replaced ch->mmu_debug_mode_enabled with ch->mmu_debug_mode_refcnt.
If channel is enabled multiple times by userspace, then ref count is
updated accordingly. There is an expectation that enable/disable
calls are balanced for setting channel's mmu debug mode.
When unbinding the channel, decrease refcnt for the channel until it
reaches 0.
Also, removed tsg parameter from nvgpu_tsg_set_mmu_debug_mode as it
can be retrieved from ch.
Bug 2515097
Change-Id: If334e374a55bd14ae219edbfd3b1fce5ff25c226
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2184702
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch adds nvgpu API in linux and posix to query vpr resize.
The new API nvgpu_is_vpr_resize_enabled() is used in
nvgpu_submit_channel_gpfifo().
Previously, if non-deterministic channel has timeout disabled and
GPU cannot railgate on some platform, then channel doesn't power ref
count and results in video freeze. To resolve non-determinstic channel
job tracking needs to be enabled if vpr resize is supported or if GPU
can railgate.
Bug 200532122
Change-Id: Icfbff6253762b195b2f5955749343974b1a7a269
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2171093
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For safety build, nvgpu driver should enter SW quiesce state
in case an uncorrectable error has occurred. In this state, any
activity on the GPU should be prevented, without powering off the GPU.
Also, a minimal set of operations should be used to enter SW quiesce
state.
Entering SW quiesce state does the following:
- set sw_quiesce_pending: when this flag is set, interrupt
handlers exit after masking interrupts. This should help mitigate
an interrupt storm.
- wake up thread to complete quiescing.
The thread performs the following:
- set NVGPU_DRIVER_IS_DYING to prevent allocation of new resources
- disable interrupts
- disable fifo scheduling
- preempt all runlists
- set error notifier for all active channels
Note: for channels with usermode submit enabled, userspace can
still ring doorbell, but this will not trigger any work on
engines since fifo scheduling is disabled.
Jira NVGPU-3493
Change-Id: I639a32da754d8833f54dcec1fa23135721d8d89a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2172391
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Add doxygen documentation.
- Remove unused fields of nvgpu_tsg struct:
-- timeslice_timeout
-- timeslice_scale
- Remove unused functions:
-- nvgpu_tsg_set_runlist_interleave
- nvgpu_tsg_post_event_id is not supported in safety build.
This function is moved under CONFIG_NVGPU_CHANNEL_TSG_CONTROL
compiler flag.
- Below functions are moved under CONFIG_NVGPU_KERNEL_MODE_SUBMIT
nvgpu_tsg_ctxsw_timeout_debug_dump_state
nvgpu_tsg_set_ctxsw_timeout_accumulated_ms
- Rename
gk20a_is_channel_active -> nvgpu_tsg_is_channel_active
release_used_tsg -> nvgpu_tsg_release_used_tsg
- nvgpu_tsg_unbind_channel_common declared static
- Fix build issue when CONFIG_NVGPU_CHANNEL_TSG_CONTROL is disabled
Remove CONFIG_NVGPU_CHANNEL_TSG_CONTROL for
nvgpu_gr_setup_set_preemption_mode as it is needed in safety build.
By default compute preemption mode will be set to WFI. CUDA will
change it to CTA during context init time.
JIRA NVGPU-3595
Change-Id: I8ff6cabc8b892c691d951c37cdc0721e820a0297
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2151489
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
test_falcon_halt failed as nvgpu_timeout_expired returned -ETIMEDOUT when
time equal to timeout is reached and nvgpu_timeout_peek_expired returns
false when time is equal or less and true when time is greater than
timeout, leading to inconsistent return value.
Update nvgpu_timeout_expired_msg_cpu logic that is used by former.
JIRA NVGPU-3946
Change-Id: I365063cc12a584833c08ca710bb795c0e9d814cd
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2180233
Reviewed-by: Nicolas Benech <nbenech@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Changes added to support "rmmod nvgpu" in dgpu simulation after gpu
poweron.
nvgpu_engine-wait_for_idle got stuck in busy mode for nvdec and nvec
engines in simulation as simulation doesnt support timeout.
These engines are not valid engines in nvgpu engine list.
Add nvgpu_engine_check_valid_id before checking engine status.
Simulation crash on accessing 0xb81604 top interrupt register.
Add func_priv_cpu_intr_top__size_1_v() function to get the supported
size than using default MAX_INTR_TOP_REGS.
nvlink is not supprted in dgpu simulation. Avoid warning for
-ENODEV return.
Avoid register read following gpu power off completion.
Bug 2498574
Change-Id: I9f9f1cf1ac4620242bda1d2cc0f29f51f81a6711
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2179930
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
CE app functionality from nvgpu is non-safe for igpu. CE engines init
/reset/cg related functionality is required in safety. Hence move the
CE app logic under CONFIG_NVGPU_DGPU flag and update the sources
accordingly.
JIRA NVGPU-3814
Change-Id: I37aa00b1184baccd5fe569ec315be60ac42dac9b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2168956
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For dGPU with PCIE interface do not have a thermal alert pin.
Only platforms where dGPU is used with SXM interface have the
thermal alert pin.
This change makes sure that if nvgpu-therm-gpio DT entry is
is missing we don't fail probe but continue with GPU
initialization without enabling thermal alert feature.
Bug 200542024
Change-Id: Iaf3aec9b66695a45daf86ecfdeec398b66f96bfd
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2173495
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- In GV11B, read fuse_status_opt_tpc_gpc register
to read which TPCs are floorswept.
- The driver will also read sysfs node: tpc_pg_mask
- Based on these two values "can_tpc_powergate" will
be set to true or false and mask will be used to write to
fuse_ctrl_opt_tpc_gpc register to powergate the TPC.
- can_tpc_powergate = true indicates that the mask value
sent from userspace is valid and can be used to power gate
the desired TPC
- can_tpc_powergate = false indicates that the mask value
sent from userspace is not valid and cannot be used to
power gate the desired TPC.
Bug 200532639
Change-Id: Ib0806e4c96305a13b3574e8063ad8e16770aa7cd
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2170736
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There was a header file circular dependency that was preventing
including some files. For example, for utils.h to include safe_ops.h
would include bug.h which included log.h which included bitops.h which
included utils.h. To break this loop, the macro nvgpu_do_assert_print()
into a function in a new file assert.c. With this change, log.h is no
longer required in bug.h.
This change also required adding a few includes in C files that were
picking up definitions through the chain above.
JIRA NVGPU-3868
Change-Id: Icf95677bb36e4aa034cba25594cf71f2d028c289
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2168528
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rule 2.2 doesn't allow unused variable assignments. The reason is
presence of unused variable assignments may indicate error in program's
logic.
Rule 21.x doesn't allow reserved identifier or macro names starting with
'_' to be reused or defined.
Jira NVGPU-3864
Change-Id: I8ee31c0ee522cd4de00b317b0b4463868ac958ef
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2163723
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Added PCIE device info for TU104-QS chip & marked as FUSA SKU
using device flag
-is_fusa_sku flag will be set if device flag has FUSA SKU flag set
& this will be checked in driver to execute functionality
specific to FUSA SKU
JIRA NVGPU-3727
Change-Id: I49ea357133ce0b9bbf52dae72afcf8139ab01346
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2161163
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Initialize the clock counters for GPCCLK, XBARCLK, SYSCLK.
This INIT was done in PMU before, but now disabled from TU10A profile.
Hence the initialization is moved into nvgpu.
This patch does the following.
1. Move clock files from GV100 to TU104.
2. Add the Counter HW Registers.
3. Initialize the counter registers for gpc, xbar and sysclk.
4. Change the debug fs node from gv100 to tu104.
5. Update in yaml file with new file names.
Bug 200536091
Change-Id: I436019a18f5c4c73979977666d0c04ce4c569047
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2155298
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This change switches nvgpu_timeout_peek_expired() to return a bool
instead of an int to remove advisory rule MISRA 10.5 violations.
MISRA 10.5 states that the value of an expression should not be
cast to an inappropriate essential type.
JIRA NVGPU-3798
Change-Id: I5cf9badaf07493e11a639e47ae4cf221700134ff
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2155617
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There was a name clash between the nvgpu_set_error_notifier*() APIs and
the SET_ERROR_NOTIFIER IOCTL. Therefore, the APIs were renamed from
nvgpu_set_error_notifier*() to nvgpu_set_err_notifier*(). This rename
was done to fix MISRA 5.x errors.
JIRA NVGPU-1633
Change-Id: I06af551a664b0706f106e853f1ea8733894f11bd
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2159813
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This change switches nvgpu_nvhost_syncpt_is_expired_ext()
to return a bool instead of an int to remove advisory rule
MISRA 10.5 violations.
MISRA 10.5 states that the value of an expression should not be
cast to an inappropriate essential type.
JIRA NVGPU-3798
Change-Id: Ie0772ac7167a3c999129de0dc7f22cd96faa28fc
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2159801
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch adds SWUD (SW Unit Design) document for SDL unit. In addition,
it re-names err_type to err_id in error reporting APIs related to ECC, GR,
PRI and MMU, to keep the name consistent with other APIs.
JIRA NVGPU-3758
Change-Id: I968218574aa78144497fc12bd6dab20d1be7aa1c
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2151092
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>