A call to exit the PMU state machine/kthread must
be prioritized over any other state change.
It was possible to set the state as PMU_STATE_EXIT,
signal the kthread and overwrite the state before
the kthread has had the chance to exit its loop.
This may lead to a "lost" signal, resulting in
indefinite wait during the destroy sequence.
Faulting sequence:
1. pmu_state = PMU_STATE_EXIT in nvgpu_pmu_destroy()
2. cond_signal()
3. pmu_state = PMU_STATE_LOADING_PG_BUF
4. PMU kthread wakes up
5. PMU kthread processes PMU_STATE_LOADING_PG_BUF
6. PMU kthread sleeps
7. nvgpu_pmu_destroy() waits indefinitely
This patch adds a sticky flag to indicate PMU_STATE_EXIT,
irrespective of any subsequent changes to pmu_state.
The PMU PG init kthread may wait on a call to
NVGPU_COND_WAIT_INTERRUPTIBLE, which requires a
corresponding call to nvgpu_cond_signal_interruptible()
as the core kernel code requires this task mask to
wake-up an interruptible task.
Bug 2658750
Bug 200532122
Change-Id: I61beae80673486f83bf60c703a8af88b066a1c36
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2181926
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Change the directly called init functions to function pointers in the
HAL. This makes it more consistent. This also allows for writing more
comprehensive unit tests for nvgpu.common.init.
JIRA NVGPU-2239
Change-Id: I05d739a8f8a2e7d385322d93154206eb0bfddc10
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2173920
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For getting the buffer size qnx issues a devctl to nvmap which can fail
as well. So, check the size that is returned by nvgpu_os_buf_get_size.
If 0 size is returned then return -EINVAL to the caller.
Jira NVGPU-3891
Change-Id: Id13e7612b044e9228d78469ab4e43961a6877ce8
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2174458
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename reserved identifier __here to label_here to fix
around 180 DCL37-C violations - "The reserved identifier "__here",
which is reserved for use as identifiers with file scope in both
the ordinary and tag name spaces, is declared.
JIRA NVGPU-3908
Change-Id: Iecaaa03674800bec919d71dc5fd226d362fc6f56
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2178457
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Advisory Rule 2.7 states that there should be no unused
parameters in functions.
This patch removes the unused 'post_event', 'fault_ch' and
'hww_global_esr' parameters from the following:
* gv11b_gr_intr_handle_l1_tag_exception()
* gv11b_gr_intr_handle_lrf_exception()
* gv11b_gr_intr_handle_cbu_exception()
* gv11b_gr_intr_handle_l1_data_exception()
* gv11b_gr_intr_handle_icache_exception()
These changes allowed the same parameters to be removed from the
the gr.intr.handle_tpc_sm_ecc_exception() interface.
Jira NVGPU-3178
Change-Id: I4d5dcbf2a5325e38782cdac67f9dd0b223fa1a18
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2171220
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
test_falcon_halt failed as nvgpu_timeout_expired returned -ETIMEDOUT when
time equal to timeout is reached and nvgpu_timeout_peek_expired returns
false when time is equal or less and true when time is greater than
timeout, leading to inconsistent return value.
Update nvgpu_timeout_expired_msg_cpu logic that is used by former.
JIRA NVGPU-3946
Change-Id: I365063cc12a584833c08ca710bb795c0e9d814cd
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2180233
Reviewed-by: Nicolas Benech <nbenech@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- This patch fixes disable fecs trace logic.
- If user does trace disable twice, enable_count will become
negative and when user tries to re-enable it, fecs trace
will not be enabled.
Bug 2672760
Change-Id: I895e48549970fa2faf06d5a531a423489f494cf4
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2171015
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Changes added to support "rmmod nvgpu" in dgpu simulation after gpu
poweron.
nvgpu_engine-wait_for_idle got stuck in busy mode for nvdec and nvec
engines in simulation as simulation doesnt support timeout.
These engines are not valid engines in nvgpu engine list.
Add nvgpu_engine_check_valid_id before checking engine status.
Simulation crash on accessing 0xb81604 top interrupt register.
Add func_priv_cpu_intr_top__size_1_v() function to get the supported
size than using default MAX_INTR_TOP_REGS.
nvlink is not supprted in dgpu simulation. Avoid warning for
-ENODEV return.
Avoid register read following gpu power off completion.
Bug 2498574
Change-Id: I9f9f1cf1ac4620242bda1d2cc0f29f51f81a6711
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2179930
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Advisory Rule 2.7 states that there should be no unused
parameters in functions.
This patch removes unused function parameters from the following:
* acr_hs_bl_exec() -> remove 'acr' param
Jira NVGPU-3178
Change-Id: I46197964aa832bae24ea2fcbc8eeea1cac7f8909
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2179495
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_err() macro with a nvgpu_readl() call results in
a volatile access. This violates PRE31-C rule - "Using an
unsafe function-like macro with side effect in argument
nvgpu_readl()" due to side effect of a volatile access.
Fix this by moving nvgpu_readl() calls before nvgpu_err().
The messages log VPR and WPR address info. There are no
known attacks using this info. So it shall be safe to
reveal address info.
JIRA NVGPU-3908
Change-Id: I487a0c0858fe9a36cc81852cedd7757aab277c6a
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2178416
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_err() macro with nvgpu_readl() call results in a
volatile access. This violates PRE31-C rule - "Using an unsafe
function-like macro with side effect in argument nvgpu_readl()"
due to side effect of a volatile access.
Fix this by moving nvgpu_readl() calls before nvgpu_err().
JIRA NVGPU-3908
Change-Id: I927d515c4b24cd4cfca16691918f327e06894c5a
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2178415
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
CTS test dEQP-VK.api.object_management.max_concurrent.device_group
crashes with invalid userspace memory access.
Currently, nvgpu_submit_prepare_syncs() races with
nvgpu_channel_clean_up_jobs() and this race condition is exposed when
aggressive_sync_destroy_thresh is set to non-zero value.
nvgpu_submit_prepare_syncs() gets ref for c->sync to submit job and
releases channel sync_lock. Meanwhile, nvgpu_worker_poll_work()
triggers nvgpu_channel_clean_up_jobs(), which destroys ref'd c->sync
pointer.
This patch protects channel's sync pointer by holding channel sync_lock
during complete execution of nvgpu_submit_prepare_syncs().
Bug 2613870
Change-Id: I6f3d48aff361d1cb38c30d2ce5de276d0c55fb6f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176929
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Advisory Rule 2.7 states that there should be no unused
parameters in functions.
This patch removes unused function parameters from the following:
* release_as_share_id() -> remove 'id' param
* nvgpu_pd_cache_look_up() -> remove 'g' param
* nvgpu_vm_get_pte_size_fixed_map() -> remove 'size' param
Jira NVGPU-3178
Change-Id: Id2c3b5378bba9bdc0312742fd8393fb3ec67c4df
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2178650
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
CE app functionality from nvgpu is non-safe for igpu. CE engines init
/reset/cg related functionality is required in safety. Hence move the
CE app logic under CONFIG_NVGPU_DGPU flag and update the sources
accordingly.
JIRA NVGPU-3814
Change-Id: I37aa00b1184baccd5fe569ec315be60ac42dac9b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2168956
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add support for the fuse and gv11b_gr related
registers and with initialized values need for
gr_config tests.
Add unit tests covering following functions
nvgpu_gr_alloc
nvgpu_gr_config_init
nvgpu_gr_config_deinit
nvgpu_gr_free
Most of the gr_config.h get and set functions
will be used in config_check_set_get test
Jira NVGPU-3582
Change-Id: I69637b3047d844559ac5b44ffd366551af8ea12c
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176369
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Advisory Rule 2.7 states that there should be no unused
parameters in functions.
This patch removes unused function parameters from the following:
* nvgpu_channel_ctxsw_timeout_debug_dump_state()
* nvgpu_channel_destroy()
* nvgpu_tsg_destroy()
* nvgpu_rc_pdbma_fault()
Jira NVGPU-3178
Change-Id: I12ad0d287fd7980533663a9776428ef5d4fd1fb9
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2176066
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For dGPU with PCIE interface do not have a thermal alert pin.
Only platforms where dGPU is used with SXM interface have the
thermal alert pin.
This change makes sure that if nvgpu-therm-gpio DT entry is
is missing we don't fail probe but continue with GPU
initialization without enabling thermal alert feature.
Bug 200542024
Change-Id: Iaf3aec9b66695a45daf86ecfdeec398b66f96bfd
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2173495
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
utf_falcons in libfalcon_utf.so are used by multiple unit tests and
since unit tests are run out of order it might lead to inconsistent
access to these structs.
Move these to falcon unit test. Updated the logic in the falcon utf
shared library to only export the interfaces to read/write to flcn
registers/memory.
JIRA NVGPU-2159
Change-Id: I3a1f80cce205747de4085802199ecc355bb95d6f
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2174187
GVS: Gerrit_Virtual_Submit
Reviewed-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>