nvgpu_engine_is_valid_runlist_id already iterates the list of
active engines, therefore the engine_id is already known to
be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.
Also make sure to reset num_engines in nvgpu_cleanup_sw, to avoid
accessing uninitialized active_engines_list in unit test corner
cases (targetting init/remove support).
Jira NVGPU-4511
Change-Id: Ia6b904a7f3ca46e5097f06770b4caad317ec967b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2263618
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below functions in common.gr.obj_ctx subunit include unnecessary
asserts to ensure value is not truncated when parsing into U32 size.
nvgpu_gr_obj_ctx_gr_ctx_alloc()
Make use of nvgpu_safe_cast_u64_to_u32() and remove unnecessary
asserts
Jira NVGPU-4778
Change-Id: Ic06c2f4131b3bba35222f7de5441f82ecee6d83d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2277158
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch removes dependency between pmu.clk and hal.clk.
Below is the implementation.
*Init the clk domains inside pmu.clk unit.
*Set this in a member variable and send it to hal.clk unit
*Use this to determine the valid clock domains and read the
monitor registers for each valid domains.
*Return the domain with error code.
*With this all the clk domain data is removed from hal.clk and
only pmu.clk uses it as it owns it.
JIRA NVGPU-4491
Change-Id: Ie57b2472cfaacfd2ce43ac7f95702bd95fb8bbaa
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2272416
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
To achieve permanent fault coverage, the CTAs launched by
each kernel in the mission and redundant contexts must execute on
different hardware resources. This feature proposes modifications
in the software to modify the virtual SM id to TPC mapping across
the mission and redundant contexts. The virtual SM identifier to TPC
mapping is done by nvgpu when setting up the patch context.
The recommendation for the redundant setting is to offset the
assignment by one TPC, and not by one GPC. This will ensure that both
GPC and TPC diversity. The SM and Quadrant diversity will happen
naturally. For kernels with few CTAs, the diversity is guaranteed
to be 100%. In case of completely random CTA allocation,
e.g. large number of CTAs in the waiting queue, the diversity is
1 - 1/#SM, or 87.5% for GV11B, 97.9% for TU104.
Added NvGpu CFLAGS to enable/disable the SM diversity support
"CONFIG_NVGPU_SM_DIVERSITY".
This support is only enabled on gv11b and tu104 QNX non safety build.
JIRA NVGPU-4685
Change-Id: I8e3eaa72d8cf7aff97f61e4c2abd10b2afe0fe8b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_engine_get_gr_runlist_id gets the first instance of
active GR engine using nvgpu_engine_get_ids. Therefore the
engine_id is already known to be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.
Jira NVGPU-4511
Change-Id: Ifcc0851e3d14d862e2ed7b21ea57f17a66eca9dd
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2263617
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
event_id_list and event_id_list_locks fields are only
needed in nvgpu_tsg when CONFIG_NVGPU_CHANNEL_TSG_CONTROL
is defined.
Conditionally compile those fields and related code,
so that they are removed from safety build.
Jira NVGPU-4376
Change-Id: I8678aa1b8cd4166aa37bcb42cda1eb9c703fd32f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2273261
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Advisory Directive 4.5 states that identifiers in the same
name space with overlapping visibility should be typographically
unambiguous.
The presence of both the roundup(x,y) and round_up(x,y) macros in
the posix utils.h header incurs a violation of this rule.
These macros were added to keep in sync with the linux kernel variants.
However, there is a key distinction between how these two macros
work in the linux kernel; roundup(x,y) can handle any y alignment while
round_up(x,y) is intended to work only when y is a power-of-two.
Passing a non-power-of-two alignment to round_up(x,y) results in an
incorrect value being returned (silently).
Because all current uses of roundup(x,y) and round_up(x,y) in
nvgpu specify a y value that is a power-of-two and the underlying
posix macro implementations assume as much, it is best to remove
roundup(x,y) from nvgpu altogether to avoid any confusion.
So this change converts all uses of roundup(x,y) to round_up(x,y).
Jira NVGPU-3178
Change-Id: I0ee974d3e088fa704e251a38f6b7ada5a7600aec
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2271385
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Created ucode_perf_change_seq_inf.h and moved all
change_seq interface structs and MACROs
-Moved nvgpu_clk_set_req_fll_clk_ps35 from clk unit
to change_seq unit
-Removed MACROs and includes which are not needed
NVGPU-4448
Change-Id: I04ab32cbc9a1fc827f3360a8ea0f367019981823
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2266051
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Created ucode_perf_pstate_inf.h and moved all
pstate interface structs and MACROs.
-Created nvgpu_perf_pstate_get_lpwr_index for getting
lpwr index
-Created nvgpu_clk_domain_get_from_index for getting
clk_domain from index
-Removed pstate_get_status code which is not needed
for tu10a profile
-Removed MACROs and includes which are not needed
NVGPU-4448
Change-Id: I516816a1d92a60a91ea479cb9c334d332d3d7a89
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2264716
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Created ucode_perf_vfe_inf.h and moved all VFE
interface structs and MACROs into this header
-Created nvgpu_clk_fll_get_fmargin_idx to get
freq margin index
-Created nvgpu_vfe_var_get_s_param to read s_param
-Removed MACROs and header includes which are
not needed
NVGPU-4448
Change-Id: I89f946d555bcbc7823665d2a5a761049f7a5e963
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2260150
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Passing branch of nvgpu_timeout_peek_expired was not covered due to jump
over it in nvgpu_falcon_mem_scrub_wait. Remove that jump to cover the
branch.
Add unit test for covering the error handling in case of read from
DMEM control register returns invalid data using fault injection.
JIRA NVGPU-4814
Change-Id: I9f99186bd2b1c5f39ead130d3161d3e7fa622ac4
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2272937
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Advisory Directive 4.5 states that identifiers in the same
name space with overlapping visibility should be typographically
unambiguous.
In both the nvgpu_locate_pte_last_level() and gp10b_get_pde0_pgz()
routines this violation is raised because a variable used as
a for-loop index ('i') is ambiguous with a parameter that points
to a struct gk20a_mmu_level ('l').
To fix these violations the loop index 'i' is renamed to 'idx'.
Jira NVGPU-3178
Change-Id: I2cec904201075b48ab6ccfbd0ff6d7e9dcac2867
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2271456
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 11.2 doesn't allow conversions of a pointer from or to an
incomplete type. These type of conversions may result in a pointer
aligned incorrectly and may further result in undefined behavior.
This patch addresses rule 11.2 violations related to pointers to and
from struct nvgpu_sgl. This patch replaces struct nvgpu_sgl pointers by
void pointers.
Jira NVGPU-3736
Change-Id: I8fd5766eacace596f2761b308bce79f22f2cb207
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2267876
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Preempt TSG occurs in non-mission mode, when unbinding channel
from TSG, or aborting TSG. Should a preempt not complete on
engine, we expect other HW safety mechanisms such as FECS
watchdog to detect issues that prevented saving current context.
Add BUG_ON when attempting to recover from preempt timeout,
to make sure we got such error, and sw_quiesce has been
requested.
Jira NVGPU-4230
Change-Id: Ia26a61e703f74eb28d29e72e75664ca4ec97a586
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265082
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
As a part of refactoring, we need to move the volt unit from perf to pmu
as it belongs there and also move the arbitor specific functions under
CLK_ARB as they will be removed from safety build.
This patch does the following
*Move volt struct from perf to pmu
*Move volt setup from pmu_pstate to volt
*Move clk freq related functions into CLK_ARB
NVGPU-4491
NVGPU-4492
Change-Id: I7180cd12bbf91cc4d2e79b6e2d71c16e494c8ff0
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268215
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new unit test to cover gops.gr.init.ecc_scrub_reg HAL function
gops.gr.init.ecc_scrub_reg HAL can generate TIMEOUT errors which are
not returned to caller currently. Update this HAL to return int value
for error propagation.
Jira NVGPU-4458
Change-Id: I98f4d5af2ef17cc4301951fec4d660638c8ef72c
Signed-off-by: dnibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265456
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Runlist update occurs in non-mission mode, when
adding/removing channel/TSGs. The pending bit
is a debug only feature. As a result logging a
warning is sufficient.
We expect other HW safety mechanisms such as
PBDMA timeout to detect issues that caused pending
to not clear. It's possible bad base address could
cause some MMU faults too.
Worst case we rely on the application level task
monitor to detect the GPU tasks are not completing
on time.
Jira NVGPU-4322
Change-Id: I7233770349db5dfad6904170a1e9a2d5eada70b2
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265094
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
fbpa related functions are not supported on igpu safety. Don't
compile them if CONFIG_NVGPU_DGPU is not set.
Also compile out fb and ramin hals that are dgpu specific.
Update the tests for the same.
JIRA NVGPU-4529
Change-Id: I1cd976c3bd17707c0d174a62cf753590512c3a37
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265402
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>