nvgpu_bios_sw_init was not freeing g->bios on init failure
(for instance when board does not complete devinit).
As a result, the function was skipping init at the next
nvgpu_finalize_poweron attempt. This was resulting in a
crash later on, when attempting to parse VBIOS data.
Changed nvgpu_bios_sw_init to free g->bios and set it to
NULL, in case init fails.
Bug 2834853
Change-Id: I87bc18b89674141f713c4043a231a437669c9c94
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2291246
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Current volt unit has multiple header files under pmuif folder.
This has combination of public struct which is accessed outside the
unit and private struct which is accessed within volt unit.
This patch segregates them based on their accessibility.
All private items are moved into ucode_volt_inf.h from pmuif which only
volt can access.
All public items are moved into include/volt.h which other units can
access
This will help in documentation of items for public items.
NVGPU-4492
Change-Id: Id40bf4922408a55f1e67d071be726839ac57718f
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2289114
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Address following issues uncovered during inspection:
1. Remove the doxygen comment from nvgpu_wait_for_deferred_interrupts
definition.
2. Use NVGPU_MC_INTR_STALLING instead of hardcoding the index.
3. Define doxygen groups NVGPU_MC_UNIT_ENUMS,
NVGPU_MC_INTR_TYPE_DEFINES, NVGPU_MC_INTR_UNIT_DEFINES and
NVGPU_MC_INTR_ENABLE_DEFINES.
4. Update the doxygen comments.
5. Fix the cleanup, typo in the description of the test
test_wait_for_deferred_interrupts.
JIRA NVGPU-4795
Change-Id: Ifc6756832aabf9dd42ee174eb1373495e6d38c86
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287627
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Add test scenarios for achieving branch coverage
for failure of dynamic memory allocation while
preparing ucode blob.
- Add more branch coverage for nvgpu_acr_bootstrap_hs_acr()
- Move GR reg space required for ACR tests to ACR unit test
itself to remove dependency on GR unit
JIRA NVGPU-4319
Change-Id: I770a696a1681eb05243c7168878793a30cd59c13
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2286257
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- "include/trace/events/gk20a.h" file was having GPL2 license
(which should not used for QNX code). This file was used for
compiling linux userspace driver("libnvgpu-drv.so") and was used for
unit testing on QNX.
- This patch removes stubs in "include/trace/events/gk20a.h" file.
(which were used for linux userspace driver.)
- For QNX driver, "nvgpu_rmos/trace/events/gk20a.h" was used.
This patch moves that file to "include/nvgpu/posix/trace_gk20a.h" and
does relevant license change. This same file will be used for linux
userspace driver.
- This patch also creates a new file "include/nvgpu/trace.h" which
selects proper trace file depending on the config.
Bug 2802414
Change-Id: Icdfb251e5698073f986753a969e804161af3ecc5
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2286388
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Advisory Rule 2.5 states that a project should not contain unused
macro declarations.
While most of the violations in the nvgpu driver are due to unused
macros from hw headers, devinit-related headers, etc. there is a small
number that are due to things like:
* macros not being used when they could/should be
* macros in C files that are really not referenced
* CPP build flag mismatches
This change eliminates such violations from the following:
* replace constants with existing macros in timeout conversion code
* wrap nvgpu_gmmu_dbg macro #defines in #ifdef CONFIG_NVGPU_TRACE/#endif
* wrap MAX_MC_INTR_REGS #define in #ifdef CONFIG_NVGPU_NON_FUSA/#endif
* remove unused FECS_MAILBOX_0_ACK_RESTORE from runlist code
* wrap BACKTRACE_MAXSIZE macro with #ifndef _QNX_SOURCE/#endif
Jira NVGPU-3178
Change-Id: I2bc72f706d7af3f8e7b062126e8543d0dc8ac250
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284419
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Xavier Chip Product POR was updated to 20G only. No more qual work
happening for 16G. So we do not plan to support 16G. Now that we have
a single speed left, remove the code added to support nvlink speed from
VBIOS as it is redundant.
JIRA NVGPU-2964
Change-Id: Icd71ebb8271240818e36d40bf73c60f0c5beb6bf
Signed-off-by: tkudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284175
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Created perf.h file and moved all private functions
and structures into it
-Created single sw_setup/pmu_setup for whole perf
unit
-Changed public function and structure names as per
standard format
-Deleted lpwr unit specific file from make file as
it is no longer used
-Removed support_vfe and support_changeseq flags as
it is no longer used
-Removed clk_set_boot_fll_clks_per_clk_domain function
as it is no longer used for tu10a
-Removed perf unit headers from pmuif folder
NVGPU-4448
Change-Id: Ia29e5b5a1a960b5474a929d8797542bf6c0eccf1
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2283587
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
ACR ucode is encrypted using different keys for prod/dbg boards.
This change adds a check to select ACR ucode based on board type.
Note: This support is added only for t19x.
This patch also enables the prints "DEBUG MODE" indicative of board/
acr_ucode signature type and sctl and cpuctl reg values.
Bug 2350733
Bug 2672832
Bug 2672836
JIRA NVGPU-4001
Change-Id: I936b811b5836152206b11ec615ee75d201939968
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2268880
Reviewed-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
As a part of refactoring, we need to move the volt functions from
pmu_pstate.c to volt.c as it belongs there and also move the
arbitor specific functions under CLK_ARB as they will be removed
from safety build.
This patch does the following
*Move volt setup from pmu_pstate to volt
*Move clk freq related functions into CLK_ARB
*Replace pmu.h with nvgpu_mem.h in boardobj.h
*Rename obj_volt to nvgpu_pmu_volt
NVGPU-4491
NVGPU-4492
Change-Id: I9abc96f695fce41893311982a80dc3656aaa64d6
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2282361
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
The atomic counter in interrupt handler can overflow and result in
calling of BUG() which will crash the process. The equivalent
functionality can be implemented with just setting an atomic variable at
start of handler and resetting at end of handler. The wait can be longer
in case there is constant interrupts coming but ultimately it will end.
Generally the wait path is not time critical so it should not be an
issue. Also, fix the unit tests for mc.
Change-Id: I9b8a236f72e057e89a969d2e98d4d3f9be81b379
Signed-off-by: shashank singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2247819
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Update SW quiesce as follows:
- After waking up sw_quiesce_thread, nvgpu_sw_quiesce
masks interrupts, then disables and preempts runlists
without lock. There could be still a concurrent thread
that would re-enable the runlist by accident. This is
very unlikely and would mean we are not in mission mode
anyway.
- In sw_quiesce_thread, wait NVGPU_SW_QUIESCE_TIMEOUT_MS,
to leave some time for interrupt handler to set error
notifier (in case of HW error interrupt). Then disable
and preempt runlists, and set error notifier for remaining
channels before exiting the process.
Also modified nvgpu_can_busy to return false in case
SW quiesce is pending. This will make subsequent
devctl to fail.
Jira NVGPU-4512
Change-Id: I36dd554485f3b9b08f740f352f737ac4baa28746
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2266389
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch adds boundary value check for common.fifo parameters as
listed below.
1. nvgpu_channel_setup_bind() includes a condition to check that value
of num_gpfifo_entries does not exceed 2^31. Otherwise prints message and
returns error.
2. nvgpu_tsg_bind_channel() includes a condition to check if channel
subctx had ASYNC id. If true, runqueue selector is set to 1 and 0
otherwise. This check is to be moved from devctl to common.fifo.
Jira NVGPU-4817
Change-Id: Id1c9253945859c245e584b5c42b3285a6b620055
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2278613
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_engine_is_valid_runlist_id already iterates the list of
active engines, therefore the engine_id is already known to
be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.
Also make sure to reset num_engines in nvgpu_cleanup_sw, to avoid
accessing uninitialized active_engines_list in unit test corner
cases (targetting init/remove support).
Jira NVGPU-4511
Change-Id: Ia6b904a7f3ca46e5097f06770b4caad317ec967b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2263618
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below functions in common.gr.obj_ctx subunit include unnecessary
asserts to ensure value is not truncated when parsing into U32 size.
nvgpu_gr_obj_ctx_gr_ctx_alloc()
Make use of nvgpu_safe_cast_u64_to_u32() and remove unnecessary
asserts
Jira NVGPU-4778
Change-Id: Ic06c2f4131b3bba35222f7de5441f82ecee6d83d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2277158
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch removes dependency between pmu.clk and hal.clk.
Below is the implementation.
*Init the clk domains inside pmu.clk unit.
*Set this in a member variable and send it to hal.clk unit
*Use this to determine the valid clock domains and read the
monitor registers for each valid domains.
*Return the domain with error code.
*With this all the clk domain data is removed from hal.clk and
only pmu.clk uses it as it owns it.
JIRA NVGPU-4491
Change-Id: Ie57b2472cfaacfd2ce43ac7f95702bd95fb8bbaa
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2272416
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
To achieve permanent fault coverage, the CTAs launched by
each kernel in the mission and redundant contexts must execute on
different hardware resources. This feature proposes modifications
in the software to modify the virtual SM id to TPC mapping across
the mission and redundant contexts. The virtual SM identifier to TPC
mapping is done by nvgpu when setting up the patch context.
The recommendation for the redundant setting is to offset the
assignment by one TPC, and not by one GPC. This will ensure that both
GPC and TPC diversity. The SM and Quadrant diversity will happen
naturally. For kernels with few CTAs, the diversity is guaranteed
to be 100%. In case of completely random CTA allocation,
e.g. large number of CTAs in the waiting queue, the diversity is
1 - 1/#SM, or 87.5% for GV11B, 97.9% for TU104.
Added NvGpu CFLAGS to enable/disable the SM diversity support
"CONFIG_NVGPU_SM_DIVERSITY".
This support is only enabled on gv11b and tu104 QNX non safety build.
JIRA NVGPU-4685
Change-Id: I8e3eaa72d8cf7aff97f61e4c2abd10b2afe0fe8b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>