Safety build does not support vidmem. This patch compiles out vidmem
related changes - vidmem, dma alloc, cbc/acr/pmu alloc based on
vidmem and corresponding tests like pramin, page allocator &
gmmu_map_unmap_vidmem..
As vidmem is applicable only in case of DGPUs the code is compiled
out using CONFIG_NVGPU_DGPU.
JIRA NVGPU-3524
Change-Id: Ic623801112484ffc071195e828ab9f290f945d4d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2132773
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove below hals since the corresponding functions are same on all
platforms and they are h/w independent
g->ops.gr.enable_ctxsw()
g->ops.gr.disable_ctxsw()
g->ops.gr.reset()
Call the functions directly at all places
Remove CONFIG_NVGPU_DEBUGGER from places where these functions are
called since they are not debugger dependent
This also helps to disable CONFIG_NVGPU_DEBUGGER and to keep recovery
sequence intact
Jira NVGPU-3506
Change-Id: Id2b208ca23dc4667e78edcd8ad242a8558e0ff64
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2137255
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In TU104 tsense is not calibrated and tsosc needs to be used.
Tsosc is the POR for TU104.
This patch does the following
Remove the GP106 related thermal header and Add TU104 therm.
Rename the files from therm_gp106 to therm_tu104.
Update the debug fs interface to reflect the same.
Update the yaml files.
Bug 200526122
Change-Id: I73fd7d4c516426b5c6b84762480be2d6d572d5a7
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2135139
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-compile out nvgpu_pmu members which are not required for
safety buid & modified source as required to support same.
-compile out PMU headers include which are not required for
safety code
-Removed unnecessary PMU header includes from some files
JIRA NVGPU-3418
Change-Id: I5364b1b16c46637d229e82745dd2846cb6335a72
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2128228
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Hal API g->ops.gr.halt_pipe() is defined in unsafe unit hal.gr.gr
It is called from safe unit, and it calls into API
g->ops.gr.falcon.ctrl_ctxsw() which is also safe
Hence get rid of unsafe API g->ops.gr.halt_pipe().
Caller now directly calls hal.gr.falcon API to halt pipe
Jira NVGPU-3506
Change-Id: I5439cb79431795fc7c22384832cf632d6db03316
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127755
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move some interrupt handling hals from hal.gr.gr unit to hal.gr.intr
unit as below
g->ops.gr.intr.set_hww_esr_report_mask()
g->ops.gr.intr.handle_tpc_sm_ecc_exception()
g->ops.gr.intr.get_esr_sm_sel()
g->ops.gr.intr.clear_sm_hww()
g->ops.gr.intr.handle_ssync_hww()
g->ops.gr.intr.log_mme_exception()
g->ops.gr.intr.record_sm_error_state()
g->ops.gr.intr.get_sm_hww_global_esr()
g->ops.gr.intr.get_sm_hww_warp_esr()
g->ops.gr.intr.get_sm_no_lock_down_hww_global_esr_mask()
g->ops.gr.intr.get_sm_hww_warp_esr_pc()
g->ops.gr.intr.tpc_enabled_exceptions()
g->ops.gr.intr.get_ctxsw_checksum_mismatch_mailbox_val()
Rename gv11b_gr_sm_offset() to nvgpu_gr_sm_offset() and move to
common.gr.gr unit
All of above functions and hals will be needed in safety build
Jira NVGPU-3506
Change-Id: I278d528e4b6176b62ff44eb39ef18ef28d37c401
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127753
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This flag is added to compile out below features from
safety build
-set_preemption_mode
-channel_enable
-channel_disable
-channel_preempt
-channel_force_reset
-tsg_enable
-tsg_disable
-tsg_preempt
-tsg_event_id_ctrl
-post_event_id
JIRA NVGPU-3516
Change-Id: I935841db766f192f62598240c0e245a2959555be
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2126829
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch does the following:
(1) Removes the usage of hw headers in SDL unit. For this purpose, it moves
the initialization required for errors that can be injected using hw
support, error injection function. Further, it passes the required
information to SDL via hal layers.
(2) Renames (i) PWR as PMU, (ii) nvgpu_report_ecc_parity_err to
nvgpu_report_ecc_err.
Jira NVGPU-3235
Change-Id: I69290af78c09fbb5b792058e7bc6cc8b6ba340c9
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2112837
Reviewed-by: Raghuram Kothakota <rkothakota@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
renamed NVGPU_LS_PMU to NVGPU_FEATURE_LS_PMU to follow
nvgpu naming standard
Compile out LS PMU files when PMU RTOS support is
disabled for safety build by setting NVGPU_LS_PMU
build flag to 0
JIRA NVGPU-3418
Change-Id: Ib09924ac25657e932723c10be573f2f701cb7bea
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127794
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Compile out PMU RTOS specific PMU HAL code when
PMU RTOS support is disabled for safety build by setting
NVGPU_LS_PMU build flag to 0
Added new functions for gv11b PMU HAL to easy compile out
other PMU HAL files.
Replaced all gk20a_writel/readl calls with nvgpu_writel/readl
calls in hal/pmu/pmu_gv11b.c files
JIRA NVGPU-3418
Change-Id: I7c315349aa95721990dc7b1570383669bcb6221f
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2117691
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
pmu_gk20a.c includes hw_mc_gk20a.h other than hw_pwr_gk20a.h
to access & configure pmu interrupt, this breaks single hw header
for HAL file.
Moved PMU interrupt enable to MC unit by creating/modifying current
mc ops intr_unit_config to intr_pmu_unit_config to configure PMU
interrupt specifically as this ops is only used by PMU unit
JIRA NVGPU-3239
Change-Id: I2514f17197708047b46ea712cf4569a5b3bfab2a
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2126420
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
PG189 has multiple sensors which can provide interrupt when board
temperature reaches programmed threshold.
This Interrupt is implemented in nvgpu and provide events via clk_arb.
Support is enabled for TU104 with NVGPU_SUPPORT_DGPU_THERMAL_ALERT flag.
Board specific config is added in DT which will be parsed by nvgpu.
Nvgpu does the following.
1.Read gpio line number, interrupt type, and event delay from DT.
2.Call kernel methods and register the interrupt with kernel.
3.Create work queue which will process the interrupt in process context.
4.When interrupt occurs disable interrupt, add work to work queue.
5.In work queue post events and sleep for delay time then enable
Interrupt
Bug 2492512
Change-Id: Ic5694fe366ca492f8afe8a67de4350e9a51af2af
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2119411
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Some functions are not accessing hardware directly
but are being called using HAL ops: For example
.init = gv100_bios_init,
.preos_wait_for_halt = gv100_bios_preos_wait_for_halt,
.preos_reload_check = gv100_bios_preos_reload_check,
.devinit = gp106_bios_devinit,
.preos = gp106_bios_preos,
.verify_devinit = NULL,
This was being called as:
g->ops.bios.init(g)
g->ops.bios.preos_wait_for_halt(g)
g->ops.bios.preos_reload_check(g)
g->ops.bios.preos(g)
g->ops.bios.devinit(g)
g->ops.bios.verify_devinit(g)
Change the function access by using sw ops, like:
Create new function: nvgpu_bios_sw_init()
and based on hardware chip call the chip specific
bios sw init function: nvgpu_gv100_bios_sw_init()
and nvgpu_tu104_bios_sw_init()to assign the sw
ops
JIRA NVGPU-2071
Change-Id: Ibfcd9b225a7bc184737abdd94c2e54190fcd90a0
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2108526
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The HS ucode runs on PMU with all interrupts disabled. So it will not be
able to detect any data corruption introduced in the IMEM or DMEM due to bit
flips. In order to mitigate this issue validate the integrity of IMEM and DMEM
at the end of HS ucode bootstrap and fail the boot incase of any un-corrected
errors.
Jira NVGPU-3555
Change-Id: Icd9a2bf2c29470629be8524c9b99f90e3036abdc
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2124107
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
pbdma fault recovery function reads pbdma status info to retrieve
channel id, tsg id and engine id. pbdma interrupts can only be cleared
after that information has been read otherwise because pbdma exits
from stall state, channel/tsg/engine could have changed and fault
recovery function reads information different from that when interrupt
is issued.
Bug 2123866
Change-Id: Ia0e0462ae02ec89a333c81bd933a74fbae8ae1e7
Signed-off-by: Peng Liu <pengliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2123774
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Since dGPU support is not required for initial safety release, compile
out dGPU sw and hal implementations except below files that are used
by gv11b currently: acr_sw_gv100.c, engine_status_gv100.c, gr_gv100.c
gr_config_gv100.c and hwpm_map_gv100.c.
JIRA NVGPU-3062
Change-Id: I8a6bc8b235e7e5eac5b0e76147b8bd12f9abbd2d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2119586
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
NV_PGPC_PRI_MMU_DEBUG_CTRL is now context switched in gv11b
FECS ucode. Enable NVGPU_SUPPORT_SET_CTX_MMU_DEBUG_MODE, so that
userspace can use NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE
ioctl for gv11b.
Bug 2515097
Change-Id: Ia9fb36cffc9e67cf96c31c50ffa4c59997258ce2
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2115019
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE ioctl to set MMU
debug mode for a given context.
Added gr.set_mmu_debug_mode HAL to change NV_PGPC_PRI_MMU_DEBUG_CTRL
for a given channel. HAL implementation for native case is
gm20b_gr_set_mmu_debug_mode. It internally uses regops, which directly
writes to the register if the context is resident, or writes to
gr context otherwise.
Added NVGPU_SUPPORT_SET_CTX_MMU_DEBUG_MODE to enable the feature.
NV_PGPC_PRI_MMU_DEBUG_CTRL has to be context switched in FECS ucode,
so the feature is only enabled on TU104 for now.
Bug 2515097
Change-Id: Ib4efaf06fc47a8539b4474f94c68c20ce225263f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2110720
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Some functions are not accessing hardware directly
but are being called using HAL ops: For example
g->ops.pmu.pmu_elpg_statistics,
g->ops.pmu.pmu_pg_init_param,
g->ops.pmu.pmu_pg_supported_engines_list,
g->ops.pmu.pmu_pg_engines_feature_list,
g->ops.pmu.pmu_is_lpwr_feature_supported,
g->ops.pmu.pmu_lpwr_enable_pg,
g->ops.pmu.pmu_lpwr_disable_pg,
g->ops.pmu.pmu_pg_param_post_init,
g->ops.pmu.save_zbc
Change the function access by using sw ops, like:
Create new functions:
int nvgpu_pmu_elpg_statistics(struct gk20a *g, u32 pg_engine_id,
struct pmu_pg_stats_data *pg_stat_data);
void nvgpu_pmu_save_zbc(struct gk20a *g, u32 entries);
bool nvgpu_pmu_is_lpwr_feature_supported(struct gk20a *g,
u32 feature_id);
JIRA NVGPU-3209
Change-Id: I6db9b43c7c4a5054720a72487302b740b091044d
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2110963
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Some functions are not accessing hardware directly
but are being called using HAL ops: For example
.pmu_init_perfmon = nvgpu_pmu_init_perfmon_rpc,
.pmu_perfmon_start_sampling = nvgpu_pmu_perfmon_start_sampling_rpc,
.pmu_perfmon_stop_sampling = nvgpu_pmu_perfmon_stop_sampling_rpc,
.pmu_perfmon_get_samples_rpc = nvgpu_pmu_perfmon_get_samples_rpc,
These were being called by:
g->ops.pmu.pmu_init_perfmon,
g->ops.pmu.pmu_perfmon_start_sampling,
g->ops.pmu.pmu_perfmon_stop_sampling,
g->ops.pmu.pmu_perfmon_get_samples_rpc
Change the function access by using sw ops, like:
Create new functions:
int nvgpu_pmu_perfmon_init(struct gk20a *g,
struct nvgpu_pmu *pmu, struct nvgpu_pmu_perfmon *perfmon);
int nvgpu_pmu_start_sampling_perfmon(struct gk20a *g,
struct nvgpu_pmu *pmu, struct nvgpu_pmu_perfmon *perfmon);
int nvgpu_pmu_stop_sampling_perfmon(struct gk20a *g,
struct nvgpu_pmu *pmu, struct nvgpu_pmu_perfmon *perfmon);
int nvgpu_pmu_get_samples_rpc_perfmon(struct gk20a *g,
struct nvgpu_pmu *pmu, struct nvgpu_pmu_perfmon *perfmon);
and based on hardware chip call the chip specific
perfmon sw init function: nvgpu_gv11b_perfmon_sw_init() and
nvgpu_gv100_perfmon_sw_init() and assign the sw ops for perfmon
JIRA NVGPU-3210
Change-Id: I2470863f87a7969e3c0454fa48761499b08d445c
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2109899
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- To enable FECS trace support, nvgpu should set the MSB
of the read pointer (MAILBOX1).
- The ucode will check if the feature is enabled/disabled
before writing a record into the circular buffer. If the
feature is disabled, it will not write the record.
- If the feature is enabled and the buffer is not allocated,
HW will throw a page fault error.
Bug 2459186
Change-Id: I71daf6fdb1bb67974f6e51e091f868cb08d3b0bf
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2111028
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>