g->ops.gr.enable_cde_in_fecs and g->ops.gr.update_boosted_ctx
are no longer required since we can directly call
g->ops.gr.ctxsw_prog.set_cde_enabled and
g->ops.gr.ctxsw_prog.set_pmu_options_boost_clock_frequencies
respectively
remove those functions and the ops
Jira NVGPU-1526
Change-Id: Idb0ad5f634e78aac44ec325ba2b7f59c612b29e8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1972184
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GSPLITE falcon base address was being set without invoking hal api.
This patch defines gpu_ops.gsp.falcon_base_addr hal api to get this
base address.
JIRA NVGPU-1587
Change-Id: Id187b34d022f90c09b8762cdab7769323b607cc0
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1969432
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GPCCS falcon base address was being set without invoking hal api. Remove
FALCON_GPCCS_BASE. This patch defines gpu_ops.gr.gpccs_falcon_base_addr
hal api to get this base address.
JIRA NVGPU-1587
Change-Id: Icfa7a26d1bb2d67c81f05a43f6ce906f59706b3d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1969431
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
FECS falcon base address was being set without invoking hal api. Remove
FALCON_FECS_BASE. This patch defines gpu_ops.gr.fecs_falcon_base_addr hal
api to get this base address.
JIRA NVGPU-1587
Change-Id: I9c8e60be4ee81a154020c982893725a12ebb72ef
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1969430
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
SEC2 falcon base address was being set without invoking hal api. Remove
FALCON_SEC_BASE. This patch defines gpu_ops.sec2.falcon_base_addr hal api
to get this base address.
Also, don't initialize the base for non-supported falcons.
JIRA NVGPU-1587
Change-Id: Iad19a9987416076cf9090d30a48ff83369cf73c2
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1969429
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
PMU falcon base address was being set without invoking hal api. Remove
FALCON_PWR_BASE. This patch defines gpu_ops.pmu.falcon_base_addr hal api
to get this base address.
JIRA NVGPU-1587
Change-Id: I5c3f27e89bdcc775025bc8d4fa9cf0af11ceb002
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1969428
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
NVDEC falcon base address was being set without invoking hal api. Remove
FALCON_NVDEC_BASE. This patch defines gpu_ops.fb.falcon_base_addr hal api
to get this base address. Currently gp106 and tu104 have these
implemented. gv100 uses the gp106 hal interface.
Also, don't initialize the base for non-supported falcons.
JIRA NVGPU-1587
Change-Id: I0be759b8462ede9b85690a70431480afdee9602c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1969427
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
PMU counters #0 and #4 are used to count total cycles and busy cycles.
These counts are used by podgov to estimate GPU load.
PMU idle intr status register is used to monitor overflow. Overflow
rarely occurs because frequency governor reads and resets the counters
at a high cadence. When overflow occurs, 100% work load is reported to
frequency governor.
Bug 1963732
Change-Id: I046480ebde162e6eda24577932b96cfd91b77c69
Signed-off-by: Peng Liu <pengliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1939547
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We had to force allocation of physically contiguous memory for
USERD in nvlink case, as a channel's USERD address is computed as
an offset from fifo->userd address, and nvlink bypasses SMMU.
With 4096 channels, it can become difficult to allocate 2MB of
physically contiguous sysmem for USERD on a busy system.
PBDMA does not require any sort of packing or contiguous USERD
allocation, as each channel has a direct pointer to that channel's
512B USERD region. When BAR1 is supported we only need the GPU VAs
to be contiguous, to setup the BAR1 inst block.
- Add slab allocator for USERD.
- Slabs are allocated in SYSMEM, using PAGE_SIZE for slab size.
- Contiguous channels share the same page (16 channels per slab).
- ch->userd_mem points to related nvgpu_mem descriptor
- ch->userd_offset is the offset from the beginning of the slab
- Pre-allocate GPU VAs for the whole BAR1
- Add g->ops.mm.bar1_map() method
- gk20a_mm_bar1_map() uses fixed mapping in BAR1 region
- vgpu_mm_bar1_map() passes the offset in TEGRA_VGPU_CMD_MAP_BAR1
- TEGRA_VGPU_CMD_MAP_BAR1 is called for each slab.
Bug 2422486
Bug 200474793
Change-Id: I202699fe55a454c1fc6d969e7b6196a46256d704
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add separate new unit gr/ctxsw_prog that provides interface to access
h/w header files hw_ctxsw_prog_*.h
Add below chip specific files that access above h/w unit and provide
interface through g->ops.gr.ctxsw_prog.*() HAL for rest of the units
common/gr/ctxsw_prog/ctxsw_prog_gm20b.c
common/gr/ctxsw_prog/ctxsw_prog_gp10b.c
common/gr/ctxsw_prog/ctxsw_prog_gv11b.c
Remove all the h/w header includes from rest of the units and code.
Remove direct calls to h/w headers ctxsw_prog_*() and use HALs
g->ops.gr.ctxsw_prog.*() instead
In gr_gk20a_find_priv_offset_in_ext_buffer(), h/w header
ctxsw_prog_extended_num_smpc_quadrants_v() is only defined on gk20a
And since we don't support gk20a remove corresponding code
Add missing h/w header ctxsw_prog_main_image_pm_mode_ctxsw_f() for
some chips
Add new h/w header ctxsw_prog_gpccs_header_stride_v()
Jira NVGPU-1526
Change-Id: I170f5c0da26ada833f94f5479ff299c0db56a732
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1966111
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Split out the code to check which engines on a particular runlist are
busy from gk20a_fifo_runlist_reset_engines() and make it a HAL op.
Resetting engines is common across chips but status is read from
registers.
Jira NVGPU-1309
Change-Id: I7a63a2942a9e210481822eaf85795fc17dad0dc5
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1961822
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Updated PMU version number to sync with
p4 cl #:25133717
-As LS falcon's bootstrap is taken care by SEC2 RTOS
so, removed ACRLIB from PMU ucode & disabled WPR
init from PMU by setting ops .init_wpr_region to NULL
-Adding dummy bytes to PMU supersurface member therm
data structure to match with tu10x ucode supersurface
change sequence offset.
-PMU ucode update to enable ECC interrupt
-Enable ECC interrupt in Falcon interrupt source
-Enable routing of ECC interrupt to HOST.
JIRA NVGPU-1150
Change-Id: Ib49f9bf811dc2a01252461c16a44869e07412005
Reviewed-on: https://git-master.nvidia.com/r/1929895
Signed-off-by: Vaikundanathan S <vaikuns@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1957846
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Different SKUs may require different nvlink speed and hence the
nvlink speed value should come from VBIOS. The initpll number
corresponding to speed is present in VBIOS Low Power Nvlink table
header. Parse this data from VBIOS and set corresponding nvlink
speed and minion initpll DLCMD as default.
We can no longer update the GV100 VBIOS with necessary nvlink speed
value. Hence the hardcoding stays for GV100.
The nvlink speed should match across the endpoints. So in speed_config
fops, communicate the speed to nvlink core-driver for co-ordination
with Tegra endpoint.
Bug 2418403
Change-Id: Ib6f60951d4ca1c275968707d4cc6d738ba3a3f08
Signed-off-by: tkudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1938046
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
All the netlist parsing code is currently under GR unit, but netlist
ucode parsing does not really have any logical dependency to GR
Hence separate out a new unit common/netlist/ that parses the netlist
image and stores/exposes its content through netlist_vars structure
Structure nvgpu_netlist_vars is added to structure gk20a
Move netlist parsing code to common/netlist/netlist.c and chip
specific files to common/netlist/netlist_<chip>.c
Move simulation netlist parsing to common/netlist/netlist_sim.c
Rename g.ops.gr_ctx HAL to g.ops.netlist
Rename all the exported structures to be in the form of nvgpu_*
Rename all exported functions to be in the form of nvgpu_netlist_*()
Add netlist initialization to GPU boot path, and add deinitialization
to GPU remove path
Jira NVGPU-1317
Change-Id: I9af86e3b3230a89db5260cc8ed96ff5f72938c9a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1936454
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a/gr_ctx_gk20a.c we right now directly read the GR register
gr_fecs_ctx_state_store_major_rev_id_r() which adds the dependency
to GR h/w header
Add a new HAL g.ops.gr.get_fecs_ctx_state_store_major_rev_id() to
read this register and use this instead
Also remove h/w header from gr_ctx_gk20a.c
Jira NVGPU-1317
Change-Id: Iab64fbfacff4d7ce4f3b61ca90b00ddc77e29551
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1936453
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reduce the size of memory allocations in the channel debug dump by
capturing only the necessary values from the instance block. This also
simplifies the allocation path slightly with the downside of having to
add a capture_channel_ram_dump HAL for reading the interesting parts
explicitly beforehand to the now smaller staging buffer.
Also rename struct ch_state to struct nvgpu_channel_dump_info.
Jira NVGPU-886
Change-Id: I5d7518d9d474b0b728b183383bc83d89ecf91b98
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1928207
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
L2 interrupt is processed by first reading from MC which L2 triggered
the interrupt and then calling a function per L2 slice to get the
details. Move the outer loop to MC unit, and the inner loop and L2
accesses to LTC unit.
JIRA NVGPU-954
Change-Id: I69b7bb82e4574b0519cdcd73b94d7d3e3fa6ef9e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1851328
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename gk20a/dbg_gpu_gk20a.c to common/debugger.c and make it a
separate common unit
Also rename gk20a/dbg_gpu_gk20a.h to include/nvgpu/debugger.h
We had two different HALs for debugger - gops.debugger and
gops.dbg_session_ops
Combine them into one single HAL gops.debugger and remove
gops.dbg_session_ops
Rename all exported APIs from debugger.h to be in the form of
nvgpu_*()
Jira NVGPU-1013
Change-Id: I136dc7786e3b2065921eb03b99f16049212f3cd2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1920075
Reviewed-by: Sachin Jadhav <sachinj@nvidia.com>
Tested-by: Sachin Jadhav <sachinj@nvidia.com>
MISRA rule 14.4 doesn't allow the usage of integer types as booleans
in the controlling expression of an if statement or an iteration
statement.
Fix violations where the integer variables err, ret, status are used
as booleans in the controlling expression of if and loop statements.
JIRA NVGPU-1019
Change-Id: I8c9ad786a741b78293d0ebc4e1c33d4d0fc8f9b4
Signed-off-by: Amurthyreddy <amurthyreddy@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1921260
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Added method nvgpu_tu104_acr_ahesasc_sw_init()
to set ACR-AHESASC properties.
-Added method nvgpu_tu104_acr_asb_sw_init() to
set ACR-ASB properties.
-Modified method nvgpu_tu104_acr_sw_init() to
call ACR AHESASC/ASB init & set bootstrap_owner
to LSF_FALCON_ID_GSPLITE by removing older support
of default ACR executing on SEC2.
-Added method tu104_bootstrap_hs_acr to execute
ACR AHESASC & ASB ucode.
-Execute ACR-AHESASC(ACR hub encryption setter and
signature checker) on SEC2 falcon to copy ucode
blob from non-wpr to wpr & lockdown wpr then
perform signature verification of LS falcon ucode
whitout doing any LS flacon bootstrap.
-Once first stage of ACR is successful then execute
ACR-ASB(ACR SEC2 booter) on GSP falcon to bootstrap
SEC2-RTOS on sec2 falcon to perform PMU & GR
falcons bootstrap.
-Enable SEC2 RTOS support by setting
NVGPU_SUPPORT_SEC2_RTOS to true
-Added tu104 ACR remove support to clear
allocated space
JIRA NVGPUT-134
Change-Id: I2d1777af83feda5e8f6845876177cce062c43ace
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1918937
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new separate unit common/perf/cyclestats_snapshot.c and add
corresponding header file include/nvgpu/cyclestats_snapshot.h
This unit is h/w independent and simply calls gops.perf.* HALs
exposed by perf unit to do the h/w configurations
Also remove gv11b/css_gr_gv11b.* files as h/w specific sequence
implemented in them is already moved to perf unit
Rename all cyclestats_snapshot HALs in the form nvgpu_css_*()
Jira NVGPU-1103
Change-Id: I303f6becb313ac918e06c495a5fe299947a1f0b1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1916652
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add Turing specific common, unit, hardware header files
Make all the Makefile and Makefile.sources changes to compile
all Turing specific code
Bug 200454999
Change-Id: I62ebff5c078b4b8817fc83ea0e4ee3cfffe668dc
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1917983
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>