- "include/trace/events/gk20a.h" file was having GPL2 license
(which should not used for QNX code). This file was used for
compiling linux userspace driver("libnvgpu-drv.so") and was used for
unit testing on QNX.
- This patch removes stubs in "include/trace/events/gk20a.h" file.
(which were used for linux userspace driver.)
- For QNX driver, "nvgpu_rmos/trace/events/gk20a.h" was used.
This patch moves that file to "include/nvgpu/posix/trace_gk20a.h" and
does relevant license change. This same file will be used for linux
userspace driver.
- This patch also creates a new file "include/nvgpu/trace.h" which
selects proper trace file depending on the config.
Bug 2802414
Change-Id: Icdfb251e5698073f986753a969e804161af3ecc5
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2286388
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The devinit executes in parallel with PCIE link training
to reduce exit latency. Therefore, all PCIE settings that
normally occur during devinit after the PCIE link is up are
deferred until nvgpu has resumed control.
Bug 2661545
Change-Id: Ifdd4f645b2e1791d93567cc34d6ab0691a25d101
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2210625
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
IRQs were not enabled before nvgpu_finalize_poweron, so debugging early
init issues such as MMU fault, invalid PRIV ring or bus access etc.
triggered during nvgpu power-on was cumbersome. Hence, Enable the
IRQs before nvgpu_finalize_poweron is called.
In HUB (MMU fault) ISR, MMU fault handling is only limited to snapped
in priv reg in case of fault during nvgpu power-on.
In HUB (MMU fault) ISR, access to fault buffers is synchronized as
nvgpu driver reads the fault buffer registers before proceeding
with fault handling. However, additional MMU fault handling
needs to be synchronized with GR/FIFO/quiesce/recovery setup
through nvgpu power-on state.
JIRA NVGPU-1592
Change-Id: I8a5f2fcd79cb7ad8e215359e7a9fad50bfd46d67
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2203861
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
IRQs can get triggered during nvgpu power-on due to MMU fault, invalid
PRIV ring or bus access etc. Handlers for those IRQs can't access the
full state related to the IRQ unless nvgpu is fully powered on.
In order to let the IRQ handlers know about the nvgpu power-on state
gk20a.power_on_state variable has to be protected through spinlock
to avoid the deadlock due to usage of earlier power_lock mutex.
Further the IRQs need to be disabled on local CPU while updating the
power state variable hence use spin_lock_irqsave and spin_unlock_-
irqrestore APIs for protecting the access.
JIRA NVGPU-1592
Change-Id: If5d1b5e2617ad90a68faa56ff47f62bb3f0b232b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2203860
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For safety build, nvgpu driver should enter SW quiesce state
in case an uncorrectable error has occurred. In this state, any
activity on the GPU should be prevented, without powering off the GPU.
Also, a minimal set of operations should be used to enter SW quiesce
state.
Entering SW quiesce state does the following:
- set sw_quiesce_pending: when this flag is set, interrupt
handlers exit after masking interrupts. This should help mitigate
an interrupt storm.
- wake up thread to complete quiescing.
The thread performs the following:
- set NVGPU_DRIVER_IS_DYING to prevent allocation of new resources
- disable interrupts
- disable fifo scheduling
- preempt all runlists
- set error notifier for all active channels
Note: for channels with usermode submit enabled, userspace can
still ring doorbell, but this will not trigger any work on
engines since fifo scheduling is disabled.
Jira NVGPU-3493
Change-Id: I639a32da754d8833f54dcec1fa23135721d8d89a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2172391
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
CE app functionality from nvgpu is non-safe for igpu. CE engines init
/reset/cg related functionality is required in safety. Hence move the
CE app logic under CONFIG_NVGPU_DGPU flag and update the sources
accordingly.
JIRA NVGPU-3814
Change-Id: I37aa00b1184baccd5fe569ec315be60ac42dac9b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2168956
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For dGPU with PCIE interface do not have a thermal alert pin.
Only platforms where dGPU is used with SXM interface have the
thermal alert pin.
This change makes sure that if nvgpu-therm-gpio DT entry is
is missing we don't fail probe but continue with GPU
initialization without enabling thermal alert feature.
Bug 200542024
Change-Id: Iaf3aec9b66695a45daf86ecfdeec398b66f96bfd
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2173495
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- In GV11B, read fuse_status_opt_tpc_gpc register
to read which TPCs are floorswept.
- The driver will also read sysfs node: tpc_pg_mask
- Based on these two values "can_tpc_powergate" will
be set to true or false and mask will be used to write to
fuse_ctrl_opt_tpc_gpc register to powergate the TPC.
- can_tpc_powergate = true indicates that the mask value
sent from userspace is valid and can be used to power gate
the desired TPC
- can_tpc_powergate = false indicates that the mask value
sent from userspace is not valid and cannot be used to
power gate the desired TPC.
Bug 200532639
Change-Id: Ib0806e4c96305a13b3574e8063ad8e16770aa7cd
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2170736
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
PG189 has multiple sensors which can provide interrupt when board
temperature reaches programmed threshold.
This Interrupt is implemented in nvgpu and provide events via clk_arb.
Support is enabled for TU104 with NVGPU_SUPPORT_DGPU_THERMAL_ALERT flag.
Board specific config is added in DT which will be parsed by nvgpu.
Nvgpu does the following.
1.Read gpio line number, interrupt type, and event delay from DT.
2.Call kernel methods and register the interrupt with kernel.
3.Create work queue which will process the interrupt in process context.
4.When interrupt occurs disable interrupt, add work to work queue.
5.In work queue post events and sleep for delay time then enable
Interrupt
Bug 2492512
Change-Id: Ic5694fe366ca492f8afe8a67de4350e9a51af2af
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2119411
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix 8.2 violation for not specifying parameter name in prototype of
secure_alloc().
Fix 21.3 & 21.8 violations for using reserved names "free" and "exit."
Fix 8.6 and 21.2 violations for __gk20a_do_idle() and
__gk20a_do_unidle() by renaming the functions and wrapping them in a
missing #ifdef CONFIG_PM.
Fix 5.7 violation for reusing "class" as parameter name when already
defined as a struct.
JIRA NVGPU-3343
Change-Id: I976e95a32868fa0a657f4baf0845a32bd7aceb9e
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2117913
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
create a new unit common.fbp which initializes fbp support and provides
APIs to retrieve fbp data.
Create private header with below data
struct nvgpu_fbp {
u32 num_fbps;
u32 max_fbps_count;
u32 fbp_en_mask;
u32 *fbp_rop_l2_en_mask;
};
Expose below public APIs to initialize/remove fbp support:
nvgpu_fbp_init_support()
nvgpu_fbp_remove_support()
vgpu_fbp_init_support() for vGPU
Expose below APIs to retrieve fbp data
nvgpu_fbp_get_num_fbps()
nvgpu_fbp_get_max_fbps_count()
nvgpu_fbp_get_fbp_en_mask()
nvgpu_fbp_get_rop_l2_en_mask()
Use above APIs to retrieve fbp data in all the code.
Remove corresponding fields from struct nvgpu_gr since they are no
longer referred from that structure
Jira NVGPU-3124
Change-Id: I027caf4874b1f6154219f01902020dec4d7b0cb1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2108617
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
cyclestats_snapshot data and lock is right now stored in struct nvgpu_gr
Use case itself is not specific to GR engine but in general it applies
to other units outside of GR too.
Hence it makes sense to move both data and lock to struct gk20a instead
of keeping them in struct nvgpu_gr
Update all cyclestats_snapshot code to refer data/lock from struct gk20a
Remove gr_priv.h header include from cyclestats_snapshot.c
Some of the functions were mistakenly declared in gr_gk20a.h.
Move them to cyclestats_snapshot.h and rename them to form nvgpu_css_*()
Jira NVGPU-1103
Change-Id: I3fb32fe96f0ca6613f4640c8bd227b9e0e02dca3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2104848
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Description:
Present p-state unit handle both pstate boardobj and
initializing all the units. As part of restructuring,
the pstate unit is separated into two units:
1) Perf_pstate: This unit will handle pstate boardobjs.
2) Pmu_pstate: This unit will initialize all the units
which supoort performance states.
Changes:
1) Created pmu_pstate unit.
2) Pstate boardobjs are moved under perf_pstate which
is under perf unit.
NVGPU-1958
Change-Id: I2c428adfe6de4992c9eeda0d4356d30290f6e8a4
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2096339
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Removed unused struct from gr_gk20a.h
Change static allocation for struct gr_gk20a to dynamic type.
Change all the files that being affected by that change.
Call gr allocation from corresponding init_support functions, which
are part of the probe functions.
nvgpu_pci_init_support in pci.c
vgpu_init_support in vgpu_linux.c
gk20a_init_support in module.c
Call gr free before the gk20a free call in nvgpu_free_gk20a.
Rename struct gr_gk20a to struct nvgpu_gr
JIRA NVGPU-3132
Change-Id: Ief5e664521f141c7378c4044ed0df5f03ba06fca
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2095798
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Moved the following HALs from fifo to usermode
- fifo.ring_channel_doorbell -> usermode.ring_doorbell
- fifo.doorbell_token -> usermode.doorbell_token
- fifo.usermode_base -> usermode.base
Created the following HAL
- usermode.setup_hw
Jira NVGPU-2978
Change-Id: I856ea24c126fa22d2f3fe860d4b14087c6d7330b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2094813
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently ACR header files are part of "include/nvgpu/acr/" folder &
ACR interfaces are not used by any other UNIT which allows headers to
keep restricted to ACR unit, as ACR can be divided into two stage
process like blob preparation & bootstrap, so moved header files from
of "include/nvgpu/acr/" to "nvgpu/common/acr/" to respective blob/
bootstrap/acr header files along with its dependent interfaces, this
allows interfaces restricted to header file based on operation it does.
With this any access to ACR must go through provided public functions,
this header move change caused large code modification & required to
make it with big single CL to avoid build break.
JIRA NVGPU-2907
Change-Id: Idb24b17a35f7c7a85efe923c4e26edfd42b028e3
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2071393
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We have 3 header files for FECS tracing support
include/nvgpu/gr/fecs_trace.h : common header
include/nvgpu/ctxsw_trace.h : header that includes both common and
os-specific functions
os/linux/ctxsw_trace.h : linux specific header
Remove the second header since it is not needed.
Move all structures that are needed in common code to
include/nvgpu/gr/fecs_trace.h
Move all function declarations that are needed in common code to
include/nvgpu/gr/fecs_trace.h
Move all linux specific declarations in os/linux/ctxsw_trace.h and
rename this file as os/linux/fecs_trace_linux.h
Also rename os/linux/ctxsw_trace.c to os/linux/fecs_trace_linux.c
Jira NVGPU-1880
Change-Id: I05cc4489c4b6a64880b7d59c02b22cd2244d5e22
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2070766
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add fifo sub-unit to common.fifo to handle init/deinit code
and global support functions.
Split init into:
- nvgpu_channel_setup_sw
- nvgpu_tsg_setup_sw
- nvgpu_fifo_setup_sw
- nvgpu_runlist_setup_sw
- nvgpu_engine_setup_sw
- nvgpu_userd_setup_sw
- nvgpu_pbdma_setup_sw
Split de-init into
- nvgpu_channel_cleanup_sw
- nvgpu_tsg_cleanup_sw
- nvgpu_fifo_cleanup_sw
- nvgpu_runlist_cleanup_sw
- nvgpu_engine_cleanup_sw
- nvgpu_userd_cleanup_sw
- nvgpu_pbdma_cleanup_sw
Added the following HALs
- runlist.length_max
- fifo.init_pbdma_info
- fifo.userd_entry_size
Last 2 HALs should be moved resp. to pbdma and userd sub-units,
when available.
Added vgpu implementation of above hals
- vgpu_runlist_length_max
- vgpu_userd_entry_size
- vgpu_channel_count
Use hals in vgpu_fifo_setup_sw.
Jira NVGPU-1306
Change-Id: I954f56be724eee280d7b5f171b1790d33c810470
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2029620
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The corresponding HAL pointer for gk20a_fifo_wait_engine_idle is not
being invoked anywhere and hence they are removed from the code.
The function gk20a_fifo_wait_engine_idle belongs to engine unit and is
only called in a non-safe build, hence its moved to engine unit and is
restricted by a non-safe build flag NVGPU_ENGINE
Also, gk20a_fifo_wait_engine_idle is renamed to nvgpu_engine_wait_for_idle
Jira NVGPU-1315
Change-Id: Ie550c7e46a4284dfe368859d828b1994df34185f
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2033631
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The following functions belong to engine unit and are moved
gk20a_fifo_enable_engine_activity
gk20a_fifo_enable_all_engine_activity
gk20a_fifo_disable_engine_activity
gk20a_fifo_disable_all_engine_activity
These are renamed by replacing gk20a_fifo with nvgpu_engine as prefix.
These functions are only invoked by linux build and not required for
safety build and hence they are defined when
-DNVGPU_ENGINE is enabled.
Jira NVGPU-1315
Change-Id: I39d820879bb55b40e754526c657d794930a4b6a1
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2032606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gk20a_ctxsw_trace_init() is defined in linux specific code, but it is
right now being called from common functions gk20a_finalize_poweron()
and vgpu_finalize_poweron_common() in common code
Move this call to linux specific common function
nvgpu_finalize_poweron_linux() instead
Jira NVGPU-1880
Change-Id: I2548683f1a0378194ff5dd84c94936ccf0f9d17b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2069474
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename __nvgpu_set_enabled() to nvgpu_set_enabled(). The original
double underscore was present to indicate that this function is a
function with potentially unintended side effects (enabling a feature
has wide ranging impact).
To not lose this documentation a comment was added to convey that this
function must be used with care.
JIRA NVGPU-1029
Change-Id: I8bfc6fa4c17743f9f8056cb6a7a0f66229ca2583
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1989434
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The PMU pstate deinit was invoked part of gpu power off. This frees and clears
the pmgr_pmu struct which causes the pmu remove support to crash when it
tries to access the pmgr_pmu object for freeing up the pmu board objects.
Deferred pstate deinit to nvgpu driver removal as there is no reason for it be
invoked part of prepare poweroff sequence.
JIRA NVGPU-1618
Change-Id: I2eb52000f0732d0abed54946e0843367b119d443
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1971225
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Instead of just the base address of the main register range, store
(also) the base address of usermode area. All regs may not be always
available; on vgpu guests we have only the usermode regs.
Store the usermode addr we get from a platform resource directly in
gv11b_vgpu_probe() for vgpu. In that case the main reg addr is unset.
The base address is computed in gk20a_pm_finalize_poweron() for native
environments; when the reg addr is read from a resource, the chip is
still unknown and as such the HAL op for reading the usermode base
offset is unavailable.
Bug 200145225
Bug 200467197
Change-Id: I8855bb54a6456eb63b69559c84398f7eeaec3513
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1951524
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a mmap callback on the control device node for mapping the usermode
register region to userspace. Each such mapping is removed when the GPU
railgates, and brought back again on unrailgate.
The mapping offset must be 0 and its size must be 4 KB.
Bug 200145225
Change-Id: Ie8d3758da745b958376292691d7d1d02a24e7815
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1795819
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Usermode submit needs to access the usermode region of registers from
userspace. Store the start address of register resource in struct
nvgpu_os_linux to be used in remap to userspace.
Bug 200145225
Change-Id: I3796b6bf67942af0cc16c86accb82a013032bfc8
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1811838
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>