Remove current mc hals
- mc.reset()
- mc.enable()
- mc.disable()
- mc.reset_mask()
- mc.reset_engine()
- mc.reset_engine_enable()
Add new mc hals
- mc.enable_units(g, units, enable)
> enable/disable given unit(s)
- mc.enable_dev(g, dev, enable)
> enable/disable engine represented by given device pointer
- mc.enable_devtype(g, devtype)
> enable/disable all engines of given devtype
Move common mc intr functions to common/mc/mc_intr.c.
Add below common mc functions
- nvgpu_mc_reset_units(g, units)
> reset given logical OR of nvgpu unit bitmap
- nvgpu_mc_reset_dev(g, dev)
> reset given single engine via dev
> if engine is graphics, reset gpcs for nvgpu_next
- nvgpu_mc_reset_devtype(g, devtype)
> reset all engines of given devtype
> if devtype is graphics, reset gpcs for nvgpu_next
Bug 200648985
Bug 3109773
Change-Id: Idc67a14a0a7cde83de44fbfbec13007fead3ed5c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2408523
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Delete the struct nvgpu_engine_info as it's essentially identical to
struct nvgpu_device. Duplicating data structures is not ideal as it's
terribly confusing what does what.
Update all uses of nvgpu_engine_info to use struct nvgpu_device. This
is often a fairly straight forward replacement. Couple of places though
where things got interesting:
- The enum_type that engine_info uses is defined in engines.h and
has a bit of SW abstraction - in particular the GRCE type. The only
place this seemed to be actually relevant (the IOCTL providing device
info to userspace) the GRCE engines can be worked out by comparing
runlist ID.
- Addition of masks based on intr_id and reset_id; those can be
computed easily enough using BIT32() but this is an area that
could be improved on.
This reaches into a lot of extraneous code that traverses the fifo
active engines list and dramtically simplifies this. Now, instead of
having to go through a table of engine IDs that point to the list of
all host engines, the active engine list is just a list of pointers to
valid engines. It's now trivial to do a for-all-active-engines type
loop. This could even be turned into a generic macro or otherwise
abstracted in the future.
JIRA NVGPU-5421
Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Nvgpu does not support nested interrupts and as a result priv/pbus
interrupt do not reach cpu while other interrupts on intr_0 (stall)
tree are being processed. This issue is not specific to priv/pbus
but since pbus errors are critical, it is important to detect it
early on.
Below is the snippet from one of the failing logs where nvgpu
is doing recovery to process gr interrupt.
Right after GR engine is reset (PGRAPH of PMC_ENABLE), failing priv
accesses should have triggered pbus interrupt but it does not reach cpu
until gr interrupt is handled. Any interrupt that requires recovery will
take longer to finish isr as recovery is done as part of isr.
Also intr_0 (stall) interrupts are paused while stall interrupt is being
processed.
gm20b_gr_falcon_bind_instblk:147 [ERR] arbiter idle timeout, status: badf1020
gm20b_gr_falcon_wait_for_fecs_arb_idle:125 [ERR] arbiter idle timeout, fecs ctxsw status: 0xbadf1020
Fix to detect pbus intr while other stall interrupts are being processed
is to move pbus intr enable/disable/clear/handle to nonstall (intr_1)
tree. Configure pbus_intr_en_1 to route pbus to nostall tree.
Priv interrupts cannot be moved to nonstall (intr_1) tree due
to h/w not supporting this.
In Turing, moving pbus intr to nonstall is not feasible as mc_intr(1)
tree is deprecated. Add Turing specific stall intr handler hals with
original logic to route pbus intr to mc_intr(0).
JIRA NVGPU-25
Bug 200603566
Change-Id: I36fc376800802f20a0ea581b4f787bcc6c73ec7e
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354192
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
MISRA Advisory Rule 2.5 states that a project should not contain unused
macro declarations.
While most of the violations in the nvgpu driver are due to unused
macros from hw headers, devinit-related headers, etc. there is a small
number that are due to things like:
* macros not being used when they could/should be
* macros in C files that are really not referenced
* CPP build flag mismatches
This change eliminates such violations from the following:
* replace constants with existing macros in timeout conversion code
* wrap nvgpu_gmmu_dbg macro #defines in #ifdef CONFIG_NVGPU_TRACE/#endif
* wrap MAX_MC_INTR_REGS #define in #ifdef CONFIG_NVGPU_NON_FUSA/#endif
* remove unused FECS_MAILBOX_0_ACK_RESTORE from runlist code
* wrap BACKTRACE_MAXSIZE macro with #ifndef _QNX_SOURCE/#endif
Jira NVGPU-3178
Change-Id: I2bc72f706d7af3f8e7b062126e8543d0dc8ac250
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284419
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Cleanup issues with traceability for common.mc:
- Move these declarations under macros or @cond as they are either
non-fusa or private functions to the unit:
- gm20b_mc_is_enabled
- mc_gp10b_log_pending_intrs
- mc_gp10b_ltc_isr
- gv11b_mc_is_intr_hub_pending
- Fix typo in SWUTS for gv11b_mc_is_stall_and_eng_intr_pending
JIRA NVGPU-4818
Change-Id: I53a332627772e4d793430159ac1924c8f9ce8c1c
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2280640
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Previously, unit interrupt enabling/disabling and corresponding MC level
interrupt enabling/disabling was not done at the same time.
With this change, stall and nonstall interrupt for units are programmed
at MC level along with individual unit interrupts. Kept access to MC
interrupt registers through mc.intr_lock spinlock.
For doing this separated CE and GR interrupt mask functions.
mc.intr_enable is only used when there is global interrupt
control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable
is now removed. Removed following functions - mc_gv100_intr_enable,
mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config
as we can use the generic unit interrupt control function.
JIRA NVGPU-4336
Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196178
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gk20a.h will include gops_mc.h to contain the mc ops definitions. Add
doxygen comments for the HAL functions that are called directly.
Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file.
JIRA NVGPU-2524
Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rule 8.6 requires each identifier with external linkage to have exactly
one external definitions.
Rule 10.x necessitates operands to have essential type; left and right
operands should be of same width and type.
Rule 17.7 requires function return value to be checked for error
information.
This patch fixes above mentioned errors in nvgpu.hal.mc.
Jira NVGPU-3855
Change-Id: I5440392de5d55dc98ed2002273af8a44a596cd3a
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2162145
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
pmu_gk20a.c includes hw_mc_gk20a.h other than hw_pwr_gk20a.h
to access & configure pmu interrupt, this breaks single hw header
for HAL file.
Moved PMU interrupt enable to MC unit by creating/modifying current
mc ops intr_unit_config to intr_pmu_unit_config to configure PMU
interrupt specifically as this ops is only used by PMU unit
JIRA NVGPU-3239
Change-Id: I2514f17197708047b46ea712cf4569a5b3bfab2a
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2126420
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move chip specific mc code from common/mc to hal/mc.
Replace gk20a_readl/writel with nvgpu_readl/writel
Replace 0xFFFFFFFFU with U32_MAX hash define
Change local variable names to fix checkpatch errors/warnings
Change BUG to WARN
Move defines to header files
Create new defines for hard coded delays
JIRA NVGPU-2041
Change-Id: I3594121a81da37ef58c47e87c45e96441e4cf8c7
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2085268
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>