Commit Graph

66 Commits

Author SHA1 Message Date
atanand
f43897c940 gpu: nvgpu: GA10X_NEXT pulling GR1 out of reset
This patch is to enable GR1 before resetting GR0
which is not visible to the driver.

Bug 3690950

Change-Id: I8a1907349f5a4354c6b7f95f9904b52738f51f00
Signed-off-by: atanand <atanand@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2758161
(cherry picked from commit 48d925cacf373a97dbdb031a109b83be3bfe2972)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2765635
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-09-08 21:05:01 -07:00
Kishan
282be07995 gpu: nvgpu: Fix leaf and top interrupt disabling logic
There are 2 issues here:
1. top_en register is being masked for each leaf level
interrupt disable operation. top_en bit should be disabled
as part of top level stall operation only.
2. Wrong mask is being calculated to disable the leaf_en bits
for a unit which inturn affects the entire subtree.
Subtree_mask_restore for a subtree stores the last state
of interrupts that are enabled. As part of disable operation,
we only need to update subtree_mask_restore and not reupdate
subtree_mask for that subtree. Same logic applies to enable
operation.

Renamed the apis to better reflect their operation. The
interrupt disabling is done at unit level and not subtree level.

Bug 3712884

Change-Id: Id840c77f612021a303cfe0e8dca69386bc570273
Signed-off-by: Kishan <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2752541
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Deepak Goyal <dgoyal@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-08-05 16:38:38 -07:00
Jinesh Parakh
622fe70dab gpu: nvgpu: Fix Bad bit shift Coverity issues
Fixed following Coverity Defects:
ioctl_as.c : Bad bit shift operation
mc_tu104.c : Bad bit shift operation
vm.c : Bad bit shift operation
vm_remap.c : Bad bit shift operation

A new linux header file for ilog2 is created.
The files which used the old ilog2 function
have been changed to use the new nvgpu_ilog2
function.

CID 9847922
CID 9869507
CID 9859508
CID 10112314
CID 10127813
CID 10127899
CID 10128004

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: Ia201eea7cc426c3d6581e1e5ae3b882dbab3b490
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2700994
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-28 04:08:45 -07:00
Tejal Kudav
b80b2bdab8 gpu: nvgpu: Add CE interrupt handling
a. LAUNCH_ERR
    - Userspace error.
    - Triggered due to faulty launch.
    - Handle using recovery to reset CE engine and teardown the
      faulty channel.

b. An INVALID_CONFIG -
    - Triggered when LCE is mapped to floorswept PCE.
    - On iGPU, we use the default PCE 2 LCE  HW mapping.
      The default mapping can be read from NV_CE_PCE2LCE_CONFIG
      INIT value in CE refmanual.
    - NvGPU driver configures the mapping on dGPUs (currently only on
      Turing).
    - So, this interrupt can only be triggered if there is
      kernel or HW error
    - Recovery ( which is killing the context + engine reset) will
      not help resolve this error.
    - Trigger Quiesce as part of handling.

c. A MTHD_BUFFER_FAULT -
    - NvGPU driver allocates fault buffers for all TSGs or contexts,
      maps them in BAR2 VA space and writes the VA into channel
      instance block.
    - Can be triggered only due to kernel bug
    - Recovery will not help, need quiesce

d. FBUF_CRC_FAIL
    - Triggered when the CRC entry read from the method fault buffer
      does not match the computed CRC from the methods contained in
      the buffer.
    - This indicates memory corruption and is a fatal interrupt which
      at least requires the LCE to be reset before operations can
      start again, if not the entire GPU.
    - Better to quiesce on memory corruption
      CE Engine reset (via recovery) will not help.

e. FBUF_MAGIC_CHK_FAIL
    - Triggered when the MAGIC_NUM entry read from the method fault
      buf does not match NV_CE_MTHD_BUFFER_GLOBAL_HDR_MAGIC_NUM_VAL
    - This indicates memory corruption and is a fatal interrupt
    - Better to quiesce on memory corruption

f. STALLING_DEBUG
    - Only triggered with SW write for debug purposes
    - Debug interrupt, currently ignored

Move launch error handling from GP10b to GV11b HAL as -
1. LAUNCHERR_REPORT errcode METHOD_BUFFER_ACCESS_FAULT is not
   defined on Pascal
2. We do not support GP10b on dev-main ToT

JIRA NVGPU-8102

Change-Id: Idc84119bc23b5e85f3479fe62cc8720e98b627a5
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678893
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-14 17:12:14 -07:00
srajum
390df709ca gpu: nvgpu: fixing static analysis violation
- MISRA Rule 17.7
  The value returned by a function having non-void return type
  shall be used

JIRA NVGPU-5955

Change-Id: I59539042d05afa9e74272fc8645b2fe1fa8e42aa
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572085
(cherry picked from commit bd75b6196b7fac67fbf7a458e6bed9e3c7076ee8)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678671
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-10 16:03:28 -08:00
Tejal Kudav
3fe70bf86e gpu: nvgpu: Update CE Intr code as per Orin HSIs
Below CE interrupts do not have any users(usecases) on safety build;
disable them only on safety build.
   1. BLOCKPIPE stall intr: Not used by GFX(VKSC) and CUDA on safety.
   2. NONBLOCK_PIPE nonstall intr: Non-stall intrs are not supported
          on safety build. Also, this one is not used by GFX(VKSC)
          and CUDA.
   3. STALLING_DEBUG intr: Added in Orin tree. It is only needed for
          debugging. Disable on safety build as there is no current
          usage in driver.
   4. POISON_ERROR intr: Poison is a fault containment and not
	  supported on GA10b.
   5. INVALID_CONFIG intr: Floor sweeping not supported on functional
          safety SKU.

Bug 3548082

Change-Id: I8d97ccb38f138b2c04a780e1c255a64d28723405
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671927
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-08 11:41:26 -08:00
Shashank Singh
19a3b86f06 gpu: nvgpu: remove unused code from common.nvgpu on safety build
- remove unused code from common.nvgpu unit on safety build. Also,
remove the code which uses them in other places.
- document use of compiler intrinsics as mandated in code inspection
  checklist.

Jira NVGPU-6876

Change-Id: Ifd16dd197d297f56a517ca155da4ed145015204c
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561584
(cherry picked from commit 900391071e9a7d0448cbc1bb6ed57677459712a4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561583
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-17 04:58:32 -08:00
Ramesh Mylavarapu
9302b2efee gpu: nvgpu: gsp units separation
Separated gsp unit into three unit:
- GSP unit which holds the core functionality of GSP RISCV core,
  bootstrap, interrupt, etc.
- GSP Scheduler to hold the cmd/msg management, IPC, etc.
- GSP Test to hold stress test ucode specific support.

NVGPU-7492

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I12340dc776d610502f28c8574843afc7481c0871
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2660619
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-09 00:38:21 -08:00
Richard Zhao
e81a36e56a gpu: nvgpu: hal: fix compile error of new compile flags
It's preparing to add bellow CFLAGS:
    -Werror -Wall -Wextra \
    -Wmissing-braces -Wpointer-arith -Wundef \
    -Wconversion -Wsign-conversion \
    -Wformat-security \
    -Wmissing-declarations -Wredundant-decls -Wimplicit-fallthrough

Jira GVSCI-11640

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia16ef186da1e97badff9dd0bf8cbd6700dd77b15
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555057
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-13 12:36:19 -08:00
Divya
4c2e2bb0b9 gpu: nvgpu: Reduce allow/disallow in stall_isr
With the existing implemenatation ELPG was disabled and
enabled once for handling stall isr and then again ELPG is
disabled and aenabled for writing to gr retrigger register.
This increased number of ELPG cycles and degraded perf of various
graphics test with ELPG enabled.

This change now disables ELPG, then handles stall_isr and
write to gr retrigger register and then enables ELPG.
Thus, number of ELPG cycles are reduced.

Bug 3451615

Change-Id: Iadac0c7b01eb711878280cd1503ba0f26000937c
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2638175
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-12-10 13:27:16 -08:00
Deepak Nibade
3d9c67a0e7 gpu: nvgpu: enable Orin support in safety build
Most of the Orin chip specific code is compiled out of safety build
with CONFIG_NVGPU_NON_FUSA and CONFIG_NVGPU_HAL_NON_FUSA. Remove the
config protection from Orin/GA10B specific code. Currently all code
is enabled. Code not required in safety will be compiled out later
in separate activity.

Other noteworthy changes in this patch related to safety build:

- In ga10b_ce_request_idle(), add a log print to dump num_pce so that
  compiler does not complain about unused variable num_pce.
- In ga10b_fifo_ctxsw_timeout_isr(), protect variables active_eng_id and
  recover under CONFIG_NVGPU_KERNEL_MODE_SUBMIT to fix compilation
  errors of unused variables.
- Compile out HAL gops.pbdma.force_ce_split() from safety since this HAL
  is GA100 specific and not required for GA10B.
- Compile out gr_ga100_process_context_buffer_priv_segment() with
  CONFIG_NVGPU_DEBUGGER.
- Compile out VAB support with CONFIG_NVGPU_HAL_NON_FUSA.
- In ga10b_gr_intr_handle_sw_method(), protect left_shift_by_2 variable
  with appropriate configs to fix unused variable compilation error.
- In ga10b_intr_isr_stall_host2soc_3(), compile ELPG function calls
  with CONFIG_NVGPU_POWER_PG.
- In ga10b_pmu_handle_swgen1_irq(), move whole function body under
  CONFIG_NVGPU_FALCON_DEBUG to fix unused variable compilation errors.
- Add below TU104 specific files in safety build since some of the code
  in those files is required for GA10B. Unnecessary code will be
  compiled out later on.
	hal/gr/init/gr_init_tu104.c
	hal/class/class_tu104.c
	hal/mc/mc_tu104.c
	hal/fifo/usermode_tu104.c
	hal/gr/falcon/gr_falcon_tu104.c
- Compile out GA10B specific debugger/profiler related files from
  safety build.
- Disable CONFIG_NVGPU_FALCON_DEBUG from safety debug build temporarily
  to work around compilation errors seen with keeping this config
  enabled. Config will be re-enabled in safety debug build later.

Jira NVGPU-7276

Change-Id: I35f2489830ac083d52504ca411c3f1d96e72fc48
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2627048
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-11-26 08:46:47 -08:00
Sagar Kamble
e692cd9913 gpu: nvgpu: fix spurious stall intr log
Below sequence leads to the condition where nvgpu logs error
about spurious stall intr occurence:

1. PMU interrupt is set after ELPG command is sent to PMU.
2. Stalling irqhandler sees the interrupt and schedules
   the stalling thread.
3. PMU isr gets executed from nvgpu_pmu_wait_fw_ack_status
   just before the stalling irq thread is run. Due to this,
   top level interrupt gets cleared.
4. When stalling irq thread gets executed it sees no
   interrupt and logs as spurious interrupt.

This condition is not actually about "spurious interrupt",
hence change error log to gpu_dbg_intr log type and
rephrase it.

Bug 200780211

Change-Id: Idab62f61007012f7022a836473562795c24821ef
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2628275
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-23 07:04:00 -08:00
Sagar Kamble
cdfbd4313b gpu: nvgpu: add BVEC tests for common.mc unit
Add BVEC tests for following common.mc unit APIs:
1. nvgpu_mc_intr_stall_unit_config
2. nvgpu_mc_intr_nonstall_unit_config
3. mc.reset_mask

Changed the WARN to nvgpu_err in mc.reset_mask. Invalid inputs are
handled properly.

Updated the MC unit test logic w.r.t mc_intr_en_r, mc_intr_en_set_r
and mc_intr_en_clear_r semantics.

JIRA NVGPU-6399

Change-Id: I6a3ae42ac37cd6b586f6c71de338595e6cb04a37
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542591
(cherry picked from commit b9908c979e8964a216141cc6ed475c7de2f2cc0b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623631
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-14 01:30:25 -08:00
Konsta Hölttä
f4ec400d5f gpu: nvgpu: simplify nvgpu_timeout_init
nvgpu_timeout_init() returns an error code only when the flags parameter
is invalid. There are very few possible values for flags, so extract the
two most common cases - cpu clock based and a retry based timeout - to
functions that cannot fail and thus return nothing. Adjust all callers
to use those, simplfying error handling quite a bit.

Change-Id: I985fe7fa988ebbae25601d15cf57fd48eda0c677
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2613833
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-26 13:47:32 -07:00
Divya Singhatwaria
9984b59a00 gpu: nvgpu: Add ELPG protected call for GR and CE intr
- Accessing any PGRAPH registers in GR intr retrigger
  ISR routine when ELPG is engaged causes idle snap.
- This idle snap is caught when nvgpu_submit_illegal_class
  test is run.
- To avoid access to PGRAPH registers when ELPG is engaged
  add elpg protected call for GR intr retrigger and CE ISR
  and retrigger HALs

Bug 200777033

Change-Id: Ieef4a423faf79f09476d696c3078b113750548bb
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2586449
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-24 07:11:38 -07:00
Ramesh Mylavarapu
ffd0d3962f gpu: nvgpu: gsp: gsp isr and debug trace support
- Created GSP NVRISCV interrupt handle and
  respective functions and register reads.
- Created Debug trace support for GSP firmware.

NVGPU-7084

Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I2728150c4db00403aa6e3c043bc19c51677dd9cf
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589430
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-07 05:37:51 -07:00
Vedashree Vidwans
43980bfe06 gpu: nvgpu: remove nvgpu_is_bpmp_running usage
BPMP driver doesn't support any API to check whether bpmp is running.
Remove use of nvgpu_is_bpmp_running.

Bug 200720732

Change-Id: Id266e65d4af598dd056cbdbaa219d0d53b7b3fb3
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556448
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-15 10:06:42 -07:00
tkudav
0526e7eaa9 gpu: nvgpu: Create CIC-mon and CIC-rm subunits
common.cic unit is divided into common.cic.mon and common.cic.rm
based on rm and mon process split.

CIC-mon subunit includes the code which is utilized in critical
interrupt handling path like initialization, error detection and
error reporting path. CIC-rm subunit includes the code corresponding
to rest of interrupt handling(like collecting error debug data from
registers) and ISR status management (status of deferred interrupts).

Split the CIC APIs and data-members into above two subunits.

JIRA NVGPU-6899

Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-02 09:57:56 -07:00
Antony Clince Alex
d2919409e9 gpu: nvgpu: rename/collpase nvgpu_next functions and structs
Replace all nvgpu_next functions/structs either by 1) collapsing them
into nvgpu legacy functions/structs 2) renaming them as follows:
- nvgpu_next_*() => nvgpu_(ga10b/ga100)_*()
- nvgpu_next_*() => (ga10b/ga100)_*()
- nvgpu_next_*() => nvgpu_*() [only if this doesn't cause collision]
- nvgpu_next_*() = > nvgpu_*_extra()

Create hal.sim unit and move Ampere+ SIM code into it.

Jira NVGPU-4771

Change-Id: I215594a0d0df4bd663bd875a0d0db47bcb9ff6a2
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548056
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:58 -07:00
Sagar Kadamati
3e43f92f21 gpu: nvgpu: add ga10b & ga100 sources
Mass copy ga10b & ga100 sources from nvgpu-next repo.
TOP COMMIT-ID: 98f530e6924c844a1bf46816933a7fe015f3cce1

Jira NVGPU-4771

Change-Id: Ibf7102e9208133f8ef3bd3a98381138d5396d831
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524817
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-17 12:56:16 -07:00
Tejal Kudav
9f43914933 gpu: nvgpu: Move Intr handling common code to CIC
CIC (Central Interrupt controller) will be responsible for the
interrupt handling. common.cic unit is the placeholder for all
interrupt related code. Move interrupt related defines and
Public APIs present in common.mc to common.cic.
Note: The common.mc interrupts related struct definitions are
not moved as part of this patch.

Adapt the code to use interrupt handling related defines and public
APIs migrated from common.mc to common.cic

JIRA NVGPU-6899

Change-Id: I747e2b556c0dd66d58d74ee5bb36768b9370d276
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535618
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-31 19:37:31 -07:00
Vedashree Vidwans
2fa0e6fca1 gpu: nvgpu: resolve misra 5.7 violation
Rule 5.7 doesn't allow an identifier to be reused. This patch renames
variable "ops" to resolve this violation.

Jira NVGPU-6272

Change-Id: Ica8364a21d28dcf02d935a1b212e334b3f48c98a
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2471047
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-01-16 03:19:41 -08:00
Vedashree Vidwans
e0dd79cd43 gpu: nvgpu: rearch mc reset and enable hals
Remove current mc hals
- mc.reset()
- mc.enable()
- mc.disable()
- mc.reset_mask()
- mc.reset_engine()
- mc.reset_engine_enable()

Add new mc hals
- mc.enable_units(g, units, enable)
  > enable/disable given unit(s)
- mc.enable_dev(g, dev, enable)
  > enable/disable engine represented by given device pointer
- mc.enable_devtype(g, devtype)
  > enable/disable all engines of given devtype

Move common mc intr functions to common/mc/mc_intr.c.
Add below common mc functions
- nvgpu_mc_reset_units(g, units)
  > reset given logical OR of nvgpu unit bitmap
- nvgpu_mc_reset_dev(g, dev)
  > reset given single engine via dev
  > if engine is graphics, reset gpcs for nvgpu_next
- nvgpu_mc_reset_devtype(g, devtype)
  > reset all engines of given devtype
  > if devtype is graphics, reset gpcs for nvgpu_next

Bug 200648985
Bug 3109773

Change-Id: Idc67a14a0a7cde83de44fbfbec13007fead3ed5c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2408523
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
38ce6fa717 gpu: nvgpu: change unnamed structs to named structs
Following changes are made in this patch.
1) Change unnamed structs within gpu_ops to named structs
with the prefix gops_*.

2) Each named struct gops_ are moved into a separate gops specific file
under include/nvgpu/gops/

3) struct gpu_ops is moved into a separate file include/nvgpu/gpu_ops.h
and all other dependent struct gops_* are included in this header.

4) Direct references to include/nvgpu/gops are removed from files as its enough
to include gk20a.h.

Change-Id: Ieb22cb853be567e3bef14f5f8a04674eebd902ea
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398776
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
fba96fdc09 gpu: nvgpu: Replace nvgpu_engine_info with nvgpu_device
Delete the struct nvgpu_engine_info as it's essentially identical to
struct nvgpu_device. Duplicating data structures is not ideal as it's
terribly confusing what does what.

Update all uses of nvgpu_engine_info to use struct nvgpu_device. This
is often a fairly straight forward replacement. Couple of places though
where things got interesting:

  - The enum_type that engine_info uses is defined in engines.h and
    has a bit of SW abstraction - in particular the GRCE type. The only
    place this seemed to be actually relevant (the IOCTL providing device
    info to userspace) the GRCE engines can be worked out by comparing
    runlist ID.
  - Addition of masks based on intr_id and reset_id; those can be
    computed easily enough using BIT32() but this is an area that
    could be improved on.

This reaches into a lot of extraneous code that traverses the fifo
active engines list and dramtically simplifies this. Now, instead of
having to go through a table of engine IDs that point to the list of
all host engines, the active engine list is just a list of pointers to
valid engines. It's now trivial to do a for-all-active-engines type
loop. This could even be turned into a generic macro or otherwise
abstracted in the future.

JIRA NVGPU-5421

Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
fb95b7efa7 gpu: nvgpu: move nvgpu_func io functions to common
- Move nvgpu_func_writel and nvgpu_func_readl to common io file.
- Add func.get_full_phys_offset() hal to gk20a_gops structure.
- Add tu104_func_get_full_phys_offset() for tu104.

JIRA NVGPU-5363

Change-Id: I2aa13862a37f48321510882053256e16ef3f7377
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2383483
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
db30ea3362 gpu: nvgpu: move mc_intr_pbus from stall (intr_0) to nonstall (intr_1) tree
Nvgpu does not support nested interrupts and as a result priv/pbus
interrupt do not reach cpu while other interrupts on intr_0 (stall)
tree are being processed. This issue is not specific to priv/pbus
but since pbus errors are critical, it is important to detect it
early on.

Below is the snippet from one of the failing logs where nvgpu
is doing recovery to process gr interrupt.
Right after GR engine is reset (PGRAPH of PMC_ENABLE), failing priv
accesses should have triggered pbus interrupt but it does not reach cpu
until gr interrupt is handled. Any interrupt that requires recovery will
take longer to finish isr as recovery is done as part of isr.
Also intr_0 (stall) interrupts are paused while stall interrupt is being
processed.

gm20b_gr_falcon_bind_instblk:147  [ERR]  arbiter idle timeout, status: badf1020
gm20b_gr_falcon_wait_for_fecs_arb_idle:125  [ERR]  arbiter idle timeout, fecs ctxsw status: 0xbadf1020

Fix to detect pbus intr while other stall interrupts are being processed
is to move pbus intr enable/disable/clear/handle to nonstall (intr_1)
tree. Configure pbus_intr_en_1 to route pbus to nostall tree.
Priv interrupts cannot be moved to nonstall (intr_1) tree due
to h/w not supporting this.

In Turing, moving pbus intr to nonstall is not feasible as mc_intr(1)
tree is deprecated. Add Turing specific stall intr handler hals with
original logic to route pbus intr to mc_intr(0).

JIRA NVGPU-25
Bug 200603566

Change-Id: I36fc376800802f20a0ea581b4f787bcc6c73ec7e
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2354192
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Seema Khowala
1c40ebe9b1 gpu: nvgpu: handle pbus and priv intr first
Handle pbus and priv intr before handling other stall
interrupts. These should be treated as high priority
interrupts.

JIRA NVGPU-25
Bug 200603566

Change-Id: I707119c8751a5621958777ffb64300db28426dfb
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2350773
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
aff5497907 gpu: nvgpu: add intr_unit_bitmask i/p param for fb.intr.isr
tu104 onwards, fb interrupt status/enable/disable moved from
fb_niso_intr_* reg to fb_*vector* registers.
At the top level, fb interrupt status/enable/disable is done
using hub_intr bit in mc_intr registers.

Starting nvgpu-next, this has changed.

JIRA NVGPU-5032

Change-Id: Ib54170b055b83e2696312c811c2e3ba678749359
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2330867
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
28ccd63f69 gpu: nvgpu: enable CONFIG_NVGPU_LS_PMU for safety
Enable CONFIG_NVGPU_LS_PMU for dGPU safety build.
Add missing #ifdefs for CONFIG_NVGPU_POWER_PG and
CONFIG_NVGPU_CLK_ARB which are not defined for safety build.

Moved gm20b_mc_is_enabled to fusa code.

NVGPU_UNIT_PWR is only defined when CONFIG_NVGPU_HAL_NON_FUSA
is defined. Added #ifdefs to compile out gk20a_pmu functions
that are using it.

Jira NVGPU-4661

Change-Id: Ieb552f9374bad6f3dad777322f118931e0bc94ec
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2317085
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Thomas Fleury
f43d5df83a gpu: nvgpu: build dGPU in safety
Enable build flags for dGPU in safety, when
NVGPU_FORCE_DGPU_SAFETY_PROFILE is set.

Use libnvgpu-dgpu_safe.exports for dGPU safety build.

Add build flags for tu104 HAL initialization (to solve
undefined symbols in safety build).

Temporarily add non-fusa files needed to build dGPU in safety.
related functions will have to move to fusa files.

Jira NVGPU-4611

Change-Id: I41db0c039c7f15d9191cdb811b4906e779d5cc88
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2310276
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kamble
992aaebfc4 gpu: nvgpu: mc: fix the header guards for hal files
Header guards in gp10b, gv11b and gv100 MC hal files were not as per
naming convention. Fix those.

JIRA NVGPU-4795

Change-Id: Ifc8c162e43242a5d221e5685ceecb02b76944a96
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2288031
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Sagar Kamble
57f3968cb9 gpu: nvgpu: mc: address code inspection gaps
Address following issues uncovered during inspection:
1. Remove the doxygen comment from nvgpu_wait_for_deferred_interrupts
   definition.
2. Use NVGPU_MC_INTR_STALLING instead of hardcoding the index.
3. Define doxygen groups NVGPU_MC_UNIT_ENUMS,
   NVGPU_MC_INTR_TYPE_DEFINES, NVGPU_MC_INTR_UNIT_DEFINES and
   NVGPU_MC_INTR_ENABLE_DEFINES.
4. Update the doxygen comments.
5. Fix the cleanup, typo in the description of the test
   test_wait_for_deferred_interrupts.

JIRA NVGPU-4795

Change-Id: Ifc6756832aabf9dd42ee174eb1373495e6d38c86
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287627
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kamble
ad6a1b141f gpu: nvgpu: remove unneeded headers inclusion
Remove the header inclusions that are not needed in falcon and mc
units.

JIRA NVGPU-4795
JIRA NVGPU-4787

Change-Id: I39f639b01f321b7b4e969d839daef8efe22617e9
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287718
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Scott Long
21f8b366cd gpu: nvgpu: fix misra 2.5 violations
MISRA Advisory Rule 2.5 states that a project should not contain unused
macro declarations.

While most of the violations in the nvgpu driver are due to unused
macros from hw headers, devinit-related headers, etc. there is a small
number that are due to things like:

 * macros not being used when they could/should be
 * macros in C files that are really not referenced
 * CPP build flag mismatches

This change eliminates such violations from the following:

 * replace constants with existing macros in timeout conversion code
 * wrap nvgpu_gmmu_dbg macro #defines in #ifdef CONFIG_NVGPU_TRACE/#endif
 * wrap MAX_MC_INTR_REGS #define in #ifdef CONFIG_NVGPU_NON_FUSA/#endif
 * remove unused FECS_MAILBOX_0_ACK_RESTORE from runlist code
 * wrap BACKTRACE_MAXSIZE macro with #ifndef _QNX_SOURCE/#endif

Jira NVGPU-3178

Change-Id: I2bc72f706d7af3f8e7b062126e8543d0dc8ac250
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284419
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Sagar Kamble
b21b9bdf5d gpu: nvgpu: update debug prints in ltc isr
It was observed that some of the informational debug prints in the
LTC ISRs were printed through nvgpu_err. They should be instead
printed through nvgpu_log under flag gpu_dbg_intr.

JIRA NVGPU-4830

Change-Id: I04cf4e70a7c810014d8a6ee5e901c2b4cf2c58ac
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2285088
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Philip Elcan
2b86f65477 gpu: nvgpu: mc: cleanup SWVR traceability
Cleanup issues with traceability for common.mc:
- Move these declarations under macros or @cond as they are either
  non-fusa or private functions to the unit:
  - gm20b_mc_is_enabled
  - mc_gp10b_log_pending_intrs
  - mc_gp10b_ltc_isr
  - gv11b_mc_is_intr_hub_pending
- Fix typo in SWUTS for gv11b_mc_is_stall_and_eng_intr_pending

JIRA NVGPU-4818

Change-Id: I53a332627772e4d793430159ac1924c8f9ce8c1c
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2280640
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:10:29 -06:00
Thomas Fleury
6fa5da61d7 gpu: nvgpu: use engine_id to access engine_info
Generalize use of "engine_id" variable name to index f->engine_info.

Jira NVGPU-4511

Change-Id: Ie3bc2c701dc3bab833d6ac134273dd6a102528c2
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262219
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
66b68edd6b gpu: nvgpu: iterator name for active_engines
Some functions used engine_id or eng_id to index active_engines_list,
which could get confusing when used in conjunction with similar
variable as active_engine_id or act_eng_id.

Use generic iterator name i or j instead, to make it clear that
f->active_engines_list is NOT indexed by engine id.

Jira NVGPU-4511

Change-Id: I07a6bf00dfb6d4e608b10f2f79e38a70e557428c
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262218
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
fba516ffae gpu: nvgpu: enable PMU ECC interrupt early
PMU IRQs were not enabled assuming entire functionality for LS PMU.
Debugging early init issues of PMU falcon ECC errors triggered
during nvgpu power-on will be cumbersome if interrupts are not
enabled early. FMEA analysis of the nvgpu init path also
requires this interrupt be enabled earlier.

Hence, Enable the PMU ECC IRQ early during nvgpu_finalize_poweron.
pmu_enable_irq is updated to enable interrupts differently for
safety and non-safety. PMU interrupts disabling is moved out
of nvgpu_pmu_destroy to nvgpu_prepare_poweroff. Prepared new
wrapper API nvgpu_pmu_enable_irq.

PMU ECC init and isr mutex init is moved to the beginning of
nvgpu_pmu_early_init as for safety, ls pmu code path is
disabled. Fixed the pmu_early_init dependent and mc
interrupt related unit tests.

Update the doxygen for changed functions.

JIRA NVGPU-4439

Change-Id: I1a1e792d2ad2cc7a926c8c1456d4d0d6d1f14d1a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2251732
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
ae4f219c65 mc remove non-fusa HAL used only by PMU
Compile out the MC HAL is_enabled() with the macro CONFIG_NVGPU_LS_PMU
since it is only used by the PMU unit.

JIRA NVGPU-2224

Change-Id: I242ba923bfce62674107089157a6103aee6a2f93
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2258703
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
e515ba7098 gpu: nvgpu: mc: remove non-fusa HALs
Compile out the function mc_gp10b_log_pending_intrs() with the
CONFIG_NVGPU_NON_FUSA macro since it isn't used in the FUSA build.

JIRA NVGPU-2224

Change-Id: I5b2d465142be7d1bddf114c7374ced7b297c33d0
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2257127
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
a8c9c800cd gpu: nvgpu: reorganization of MC interrupts control
Previously, unit interrupt enabling/disabling and corresponding MC level
interrupt enabling/disabling was not done at the same time.
With this change, stall and nonstall interrupt for units are programmed
at MC level along with individual unit interrupts. Kept access to MC
interrupt registers through mc.intr_lock spinlock.

For doing this separated CE and GR interrupt mask functions.
mc.intr_enable is only used when there is global interrupt
control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable
is now removed. Removed following functions - mc_gv100_intr_enable,
mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config
as we can use the generic unit interrupt control function.

JIRA NVGPU-4336

Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196178
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
f2b49f1c40 gpu: nvgpu: add doxygen for common MC unit
Add doxygen details about Master Control (MC) common unit. Moved the
interrupt handling related variables to new structure nvgpu_mc.

JIRA NVGPU-2524

Change-Id: I61fb4ba325d9bd71e9505af01cd5a82e4e205833
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226019
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
b26acdeb87 gpu: nvgpu: move mc_boot_0 function to hals and rename to get_chip_details
This function gets the GPU chip architecture, implementation and
revision information by reading the MC boot register, hence it
is more suited to be located in HAL files.
test_check_gpu_state is now being run after test_hal_init as the
gops.mc needs to be initialized for test_check_gpu_state subtest.

JIRA NVGPU-2524

Change-Id: I85355af11d3505a9eb4f10a3fe4e6d9b56285047
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226018
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
2edf3db10a gpu: nvgpu: move mc gpu_ops out of gk20a.h and add doxygen comments for HALs
gk20a.h will include gops_mc.h to contain the mc ops definitions. Add
doxygen comments for the HAL functions that are called directly.
Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file.

JIRA NVGPU-2524

Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
9169e8c048 gpu: nvgpu: mc: move mc declarations to mc.h
Move declarations that belong to mc from gk20a.h to mc.h where they
belong.

JIRA NVGPU-2532

Change-Id: I91934ff60e2735c61d16459c04507fed6e1c96d7
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2214421
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
06fd513e1e gpu: nvgpu: move common.unit into common.mc
nvgpu.common.unit was just an enum used for passing to nvgpu.common.mc
APIs. So, move the enum into mc.h, and replace the include of unit.h
with mc.h where appropriate. And update the yaml arch.

JIRA NVGPU-4144

Change-Id: I210ea4d3b49cd494e43add1b52f3fbcdb020a1e3
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2216106
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
065f98f669 gpu: nvgpu: init: add return for all init APIs
This adds return values for all init APIs. This make all the init APIs
have the same signature. This is a prerequisite to making a table of
init functions.

JIRA NVGPU-3980

Change-Id: I5b71fd06ad248092af133ffe908e2930acb6d2b0
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2202973
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Vedashree Vidwans
21dfd4e441 gpu: nvgpu: mm: hal code complexity cleanup hal.mc
This patch divides complex code segments into smaller functions to
reduce code complexity in hal mc code.

Jira NVGPU-4065

Change-Id: I90c37da2ce100bdbb5f11b744a42c00c7fc33988
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2204234
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00