Commit Graph

92 Commits

Author SHA1 Message Date
Rajesh Devaraj
3b2b225c73 gpu: nvgpu: update pmu_early_init
Move the setting of power features related enable flags to separate
static function. Invoke this function when PMU is not supported.

JIRA NVGPU-9283

Change-Id: I429504c09d40c2cb115fce7550555f06b1e384ed
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2817658
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Ramalingam C <ramalingamc@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-07 01:51:30 -08:00
Dinesh T
e4cf52123f gpu: nvgpu: Add ce halt function
This is adding CE halt fuction to reset CE properly
by setting stall req and waiting for stallack.

Bug 200641946

Change-Id: I501ccf68a4f6fe95911e73fa2eb65bde93a9f3e9
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678366
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-11 20:44:38 -08:00
Sagar Kamble
29a0a146ac gpu: nvgpu: fix coverity defects
Fix following coverity defects:
  ioctl_prof.c resource leak
  ioctl_dbg.c logically dead code
  global_ctx.c identical code for branches
  therm_dev.c resource leak
  pmu_pstate.c unused value
  nvgpu_mem.c dead default in switch
  tsg.c Dereference before null check
  nvlink_gv100.c logically dead code
  nvlink.c Out-of-bounds write
  fifo_vgpu.c Dereference null return value
  pmu_pg.c Dereference before null check
  fw_ver_ops.c Identical code for different branches
  boardobjgrp.c Dereference after null check
  boardobjgrp.c Dereference before null check
  boardobjgrp.c Dereference after null check
  engines.c Dereference before null check
  nvgpu_init.c Unused value

CID 10127875
CID 10127820
CID 10063535
CID 10059311
CID 10127863
CID 9875900
CID 9865875
CID 9858045
CID 9852644
CID 9852635
CID 9852232
CID 9847593
CID 9847051
CID 9846056
CID 9846055
CID 9846054
CID 9842821

Bug 3460991

Change-Id: I91c215a545d07eb0e5b236849d5a8440ed6fe18d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2657444
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-28 04:50:12 -08:00
Deepak Nibade
3d9c67a0e7 gpu: nvgpu: enable Orin support in safety build
Most of the Orin chip specific code is compiled out of safety build
with CONFIG_NVGPU_NON_FUSA and CONFIG_NVGPU_HAL_NON_FUSA. Remove the
config protection from Orin/GA10B specific code. Currently all code
is enabled. Code not required in safety will be compiled out later
in separate activity.

Other noteworthy changes in this patch related to safety build:

- In ga10b_ce_request_idle(), add a log print to dump num_pce so that
  compiler does not complain about unused variable num_pce.
- In ga10b_fifo_ctxsw_timeout_isr(), protect variables active_eng_id and
  recover under CONFIG_NVGPU_KERNEL_MODE_SUBMIT to fix compilation
  errors of unused variables.
- Compile out HAL gops.pbdma.force_ce_split() from safety since this HAL
  is GA100 specific and not required for GA10B.
- Compile out gr_ga100_process_context_buffer_priv_segment() with
  CONFIG_NVGPU_DEBUGGER.
- Compile out VAB support with CONFIG_NVGPU_HAL_NON_FUSA.
- In ga10b_gr_intr_handle_sw_method(), protect left_shift_by_2 variable
  with appropriate configs to fix unused variable compilation error.
- In ga10b_intr_isr_stall_host2soc_3(), compile ELPG function calls
  with CONFIG_NVGPU_POWER_PG.
- In ga10b_pmu_handle_swgen1_irq(), move whole function body under
  CONFIG_NVGPU_FALCON_DEBUG to fix unused variable compilation errors.
- Add below TU104 specific files in safety build since some of the code
  in those files is required for GA10B. Unnecessary code will be
  compiled out later on.
	hal/gr/init/gr_init_tu104.c
	hal/class/class_tu104.c
	hal/mc/mc_tu104.c
	hal/fifo/usermode_tu104.c
	hal/gr/falcon/gr_falcon_tu104.c
- Compile out GA10B specific debugger/profiler related files from
  safety build.
- Disable CONFIG_NVGPU_FALCON_DEBUG from safety debug build temporarily
  to work around compilation errors seen with keeping this config
  enabled. Config will be re-enabled in safety debug build later.

Jira NVGPU-7276

Change-Id: I35f2489830ac083d52504ca411c3f1d96e72fc48
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2627048
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-11-26 08:46:47 -08:00
Antony Clince Alex
cce1d7ad84 gpu: nvgpu: update device management framework to remove unusable engines
On certain platforms, not all copy engine instances are usable. The user
shouldn't submit any work to these engines. To enforce this, remove
these engines from active/host_engine list, this should ensure that these
engines do not get advertised to userspace. In order to accomplish this
introduce the following functions:
- nvgpu_engine_remove_one_dev: This function removes the specified device
  entry from following device lists: fifo->host_engines, fifo->active_engines,
  runlist->rl_dev_list, runlist->eng_bitmask.

Replace iteration over LCE device type entries using
nvgpu_device_for_each(g, dev, NVGPU_DEVTYPE_LCE), along with this introduce
macro nvgpu_device_for_each_safe.

Introduce gpu_dbg_ce flag for CE debugging.

Bug 3370462

Change-Id: I2e21f18363c6e53630d129da241c8fece106cd33
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2616711
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-18 09:18:55 -08:00
Konsta Hölttä
f4ec400d5f gpu: nvgpu: simplify nvgpu_timeout_init
nvgpu_timeout_init() returns an error code only when the flags parameter
is invalid. There are very few possible values for flags, so extract the
two most common cases - cpu clock based and a retry based timeout - to
functions that cannot fail and thus return nothing. Adjust all callers
to use those, simplfying error handling quite a bit.

Change-Id: I985fe7fa988ebbae25601d15cf57fd48eda0c677
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2613833
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-26 13:47:32 -07:00
Divya Singhatwaria
77e3a8c5e4 gpu: nvgpu: ga10b: Add request_idle ce ops
Issue observed:
- In GA10B, it was observed that after recovery happens
  ELPG does not engage.
- It was because, after CE reset, when nvgpu_submit_twod test
  was run to engage ELPG, IDLE_FLIPPED_PWR_OFF signal was asserted.
- This means that when ELPG was engaged (engine is in PWR_OFF),
  some idle signal flips (becomes non-idle) and this causes
  IDLE_SNAP. After IDLE_SNAP is hit, ELPG will not engage further.
- After debugging from WAVES, it was observed that:
  LCE0/LCE1 are not done with the reset sequence.
- The state of these LCE is RESET0. A pri request (pri read
  to NV_CE_PCE_MAP register in CE) is seen that kicks it out of
  RESET0. After this state, it goes through few states to update
  some internal states (states RESET1/RESET2/PCE_MAP etc) and then
  eventually settles down to IDLE state.

Solution:
- Read ce_pce_map_r register in recovery sequence (after ce reset).
- It is observed that when this read is added recovery is complete
  and post that when nvgpu_submit_two test is executed, ELPG is engaging.
- This means that a pri read is needed after CE reset so that it settles
  to idle state properly and post that ELPG can engage properly.

Bug 200734258

Change-Id: I5bb84921ca62a740fde81ffe6c29ccde4ebb341b
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554493
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-15 10:05:02 -07:00
Antony Clince Alex
d2919409e9 gpu: nvgpu: rename/collpase nvgpu_next functions and structs
Replace all nvgpu_next functions/structs either by 1) collapsing them
into nvgpu legacy functions/structs 2) renaming them as follows:
- nvgpu_next_*() => nvgpu_(ga10b/ga100)_*()
- nvgpu_next_*() => (ga10b/ga100)_*()
- nvgpu_next_*() => nvgpu_*() [only if this doesn't cause collision]
- nvgpu_next_*() = > nvgpu_*_extra()

Create hal.sim unit and move Ampere+ SIM code into it.

Jira NVGPU-4771

Change-Id: I215594a0d0df4bd663bd875a0d0db47bcb9ff6a2
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548056
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:58 -07:00
Antony Clince Alex
f9cac0c64d gpu: nvgpu: remove nvgpu_next files
Remove all nvgpu_next files and move the code into corresponding
nvgpu files.

Merge nvgpu-next-*.yaml into nvgpu-.yaml files.

Jira NVGPU-4771

Change-Id: I595311be3c7bbb4f6314811e68712ff01763801e
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547557
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:53 -07:00
Antony Clince Alex
c7d43f5292 gpu: nvgpu: remove usage of CONFIG_NVGPU_NEXT
The CONFIG_NVGPU_NEXT config is no longer required now that ga10b and
ga100 sources have been collapsed. However, the ga100, ga10b sources
are not safety certified, so mark them as NON_FUSA by replacing
CONFIG_NVGPU_NEXT with CONFIG_NVGPU_NON_FUSA.

Move CONFIG_NVGPU_MIG to Makefile.linux.config and enable MIG support
by default on standard build.

Jira NVGPU-4771

Change-Id: Idc5861fe71d9d510766cf242c6858e2faf97d7d0
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547092
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:47 -07:00
Antony Clince Alex
2d5d8e882f gpu: nvgpu: fix ce interrupt mask update
The CE interrupt mask update should not be skipped if the driver doesn't
implement stall or non-stall interrupt handlers. At present, the mask update is
skipped if either is not implemented causing the other to remain disabled which
is not correct.

Update nvgpu_ce_engine_interrupt_mask to always return engine interrupt mask.

Bug 200709761

Change-Id: I503338e3f4d53c1e0b85b0974d862f7b88545ef2
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2506292
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-03-29 19:02:16 -07:00
Alex Waterman
11d3785faf gpu: nvgpu: Rename struct nvgpu_runlist_info, fields in fifo
Rename struct nvgpu_runlist_info to struct nvgpu_runlist; the
info is not necessary. struct nvgpu_runlist is soon to be a
first class object among the nvgpu object model.

Also rename the fields runlist_info and active_runlist_info to
simply runlists and active_runlists respectively. Again the info
text is just not necessary and somewhat misleading. These structs
_are_ the runlist representations in SW; they are not merely
informational.

Also add an rl_dbg() macro to print debug info specific to
runlist management and some debug prints specifying the runlist
topology for the running chip.

Change-Id: Id9fcbdd1a7227cb5f8c75cca4abbff94fe048e49
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470303
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-01-20 21:56:33 -08:00
Antony Clince Alex
c36af00e55 gpu: nvgpu: fix lookup of engine_id from mmu_fault_id
The function "nvgpu_engine_mmu_fault_id_to_eng_id_and_veid" updates only
the veid field and leaves the engine_id as invalid. This can cause the
recovery to be skipped in certain instances of MMUFAULT; For example,
the MMUFAULT when a unbind is done on a channel which is currently active
on the engine. In this case, the ch_id associated with the fault is -1 and
the function "gv11b_mm_mmu_fault_handle_non_replayable" will not set the
rc_type correctly causing recovery to be skipped and leaving the engine in
a bad state.

Bug 3163660

Change-Id: Ic99c47771a4002c153ac77ab0473b11d01cfd54a
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2457259
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:48 -06:00
Lakshmanan M
883c12529a gpu: nvgpu: Add multi GR reset support for MIG
* Added multi GR reset/recovery support for MIG.
* Added a api to get the gr engine id using gr instance id.

JIRA NVGPU-5650
JIRA NVGPU-5653

Change-Id: I12ece75a4c33f0944f404121b54879e814dda6df
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2443644
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:48 -06:00
Vedashree Vidwans
c0b9ae2f17 gpu: nvgpu: enable gr_reset in recovery on sim platform
HALT_PIPELINE method is supported on nvgpu-next simulation platform.
Send HALT_PIPELINE followed by gr reset during recovery for all types of
platforms including simulation platform.

Bug 3109773

Change-Id: Ib830075bb9414fa1765c762a652e63cddbe6a141
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406719
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Lakshmanan M
b49c892f81 gpu: nvgpu: Add multi GR reset support
Added multi GR reset support for MIG.

JIRA NVGPU-5653

Change-Id: I36c0473d4ba0e5bdd2dc07204b7c516ce9860b5e
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2416069
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
e0dd79cd43 gpu: nvgpu: rearch mc reset and enable hals
Remove current mc hals
- mc.reset()
- mc.enable()
- mc.disable()
- mc.reset_mask()
- mc.reset_engine()
- mc.reset_engine_enable()

Add new mc hals
- mc.enable_units(g, units, enable)
  > enable/disable given unit(s)
- mc.enable_dev(g, dev, enable)
  > enable/disable engine represented by given device pointer
- mc.enable_devtype(g, devtype)
  > enable/disable all engines of given devtype

Move common mc intr functions to common/mc/mc_intr.c.
Add below common mc functions
- nvgpu_mc_reset_units(g, units)
  > reset given logical OR of nvgpu unit bitmap
- nvgpu_mc_reset_dev(g, dev)
  > reset given single engine via dev
  > if engine is graphics, reset gpcs for nvgpu_next
- nvgpu_mc_reset_devtype(g, devtype)
  > reset all engines of given devtype
  > if devtype is graphics, reset gpcs for nvgpu_next

Bug 200648985
Bug 3109773

Change-Id: Idc67a14a0a7cde83de44fbfbec13007fead3ed5c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2408523
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
2b48aa5b0c gpu: nvgpu: Add device for_each macro
Add a macro to iterate over a device list; it is just a wrapper to
the nvgpu_list_for_each() macro. It lets code iterate over the
list of detected devices without being aware of the underlying
instance IDs.

This also removes the need to do a separate nvgpu_device_get()
and subsequent NULL checking. This will reduce overhead for
unit testing!

Change-Id: If41dbee30a743d29ab62ce930a819160265b9351
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2404914
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Tejal Kudav
b269aae9f2 gpu: nvgpu: correct usage of pbdma_id
The pbdma_id field stored in struct nvgpu_device is bitmask and not
bit position as implied by the name. This field is incorrectly used as
bit position in nvgpu_engine_disable_activity(), causing PRI timeout
errors during iGPU and dGPU shutdown path.

PRI timeout errors-
nvgpu: 17000000.gv11b                  gk20a_ptimer_isr:54   [ERR]
PRI timeout: ADR 0x0000308c READ  DATA 0x00000000

Here the pbdma_id stored in struct nvgpu_device for runlist_0 on
gv11b is 0x3(bitmask corresponding to PBDMA_0 and PBDMA_1).
nvgpu_engine_disable_activity() interprets this as PBDMA_3 and adds
incorrect offset to access PBDMA_STATUS register, causing PRI error.

Modify nvgpu_engine_disable_activity() to treat pbdma_id as bitmask
and loop through set bits.

JIRA NVGPU-5991

Change-Id: Iaffb974cddaa375a329e70f3b5903b9ef2a222c4
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397954
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
717921a274 gpu: nvgpu: return intr mask of all GR engine instances
nvgpu_gr_engine_interrupt_mask() earlier returned mask of all GR engine
instance interrupts. During device refactor series, this got changed to
return interrupt of only first instance.

Change this again to return interrupt mask of all the GR engine
instances since common.mc unit does not yet support APIs to enable
interrupt of individual GR instance.

Update nvgpu_gr_get_syspipe_id() API to take gr_instance_id as parameter
instead of struct nvgpu_gr pointer. Definition of struct nvgpu_gr is not
available outside of common.gr unit.

Jira NVGPU-5648

Change-Id: I5320d1515eea6054150dc14706a16475bd650da7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2405409
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Debarshi Dutta
38ce6fa717 gpu: nvgpu: change unnamed structs to named structs
Following changes are made in this patch.
1) Change unnamed structs within gpu_ops to named structs
with the prefix gops_*.

2) Each named struct gops_ are moved into a separate gops specific file
under include/nvgpu/gops/

3) struct gpu_ops is moved into a separate file include/nvgpu/gpu_ops.h
and all other dependent struct gops_* are included in this header.

4) Direct references to include/nvgpu/gops are removed from files as its enough
to include gk20a.h.

Change-Id: Ieb22cb853be567e3bef14f5f8a04674eebd902ea
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398776
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
fba96fdc09 gpu: nvgpu: Replace nvgpu_engine_info with nvgpu_device
Delete the struct nvgpu_engine_info as it's essentially identical to
struct nvgpu_device. Duplicating data structures is not ideal as it's
terribly confusing what does what.

Update all uses of nvgpu_engine_info to use struct nvgpu_device. This
is often a fairly straight forward replacement. Couple of places though
where things got interesting:

  - The enum_type that engine_info uses is defined in engines.h and
    has a bit of SW abstraction - in particular the GRCE type. The only
    place this seemed to be actually relevant (the IOCTL providing device
    info to userspace) the GRCE engines can be worked out by comparing
    runlist ID.
  - Addition of masks based on intr_id and reset_id; those can be
    computed easily enough using BIT32() but this is an area that
    could be improved on.

This reaches into a lot of extraneous code that traverses the fifo
active engines list and dramtically simplifies this. Now, instead of
having to go through a table of engine IDs that point to the list of
all host engines, the active engine list is just a list of pointers to
valid engines. It's now trivial to do a for-all-active-engines type
loop. This could even be turned into a generic macro or otherwise
abstracted in the future.

JIRA NVGPU-5421

Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
d0714b40c1 gpu: nvgpu: Add engine reset profiling
This is a key part of the fifo recovery sequence.

JIRA NVGPU-5606

Change-Id: I8807884394834b912f25d7c535ee22f547988b2d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2382590
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
359fc24aaf gpu: nvgpu: Rework engine management to work with vGPU
Currently the vGPU engine management rewrites a lot of the common
device agnostic engine management code.

With the new top HAL parsing one device at a time, it is now more
easily possible to tie the vGPU into the new common device framework
by implementing the top HAL but with the vGPU engine list backend.

This lets the vGPU inherit all the common engine and device
management code. By doing so the vGPU HAL need only implement a
trivial and simple HAL.

This also gets us a step closer to merging all of the CE init
code: logically it just iterates through all CE engines whatever
they may be. The only reason this differs between chips is because
of the swap from CE0-2 to LCEs in the Pascal generation. This could
be abstracted by the unit code easily enough.

Also, the pbdma_id for each engine has to be added to the device
struct. Eventually this was going to happen anyway, since the
device struct will soon replace the nvgpu_engine_info struct.
It's a little bit of an abuse but might be worth it long term. If
not, it should not be difficult to replace uses of dev->pbdma_id
with a proper lookup of PBDMA ID based on the device info.

JIRA NVGPU-5421

Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
fbb6a5bc1c gpu: nvgpu: Remove fifo->pbdma_map
The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists.
This array allows other software to query what PBDMA(s) serves a given
runlist. The PBDMA map is read verbatim from an array of host registers.
These registers are stored in a kmalloc()'ed array.

This causes a problem for the device management code. The device
management initialization executes well before the rest of the FIFO
PBDMA initialization occurs. Thus, if the device management code
queries the PBDMA mapping for a given device/runlist, the mapping has
yet to be populated.

In the next patches in this series the engine management code is subsumed
into the device management code. In other words the device struct is
reused by the engine management and all host SW does is pull pointers to
the host managed devices from the device manager. This means that all
engine initialization that used to be done on top of the device
management needs to move to the device code.

So, long story short, the PBDMA map needs to be read from the registers
directly, instead of an array that gets allocated long after the device
code has run.

This patch removes the pbdma map array, deletes two HALs that managed
that, and instead provides a new HAL to query this map directly from
the registers so that the device code can use it.

JIRA NVGPU-5421

Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
160669a7bb gpu: nvgpu: return device from nvgpu_device_get()
Instead of copying the device contents into the passed pointer have
nvgpu_device_get() return a device pointer. This will let the engines.c
code move towards using the nvgpu_device type directly, instead of
maintaining its own version of an essentially identical struct.

JIRA NVGPU-5421

Change-Id: I6ed2ab75187a207c8962d4c0acd4003d1c20dea4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319758
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
319520ff57 gpu: nvgpu: Add a new device manager unit
This adds a new device management unit in the common code responsible
for facilitating the parsing of the GPU top device list and providing
that info to other units in nvgpu.

The basic idea is to read this list once from HW and store it in a
set of lists corresponding to each device type (graphics, LCE, etc).
Many of the HALs in top can be deleted and instead implemented using
common code parsing the SW representation.

Every time the driver queries the device list it does so using a
device type and instance ID. This is common code. The HAL is responsible
for populating the device list in such a way that the driver can
query it in a chip agnostic manner.

Also delete some of the unit tests for functions that no longer
exist. This code will require new unit tests in time; those should be
quite simple to write once unit testing is needed.

JIRA NVGPU-5421

Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
2a3bb9107f gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h>
top.h is a description of "devices" available on the GPU. As such
rename this header to device.h.

device.h will ultimately be a unit of actual C code that will rely
on the top HAL to fill a device list.

JIRA NVGPU-5421

Change-Id: If6e4a537d2209e429a678761a34713723da7a00a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
2f6be2735e gpu: nvgpu: remove nvgpu-next gr init
nvgpu-next gr init is handled within nvgpu-next
hals. So remove references to NVGPU_NEXT_INIT_GR_INFO from
nvgpu hals.

JIRA NVGPU-4979

Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Change-Id: I2e493220f855a7ff2f940cf07b1fc0b876601df5
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319102
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
44f12288ad gpu: nvgpu: add mc.reset_engine hal for nvgpu-next
Engine reset process has changed for nvgpu-next. Add mc.reset_engine
gops for nvgpu-next.
Modify engine reset functions to use mc.reset_engine hal.

Jira NVGPU-5145

Change-Id: I176800212042eaef71c8cbd4bc499805c5af0e60
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2312485
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
31ba194a85 gpu: nvgpu: extend engine_info for nvgpu-next
Extend engine_info for nvgpu-next.

JIRA NVGPU-4970

Change-Id: I0e8e5ae9361776a48972ae6d0cee84ece19d7590
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2291811
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
5457172924 gpu: nvgpu: fifo: modify bit shift operand sign
Bit shift operands should be positive to ensure correct behavior. This
patch modifies bit shift operands to be unsigned.

Jira NVGPU-4817

Change-Id: I2fc6510986cee0fbd79743164f25b05231b4da92
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2285810
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
55510f266d gpu: nvgpu: unit: improve coverage for engines
Improve branch coverage for the following functions:
- nvgpu_engine_get_active_eng_info
- nvgpu_engine_get_ids
- nvgpu_ce_engine_interrupt_mask
- nvgpu_engine_get_gr_runlist_id

Add unit tests for the following functions:
-_nvgpu_engine_get_fast_ce_runlist_id
- nvgpu_engine_is_valid_runlist_id
- nvgpu_engine_id_to_mmu_fault_id
- nvgpu_engine_mmu_fault_id_to_engine_id
- nvgpu_engine_get_mask_on_id
- nvgpu_engine_get_id_and_type
- nvgpu_engine_find_busy_doing_ctxsw
- nvgpu_engine_get_runlist_busy_engines
- nvgpu_engine_mmu_fault_id_to_veid
- nvgpu_engine_mmu_fault_id_to_eng_id_and_veid
- nvgpu_engine_mmu_fault_id_to_eng_ve_pbdma_id

Jira NVGPU-4511

Change-Id: Ib340df17468ff3447e271a86af9a47a067f6ad11
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2262222
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
45b99f67b2 gpu: nvgpu: remove dead code for runlist_id check
nvgpu_engine_is_valid_runlist_id already iterates the list of
active engines, therefore the engine_id is already known to
be valid.

Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.

Also make sure to reset num_engines in nvgpu_cleanup_sw, to avoid
accessing uninitialized active_engines_list in unit test corner
cases (targetting init/remove support).

Jira NVGPU-4511

Change-Id: Ia6b904a7f3ca46e5097f06770b4caad317ec967b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2263618
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Scott Long
3b4b418330 gpu: nvgpu: fifo: misra 12.1 fixes
MISRA Advisory Rule states that the precedence of operators within
expressions should be made explicit.

This change removes the Advisory Rule 12.1 violations from fifo code.

Jira NVGPU-3178

Change-Id: I487d039c5be8024b21ec87d520d86763f9338d2a
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2276793
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
2088dc5d85 gpu: nvgpu: remove dead code for get gr runlist_id
nvgpu_engine_get_gr_runlist_id gets the first instance of
active GR engine using nvgpu_engine_get_ids. Therefore the
engine_id is already known to be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.

Jira NVGPU-4511

Change-Id: Ifcc0851e3d14d862e2ed7b21ea57f17a66eca9dd
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2263617
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
ca17622b7e gpu: nvgpu: set invalid veid for non GR engines
In nvgpu_engine_mmu_fault_id_to_eng_id_and_veid, set veid to
invalid for non-GR engines.

Jira NVGPU-4511

Change-Id: I2cec7898f8f7dec15224fdf70c444c0dd6de8a16
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262220
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
6fa5da61d7 gpu: nvgpu: use engine_id to access engine_info
Generalize use of "engine_id" variable name to index f->engine_info.

Jira NVGPU-4511

Change-Id: Ie3bc2c701dc3bab833d6ac134273dd6a102528c2
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262219
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
66b68edd6b gpu: nvgpu: iterator name for active_engines
Some functions used engine_id or eng_id to index active_engines_list,
which could get confusing when used in conjunction with similar
variable as active_engine_id or act_eng_id.

Use generic iterator name i or j instead, to make it clear that
f->active_engines_list is NOT indexed by engine id.

Jira NVGPU-4511

Change-Id: I07a6bf00dfb6d4e608b10f2f79e38a70e557428c
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2262218
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Vedashree Vidwans
71040ef04f gpu: nvgpu: unit: mm: mmu_fault gv11b_fusa UT
This unit test covers most of the nvgpu.hal.mm.mmu_fault.gv11b_fusa
module lines and almost all branches.

Jira NVGPU-2218

Change-Id: I7c95876a0b1b4bb4b86eb15e21ca0da747d06162
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2258545
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
a560a378a1 gpu: nvgpu: ce: fifo: fix CE interrupt mask
Fix bug where the CE mask includes other engine types besides just CEs
in nvgpu_ce_engine_interrupt_mask().

The intent of this API is to return mask of CE interrupts. However, the
if clause in the for loop is only excluding engine interrupts if the CE
stall or non-stall ISR is NULL. So, it does not distinquish between CE
or GR engine interrupts if the CE ISR is non-null.

Since the expectation is to not return CE interrupts if the ISRs are
NULL, just return a 0 mask if either ISR is NULL without having to
bother with the loop.

If the ISRs are set in the CE HAL, within the loop, only add interrupts
to the mask returned if the engine type is actually a CE engine (i.e. do
not include GR engine interrupts).

JIRA NVGPU-2224

Change-Id: Ic0048b00f16590fec50bb0858bd3f4498a00650d
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2256269
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
945e9ebee2 gpu: nvgpu: checks in nvgpu_engine_init_info
Return error in nvgpu_engine_init_info if g->ops.top.get_device_info
is NULL. In particular, do not attempt to init CE info.

Jira NVGPU-3693

Change-Id: I521cb43233a48b6e765ffd0b1feee81a30dbd739
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2242699
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
a8c9c800cd gpu: nvgpu: reorganization of MC interrupts control
Previously, unit interrupt enabling/disabling and corresponding MC level
interrupt enabling/disabling was not done at the same time.
With this change, stall and nonstall interrupt for units are programmed
at MC level along with individual unit interrupts. Kept access to MC
interrupt registers through mc.intr_lock spinlock.

For doing this separated CE and GR interrupt mask functions.
mc.intr_enable is only used when there is global interrupt
control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable
is now removed. Removed following functions - mc_gv100_intr_enable,
mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config
as we can use the generic unit interrupt control function.

JIRA NVGPU-4336

Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196178
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
2edf3db10a gpu: nvgpu: move mc gpu_ops out of gk20a.h and add doxygen comments for HALs
gk20a.h will include gops_mc.h to contain the mc ops definitions. Add
doxygen comments for the HAL functions that are called directly.
Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file.

JIRA NVGPU-2524

Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Adeel Raza
252ddc4f05 gpu: nvgpu: add coverity whitelisting support
Add macros for whitelisting coverity violations. These macros use pragma
directives. The pragma directives and whitelisting macros are only
enabled when a coverity scan is being run.

The whitelisting macros have been added to a new header called
static_analysis.h. The contents of safe_ops.h (CERT C safe ops) have
been moved into static_analysis.h because this will be the new header
for static analysis related macros/defines/etc.

JIRA NVGPU-3820

Change-Id: I9c63f20f670880b420415535738034619314b7c3
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2180600
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
vinodg
087d4d3df4 gpu: nvgpu: rmmod support in dgpu simulation
Changes added to support "rmmod nvgpu" in dgpu simulation after gpu
poweron.

nvgpu_engine-wait_for_idle got stuck in busy mode for nvdec and nvec
engines in simulation as simulation doesnt support timeout.
These engines are not valid engines in nvgpu engine list.
Add nvgpu_engine_check_valid_id before checking engine status.

Simulation crash on accessing 0xb81604 top interrupt register.
Add func_priv_cpu_intr_top__size_1_v() function to get the supported
size than using default MAX_INTR_TOP_REGS.

nvlink is not supprted in dgpu simulation. Avoid warning for
-ENODEV return.

Avoid register read following gpu power off completion.

Bug 2498574

Change-Id: I9f9f1cf1ac4620242bda1d2cc0f29f51f81a6711
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2179930
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-21 23:38:56 -07:00
Thomas Fleury
9836420185 gpu: nvgpu: no engine reset when recovery is disabled
Compile out nvgpu_engine_reset and nvgpu_gr_reset when
CONFIG_NVGPU_RECOVERY is not defined.

Jira NVGPU-3886

Change-Id: I7ff67cf3680dfff2130e2a9e16d68b5a3f684bd4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2175430
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-15 13:26:09 -07:00
Debarshi Dutta
5980d4c44f gpu: nvgpu: fix cert-c issues in common.fifo unit
Fix cert-c issues that violate the following rule for common/fifo/*
INT30-C: Unsigned integer operation may wrap.

Jira NVGPU-3881

Change-Id: Ifd1994960774cc0e190610c67d0e3f4334b73cf0
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2166535
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-07 04:06:07 -07:00
Vedashree Vidwans
40a58a7c06 gpu: nvgpu: fix MISRA errors common.fifo.engines
Rule 15.7 needs if-elseif constructs to be terminated with else
statement.
Rule 17.7 requires function return value to be checked for error
information.
This patch fixes MISRA errors mentioned above.

Jira NVGPU-3809

Change-Id: I201497c3fb679514268d06ff2b3e9cf378591c5b
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2153554
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-07-16 16:16:26 -07:00
Seema Khowala
e0419c4199 gpu: nvgpu: FIFO SWUD
- Add template for FIFO SWUD (SW Unit Design Document).
- Add doxygen documentation in top.h, fifo.h and engines.h
- Removed
 -- nvgpu_pbdma_exception_info
 -- nvgpu_engine_exception_info

JIRA NVGPU-3589

Change-Id: Ie4e80e75bc13d6cefe1835e5f176f313456f2351
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2134671
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-07-11 08:56:04 -07:00