Commit Graph

23 Commits

Author SHA1 Message Date
Shashank Singh
3aec79d242 gpu: nvgpu: add check for valid engine id
-Check validity of engine-id when iterating through all engines and
passing the engine-id as an argument to other function(s).
-Skip test test_gv100_dump_engine_status which fails due to this change.

Bug 200660469

Change-Id: I64ebb1a0297f605dd3cba7ef73954ff5594828bc
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2424655
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
0f501d806f gpu: nvgpu: Unit test fixes and staging for device rework
Fix up what unit tests can be easily fixed up. Stage everything else.

In short the unit test code is _incredibly_ fragile since it's designed
to hit every branch, positive and negative, in the code. However, the
result of that is unit tests that are painful to modify.

A lot of unit tests are also extremely opaque and rely on internal
nvgpu behavior. This patch will be updated with fixes as I make them.

Or, alternatively, it may be worth just temporarily disabling unit
tests on dev-main. We'll have a _lot_ of work for Orin that will
essentially gut the gr, host, and interrupt code. If we retain the
unit test code for this, it may end up being backgreaking.

JIRA NVGPU-5421

Change-Id: I8055fc72521f6a3a8a0d8f07fbe50c649a675016
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347274
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
359fc24aaf gpu: nvgpu: Rework engine management to work with vGPU
Currently the vGPU engine management rewrites a lot of the common
device agnostic engine management code.

With the new top HAL parsing one device at a time, it is now more
easily possible to tie the vGPU into the new common device framework
by implementing the top HAL but with the vGPU engine list backend.

This lets the vGPU inherit all the common engine and device
management code. By doing so the vGPU HAL need only implement a
trivial and simple HAL.

This also gets us a step closer to merging all of the CE init
code: logically it just iterates through all CE engines whatever
they may be. The only reason this differs between chips is because
of the swap from CE0-2 to LCEs in the Pascal generation. This could
be abstracted by the unit code easily enough.

Also, the pbdma_id for each engine has to be added to the device
struct. Eventually this was going to happen anyway, since the
device struct will soon replace the nvgpu_engine_info struct.
It's a little bit of an abuse but might be worth it long term. If
not, it should not be difficult to replace uses of dev->pbdma_id
with a proper lookup of PBDMA ID based on the device info.

JIRA NVGPU-5421

Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
fbb6a5bc1c gpu: nvgpu: Remove fifo->pbdma_map
The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists.
This array allows other software to query what PBDMA(s) serves a given
runlist. The PBDMA map is read verbatim from an array of host registers.
These registers are stored in a kmalloc()'ed array.

This causes a problem for the device management code. The device
management initialization executes well before the rest of the FIFO
PBDMA initialization occurs. Thus, if the device management code
queries the PBDMA mapping for a given device/runlist, the mapping has
yet to be populated.

In the next patches in this series the engine management code is subsumed
into the device management code. In other words the device struct is
reused by the engine management and all host SW does is pull pointers to
the host managed devices from the device manager. This means that all
engine initialization that used to be done on top of the device
management needs to move to the device code.

So, long story short, the PBDMA map needs to be read from the registers
directly, instead of an array that gets allocated long after the device
code has run.

This patch removes the pbdma map array, deletes two HALs that managed
that, and instead provides a new HAL to query this map directly from
the registers so that the device code can use it.

JIRA NVGPU-5421

Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
59eb714c48 unit: Disable some unit tests for device work
Fix what unit tests can be easily fixed, but disable some others. It's
not clear why the MM related tests started failing - there's really zero
reason for this. The list of disable tests are primarily engine related
but there are some others that get inflenced by the device and engine
structure.

  test_poweroff.init_poweroff=2
  test_is_stall_and_eng_intr_pending.intr_is_stall_and_eng_intr_pending=2
  test_isr_nonstall.isr_nonstall=2
  test_isr_stall.isr_stall=2
  test_engine_enum_from_type.enum_from_type=2
  test_engine_find_busy_doing_ctxsw.find_busy_doing_ctxsw=2
  test_engine_get_active_eng_info.get_active_eng_info=2
  test_engine_get_fast_ce_runlist_id.get_fast_ce_runlist_id=2
  test_engine_get_gr_runlist_id.get_gr_runlist_id=2
  test_engine_get_mask_on_id.get_mask_on_id=2
  test_engine_get_runlist_busy_engines.get_runlist_busy_engines=2
  test_engine_ids.ids=2
  test_engine_init_info.init_info=2
  test_engine_interrupt_mask.interrupt_mask=2
  test_engine_is_valid_runlist_id.is_valid_runlist_id=2
  test_engine_mmu_fault_id.mmu_fault_id=2
  test_engine_mmu_fault_id_veid.mmu_fault_id_veid=2
  test_engine_setup_sw.setup_sw=2
  test_engine_status.status=2
  test_fifo_init_support.init_support=2
  test_fifo_remove_support.remove_support=2
  test_gp10b_engine_init_ce_info.engine_init_ce_info=2
  test_nvgpu_mem_iommu_translate.mem_iommu_translate=2
  test_nvgpu_mem_phys_ops.nvgpu_mem_phys_ops=2

And delete unit tests for functions that no longer exist:

  test_device_info_parse_enum.top_device_info_parse_enum
  test_get_device_info.top_get_device_info
  test_get_num_engine_type_entries.top_get_num_engine_type_entries
  test_is_engine_ce.top_is_engine_ce
  test_is_engine_gr.top_is_engine_gr

JIRA NVGPU-5421

Change-Id: I343c0b1ea44c472b22356c896672153fc889ffc0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355300
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
319520ff57 gpu: nvgpu: Add a new device manager unit
This adds a new device management unit in the common code responsible
for facilitating the parsing of the GPU top device list and providing
that info to other units in nvgpu.

The basic idea is to read this list once from HW and store it in a
set of lists corresponding to each device type (graphics, LCE, etc).
Many of the HALs in top can be deleted and instead implemented using
common code parsing the SW representation.

Every time the driver queries the device list it does so using a
device type and instance ID. This is common code. The HAL is responsible
for populating the device list in such a way that the driver can
query it in a chip agnostic manner.

Also delete some of the unit tests for functions that no longer
exist. This code will require new unit tests in time; those should be
quite simple to write once unit testing is needed.

JIRA NVGPU-5421

Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
2a3bb9107f gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h>
top.h is a description of "devices" available on the GPU. As such
rename this header to device.h.

device.h will ultimately be a unit of actual C code that will rely
on the top HAL to fill a device list.

JIRA NVGPU-5421

Change-Id: If6e4a537d2209e429a678761a34713723da7a00a
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
76295a5aeb gpu: nvgpu: redundant dependency on driver
Makefile.units.common.tmk already specifies dependency
on nvgpu driver interface.

Remove redundant dependency in units makefiles.

Jira NVGPU-5217

Change-Id: I94cbe707c25f41f0e61915c243fd55fd4bda9ccf
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322205
(cherry picked from commit d9bdd8f589c121802c74da53945baa677578f71c)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325907
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
75175d8081 gpu: nvgpu: unit: more traceability for fifo
Add/fix 'Targets:' statements for:
- nvgpu_engine_status_get_ctx_id_type
- nvgpu_engine_status_get_next_ctx_id_type
- gm20b_pbdma_reset_method
- gm20b_pbdma_restartable_0_intr_descs
- gm20b_pbdma_disable_and_clear_all_intr
- gm20b_pbdma_clear_all_intr
- gm20b_pbdma_get_fc_target
- nvgpu_bug_exit
- nvgpu_bug_cb_from_node

Jira NVGPU-4890

Change-Id: I984542aa7daace85843fe6858d7c27b310046b28
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2284481
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
55510f266d gpu: nvgpu: unit: improve coverage for engines
Improve branch coverage for the following functions:
- nvgpu_engine_get_active_eng_info
- nvgpu_engine_get_ids
- nvgpu_ce_engine_interrupt_mask
- nvgpu_engine_get_gr_runlist_id

Add unit tests for the following functions:
-_nvgpu_engine_get_fast_ce_runlist_id
- nvgpu_engine_is_valid_runlist_id
- nvgpu_engine_id_to_mmu_fault_id
- nvgpu_engine_mmu_fault_id_to_engine_id
- nvgpu_engine_get_mask_on_id
- nvgpu_engine_get_id_and_type
- nvgpu_engine_find_busy_doing_ctxsw
- nvgpu_engine_get_runlist_busy_engines
- nvgpu_engine_mmu_fault_id_to_veid
- nvgpu_engine_mmu_fault_id_to_eng_id_and_veid
- nvgpu_engine_mmu_fault_id_to_eng_ve_pbdma_id

Jira NVGPU-4511

Change-Id: Ib340df17468ff3447e271a86af9a47a067f6ad11
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2262222
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Vedashree Vidwans
652cff2cd0 gpu: nvgpu: unit: fifo: move assert to unit_assert
unit_assert macro is provided to check a condition and execute bail_out
action given as a second argument.
Currently, in fifo unit, unit_assert() is redefined as assert with
common bail_out action. However, name assert() creates confusion with
linux assert macro. So, this patch removes redefined assert macro and
replaces with unit_assert.

Jira NVGPU-4684

Change-Id: I3a880f965a191f16efdabced5e23723e66ecaf3c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2276863
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
2540a98aa4 gpu: nvgpu: unit: update 'Targets' for fifo units
This patch adds a number of missing 'Targets' fields in the SWUTS of
various fifo units, fixes some missing ones and adds gops based
targets.

Jira NVGPU-4376

Change-Id: I445196e7092b01853786f40b860b29abc5d68371
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2276680
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Vedashree Vidwans
0ca906a6ad gpu: nvgpu: unit: fifo: fifo unit test
This unit test covers most of the nvgpu.common.fifo.fifo module lines
and almost all branches.

Jira NVGPU-3697

Change-Id: I5722277a3e1630a902f63b707eb3de1c4e1876b0
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2237796
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
a5470fab90 gpu: nvgpu: unit: branch coverage for gp10b engine HAL
Add remaining branch coverage for:
- gp10b_engine_init_ce_info (invalid enum read from dev info).

Jira NVGPU-4673

Change-Id: Ibeb673374f547d18a9897eb9dedc7502345461b2
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265793
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
8ea850ccb6 gpu: nvgpu: unit: branch coverage for gv100 engine HAL
Add remaining branch coverage for:
- gv100_dump_engine_status (case w/ no engine)

Jira NVGPU-4673

Change-Id: I1d1eb61752d00a427e79f92f470ff072253791e1
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265792
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Nicolas Benech
b682091b13 gpu: nvgpu: SWUTS: clean up test types
Apply the following changes to test types:
* "Init" --> "Other (setup)"
* "Coverage" --> Removed since it's implied for all tests
* "Feature based" --> "Feature"
* "Boundary Value analysis" and "Boundary values based" --> "Boundary values"
* "Error guessing based" --> "Error guessing"

JIRA NVGPU-3510

Change-Id: I3a9c0c59e6ad806f3479caa5e9a62f4d89f76923
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2265670
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
2846c26c1e gpu: nvgpu: unit: enable engine tests on target
Add the following tests to target makefile:
- engine
- engine/gm20b
- engine/gp10b
- engine/gv100
- engine/gv11b

Fix build issues for unit tests on QNX safety.
Update export files to fix link issues.
Update list of required tests in JSON file.

Jira NVGPU-3695

Change-Id: I373c6c8575ed4cbf6c5597502f2ca6ec2f078ca4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2253506
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
e211535142 gpu: nvgpu: unit: add tests for gm20b engine HAL
Add unit test for the following HAL:
- gm20b_read_engine_status_info

Jira NVGPU-3695

Change-Id: I8752e3ac83fb647704ad5547d650574d2b5a95c7
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2253505
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
94056dedf5 gpu: nvgpu: unit: add tests for gp10b engine HAL
Add unit test for the following HAL:
- gp10b_engine_init_ce_info

Jira NVGPU-3695

Change-Id: Id63818b9b2478408f84489cf70c496f4a5645a47
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2253504
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
2c83d780b0 gpu: nvgpu: unit: add tests for gv11b engine HAL
Add unit test for the following HAL:
- gv11b_is_fault_engine_subid_gpc

Jira NVGPU-3695

Change-Id: I968fe6d189f311c9b9627f6b59998ef3cbd0b3f5
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2253503
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
71b7e85b50 gpu: nvgpu: unit: add tests for gv100 engine HAL
Add unit tests for the following HALs:
- gv100_read_engine_status_info
- gv100_dump_engine_status

Jira NVGPU-3695

Change-Id: I66ba608802431d0f7ac2471d9ce726817f32c73e
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2253502
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
8fa2fcd9da gpu: nvgpu: unit: use gr/ce engine interrupt mask
nvgpu_engine_interrupt_mask has been split into
nvgpu_gr_engine_interrupt_mask and nvgpu_gr_engine_interrupt_mask.
Update test_engine_interrupt_mask to combine them.

Jira NVGPU-3693

Change-Id: I1e09ff3efd83415120773da75f8b512a481ee14c
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2249913
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Thomas Fleury
27cc81dafa gpu: nvgpu: unit: add tests for common.fifo.engine
Add unit tests for:
- nvgpu_engine_setup_sw
- nvgpu_engine_cleanup_sw
- nvgpu_engine_init_info
- nvgpu_engine_get_ids
- nvgpu_engine_check_valid_id
- nvgpu_engine_get_gr_id
- nvgpu_engine_get_active_eng_info
- nvgpu_engine_enum_from_type
- nvgpu_engine_interrupt_mask
- nvgpu_engine_act_interrupt_mask
- nvgpu_engine_get_all_ce_reset_mask

Jira NVGPU-3693

Change-Id: I8cbaea1918b284ac09a502225fa674f84c08eb8e
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2242701
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00