Commit Graph

117 Commits

Author SHA1 Message Date
Sagar Kamble
72c3bce602 gpu: nvgpu: compile out non-safe ctxsw_prog hals
Following two hals are non-safe. Compile them under
CONFIG_NVGPU_HAL_NON_FUSA:
1. init_ctxsw_hdr_data
2. disable_verif_features

JIRA NVGPU-5358

Change-Id: I751c4655dc628f7ab66ed3a779268a6a88f9a1e3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581361
(cherry picked from commit abf16c6a01109d174879609c10354f06739fb6dc)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581842
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 03:17:12 -07:00
Sagar Kamble
62b04331de gpu: nvgpu: compile out priv_access_map config/addr hals
These hals are non-safe. Compile them out with
CONFIG_NVGPU_SET_FALCON_ACCESS_MAP.

JIRA NVGPU-5358

Change-Id: I75b46e201fa132e09fee15679a402d24bbf9b2ab
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581360
(cherry picked from commit d048333ef391019b2618abf7d09c8fe2042f8ee0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581841
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 03:17:00 -07:00
Debarshi Dutta
791dc18666 gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields
Add Setter and Getter methods for accessing tsg->sm_error_states.
Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state.
This renders it unnecessary to add BVEC for above fields for the struct
in multiple locations. The current design ensures that only a constant
pointer is obtained from the owner unit i.e. FIFO.

The following new methods are added. Both unit tests and BVEC tests
are added for them as well.

nvgpu_tsg_store_sm_error_state
nvgpu_tsg_get_sm_error_state

Jira NVGPU-6947

Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-13 20:57:09 -07:00
Tejal Kudav
b33079d47e gpu: nvgpu: Move intr data members from MC to CIC
Move interrupt specific data-members from common.mc to common.cic
Some of these data members like sw_irq_stall_last_handled_cond need
To be initialized much earlier during the OS specific init/probe stage.
Also, some more members from struct nvgpu_interrupts(like stall_size,
stall_lines[]), which will soon be moved to CIC will also need to be
initialized early during the OS specific probe stage.
However, the chip specific LUT can only be initialized after the
hal_init stage where the HALs are all initialized.
Split the CIC init to accommodate the above initialization requirements.

JIRA NVGPU-6899

Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 18:06:28 -07:00
tkudav
c418946cb6 gpu: nvgpu: BVEC test for common.gr
Add BVEC test for nvgpu_gr_config_get_sm_info()

JIRA NVGPU-6951

Signed-off-by: tkudav <tkudav@nvidia.com>
Change-Id: I06182c68c041063556edd723f38fe1552ded7af0
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554616
(cherry picked from commit f56884a5ea182695879a92412e8e5d94860ee606)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560011
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-16 16:48:26 -07:00
Deepak Nibade
4edf952e3e gpu: nvgpu: fix rule 5.1 misra violations in common.gr
Fix rule 5.1 misra violations in common.gr by renaming below functions :

nvgpu_gr_config_get_gpc_tpc_mask_base ->
  nvgpu_gr_config_get_base_mask_gpc_tpc

nvgpu_gr_config_get_gpc_tpc_count_base ->
  nvgpu_gr_config_get_base_count_gpc_tpc

gm20b_ctxsw_prog_set_priv_access_map_config_mode ->
  gm20b_ctxsw_prog_set_config_mode_priv_access_map

gm20b_ctxsw_prog_set_priv_access_map_addr ->
  gm20b_ctxsw_prog_set_addr_priv_access_map

gm20b_gr_falcon_read_fecs_ctxsw_mailbox ->
  gm20b_gr_falcon_read_mailbox_fecs_ctxsw

gm20b_gr_falcon_read_fecs_ctxsw_status0 ->
  gm20b_gr_falcon_read_status0_fecs_ctxsw

gm20b_gr_falcon_read_fecs_ctxsw_status1 ->
  gm20b_gr_falcon_read_status1_fecs_ctxsw

gv11b_gr_intr_get_sm_hww_warp_esr_pc ->
  gv11b_gr_intr_get_warp_esr_pc_sm_hww

gv11b_gr_intr_get_sm_hww_warp_esr ->
  gv11b_gr_intr_get_warp_esr_sm_hww

Jira NVGPU-6779

Change-Id: Icbe23a7b022373785968fc417ee247e2d80cfcc6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554521
(cherry picked from commit 1432650774506f2a7e45f70b084f498736d0d0c5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555330
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-13 09:20:41 -07:00
tkudav
4856e709f8 gpu: nvgpu: BVEC test for common.gr
Add BVEC test for below APIs:
nvgpu_gr_setup_alloc_obj_ctx
nvgpu_gr_setup_set_preemption_mode

JIRA NVGPU-6389

Signed-off-by: tkudav <tkudav@nvidia.com>
Change-Id: Ib42431e1fb85e42b2767fa6ba2212c3ec578f487
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548282
(cherry picked from commit 7d59394a11e680b8118684e38d7fa5de2be20da9)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555075
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-07 12:26:13 -07:00
tkudav
0526e7eaa9 gpu: nvgpu: Create CIC-mon and CIC-rm subunits
common.cic unit is divided into common.cic.mon and common.cic.rm
based on rm and mon process split.

CIC-mon subunit includes the code which is utilized in critical
interrupt handling path like initialization, error detection and
error reporting path. CIC-rm subunit includes the code corresponding
to rest of interrupt handling(like collecting error debug data from
registers) and ISR status management (status of deferred interrupts).

Split the CIC APIs and data-members into above two subunits.

JIRA NVGPU-6899

Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-02 09:57:56 -07:00
Deepak Nibade
8ccf9820ba gpu: nvgpu: check for valid sm_id in nvgpu_gr_config_get_sm_info
Check if requested sm_id is valid in nvgpu_gr_config_get_sm_info()
function. Also update doxygen documentation for same.

Also, ensure SM count is set using nvgpu_gr_config_set_sm_info() before
usig nvgpu_gr_config_get_sm_info() to retrieve it.

Update unit test test_gr_config_set_get to set valid SM count instead of
random number. With random number it is possible that SM count is set
higher than size of SM info struct. This could result into test process
crash.

Change-Id: I4292977b7e880752c65001cbd594e0617fe135f5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2549882
(cherry picked from commit ee9767cac1a27ffbc99f707c1aa158b8216d757f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551983
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-01 06:51:05 -07:00
Lakshmanan M
7d473f4dcc gpu: nvgpu: Expose logical mask for MIG
1) Expose logical mask instead of physical mask when MIG is enabled.
   For legacy, NvGpu expose physical mask.
2) Added fb related info in struct nvgpu_gpu_instance().
4) Added utility api to get the logical id for a given local id
   nvgpu_grmgr_get_gr_gpc_logical_id()
5) Added grmgr api to get max_gpc_count
   nvgpu_grmgr_get_max_gpc_count().
5) Added grmgr's fbp api to get num_fbps and its enable masks.
   nvgpu_grmgr_get_num_fbps()
   nvgpu_grmgr_get_fbp_en_mask()
   nvgpu_grmgr_get_fbp_rop_l2_en_mask()
6) Used grmgr's fbp apis in ioctl_ctrl.c
7) Moved fbp_init_support() in nvgpu_early_init()
8) Added nvgpu_assert handling in grmgr.c
9) Added vgpu hal for get_max_gpc_count().

JIRA NVGPU-5656

Change-Id: I90ac2ad99be608001e7d5d754f6242ad26c70cdb
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538508
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-10 03:05:21 -07:00
Tejal Kudav
e0a1fcf5f5 gpu: nvgpu: Add Central Intr Controller unit
Add a new Central Interrupt Controller(CIC) unit in common code.
The interrupt handling is done in a distributed manner currently.
The error handling policy for different errors resides in each unit's
ISR code. The goal is to converge this data under one central place -
the CIC unit.

This patch creates framework for CIC unit and moves the gv11b QNX
safety LUT to CIC unit. All the error reporting APIs from different
units are also moved to CIC.

New APIs are exposed by CIC unit to access its internal data like:
  1. Struct err_desc - the static err handling /injection data per
                       error id
  2. Num_hw_modules  - the number of error reporting HW units
                       supported by CIC

Init and deinit of CIC unit:
  1. CIC unit should be initialized earlyon during boot so that it
     is available for any interrupt handling.
  2. Initialize CIC just before the interrupts are enabled during
     boot.
  3. Similarly, CIC is disabled late during deinit cycle; right
     after the interrupts are masked.

LUT:
  1. LUT is currently used only for reporting error to safety
     services in gv11b QNX safety build.
  2. This error handling policy LUT currently has only two levels
     of handing - correctable and quiecse.
  3. Once, the error handling policy decision is moved from leaf
     unit nodes to CIC, LUT will be updated to have additional levels
     like fast recovery and full recovery.
  4. Also, then a separate LUT will be added for each platform/build.
  5. In current framework, the LUT is set to NULL for all
     configurations except gv11b.

report_err() ops is added to report error to safety services.
This ops is only effective for gv11b qnx build; and set to NULL for
other configurations.

NVGPU-6521
NVGPU-6523
NVGPU-6750
NVGPU-6758
NVGPU-6760
NVGPU-6754

Change-Id: I24be7836a96d787741e37b732e19863ed8014635
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683
Reviewed-by: Ajesh K V <akv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-25 14:28:04 -07:00
Debarshi Dutta
096f4ef055 gpu: nvgpu: fix l2_flush errors during rmmod
The function gk20a_mm_l2_flush incorrectly returns an error value
when it skips l2_flush when hardware is powered off.
This causes the following prints to occur even when the behavior is expected.

gv11b_mm_l2_flush:43 [ERR] gk20a_mm_l2_flush failed
nvgpu_gmmu_unmap_locked:1043 [ERR] gk20a_mm_l2_flush[1] failed

The above errors occur from the following paths
1) gk20a_remove -> gk20a_free_cb -> gk20a_remove_support ->
	nvgpu_pmu_remove_support -> nvgpu_pmu_pg_deinit ->
	nvgpu_dma_unmap_free

2) gk20a_remove -> gk20a_free_cb -> gk20a_remove_support ->
	nvgpu_remove_mm_support -> gv11b_mm_mmu_fault_info_mem_destroy ->
        nvgpu_dma_unmap_free

Since, these do not belong in the Poweron/Poweroff path, its okay to
skip flushing them when the hardware has powered off.

Fixed the userspace tests by allocating g->mm.bar1.vm to prevent NULL access
in gv11b_mm_l2_flush->tlb_invalidate.

Jira LS-77

Change-Id: I3ca71f5118daf4b2eeacfe5bf83d94317f29d446
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2523751
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-05-10 10:06:24 -07:00
Deepak Nibade
c08719cb0b gpu: nvgpu: move graphics specific HALs to fusa files
All graphics code is under CONFIG_NVGPU_GRAPHICS and all the HALs are
in non-fusa files. In order to support graphics in safety,
CONFIG_NVGPU_GRAPHICS needs to be enabled. But since most of the HALs
are in non-fusa files, this causes huge compilation problem.

Fix this by moving all graphics specific HALs used on gv11b to fusa
files. Graphics specific HALs not used on gv11b remain in non-fusa files
and need not be protected with GRAPHICS config.

Protect call to nvgpu_pmu_save_zbc() also with config
CONFIG_NVGPU_POWER_PG, since it is implemented under that config.

Delete hal/ltc/ltc_gv11b.c since sole function in this file is moved to
fusa file.

Enable nvgpu_writel_loop() in safety build since it is needed for now.
This will be revisited later once requirements are clearer.

Move below CTXSW methods under CONFIG_NVGPU_NON_FUSA for now. Safety
CTXSW ucode does not support these methods. These too will be revisited
later once requirements are clearer.
NVGPU_GR_FALCON_METHOD_PREEMPT_IMAGE_SIZE
NVGPU_GR_FALCON_METHOD_CTXSW_DISCOVER_ZCULL_IMAGE_SIZE

Jira NVGPU-6460

Change-Id: Ia095a04a9ba67126068aa7193f491ea27477f882
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2513675
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-22 14:59:38 -07:00
Deepak Nibade
bb43f11a61 gpu: nvgpu: update common.gr doxygen
Add below updates to common.gr doxygen:

- Add doxygen comments for APIs that are mentioned in RM SWAD and in
  RM-common.gr traceability document.
- Comment about valid ranges for input parameters of bunch of functions.
- Add nvgpu_assert() to ensure correct value is passed as input
  parameter to number of functions.
- Add references to relevant functions with @see.
- Update Targets field for unit tests to cover newly doxygenated
  functions.
- Update unit test test_gr_init_hal_pd_skip_table_gpc to take care of
  new asserts added into some APIs.

Jira NVGPU-6180

Change-Id: Ie889bed96b6428b1fd86dcf30b322944464e9d12
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2469397
(cherry picked from commit 5d7d7e9ce1c4efe836ab842d7962a3aee4e8972f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2469394
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-02-02 23:34:27 -08:00
Alex Waterman
d925e33e8b userspace: Prune unit tests for new runlist code
Remove and prune the now broken tests related to the runlist updates.

JIRA NVGPU-6425

Change-Id: I76e03c943ceae261e35958aa64717b5590a19c0e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2474334
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-01-29 09:51:50 -08:00
Sagar Kamble
cf287a4ef5 gpu: nvgpu: retry tsg unbind if NEXT is set
The NEXT bit can remain set for the channel if timeslice expires before
scheduler clears it. Due to this nvgpu fails TSG unbind and in turn
nvrm_gpu fails channel close. In this case, checking the channel hw
state after some time can help see NEXT bit cleared by scheduler.

Reenable the tsg and return -EAGAIN to nvrm_gpu for it to retry again.

Bug 3144960

Change-Id: I35f417f02270e371a4e632986b73a00f8a4f921a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2468391
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-01-18 23:11:57 -08:00
Deepak Nibade
d584294545 gpu: nvgpu: set preemption mode for specific GR instance
Pass gr_instance_id to function nvgpu_gr_setup_set_preemption_mode()
which picks up correct nvgpu_gr struct pointer based on instance id.

nvgpu_gr_get_cur_instance_ptr() is not needed in this special case
since there is no PGRAPH register programming required to set preemption
mode. All writes/updates are done on context image.

Also fix unit tests accordingly to always select 0th GR instance.

Jira NVGPU-5648

Change-Id: I46eff816d5a4afe784bf75b64ee9d698c77eb64a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2435468
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:48 -06:00
tkudav
2ca4f145e4 gpu: nvgpu: Fix HAL checker pointed mismatches
Add new HALs for register field definition/value changes in
GV11B as compared to Pascal. Update the HALs for recent
chips too if applicable.

Bug 200604892

Change-Id: I14ee9440859007e86a1ffa937df399a31e2628bd
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2437564
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
a252cc244a gpu: nvgpu: modify alloc_as ioctl to accept mem size
- Modify NVGPU_GPU_IOCTL_ALLOC_AS and struct nvgpu_alloc_as_args to
accept start address and size of user memory. This allows configurable
address space allocation.
- Modify gk20a_as_alloc_share() and gk20a_vm_alloc_share() to receive
va_range_start and va_range_end values.
- gk20a_vm_alloc_share() initializes vm with low_hole = va_range_start,
and user vma size = (va_range_end - va_range_start).
- Modify nvgpu_as_alloc_space_args and nvgpu_as_free_space_args to
accept 64 bit number of pages.

Bug 2043269
JIRA NVGPU-5302

Change-Id: I243995adf5b7e0e84d6b36abe3b35a5ccabd7a37
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2385496
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Sami Kiminki <skiminki@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
00d1e10ff2 gpu: nvgpu: accept small_big_split in vm_init
Currently, when unified address space is not requested, nvgpu_vm_init
splits user vm at a fixed address of 56G.
Modify nvgpu_vm_init to allow user to specify small big page vm split.

JIRA NVGPU-5302

Change-Id: I6ed33a4dc080f10a723cb9bd486f0d36c0cee0e9
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2428326
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Sami Kiminki <skiminki@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Peter Daifuku
a331fd4b3a gpu: nvgpu: pd_cache enablement for >4k allocations in qnx
Mapping of large buffers to GMMU end up needing many
pages for the PTE tables. Allocating these one by one
can end up being a performance bottleneck, particularly
in the virtualized case.

This is adding the following changes:

 - As the TLB invalidation doesn't have access to mem_off,
   allow top-level allocation by alloc_cache_direct().
 - Define NVGPU_PD_CACHE_SIZE, the allocation size for a new slab
   for the PD cache, effectively set to 64K bytes
 - Use the PD cache for any allocation < NVGPU_PD_CACHE_SIZE
   When freeing up cached entries, avoid prefetch errors by
   invalidating the entry (memset to 0).
 - Try to fall back to direct allocation of smaller chunk for
   contiguous allocation failures.
 - Unit test changes.

Bug 200649243

Change-Id: I0a667af0ba01d9147c703e64fc970880e52a8fbc
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2404371
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
c8b2bd7a03 gpu: nvgpu: check default and valid preemption modes
APIs to set preemption modes right now have config based code to set
default preemption modes or to check if given preemption mode is valid
or not. This makes code unreadable and complex.

Rework nvgpu_gr_obj_ctx_init_ctxsw_preemption_mode() so that it checks
for initial preemption modes in the beginning. If no preemption mode is
passed while allocating context, get default preemption modes with
gops.gr.init.get_default_preemption_modes() and use them.

Rework nvgpu_gr_ctx_check_valid_preemption_mode() so that it is more
readable. Use gops.gr.init.get_supported_preemption_modes() to validate
incoming preemption modes against supported preemption modes.

Log preemption modes getting set in
nvgpu_gr_obj_ctx_set_ctxsw_preemption_mode().

Disable failing unit test. It will need rework according to new code.

Jira NVGPU-5648

Change-Id: Ie1a3e1aeae7826a123e104d9d016f181bea3b271
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2419034
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
d2bb5df3c7 gpu: nvgpu: remove NVGPU_GR_NUM_INSTANCES
common.gr defined a temporary macro NVGPU_GR_NUM_INSTANCES to enable or
disable multiple GR instances from common.gr unit.
Multiple GR instance boot is now verified, so we can remove this
temporary solution.

Note that nvgpu_grmgr_get_num_gr_instances() will return more than 1
instance only if NVGPU_SUPPORT_MIG is enabled.

Update unit tests to set number of syspipes to 1 to allow enumeration
of GR instance by grmgr.

Jira NVGPU-5648

Change-Id: I795901ae516843ae7b6c1794dae0f023a213ab1d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2418377
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
srajum
6aec282dc1 Revert "gpu: nvgpu: Fix for unit test failures"
This reverts commit 0e353e0022da6064a2c0f71ed43a2a76ceec1a97.

- created unit test change to exercise change with "23293fef"
  but there was issues with that change and now made correponding 
  driver change and no longer this unit test change required.

JIRA NVGPU-6051

Change-Id: Id8131ad027069062435947d79d627b23470a7199
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2415023
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
49c9f0c137 gpu: nvgpu: accept user vma size in vm init
Modify nvgpu_vm_init to accept low_hole, user_reserved and
kernel_reserved. This will simplify argument limit checks and make code
more legible.

JIRA NVGPU-5302

Change-Id: I62773dd7b06264a3b6cb8896239b24c49fa69f9b
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394901
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seeta Rama Raju
1bd0261cbe gpu: nvgpu: Fix for unit test failures
JIRA NVGPU-6051

Change-Id: Ic061594096ef49f7984cde4405f4934ded220e91
Signed-off-by: Seeta Rama Raju <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2411562
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
bafeea3530 gpu: nvgpu: setup HW for each GR instance
Get number of SMs from GR instance specific nvgpu_gr_config pointer
instead of global SM count in below functions :
nvgpu_gr_fs_state_init()
gv11b_gr_init_sm_id_config()

Update nvgpu_gr_config_get_gpc_skip_mask() to return 0 in case gpc_index
is greater than available gpc_count. This is not MIG specific, but based
on code review possible even today for existing chips.
See gm20b_gr_init_pd_skip_table_gpc()

Update nvgpu_gr_get_override_ecc_val() to return GR instance specific
value.

Execute gr_init_setup_hw() for each GR instance.

Disable below failing unit tests:
nvgpu_gr_fs_state.test_gr_fs_state_error_injection
nvgpu_gr_init.test_gr_init_hal_config_error_injection

Jira NVGPU-5648

Change-Id: Ie8f1c0c304c634756786d85facf336a5c9ae8195
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410702
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
6745b0685e gpu: nvgpu: support resetting each GR instance
Add a new header file <nvgpu/gr/gr_instances.h> that supports below
macros to execute various functions for GR instances

1) nvgpu_gr_exec_for_each_instance
   Execute a function for each GR instance by configuring GR remap
   window for that instance. Function being executed returns void.

2) nvgpu_gr_exec_with_ret_for_each_instance
   Execute a function for each GR instance by configuring GR remap
   window for that instance. Function being executed returns an error.

3) nvgpu_gr_exec_for_all_instances
   Execute a function for all GR instances at once. For this GR remap
   window needs to be disabled temporarily.

If CONFIG_NVGPU_MIG is disabled, all above macros will turn into simple
funciton calls.
If CONFIG_NVGPU_MIG is disabled or if runtime flag  NVGPU_SUPPORT_MIG is
disabled, all above macros will turn into simple function calls that
configure single GR instance.

Separate out GR engine reset code into new API gr_reset_engine() and
execute it with nvgpu_gr_exec_with_ret_for_each_instance().

PROD values need to be loaded in legacy mode, hence call
nvgpu_cg_init_gr_load_gating_prod() inside
nvgpu_gr_exec_for_all_instances().

Rename gr_init_prepare_hw() to more appropriate
gr_reset_hw_and_load_prod()

Moe gops.gr.init.fifo_access() call to gr_init_reset_enable_hw().

Add new API nvgpu_grmgr_get_gr_syspipe_id() to query GR instance syspipe
id from common.grmgr unit. Add nvgpu_gr_get_syspipe_id() that returns
same value stored in nvgpu_gr struct.

Add cur_gr_instance field to struct nvgpu_gr to track current GR
instance being programmed under remap window.

Jira NVGPU-5648

Change-Id: I86920303427a6e6547ebf195daa37438365bb38e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2403550
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
a2809088eb gpu: nvgpu: remove unnecessary hal gops.gr.gr_enable_hw()
gops.gr.gr_enable_hw() is a common function and not referred on vGPU.
Remove HAL pointer and directly use nvgpu_gr_enable_hw() instead.

Jira NVGPU-5648

Change-Id: Id031024ed01f9d890cffb5902cc433800810b219
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2403548
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
8cccb49bd2 gpu: nvgpu: collapse nvgpu_gr_prepare_sw into nvgpu_gr_alloc
common.gr unit exports a separate API nvgpu_gr_prepare_sw to
initialize some SW pieces required for nvgpu_gr_enable_hw().
A separate API is really unnecessary since same initialization
can be performed in nvgpu_gr_alloc().

Remove nvgpu_gr_prepare_sw() and HAL gops.gr.gr_prepare_sw().
Initialize falcon and interrupt structures in loop from
nvgpu_gr_alloc().

Move nvgpu_netlist_init_ctx_vars() from nvgpu_gr_prepare_sw() to
common init path since netlist parsing need not be done from
common.gr unit. It just needs to happen before nvgpu_gr_enable_hw().

Also, trigger nvgpu_gr_free() from gr_remove_support() instead
of OS specific paths. Also remove nvgpu_gr_free() calls from
probe error paths since nvgpu_gr_alloc is no longer called in
probe path.

Move interrupt and falcon data structure free calls to nvgpu_gr_free().

Also remove corresponding unit testing code that tests
nvgpu_gr_prepare_sw() specifically.
Update some unit tests to initialize ecc counters and netlist.
Disable some unit tests that fail for reasons unknown.

Jira NVGPU-5648

Change-Id: I82ec8160f76530bc40e0c11a9f26ba1c8f9cf643
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400166
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
cfa360f5b8 gpu: nvgpu: allocate struct nvgpu_gr based on enumerated gr count
Add new API nvgpu_grmgr_get_num_gr_instances() that returns number of
GR instance enumerated by GR manager. This just returns number of sys
pipes enabled since it is same as number of GR instances.

For consistency until common.gr supports multiple GR instances
completely, add a temporary macro NVGPU_GR_NUM_INSTANCES and set it
to 1. If this macro is changed to 0 (for local MIG testing), fall
back to use nvgpu_grmgr_get_num_gr_instances() to get enumerated number
of GR instances.

Use a for loop to initialize other variables of struct nvgpu_gr.

Remove unnecessary NULL check in nvgpu_gr_alloc() since struct gk20a
pointer can never be NULL in this path. Also remove corresponding unit
test code.

Jira NVGPU-5648

Change-Id: Id151d634a23235381229044f2a9af89e390886f2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400151
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
1278204c28 gpu: nvgpu: modify unit tests for non-safety build
Modify unit tests to successfully compile with non-safety build.

JIRA NVGPU-5363

Change-Id: Ib869880372972895861db246ff06b5373756e0fe
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369659
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
0f501d806f gpu: nvgpu: Unit test fixes and staging for device rework
Fix up what unit tests can be easily fixed up. Stage everything else.

In short the unit test code is _incredibly_ fragile since it's designed
to hit every branch, positive and negative, in the code. However, the
result of that is unit tests that are painful to modify.

A lot of unit tests are also extremely opaque and rely on internal
nvgpu behavior. This patch will be updated with fixes as I make them.

Or, alternatively, it may be worth just temporarily disabling unit
tests on dev-main. We'll have a _lot_ of work for Orin that will
essentially gut the gr, host, and interrupt code. If we retain the
unit test code for this, it may end up being backgreaking.

JIRA NVGPU-5421

Change-Id: I8055fc72521f6a3a8a0d8f07fbe50c649a675016
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347274
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
010f818596 gpu: nvgpu: initialize gr struct in poweron path
struct nvgpu_gr is right now initialized during probe and from OS
specific code. To support multiple instances of graphics engine,
nvgpu needs to initialize nvgpu_gr after number of engine instances
have been enumerated in poweron path.
Hence move nvgpu_gr_alloc() to poweron path and after gr manager has
been initialized.

Some of the members of nvgpu_gr are initialized in probe path and they
too are in OS specific code. Move them to common code in
nvgpu_gr_alloc()

Add field fecs_feature_override_ecc_val to struct gk20a to store the
override flag read from device tree. This flag is later copied to
nvgpu_gr in poweron path.

Update tpc_pg_mask_store() to check for g->gr being NULL before
accessing golden image pointer.
Update tpc_fs_mask_store() to return error if g->gr is not initialized.
This path needs nvgpu_gr struct initialized. Also fix the incorrect
NULL pointer check in tpc_fs_mask_store() which breaks the write path
to this sysfs.

Jira NVGPU-5648

Change-Id: Ifa2f66f3663dc2f7c8891cb03b25e997e148ab06
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397259
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
b91b1f06e1 gpu: nvgpu: check and handle all bits set in fecs_host_intr_status
Check all the bits set in fecs_host_intr_status h/w register.
Read fecs_host_intr_status before calling handle_fecs_error
and store this info in isr_data.

JIRA NVGPU-5502

Change-Id: I198b11aa62e394706007d6dc034fe0ac8da2bcb5
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2343684
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
smadhavan
3b560b5757 gpu: nvgpu: set gr.falcon.bind_instblk ops to NULL
While booting LS falcons, gr.falcon.bind_instblk gops is
used to bind WPR VA to gr falcon. Only FECS_METHOD must be
used to bind instblks. But at this point FECS falcon is not loaded
and running. Hence FECS_METHOD cannot be used to bind this instblk.

Besides that, this code is not required
for successful falcon boot and functioning of chips other
than gm20b.

JIRA NVGPU-5323

Change-Id: I148ccc77d65d5f01adbba6261369e7a292dccfc3
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369736
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Rajesh Devaraj
b8c6ad3f5f gpu: nvgpu: remove service IDs
This patch removes the reporting of _ECC_CORRECTED errors which are
not applicable to GV11B. Specifically, this patch removes the code
related to the  reporting of the following service IDs:

NVGUARD_SERVICE_IGPU_SM_SWERR_LRF_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_CBU_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_PMU_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_GPCCS_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_FECS_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_GCC_SWERR_L15_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_MMU_SWERR_L1TLB_FA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_MMU_SWERR_L1TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_L2TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_PTE_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_PDE0_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L0_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L0_PREDECODE_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L1_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L1_PREDECODE_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_TAG_MISS_FIFO_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_TAG_S2R_PIXPRF_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_TSTG_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_RSTG_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_DSTG_BE_ECC_CORRECTED

Bug 200616002

Change-Id: I199c396f9f6a6be007bd6d3c556199b5a73c3c91
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349587
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
59eb714c48 unit: Disable some unit tests for device work
Fix what unit tests can be easily fixed, but disable some others. It's
not clear why the MM related tests started failing - there's really zero
reason for this. The list of disable tests are primarily engine related
but there are some others that get inflenced by the device and engine
structure.

  test_poweroff.init_poweroff=2
  test_is_stall_and_eng_intr_pending.intr_is_stall_and_eng_intr_pending=2
  test_isr_nonstall.isr_nonstall=2
  test_isr_stall.isr_stall=2
  test_engine_enum_from_type.enum_from_type=2
  test_engine_find_busy_doing_ctxsw.find_busy_doing_ctxsw=2
  test_engine_get_active_eng_info.get_active_eng_info=2
  test_engine_get_fast_ce_runlist_id.get_fast_ce_runlist_id=2
  test_engine_get_gr_runlist_id.get_gr_runlist_id=2
  test_engine_get_mask_on_id.get_mask_on_id=2
  test_engine_get_runlist_busy_engines.get_runlist_busy_engines=2
  test_engine_ids.ids=2
  test_engine_init_info.init_info=2
  test_engine_interrupt_mask.interrupt_mask=2
  test_engine_is_valid_runlist_id.is_valid_runlist_id=2
  test_engine_mmu_fault_id.mmu_fault_id=2
  test_engine_mmu_fault_id_veid.mmu_fault_id_veid=2
  test_engine_setup_sw.setup_sw=2
  test_engine_status.status=2
  test_fifo_init_support.init_support=2
  test_fifo_remove_support.remove_support=2
  test_gp10b_engine_init_ce_info.engine_init_ce_info=2
  test_nvgpu_mem_iommu_translate.mem_iommu_translate=2
  test_nvgpu_mem_phys_ops.nvgpu_mem_phys_ops=2

And delete unit tests for functions that no longer exist:

  test_device_info_parse_enum.top_device_info_parse_enum
  test_get_device_info.top_get_device_info
  test_get_num_engine_type_entries.top_get_num_engine_type_entries
  test_is_engine_ce.top_is_engine_ce
  test_is_engine_gr.top_is_engine_gr

JIRA NVGPU-5421

Change-Id: I343c0b1ea44c472b22356c896672153fc889ffc0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355300
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
5f0fdf085c nvgpu: unit: Add new mock register framework
Many tests used various incarnations of the mock register framework.
This was based on a dump of gv11b registers. Tests that greatly
benefitted from having generally sane register values all rely
heavily on this framework.

However, every test essentially did their own thing. This was not
efficient and has caused a some issues in cleaning up the device and
host code.

Therefore introduce a much leaner and simplified register framework.
All unit tests now automatically get a good subset of the gv11b
registers auto-populated. As part of this also populate the HAL with
a nvgpu_detect_chip() call. Many tests can now _probably_ have all
their HAL init (except dummy HAL stuff) deleted. But this does
require a few fixups here and there to set HALs to NULL where tests
expect HALs to be NULL by default.

Where necessary HALs are cleared with a memset to prevent unwanted
code from executing.

Overall, this imposes a far smaller burden on tests to initialize
their environments.

Something to consider for the future, though, is how to handle
supporting multiple chips in the unit test world.

JIRA NVGPU-5422

Change-Id: Icf1a63f728e9c5671ee0fdb726c235ffbd2843e2
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335334
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Thomas Fleury
76295a5aeb gpu: nvgpu: redundant dependency on driver
Makefile.units.common.tmk already specifies dependency
on nvgpu driver interface.

Remove redundant dependency in units makefiles.

Jira NVGPU-5217

Change-Id: I94cbe707c25f41f0e61915c243fd55fd4bda9ccf
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322205
(cherry picked from commit d9bdd8f589c121802c74da53945baa677578f71c)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325907
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Terje Bergstrom
7a71aba234 gpu: nvgpu: unit: Fix header guards
Fix cases where header guard #ifdef and #define had a mismatch.

Change-Id: I74aec2736c467f79e9786880d3e3847ee86a2466
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2318388
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
sagar
8c04d2f000 gpu: nvgpu: skip classes in obj_alloc
Currently, we are performing obj ctx alloction for bellow classes

 1. VOLTA_COMPUTE_A
 2. VOLTA_DMA_COPY_A
 3. VOLTA_CHANNEL_GPFIFO_A

In safety, we use Async CE but not GRCE.
So allocating obj context only for COMPUTE_A and return success(0) for
all other valid classes, after setting class in the channel struct.

Jira NVGPU-4378

Change-Id: Ie99872e062cc66f9ddf699397a13df85c3d8d59e
Signed-off-by: sagar <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287486
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
vinodg
740f26cee5 gpu: nvgpu: fix branch coverage for gr unit
Fix branch coverage to the change added in
driver code for checking su_coalesce and lg_coalesce
hal pointers as valid.

Jira NVGPU-4868

Change-Id: I88b34226051697c941811c40b3a0f7928f3b1e2a
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2291208
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
vinodg
7852d452ee gpu: nvgpu: Add test to cover FECS watchdog timeout
Add unit test to cover the FECS watch timeout method
in gm20b.
Correct the file and function name to gm20b from gk20a.

Bug 200586923

Change-Id: I447e26c7d898f3967ad2de7a7e4a7457264941b5
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2290643
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seema Khowala
c8050eabec gpu: nvgpu: gr: fix return value of *handle_sw_method*
Currently chip specific functions for handle_sw_method
does not return -EINVAL if class does not match as
expected. Fix it by setting default return value to
-EINVAL and updating return value to 0 for successful
class/method matches.

JIRA NVGPU-4909

Change-Id: Ifb3aa35215171ddbc64d6f1d23f8944c9fe44b2d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2285848
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
vinodg
719186cca0 gpu: nvgpu: fix error in gr.falcon header documentation
Remove # from function name in target section.
Fix missing comma to separate function names.
Add missing gops_gr_falcon hal function to target section.

Jira NVGPU-4888

Change-Id: I8ef8a035767c06c41c1daf60c295249c5a50c7fc
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2283719
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:10:29 -06:00
vinodg
7f73c5fc20 gpu: nvgpu: add missing gops_gr hal to target section.
Add missing gops_gr_setup, gops_gr_init, obj_ctx, global_ctx,
fs_state, gops_gr_falcon hal function  to the the target section.
Remove # from the target function name to keep consistency.

Jira NVGPU-4888

Change-Id: Ife9f2435d0e52cec490cfdf1809cc86809832cf2
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2280202
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
vinodg
2ea4dfb5c7 gpu: nvgpu: add gr.falcon hal function to target section
Add gr.falcon hal functions being used from gr.falcon
unit test to the target section.

Jira NVGPU-4888

Change-Id: Id90ac8babfae95805421e4b9aded6a055e10e85b
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2280795
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:10:29 -06:00
vinodg
0003c7d13e gpu: nvgpu: add gr.config hal functions to target section
Add gr.config hal functions to target section
Remove # from function names in target section

Jira NVGPU-4888

Change-Id: Ia8d8e91e731ad477ec330d79af69f60c0990fa70
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2280794
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:10:29 -06:00
vinodg
d0a029eadf gpu: nvgpu: add missing gr.falcon hal header to SWUTS
add nvgpu-gr-falcon-gk20a.h to the SWUTS sources.

Jira NVGPU-4888

Change-Id: I4b407a5321ea67b3edcec7df7f3c07eb1e81c395
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2280793
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00