Commit Graph

154 Commits

Author SHA1 Message Date
Rajesh Devaraj
01d3ed09b0 gpu: nvgpu: update get_access_map
This patch re-names the variables used in gr_init_get_access_map API:
whitelist - gr_access_map
num_entries - gr_access_map_num_entries
wl_addr_*[] - gr_access_map_*[]

JIRA NVGPU-9849

Change-Id: I3a0a59410af8983867af5bc2f9ff200e56e190c4
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2891567
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2023-05-05 02:31:10 -07:00
Martin Radev
84bb919909 gpu: nvgpu: Setup GFX-capable TPCs
This patch dispatches to the appropriate HAL to
select the GFX-capable TPCs.

Bug 3944931

Change-Id: Ifb7338bea2cd59581133b7a2ba723f5d8bfa507c
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2891725
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2023-05-04 11:38:54 -07:00
prsethi
c49ac865de gpu: nvgpu: init golden ctx image during nvgpu poweron
Safety build temporal requirement is that on FECS power up it should go
through entire initialization methods.
init_golden_image callback is being called from devctl/ioctl path and
triggers FECS method 10 and 11. As these methods are part of APP init,
not being called during resume and causing quiesce on safety build.
To fix this issue, calling the callback from poweron API.

Bug 4082813
Bug 4037712

Change-Id: I2d27203d3cb4326ae7d8bd6025693fd61d5237df
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2893218
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2023-05-04 03:14:19 -07:00
Ramalingam C
af48120169 gpu: nvgpu: configure CWD sm id regs before ucode load
GPCCS ucode is expecting the SM ID programmed in GPM and CWD registers
to be in sync. So create a hal called gr.init.load_sm_id_config()
for the sm_id programming for the CWD registers and invoke them before
the ucode load.

Initialize this hal only for the required GPUs.

JIRA NVGPU-9757

Change-Id: Ib0984fd6326c37e0c2a06123041032575a23ec04
Signed-off-by: Ramalingam C <ramalingamc@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2864999
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-06 10:09:14 -07:00
Richard Zhao
7cd377568f gpu: nvgpu: vgpu: init ctx buffers for vf driver
VF needs to allocate gr ctx buffers on gr init, since VF will manage the
gr ctx.

Jira GVSCI-15769

Change-Id: Ifd09e6b09306c0fd36bddc60caa3d0d56f2b29cb
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2863434
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-03-13 04:55:35 -07:00
Divya
3e6d61b177 gpu: nvgpu: wake up gr wait wq in rmmod path
- The pmu_pg_task thread remains alive in the background
  during railgate and rail-ungate.
- During rail-ungate, the PG task thread starts again and
  executes PG-related tasks.
- It comes in pmu_pg_init_powergating() and waits for GR
  initialization. Here it waits for gr to be initialized.
- In parallel, the main GPU thread works on rmmod (from
  gpu_module_reload test).
- By this time, the main gpu thread has started rmmod and
  gr->initialized can be set to false, thus causing an uninterruptible
  wait for pmu_pg_task thread.
- To solve this, wake gr wait wq in rmmod path when
  NVGPU_DRIVER_IS_DYING and NVGPU_KERNEL_IS_DYING flgas are set.

Bug 3806514

Change-Id: Id78d92f30b75aba1aee22398cc86a3acebd50ef6
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2798003
(cherry picked from commit d9345065bcb6d9ff497c127fa4cd52077f4ecfa4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2807245
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-21 04:19:18 -08:00
Dinesh T
68976fbd22 gpu: nvgpu: gv11b+: set live pes mask
This change is reading the live pes from the register
"gr_gpc0_gpm_pd_live_physical_pes_r" and set it to
"gr_gpc0_swdx_pes_mask_r".

Every PES needs at least a TPC to work. If any of the TPCs
are floorswept,the live PES mask is read from
"gr_gpc0_gpm_pd_live_physical_pes_r" and  the corresponding
active PES mask is updated in "gr_gpc0_swdx_pes_mask_r".

Bug 3677421

Change-Id: I899ac41c4a82beb3ce75c84ad57dcad262a49ba1
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2736560
(cherry picked from commit 85f2ceb3db6eeef925b49553f445d8cc31ec39da)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2759135
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-08-12 11:05:35 -07:00
Sagar Kamble
f95cb5f4f8 gpu: nvgpu: maintain ctx buffers mappings separately from ctx mems
In order to maintain separate mappings of GR TSG and global context
buffers for different subcontexts, we need to separate the memory
struct and the mapping struct for the buffers. This patch moves
the mappings of all GR ctx buffers to new structure
nvgpu_gr_ctx_mappings.

This will be instantiated per subcontext in the upcoming patches.

Summary of changes:
  1. Various context buffers were allocated and mapped separately.
     All TSG context buffers are now stored in gr_ctx->mem[] array
     since allocation and mapping is unified for them.
  2. Mapping/unmapping and querying the GPU VA of the context
     buffers is now handled in ctx_mappings unit. Structure
     nvgpu_gr_ctx_mappings in nvgpu_gr_ctx holds the maps.
     On ALLOC_OBJ_CTX this struct is instantiated and deleted
     on free_gr_ctx.
  3. Introduce mapping flags for TSG and global context buffers.
     This is to map different buffers with different caching
     attribute. Map all buffers as cacheable except
     PRIV_ACCESS_MAP, RTV_CIRCULAR_BUFFER, FECS_TRACE, GR CTX
     and PATCH ctx buffers. Map all buffers as privileged.
  4. Wherever VM or GPU VA is passed in the obj_ctx allocation
     functions, they are now replaced by nvgpu_gr_ctx_mappings.
  5. free_gr_ctx API need not accept the VM as mappings struct
     will hold the VM. mappings struct will be kept in gr_ctx.
  6. Move preemption buffers allocation logic out of
     nvgpu_gr_obj_ctx_set_graphics_preemption_mode.
  7. set_preemption_mode and gr_gk20a_update_hwpm_ctxsw_mode
     functions need update to ensure buffers are allocated
     and mapped.
  8. Keep the unit tests and documentation updated.

With these changes there is clear seggregation of allocation and
mapping of GR context buffers. This will simplify further change
to add multiple address spaces support. With multiple address
spaces in a TSG, subcontexts created after first subcontext
just need to map the buffers.

Bug 3677982

Change-Id: I3cd5f1311dd85aad1cf547da8fa45293fb7a7cb3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2712222
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-15 07:10:11 -07:00
Sagar Kamble
d3b417ce2c gpu: nvgpu: address priv_ring unit code inspection gaps
1. Hardcoded constants are defined using #define are converted to
   const.
2. set_ppriv_timeout_settings HAL is not applicable from gm20b.
   Hence remove it completely.

JIRA NVGPU-6903

Change-Id: Ic096c5dc87aa45db0aa05482947cd032ae72bdd4
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552581
(cherry picked from commit c5fb38a54208330f24754fed33d7242903dbac59)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623635
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-05-17 08:40:46 -07:00
Sagar Kamble
9d6269ce7f gpu: nvgpu: assert gr dev is non-NULL
nvgpu_device_get can return NULL if supplied invalid ID or instance
ID. We expect GR device struct to be non-NULL there hence just
assert that it is indeed non-NULL in gr_reset_engine and
ga10b_grmgr_init_gr_manager.

CID 224133
CID 250232
Bug 3512546

Change-Id: Id09a1c436a8e49b921111b940d3d013bd66bff7a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707018
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-05-07 23:24:39 -07:00
Antony Clince Alex
9e0fd1a093 gpu: nvgpu: gr: update gr suspend
Update GR suspend routine to clear GR falcon "coldboot_bootstrap_done"
flag, this is needed because GPU power rails are turned off during
suspend cycle due to which GR falcons need to be bootstrapped again
during resume.

Function "nvgpu_gr_falcon_suspend" is added to clear the above mentioned
flag.

Bug 3497398
Bug 3514055

Change-Id: If852a2c09f05c096f287b845c56d8b4f335ec8e7
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2670554
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-28 23:47:06 -07:00
Rajesh Devaraj
c5822b0d98 gpu: nvgpu: add error prints for errors reported to sdl
In Drive 6.0, only error IDs are reported to Safety_Services. The
additional debug/error information is printed using nvgpu_err().

JIRA NVGPU-8094
Bug 3491596

Change-Id: Ie90f3e1453e6a796d5c76373c11f8a5a188ac590
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2684289
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-22 17:55:10 -07:00
Shashank Singh
5ec241a1d8 gpu: nvgpu: remove non stall intr from top handler for safety
On safety nonstall interrupt is not used and should be compiled out to
rule out any chance of interference with safety code. Remove top handler
support of nonstall interrupt for safety which is currently not
applicable to linux.

Jira NVGPU-7066
Jira NVGPU-4078

Change-Id: I278efc8da6ddd0f22c128af6630cfd1b20ba4784
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589006
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671586
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-21 02:31:38 -08:00
Rajesh Devaraj
0699220b85 gpu: nvgpu: compile-out unused apis from safety build
This patch does the following changes:
- Compiles-out unused error reporting APIs and the related
  data structures from safety build. For this purpose, it
  introduces the new flag: CONFIG_NVGPU_INTR_DEBUG
- Updates nvgpu_report_err_to_sdl() API with one more argument,
  hw_unit_id. This aids in finding whether an error to be reported
  is corrected or uncorrected from LUT.
- Triggers SW quiesce, if an uncorrected error is reported to
  Safety_Services, in safety build.
- Renames files in cic folder by replacing gv11b with ga10b,
  since error reporting for gv11b is not supported in dev-main.

JIRA NVGPU-8002

Change-Id: Ic01e73b0208252abba1f615a2c98d770cdf41ca4
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2668466
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-14 22:00:33 -08:00
Rajesh Devaraj
7dc013d242 gpu: nvgpu: merge error reporting apis
In DRIVE 6.0, NvGPU is allowed to report only 32-bit metadata to
Safety_Services. So, there is no need to have distinct APIs for
reporting errors from units like GR, MM, FIFO to SDL unit. All
these error reporting APIs will be replaced with a single API. To
meet this objective, this patch does the following changes:
- Replaces nvgpu_report_*_err with nvgpu_report_err_to_sdl.
- Removes the reporting of error messages.
- Replaces nvgpu_log() with nvgpu_err(), for error reporting.
- Removes error reporting to Safety_Services from nvgpu_report_*_err.

However, nvgpu_report_*_err APIs and their related files are not
removed. During the creation of nvgpu-mon, they will be moved under
nvgpu-rm, in debug builds.

Note:
- There will be a follow-up patch to fix error IDs.
- As discussed in https://nvbugs/3491596 (comment #12), the high
level expectation is to report only errors.

JIRA NVGPU-7450

Change-Id: I428f2a9043086462754ac36a15edf6094985316f
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662590
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-09 00:41:18 -08:00
Richard Zhao
9ab1271269 gpu: nvgpu: common: fix compile error of new compile flags
It's preparing to add bellow CFLAGS:
    -Werror -Wall -Wextra \
    -Wmissing-braces -Wpointer-arith -Wundef \
    -Wconversion -Wsign-conversion \
    -Wformat-security \
    -Wmissing-declarations -Wredundant-decls -Wimplicit-fallthrough

Jira GVSCI-11640

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia8f508c65071aa4775d71b8ee5dbf88a33b5cbd5
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555056
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-13 12:36:14 -08:00
Dinesh T
ad09e3e3cc gpu: nvgpu: Enable sm_l1tag_surface_cut_collector
This is enabling sm_l1tag_surface_cut_collector at gpu boot.
This is done with adding new hal "set_sm_l1tag_surface_collector"
that sets l1tag_surface_cut_collector in the sm_l1tag_ctrl
register.

Bug 2557724

Change-Id: I869e3bfa563db204259e7a464657229632f182d9
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2634878
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-06 04:36:56 -08:00
Deepak Nibade
3d9c67a0e7 gpu: nvgpu: enable Orin support in safety build
Most of the Orin chip specific code is compiled out of safety build
with CONFIG_NVGPU_NON_FUSA and CONFIG_NVGPU_HAL_NON_FUSA. Remove the
config protection from Orin/GA10B specific code. Currently all code
is enabled. Code not required in safety will be compiled out later
in separate activity.

Other noteworthy changes in this patch related to safety build:

- In ga10b_ce_request_idle(), add a log print to dump num_pce so that
  compiler does not complain about unused variable num_pce.
- In ga10b_fifo_ctxsw_timeout_isr(), protect variables active_eng_id and
  recover under CONFIG_NVGPU_KERNEL_MODE_SUBMIT to fix compilation
  errors of unused variables.
- Compile out HAL gops.pbdma.force_ce_split() from safety since this HAL
  is GA100 specific and not required for GA10B.
- Compile out gr_ga100_process_context_buffer_priv_segment() with
  CONFIG_NVGPU_DEBUGGER.
- Compile out VAB support with CONFIG_NVGPU_HAL_NON_FUSA.
- In ga10b_gr_intr_handle_sw_method(), protect left_shift_by_2 variable
  with appropriate configs to fix unused variable compilation error.
- In ga10b_intr_isr_stall_host2soc_3(), compile ELPG function calls
  with CONFIG_NVGPU_POWER_PG.
- In ga10b_pmu_handle_swgen1_irq(), move whole function body under
  CONFIG_NVGPU_FALCON_DEBUG to fix unused variable compilation errors.
- Add below TU104 specific files in safety build since some of the code
  in those files is required for GA10B. Unnecessary code will be
  compiled out later on.
	hal/gr/init/gr_init_tu104.c
	hal/class/class_tu104.c
	hal/mc/mc_tu104.c
	hal/fifo/usermode_tu104.c
	hal/gr/falcon/gr_falcon_tu104.c
- Compile out GA10B specific debugger/profiler related files from
  safety build.
- Disable CONFIG_NVGPU_FALCON_DEBUG from safety debug build temporarily
  to work around compilation errors seen with keeping this config
  enabled. Config will be re-enabled in safety debug build later.

Jira NVGPU-7276

Change-Id: I35f2489830ac083d52504ca411c3f1d96e72fc48
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2627048
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-11-26 08:46:47 -08:00
Debarshi Dutta
2e3c3aada6 gpu: nvgpu: fix deinit of GR
Existing implementation of GR de-init doesn't account for multiple
instances of struct nvgpu_gr. As a fix, below changes are added.

1) nvgpu_gr_free is unified for VGPU as well as native.
2) All the GR instances are freed.
3) Appropriate NULL checks are added when freeing GR memories.
4) 2D, 3D, I2M and ZBC etc are explicitely disabled when MIG is set.
5) In ioctl_ctrl, checks are added to not return error when zbc is NULL
   for VGPU as requests are rerouted to RMserver.

Jira NVGPU-6920

Change-Id: Icaa40f88f523c2cdbfe3a4fd6a55681ea7a83d12
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2578500
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: Antony Clince Alex <aalex@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-23 05:27:45 -07:00
Sagar Kamble
40064ef1ec gpu: nvgpu: fix ecc counter free
ECC counter structures are freed without removing the node from the
stats_list. This can lead to invalid access due to dangling pointers.

Update the ecc counter free logic to set them to NULL upon free, to
remove them from stats_list and free them by validation.

Also updated some of the ecc init paths where error was not propa-
gated to callers and full ecc counters deallocation was not done.

Now, calling unit ecc_free from any context (with counters alloc-
ated or not) is harmless as requisite checks are in place.

bug 3326612
bug 3345977

Change-Id: I05eb6ed226cff9197ad37776912da9dcb7e0716d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2565264
Tested-by: Ashish Mhetre <amhetre@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-11 01:55:08 -07:00
Lakshmanan M
e9872a0d91 gpu: nvgpu: Skip graphics unit access when MIG is enabled
This CL covers the following modifications,
1) Added logic to skip the graphics unit specific sw context load
   register write during context creation when MIG is enabled.
2) Added logic to skip the graphics unit specific sw method
   register write when MIG is enabled.
3) Added logic to skip the graphics unit specific slcg and blcg gr
   register write when MIG is enabled.
4) Fixed some priv errors observed during MIG boot.
5) Added MIG Physical support for GPU count < 1.
6) Host clk register access is not allowed for GA100.
   So skipped to access host clk register.
7) Added utiliy api - nvgpu_gr_exec_with_ret_for_all_instances()
8) Added gr_pri_mme_shadow_ram_index_nvclass_v() reg field
   to identify the sw method class number.

Bug 200649233

Change-Id: Ie434226f007ee5df75a506fedeeb10c3d6e227a3
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2549811
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-02 16:41:51 -07:00
tkudav
0526e7eaa9 gpu: nvgpu: Create CIC-mon and CIC-rm subunits
common.cic unit is divided into common.cic.mon and common.cic.rm
based on rm and mon process split.

CIC-mon subunit includes the code which is utilized in critical
interrupt handling path like initialization, error detection and
error reporting path. CIC-rm subunit includes the code corresponding
to rest of interrupt handling(like collecting error debug data from
registers) and ISR status management (status of deferred interrupts).

Split the CIC APIs and data-members into above two subunits.

JIRA NVGPU-6899

Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-02 09:57:56 -07:00
Antony Clince Alex
d2919409e9 gpu: nvgpu: rename/collpase nvgpu_next functions and structs
Replace all nvgpu_next functions/structs either by 1) collapsing them
into nvgpu legacy functions/structs 2) renaming them as follows:
- nvgpu_next_*() => nvgpu_(ga10b/ga100)_*()
- nvgpu_next_*() => (ga10b/ga100)_*()
- nvgpu_next_*() => nvgpu_*() [only if this doesn't cause collision]
- nvgpu_next_*() = > nvgpu_*_extra()

Create hal.sim unit and move Ampere+ SIM code into it.

Jira NVGPU-4771

Change-Id: I215594a0d0df4bd663bd875a0d0db47bcb9ff6a2
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548056
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:58 -07:00
Antony Clince Alex
f9cac0c64d gpu: nvgpu: remove nvgpu_next files
Remove all nvgpu_next files and move the code into corresponding
nvgpu files.

Merge nvgpu-next-*.yaml into nvgpu-.yaml files.

Jira NVGPU-4771

Change-Id: I595311be3c7bbb4f6314811e68712ff01763801e
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547557
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:53 -07:00
Antony Clince Alex
c7d43f5292 gpu: nvgpu: remove usage of CONFIG_NVGPU_NEXT
The CONFIG_NVGPU_NEXT config is no longer required now that ga10b and
ga100 sources have been collapsed. However, the ga100, ga10b sources
are not safety certified, so mark them as NON_FUSA by replacing
CONFIG_NVGPU_NEXT with CONFIG_NVGPU_NON_FUSA.

Move CONFIG_NVGPU_MIG to Makefile.linux.config and enable MIG support
by default on standard build.

Jira NVGPU-4771

Change-Id: Idc5861fe71d9d510766cf242c6858e2faf97d7d0
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547092
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-06-27 05:02:47 -07:00
Lakshmanan M
df87591b7d gpu: nvgpu: Add multi gr handling for debugger and profiler
1) Added multi gr handling for dbg_ioctl apis.
2) Added nvgpu_assert() in gr_instances.h (for legacy mode).
3) Added multi gr handling for prof_ioctl apis.
4) Added multi gr handling for profiler.
5) Added multi gr handling for ctxsw enable/disable apis.
6) Updated update_hwpm_ctxsw_mode() HAL for multi gr handling.

JIRA NVGPU-5656

Change-Id: I3024d5e6d39bba7a1ae54c5e88c061ce9133e710
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538761
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-06-04 18:07:47 -07:00
Tejal Kudav
9f43914933 gpu: nvgpu: Move Intr handling common code to CIC
CIC (Central Interrupt controller) will be responsible for the
interrupt handling. common.cic unit is the placeholder for all
interrupt related code. Move interrupt related defines and
Public APIs present in common.mc to common.cic.
Note: The common.mc interrupts related struct definitions are
not moved as part of this patch.

Adapt the code to use interrupt handling related defines and public
APIs migrated from common.mc to common.cic

JIRA NVGPU-6899

Change-Id: I747e2b556c0dd66d58d74ee5bb36768b9370d276
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535618
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-31 19:37:31 -07:00
Deepak Nibade
9034b1676e gpu: nvgpu: compile out GFxP support in safety
GFxP preemption for graphics contexts is not supported in safety.
But the support was enabled along with CONFIG_NVGPU_GRAPHICS since GFxP
preemption was protected under same config.

Add a separate config CONFIG_NVGPU_GFXP to protect all GFxP specific
code, enum values, and HALs.

Disable the config in safety profile.

Jira NVGPU-6893

Change-Id: Iebb5f754a1025dfa6e05a94704bdb8a7123b599a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534986
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-28 15:17:36 -07:00
Deepak Nibade
cebefd7ea2 gpu: nvgpu: move RTV CB code to GRAPHICS config
Some of the RTV circular buffer programming is under GRAPHICS config and
some is under DGPU config. For nvgpu-next, RTV circular buffer is
required even for iGPU so keeping the code under DGPU config does not
make sense.
Move all the code from DGPU config to GRAPHICS config.

Bug 3159973

Change-Id: I8438cc0e25354d27701df2fe44762306a731d8cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-05-06 06:10:58 -07:00
Debarshi Dutta
0a25376965 gpu: nvgpu: disable access to PE unit when MIG is enabled
PE unit belongs to GR pipeline but not compute.
Hence disabled access to the PE register in the GR Boot flow
to prevent following PRIV error when SMC mode is enabled.

PRI timeout: ADR 0x00503018 READ  DATA 0x00000000
FECS_ERRCODE 0xbadf1100
[Error Type]: decode error

Jira NVGPU-6699

Change-Id: Ia6f58258611a010252c7ead46b1b48cbf1b64001
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2514894
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-04-19 16:19:09 -07:00
Deepak Nibade
bb43f11a61 gpu: nvgpu: update common.gr doxygen
Add below updates to common.gr doxygen:

- Add doxygen comments for APIs that are mentioned in RM SWAD and in
  RM-common.gr traceability document.
- Comment about valid ranges for input parameters of bunch of functions.
- Add nvgpu_assert() to ensure correct value is passed as input
  parameter to number of functions.
- Add references to relevant functions with @see.
- Update Targets field for unit tests to cover newly doxygenated
  functions.
- Update unit test test_gr_init_hal_pd_skip_table_gpc to take care of
  new asserts added into some APIs.

Jira NVGPU-6180

Change-Id: Ie889bed96b6428b1fd86dcf30b322944464e9d12
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2469397
(cherry picked from commit 5d7d7e9ce1c4efe836ab842d7962a3aee4e8972f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2469394
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-02-02 23:34:27 -08:00
Lakshmanan M
883c12529a gpu: nvgpu: Add multi GR reset support for MIG
* Added multi GR reset/recovery support for MIG.
* Added a api to get the gr engine id using gr instance id.

JIRA NVGPU-5650
JIRA NVGPU-5653

Change-Id: I12ece75a4c33f0944f404121b54879e814dda6df
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2443644
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:48 -06:00
Vedashree Vidwans
c0b9ae2f17 gpu: nvgpu: enable gr_reset in recovery on sim platform
HALT_PIPELINE method is supported on nvgpu-next simulation platform.
Send HALT_PIPELINE followed by gr reset during recovery for all types of
platforms including simulation platform.

Bug 3109773

Change-Id: Ib830075bb9414fa1765c762a652e63cddbe6a141
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406719
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Antony Clince Alex
c36752fe3d gpu: nvgpu: sim: make ring buffer independent of PAGE_SIZE
The simulator ring buffer DMA interface supports buffers of the following sizes:
4, 8, 12 and 16K. At present, it is configured to 4K and it  happens to match
with the kernel PAGE_SIZE, which is used to wrap back the GET/PUT pointers once
4K is reached. However, this is not always true; for instance, take 64K pages.
Hence, replace PAGE_SIZE with SIM_BFR_SIZE.

Introduce macro NVGPU_CPU_PAGE_SIZE which aliases to PAGE_SIZE and replace
latter with former.

Bug 200658101
Jira NVGPU-6018

Change-Id: I83cc62b87291734015c51f3e5a98173549e065de
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2420728
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Lakshmanan M
995731171b gpu: nvgpu: Do not reset PERFMON and BLG when MIG is enabled
Do not reset PERFMON and BLG when MIG is enabled as
PERFMON is a global engine which is shared by all syspipes.
Individual PERF counters can be reset during gr syspipe reset.

JIRA NVGPU-5650

Change-Id: I4a7fc9b6c62e94ee65779068ca257cb8e01c8cee
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2424604
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Lakshmanan M
2ecb5feaad gpu: nvgpu: Skip graphics CB programming for MIG
Added logic to skip the following graphics CB allocation, map and
programming sequence when MIG is enabled.

Global CB:
1) NVGPU_GR_GLOBAL_CTX_CIRCULAR
2) NVGPU_GR_GLOBAL_CTX_PAGEPOOL
3) NVGPU_GR_GLOBAL_CTX_ATTRIBUTE
4) NVGPU_GR_GLOBAL_CTX_CIRCULAR_VPR
5) NVGPU_GR_GLOBAL_CTX_PAGEPOOL_VPR
6) NVGPU_GR_GLOBAL_CTX_ATTRIBUTE_VPR
7) NVGPU_GR_GLOBAL_CTX_RTV_CIRCULAR_BUFFER

CTX CB:
1) NVGPU_GR_CTX_CIRCULAR_VA
2) NVGPU_GR_CTX_PAGEPOOL_VA
3) NVGPU_GR_CTX_ATTRIBUTE_VA
4) NVGPU_GR_CTX_RTV_CIRCULAR_BUFFER_VA

JIRA NVGPU-5650

Change-Id: I38c2859ce57ad76c58a772fdf9f589f2106149af
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2423450
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
94bc3a8135 gpu: nvgpu: rearch zbc code and update hals
Update nvgpu_gr_zbc as:
struct nvgpu_gr_zbc {
   struct nvgpu_mutex zbc_lock;	/* Lock to access zbc table */
   struct zbc_color_table *zbc_col_tbl; /* SW zbc color table pointer */
   struct zbc_depth_table *zbc_dep_tbl; /* SW zbc depth table pointer */
   struct zbc_stencil_table *zbc_s_tbl; /* SW zbc stencil table pointer */
   u32 min_color_index;	/* Minimum valid color table index */
   u32 min_depth_index;	/* Minimum valid depth table index */
   u32 min_stencil_index;	/* Minimum valid stencil table index */
   u32 max_color_index;	/* Maximum valid color table index */
   u32 max_depth_index;	/* Maximum valid depth table index */
   u32 max_stencil_index;	/* Maximum valid stencil table index */
   u32 max_used_color_index; /* Max used color table index */
   u32 max_used_depth_index; /* Max used depth table index */
   u32 max_used_stencil_index; /* Max used stencil table index */
};

Add global struct nvgpu_gr_zbc_table_indices
struct nvgpu_gr_zbc_table_indices {
       u32 min_color_index;
       u32 min_depth_index;
       u32 min_stencil_index;
       u32 max_color_index;
       u32 max_depth_index;
       u32 max_stencil_index;
};

Currently, hw zbc table registers are written during both
gr_init_setup_sw() and gr_init_setup_hw().
- Modify nvgpu_gr_zbc_load_default_table() to
nvgpu_gr_zbc_load_default_sw_table() to only update sw copy of zbc table
during gr_init_setup_sw().
- Modify nvgpu_gr_zbc_load_table() to write zbc values stored in sw zbc
table to hw registers.

Re-structure zbc function as per zbc type i.e. color, depth and stencil.

Add gr.zbc.init_table_indices() hal to initialize zbc indices. Valid ZBC
table indices start from 1. HW indices start from 0 for color, depth and
stencil tables. Note that the corresponding format registers follow ZBC
index range starting at 1.
- void (*init_table_indices)(struct gk20a *g,
	struct nvgpu_gr_zbc_table_indices *zbc_indices);
- Add corresponding functions for legacy chips
- Add zbc color, depth and stencil table size hw defines
- Remove ltc.zbc_table_size() hal
- Update ltc.set_zbc_s_entry(), ltc.set_zbc_color_entry and
ltc.set_zbc_depth_entry() accordingly.

Bug 3122410
Bug 3122649

Change-Id: Ib799991ad35c6613534c0a6eb07f3bf24e600dc5
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2417620
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
d2bb5df3c7 gpu: nvgpu: remove NVGPU_GR_NUM_INSTANCES
common.gr defined a temporary macro NVGPU_GR_NUM_INSTANCES to enable or
disable multiple GR instances from common.gr unit.
Multiple GR instance boot is now verified, so we can remove this
temporary solution.

Note that nvgpu_grmgr_get_num_gr_instances() will return more than 1
instance only if NVGPU_SUPPORT_MIG is enabled.

Update unit tests to set number of syspipes to 1 to allow enumeration
of GR instance by grmgr.

Jira NVGPU-5648

Change-Id: I795901ae516843ae7b6c1794dae0f023a213ab1d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2418377
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
ebb66b5d50 gpu: nvgpu: add macros to get current GR instance
Add macros to get current GR instance id and the pointer
nvgpu_gr_get_cur_instance_ptr()
nvgpu_gr_get_cur_instance_id()

This approach makes sure that the caller is getting GR instance pointer
under mutex g->mig.gr_syspipe_lock in MIG mode. Trying to access
current GR instance outside of this lock in MIG mode dumps a warning.

Return 0th instance in case MIG mode is disabled.

Use these macros in nvgpu instead of direct reference to
g->mig.cur_gr_instance.

Store instance id in struct nvgpu_gr. This is to retrieve GR instance
id in functions where struct nvgpu_gr pointer is already available.

Jira NVGPU-5648

Change-Id: Ibfef6a22371bfdccfdc2a7d636b0a3e8d0eff6d9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2413140
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
6a69ea235e gpu: nvgpu: disable graphics specific init functions in MIG mode
MIG mode does not support graphics, ELPG, and use cases like TPC
floorsweeping. Skip all such initialization functions in common.gr
unit if MIG mode is enabled.

Set can_elpg to false if MIG mode is enabled.

Jira NVGPU-5648

Change-Id: I03656dc6289e49a21ec7783430db9c8564c6bf1f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2411741
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
7a937a6190 gpu: nvgpu: add debug logs for common.gr debugging
Add separate flag gpu_dbg_gr to enable common.gr specific debugging.
Add this flag to all the existing debug logs that use gpu_dbg_fn or
gpu_dbg_info for debugging. Also add many other debugging logs that
might be helpful in debugging.

Removing debug log in gv11b_gr_init_get_nonpes_aware_tpc() as it dumps
too much data that does not seem useful.

Batch all interrupt enable functions in gr_init_setup_hw() together for
readability.

Jira NVGPU-5648

Change-Id: I0b857650122cdb1f974b452d28c26e7f142baf61
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2411740
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
e0dd79cd43 gpu: nvgpu: rearch mc reset and enable hals
Remove current mc hals
- mc.reset()
- mc.enable()
- mc.disable()
- mc.reset_mask()
- mc.reset_engine()
- mc.reset_engine_enable()

Add new mc hals
- mc.enable_units(g, units, enable)
  > enable/disable given unit(s)
- mc.enable_dev(g, dev, enable)
  > enable/disable engine represented by given device pointer
- mc.enable_devtype(g, devtype)
  > enable/disable all engines of given devtype

Move common mc intr functions to common/mc/mc_intr.c.
Add below common mc functions
- nvgpu_mc_reset_units(g, units)
  > reset given logical OR of nvgpu unit bitmap
- nvgpu_mc_reset_dev(g, dev)
  > reset given single engine via dev
  > if engine is graphics, reset gpcs for nvgpu_next
- nvgpu_mc_reset_devtype(g, devtype)
  > reset all engines of given devtype
  > if devtype is graphics, reset gpcs for nvgpu_next

Bug 200648985
Bug 3109773

Change-Id: Idc67a14a0a7cde83de44fbfbec13007fead3ed5c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2408523
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
e6e7561084 gpu: nvgpu: execute nvgpu_gr_init_support for each GR instance
nvgpu_gr_init_support() right now executes each of its function for each
GR instance separately. Instead of looping for each function, move the
GR engine initialization sequence to a separate gr_init_support_impl()
and execute this function for each instance.

Update below functions to take nvgpu_gr pointer as parameter. These
functions need not worry about GR instance, instead they'll just operate
on provided instance pointer.
gr_init_setup_hw
gr_init_config
gr_init_setup_sw
gr_init_sm_id_config_early
gr_init_ctxsw_falcon_support

Add new static function gr_init_support_finalize() to set the ready
status and invoke waiters. Execute this per GR instance.

gr_init_ecc_init() and nvgpu_cg_elcg_enable_no_wait() are not needed to
be run per instance.
gr_init_ecc_init() will be later updated to allocate meta data for all
instances

Jira NVGPU-5648

Change-Id: Ia6860f2bdfe0080aebf8930266d3f51bfd805e36
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410703
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
bafeea3530 gpu: nvgpu: setup HW for each GR instance
Get number of SMs from GR instance specific nvgpu_gr_config pointer
instead of global SM count in below functions :
nvgpu_gr_fs_state_init()
gv11b_gr_init_sm_id_config()

Update nvgpu_gr_config_get_gpc_skip_mask() to return 0 in case gpc_index
is greater than available gpc_count. This is not MIG specific, but based
on code review possible even today for existing chips.
See gm20b_gr_init_pd_skip_table_gpc()

Update nvgpu_gr_get_override_ecc_val() to return GR instance specific
value.

Execute gr_init_setup_hw() for each GR instance.

Disable below failing unit tests:
nvgpu_gr_fs_state.test_gr_fs_state_error_injection
nvgpu_gr_init.test_gr_init_hal_config_error_injection

Jira NVGPU-5648

Change-Id: Ie8f1c0c304c634756786d85facf336a5c9ae8195
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410702
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
3df2ed4f82 gpu: nvgpu: setup SW for each GR instance
Execute gr_init_setup_sw() for each GR instance.
Update all of the functions called from this function to receive
nvgpu_gr pointer explicitly.

Separate out nvgpu_gr_zbc_init() call to gr_init_setup_sw() and rename
gr_init_ctx_and_map_zbc() to gr_init_ctx_bufs() for more clarity.

Call gr_init_ecc_init() from nvgpu_gr_init_support() since this does not
need to be executed per GR instance.

Initialize mutex etc in nvgpu_gr_alloc() for consistency.

Jira NVGPU-5648

Change-Id: I8e990e11458c05c1b53a4d6710cc2ec3545762a8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410701
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
83691e088f gpu: nvgpu: initialize ctx state for each GR instance
Execute nvgpu_gr_init_ctx_state() for each GR instance. Move it under
gr_init_ctxsw_falcon_support() which is already executed for each
instance.

Update the API to accept struct nvgpu_gr pointer for convenience. API
does not need to know about other instances.

For reset path, continue using g->gr instead of specific instance.
This will be revisited when entire reset path is refactored.

Jira NVGPU-5648

Change-Id: I8879bf3b44bb01f6b8053f1aecbd550f49837520
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2409535
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
b6c72410bb gpu: nvgpu: execute CTXSW ucode initialization per GR instance
Move CTXSW ucode initialization to separate static API
gr_init_ctxsw_falcon_support() and execute this per GR instance with
nvgpu_gr_exec_with_ret_for_each_instance()

Jira NVGPU-5648

Change-Id: I6e0fa72bd568eaac027bb12edcdf90255336f0a1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2409532
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
43242fa878 gpu: nvgpu: init ctxsw state after gr reset
Ctxsw state will be lost after gr reset. After gr reset
in recovery sequence, re-initialize ctxsw state to send
below fecs methods:
gr_fecs_method_push_adr_discover_image_size_v()
gr_fecs_method_push_adr_discover_pm_image_size_v()
gr_fecs_method_push_adr_discover_zcull_image_size_v()
gr_fecs_method_push_adr_discover_preemption_image_size_v()

Without these methods sent to ctxsw, fecs will generate
host error interrupts indicating mismatches in ctxsw
image. Above fecs methods needs to be sent even if they
are already sent during golden context creation.

Bug 3109773

Change-Id: I2aeb92da8fa1961903ab95ef90f47906a1bb32c4
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406685
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
da43acf639 gpu: nvgpu: execute early SM id config for each instance
Execute gops.gr.init.sm_id_config_early() for each GR instance with
nvgpu_gr_exec_with_ret_for_each_instance()

Jira NVGPU-5648

Change-Id: I7023ed5c7d65d43eb7bb8384617464a39c846f56
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2408419
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
fc12a284bf gpu: nvgpu: initialize per GR instance config
Expose below two new APIs from common.grmgr unit
nvgpu_grmgr_get_gr_num_gpcs() - get per instance number of GPCs
nvgpu_grmgr_get_gr_gpc_phys_id() - get physical GPC id for MIG engine
local id in corresponding instance

Execute gr_init_config() for each GR instance.
Add gr_config_init_mig_gpcs() to initialize GPC data in case MIG is
enabled. Separate out gr_config_init_gpcs() for legacy GPC data
initialization.

These functions will inititialize below data in struct nvgpu_gr_config:
max_gpc_count
gpc_count
gpc_mask
gpc_tpc_mask[gpc_count]
max_tpc_per_gpc_count

Rest of the values in struct nvgpu_gr_config are either based on above
values, or read from HW after setting GPC PRI window.

In gr_config_alloc_struct_mem(), rename total_gpc_cnt to total_tpc_cnt
since it represents total TPC count and not GPC. Remove use of temp3
variable since it does not give any idea on usage.

Jira NVGPU-5648

Change-Id: I646cac2ddc312e72b241b1b2a0e51a5cce141535
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406390
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00