Commit Graph

41 Commits

Author SHA1 Message Date
Lakshmanan M
2a6fcec078 gpu: nvgpu: add gr manager ops-2 and mig infra-2
This CL covers the code changes related to following support,
 - Enabled gr manager ops.
 - Added gr manager init/remove support.
 - Refactor in gpu instance config infra.
 - Refactor in gr syspipe gpcs config infra.

JIRA NVGPU-5645
JIRA NVGPU-5646

Change-Id: Ib2fab2796d76fe105fc5a08f2c5f9bfa36317f7c
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2393550
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
3f81f1952d gpu: nvgpu: vgpu: fix NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS crash
vgpu currently does not support suspend gpu context and stall
the whole gpu, because of safety concerns. So vgpu does not set
HALs that are related to on-gpu context.

This change unset gops.gr.clear_sm_errors. And the ioctl
NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS will return -ENOSYS.

Bug 200469468

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ie578495e175ad898994fe1c4184a0243d5541cd3
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395598
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Lakshmanan M
c99afa1766 gpu: nvgpu: add gr manager and mig infra
This CL covers the code changes related to following support,
 - Added gr manager infra.
 - Added grmgr_gops infra.
 - Added mig infra.
 - Added log mask for MIG verbose support.

JIRA NVGPU-5645
JIRA NVGPU-5646

Change-Id: Iec356e08e6cfee86ad9f59fdf6cfee9c38231359
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2385111
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
9d723a5f1f gpu: nvgpu: add knob to control fecs_trace feature
Currently, NVGPU_SUPPORT_FECS_CTXSW_TRACE enabled flag is set to true
when fecs_trace s/w setup is executed successfully. Sometimes,
fecs_trace is required to be disabled for debugging. This change will
help disable/enable fecs_trace feature by modifying one of the enabled
flags.
Enable NVGPU_SUPPORT_FECS_CTXSW_TRACE during chip specific hal init.
Control fec_trace init and ctxsw dev open depending on
NVGPU_SUPPORT_FECS_CTXSW_TRACE flag status.

JIRA NVGPU-5616

Change-Id: Id0754a5af7cd95a67a1f0ae5de36115d44e1111b
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357501
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
f34711d3de gpu: nvgpu: split perfbuf initialization
gk20a_perfbuf_map() allocates perfbuf VM, maps the user buffer into new
VM, and then triggers gops.perfbuf.perfbuf_enable(). This HAL then does
following :
- Allocate perfbuf instance block
- Initialize perfbuf instance block
- Reset stream buffer
- Program instance block address in PMA registers
- Program user buffer address into PMA registers

New profiler interface will have it's own API to setup PMA strem, and
it requires above setup to be done in two phases of perfbuf
initialization and then user buffer setup.

Split above functionalities into below functions
- nvgpu_perfbuf_init_vm()
  - Allocate perfbuf VM
  - Call gops.perfbuf.init_inst_block() to initialize perfbuf instance
    block

- gops.perfbuf.init_inst_block()
  - Allocate perfbuf instance block
  - Initialize perfbuf instance block
  - Program instance block address in PMA registers using
    gops.perf.init_inst_block()
  - In case of vGPU, trigger TEGRA_VGPU_CMD_PERFBUF_INST_BLOCK_MGT
    command to gpu server

- gops.perf.init_inst_block()
  - Reset stream buffer
  - Program user buffer address into PMA registers

Also add corresponding cleanup functions as below :
gops.perf.deinit_inst_block()
gops.perfbuf.deinit_inst_block()
nvgpu_perfbuf_deinit_vm()

Bug 2510974
Jira NVGPU-5360

Change-Id: I486370f21012cbb7fea84fe46fb16db95bc16790
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372984
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
359fc24aaf gpu: nvgpu: Rework engine management to work with vGPU
Currently the vGPU engine management rewrites a lot of the common
device agnostic engine management code.

With the new top HAL parsing one device at a time, it is now more
easily possible to tie the vGPU into the new common device framework
by implementing the top HAL but with the vGPU engine list backend.

This lets the vGPU inherit all the common engine and device
management code. By doing so the vGPU HAL need only implement a
trivial and simple HAL.

This also gets us a step closer to merging all of the CE init
code: logically it just iterates through all CE engines whatever
they may be. The only reason this differs between chips is because
of the swap from CE0-2 to LCEs in the Pascal generation. This could
be abstracted by the unit code easily enough.

Also, the pbdma_id for each engine has to be added to the device
struct. Eventually this was going to happen anyway, since the
device struct will soon replace the nvgpu_engine_info struct.
It's a little bit of an abuse but might be worth it long term. If
not, it should not be difficult to replace uses of dev->pbdma_id
with a proper lookup of PBDMA ID based on the device info.

JIRA NVGPU-5421

Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
04a179a161 gpu: nvgpu: del gr.get_lrf_tex_ltc_dram_override
Delete unused gr gops get_lrf_tex_ltc_dram_override().

Jira NVGPU-5755

Change-Id: Ic8f8e8de8066325109c0284f0f620accdd81db7b
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368974
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
08308bc936 gpu: nvgpu: rework pm resource reservation system
Current PM resource reservation system is limited to HWPM resources
only. And reservation tracking is done using boolean variables.

New upcoming profiler support requires reservation for all the PM
resources like SMPC and PMA stream. Using boolean variables is
not scalable and confusing. Plus the variables have to be replicated
on gpu server in case of virtualization.

Remove flag tracking mechanism and use list based approach to track
all PM reservations. Also, current HALs are defined on debugger object.
Implement new HALs in new pm_reservation object since it is really an
independent functionality.

Add new source file common/profiler/pm_reservation.c which implements
functions to reserve/release resources and to check if any resource
is reserved or not.
Add common/vgpu/pm_reservation_vgpu.c for vGPU which simply forwards
the request to gpu server.

Define new HAL object gops.pm_reservation and assign above functions
to below respective HALs :
g->ops.pm_reservation.acquire()
g->ops.pm_reservation.release()
g->ops.pm_reservation.release_all_per_vmid()

Last HAL above is only used for gpu server cleanup of guest OS.

Add below new common profiler functions that act as APIs to reserve/
release resources for rest of the units in nvgpu.
nvgpu_profiler_pm_resource_reserve()
nvgpu_profiler_pm_resource_release()

Initialize the meta data required for reservtion system in
nvgpu_pm_reservation_init() and call it during nvgpu_finalize_poweron.
Clean up the meta data before releasing struct gk20a.

Delete below HALs :
g->ops.debugger.check_and_set_global_reservation()
g->ops.debugger.check_and_set_context_reservation()
g->ops.debugger.release_profiler_reservation()

Bug 2510974
Jira NVGPU-5360

Change-Id: I4d9f89c58c791b3b2e63099a8a603462e5319222
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2367224
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
1ff79b1d2c gpu: nvgpu: remove support for quad reg_op
quad type reg_ops were only needed on Kepler, and not for any other chip
beginning Maxweel.

HAL g->ops.gr.access_smpc_reg() was incorrectly set for Volta and Turing
whereas it was only applicable to Kepler. Delete it.

There is no register in the quad type whitelist since the type itself is
not supported anymore. Remove the empty whitelists for all chips and
also delete below HALs:
g->ops.regops.get_qctl_whitelist()
g->ops.regops.get_qctl_whitelist_count()

hal/regops/regops_gv100.* files are not used anymore. Delete the files
instead of just deleting quad HALs in these files.

Bug 200628391

Change-Id: I4dcc04bef5c24eb4d63d913f492a8c00543163a2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366035
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
194fac7f3c gpu: nvgpu: Remove clutter in engine code
Remove the get_mask_on_id() HAL and replace it's usage with the
global nvgpu_engine_get_mask_on_id() function. There's no need
to have this function as a HAL.

JIRA NVGPU-5420

Change-Id: I4fc843beff8e65806da26a0addc83fa218d390ac
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361315
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
2d94863cae gpu: nvgpu: move is_tpc_addr and get_tpc_num to common
gr.is_tpc_addr() and gr.get_tpc_num() are chip agnostic hals. Move these
hals to common code.

Jira NVGPU-5504

Change-Id: I50fa7ac876c8667de42df1830bd412b412538508
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349272
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Richard Zhao
6d922dd9b7 gpu: nvgpu: vgpu: remove debugfs node dump_ctxsw_stats_on_channel_close
It could cause kernel debug since vgpu cannot dump gr_ctx content.
Also set .dump_ctxsw_stats null in vgpu hal.

Bug 2848790

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia9ec99d464be72e2be26df25c572e671e10c18a5
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349295
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Richard Zhao
cef1780e05 gpu: nvgpu: vgpu: remove ce_app support
Kernel oops on dump ce_app debugfs nodes. ce_app is only used by dGPU
which vgpu does not support currently. This patch removes hal setup and
debugfs setup for ce_app.

Bug 2848790

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ia60a06a27b2d2ceda96ca567cda9e9a01e023c4b
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349294
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
cd7194cbc0 gpu: nvgpu: modify gmmu page table entry functions
Move below chip agnostic gmmu pte functions to common/mm/gmmu/pte.c.
- gmmu_aperture_mask()
- pte_dbg_print()

Default big page size for all chips is 64K. So, move
gp10b_mm_get_default_big_page_size() to common file and rename as
nvgpu_gmmu_default_big_page_size().

Modify gv11b_gpu_phys_addr() to use get_iommu_bit() hal.

JIRA NVGPU-4666

Change-Id: I512c42723faf2d03e5b367879c9c385dcf52cdc2
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329560
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
4acf78dff3 gpu: nvgpu: guard sync cmd hals properly
Make the syncpt and sema wait and incr command HAL ops consistent. Add
CONFIG_NVGPU_SW_SEMAPHORE guards for the semaphore ops. The syncpoint
ops already have CONFIG_TEGRA_GK20A_NVHOST around them.

Delete the dummy syncpt ops. They are not used; the ops are only needed
when the real versions exist.

Jira NVGPU-4548

Change-Id: I30315a67169b31b1d63a0a1a0a4492688db4a2bc
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325100
(cherry picked from commit ed13b286c5fbdbc008ec59172d98ac79e9f2e733)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331337
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
6202ead057 gpu: nvgpu: split sema sync hal to wait and incr
Instead of one HAL op with a boolean flag to decide whether to do one
thing or another entirely different thing, use two separate HAL ops for
filling priv cmd bufs with semaphore wait and semaphore increment
commands. It's already two ops for syncpoints, and explicit commands are
more readable than boolean flags.

Change offset into cmdbuf in sem wait HAL to be relative to the cmdbuf,
so the HAL adds the cmdbuf internal offset to it.

While at it, modify the syncpoint cmdbuf HAL ops' prototypes to be
consistent.

Jira NVGPU-4548

Change-Id: Ibac1fc5fe2ef113e4e16b56358ecfa8904464c82
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323319
(cherry picked from commit 08c1fa38c0fe4effe6ff7a992af55f46e03e77d0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328409
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vinod G
340ea241cb gpu: nvgpu: remove channel debug_dump hal
Channel debug_dump hal function does not involve
any register related code.

Move gv11b_channel_debug_dump hal function to
common code nvgpu_channel_info_debug_dump function.

Check gpu hw version to limit instance variables
dump that differs between socs.

Add new hal pointer syncpt_debug_dump for pbdma.

Jira NVGPU-5109

Signed-off-by: Vinod G <vinodg@nvidia.com>
Change-Id: Icfca837ce8e4117387cffa6fadf6c094c7da5946
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321016
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
f483304238 gpu: nvgpu: add prerequisite for syncpoint-shim support
add check for nvgpu_has_syncpoints() before enabling syncpoint-shim and
usermode_syncpoint support. Syncpoint shim cannot exist without
syncpoint support in the first place.

Bug 200551105

Change-Id: I2a9c6d23c72a25bcac4a2a8737ed0bad14cd4d8f
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323208
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Seshendra Gadagottu
675fb39ca0 gpu: nvgpu: add runlist.init_enginfo hal
Add runlist.init_enginfo hal to initialize
runlist's engine info. nvgpu-next has it's own
implementation for init_enginfo hal, so removed
NVGPU_NEXT_INIT_RUNLIST_ENGINFO from nvgpu hals.

JIRA NVGPU-4979

Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Change-Id: Ie35a88c6ba3c7c741124386f7c643b36b42d4143
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319103
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Sagar Kamble
59c6947fc6 gpu: nvgpu: add CONFIG_NVGPU_TEGRA_FUSE
Encapsulate the tegra fuse functionality under the config flag
CONFIG_NVGPU_TEGRA_FUSE.

Bug 2834141

Change-Id: I54c9e82360e8a24008ea14eb55af80f81d325cdc
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2306432
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Lakshmanan M
1c991a58af gpu: nvgpu: Add SM diversity support
To achieve permanent fault coverage, the CTAs launched by
each kernel in the mission and redundant contexts must execute on
different hardware resources. This feature proposes modifications
in the software to modify the virtual SM id to TPC mapping across
the mission and redundant contexts. The virtual SM identifier to TPC
mapping is done by nvgpu when setting up the patch context.

The recommendation for the redundant setting is to offset the
assignment by one TPC, and not by one GPC. This will ensure that both
GPC and TPC diversity. The SM and Quadrant diversity will happen
naturally. For kernels with few CTAs, the diversity is guaranteed
to be 100%. In case of completely random CTA allocation,
e.g. large number of CTAs in the waiting queue, the diversity is
1 - 1/#SM, or 87.5% for GV11B, 97.9% for TU104.

Added NvGpu CFLAGS to enable/disable the SM diversity support
"CONFIG_NVGPU_SM_DIVERSITY".

This support is only enabled on gv11b and tu104 QNX non safety build.

JIRA NVGPU-4685

Change-Id: I8e3eaa72d8cf7aff97f61e4c2abd10b2afe0fe8b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
cb117411ca gpu: nvgpu: cg: update the gating reglist hals
pwr_csb slcg, blcg gating registers are covered by pmu slcg/blcg hence
its load functions are not used. Hence, delete the generated data and
functions. slcg, blcg ctxsw_firmware and pg_gr gating reglists are
null hence delete the generated data and functions.

JIRA NVGPU-2175

Change-Id: Ib04d9845331c9a287666d3b8c974e1d3b66a7677
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2263272
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Lakshmanan M
a52ee77837 gpu: nvgpu: Add SM diversity gpu characteristic flag
To achieve permanent fault coverage, the CTAs launched by
each kernel in the mission and redundant contexts must execute
on different hardware resources.
This feature requires a change in software to make it possible
to modify the virtual SM id to TPC mapping across mission and
redundant contexts.

This CL adds only SM diversity flags which are exposed to
its clients through ioctl/devctl interfaces.
Actual virtual SM id to TPC mapping implementation
will be part of upcoming patch sets.

Added NvGpu CFLAGS to identify the safety build
"CONFIG_NVGPU_BUILD_CONFIGURATION_IS_SAFETY"

JIRA NVGPU-4133

Change-Id: I5a18256780e6726e399e39c1c8d155d2ef07d7bd
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2250461
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Deepak Nibade
4554b4654a gpu: nvgpu: make gops.gr.init.fs_state return void
This HAL function does not return any real error at all.
So just change the return type to void.

In case of vGPU, this function only calls another HAL
gops.gr.config.init_sm_id_table(). So unset gops.gr.init.fs_state()
for vGPU, and call gops.gr.config.init_sm_id_table() directly.

Jira NVGPU-4373

Change-Id: I06a80520e9be50a0703608a79187c553b33aa582
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2247844
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
a8c9c800cd gpu: nvgpu: reorganization of MC interrupts control
Previously, unit interrupt enabling/disabling and corresponding MC level
interrupt enabling/disabling was not done at the same time.
With this change, stall and nonstall interrupt for units are programmed
at MC level along with individual unit interrupts. Kept access to MC
interrupt registers through mc.intr_lock spinlock.

For doing this separated CE and GR interrupt mask functions.
mc.intr_enable is only used when there is global interrupt
control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable
is now removed. Removed following functions - mc_gv100_intr_enable,
mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config
as we can use the generic unit interrupt control function.

JIRA NVGPU-4336

Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196178
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
rmylavarapu
692a442e9d nvgpu: gpu: Remove freq_controller support.
Removed Freq_controller support as it is no longer
supported in auto profile.

NVGPU-4284

Change-Id: I276048e44cb8a33f303517da91cb6ea0f1612695
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2211457
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
b26acdeb87 gpu: nvgpu: move mc_boot_0 function to hals and rename to get_chip_details
This function gets the GPU chip architecture, implementation and
revision information by reading the MC boot register, hence it
is more suited to be located in HAL files.
test_check_gpu_state is now being run after test_hal_init as the
gops.mc needs to be initialized for test_check_gpu_state subtest.

JIRA NVGPU-2524

Change-Id: I85355af11d3505a9eb4f10a3fe4e6d9b56285047
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226018
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
2edf3db10a gpu: nvgpu: move mc gpu_ops out of gk20a.h and add doxygen comments for HALs
gk20a.h will include gops_mc.h to contain the mc ops definitions. Add
doxygen comments for the HAL functions that are called directly.
Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file.

JIRA NVGPU-2524

Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Philip Elcan
cf8707e2b3 gpu: nvgpu: mm: add hal to get max page table levels
Add a HAL API to get the maximum page table levels for the current
hardware.

JIRA NVGPU-3489

Change-Id: I1635ca576f3db461afb8e4e46db1e8912bcfdcd6
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2224449
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Sagar Kamble
6fe794bc98 gpu: nvgpu: prepare ce_app.h header
In preparation for SWUD of CG unit, separate CE app related APIs
into separate header ce_app.h.

JIRA NVGPU-4143

Change-Id: I9be8a4f2eee3aaf3af71f5843f957052064d9651
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2221660
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:10:29 -06:00
Peter Daifuku
77e3704d3d nvgpu: vgpu: no debugfs entries that rely on PMU
When virtualized, the guest OS has no direct access to
PMU functionality:

- Don't create debugfs entries that rely on PMU access
- Clean up PMU vgpu HAL entries that imply that PMU access
  is supported

Bug 200543218

Change-Id: I12730b600802448a240f3de042760041d3ae7d29
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2213650
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Philip Elcan
065f98f669 gpu: nvgpu: init: add return for all init APIs
This adds return values for all init APIs. This make all the init APIs
have the same signature. This is a prerequisite to making a table of
init functions.

JIRA NVGPU-3980

Change-Id: I5b71fd06ad248092af133ffe908e2930acb6d2b0
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2202973
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Philip Elcan
19dd64930d gpu: nvgpu: pmu: move rtos init to func ptr
This moves the nvgpu_pmu_rtos_init() to a HAL function pointer which
makes it consistent with the other init APIs.

JIRA NVGPU-3980

Change-Id: I562e264deaec76f2a45026a07f24d35b291b1930
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2202969
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Richard Zhao
1ad0bf9098 gpu: nvgpu: vgpu: add mmu_debug_mode support
Added two new IVC commands that set gr and fb mmu debug mode.

Bug 2586624

Change-Id: I358fb04713a9754fb209c0a90d02130dd4a1caf6
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2204980
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Nirav Patel <nipatel@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Deepak Nibade
99f775b622 gpu: nvgpu: compile out ctxsw stats dump in safety
CTXSW stats dump is only enabled on Linux and only through DEBUG FS.
Hence add CONFIG_DEBUG_FS compile time flag to remove corresponding
HALs in safety build.

Jira NVGPU-4028

Change-Id: I37088e1572c51ca35b651c56a4cb907eda5c9004
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2201371
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Mahantesh Kumbar
5eeb751d58 gpu: nvgpu: Move PMU RTOS functions out from pmu.c
Moved PMU RTOS functions to new file from pmu.c to make clear
separation of PMU unit init & PMU RTOS init.

JIRA NVGPU-2457

Change-Id: I694bf561517b4b55f9396be8e132dc0da5cb29e6
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2199543
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:05:52 -06:00
Thomas Fleury
b8465d479d gpu: nvgpu: sw quiesce when recovery is disabled
When CONFIG_NVGPU_RECOVERY is disabled, warn if recovery function
is entered with sw_quiesce_pending false.

Jira NVGPU-3871

Change-Id: Ic8e878ff6637c07f80b1a3542355ec51f729fe12
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2175446
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:01:38 -06:00
Thomas Fleury
8057514a9f gpu: nvgpu: set FB/HSMMU debug mode
Set NV_PFB_HSMMU_PRI_MMU_DEBUG_CTRL and NV_PFB_PRI_MMU_DEBUG_CTRL
in addition to NV_PGRAPH_PRI_GPCS_MMU_DEBUG_CTRL, in
NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE

Bug 2515097

Change-Id: I1763b43e79fac3edb68a35980683d58bfa89519f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2115785
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-28 16:54:26 -07:00
Philip Elcan
52f80de033 gpu: nvgpu: init: make init functions pointers
Change the directly called init functions to function pointers in the
HAL. This makes it more consistent. This also allows for writing more
comprehensive unit tests for nvgpu.common.init.

JIRA NVGPU-2239

Change-Id: I05d739a8f8a2e7d385322d93154206eb0bfddc10
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2173920
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-25 21:55:57 -07:00
Debarshi Dutta
48c00bbea9 gpu: nvgpu: rename channel functions
This patch makes the following changes

1) rename public channel functions to use nvgpu_channel prefix
2) rename static channel functions to use channel prefix

Jira NVGPU-3248

Change-Id: Ib556a0d6ac24dc0882bfd3b8c68b9d2854834030
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2150729
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-08-01 04:37:31 -07:00
Aparna Das
5e877f2985 gpu: nvgpu: vgpu: move vgpu hal files out of common
Move vgpu hal files out of nvgpu common to hal.

Jira GVSCI-1339

Change-Id: Ibf2e987a88a1bf1e5790ed746b927c52b354f790
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2162259
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2019-07-29 16:28:44 -07:00