Replace all nvgpu_next functions/structs either by 1) collapsing them
into nvgpu legacy functions/structs 2) renaming them as follows:
- nvgpu_next_*() => nvgpu_(ga10b/ga100)_*()
- nvgpu_next_*() => (ga10b/ga100)_*()
- nvgpu_next_*() => nvgpu_*() [only if this doesn't cause collision]
- nvgpu_next_*() = > nvgpu_*_extra()
Create hal.sim unit and move Ampere+ SIM code into it.
Jira NVGPU-4771
Change-Id: I215594a0d0df4bd663bd875a0d0db47bcb9ff6a2
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548056
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The CONFIG_NVGPU_NEXT config is no longer required now that ga10b and
ga100 sources have been collapsed. However, the ga100, ga10b sources
are not safety certified, so mark them as NON_FUSA by replacing
CONFIG_NVGPU_NEXT with CONFIG_NVGPU_NON_FUSA.
Move CONFIG_NVGPU_MIG to Makefile.linux.config and enable MIG support
by default on standard build.
Jira NVGPU-4771
Change-Id: Idc5861fe71d9d510766cf242c6858e2faf97d7d0
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2547092
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add api to translate dmabuf's fmode_t to gk20a_mem_rw_flag
for read only/read write mapping selection.
By default dmabuf fd mapping permission should be a maximum
access permission associated to a particual dmabuf fd.
Remove bit flag MAP_ACCESS_NO_WRITE and add 2 bit values for
user access requests NVGPU_VM_MAP_ACCESS_DEFAULT|READ_ONLY|
READ_WRITE.
To unify map access type handling in Linux and QNX move the
parameter NVGPU_VM_MAP_ACCESS_* check to common function
nvgpu_vm_map.
Set MAP_ACCESS_TYPE enabled flag in common characteristics
init function as it is supported for Linux and QNX.
Bug 200717195
Bug 3250920
Change-Id: I1a249f7c52bda099390dd4f371b005e1a7cef62f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2507150
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Query the num_perfmon requires golden context to be ready. Accessing
golden context might require gr_instance_id, specific to a GR engine.
On TOT, get_num_hwpm_perfmon() called from perfmon HAL which might
require to call nvgpu_gr_exec_with_err_for_instance().
It internally calls nvgpu_grmgr_config_gr_remap_window() to change
gr_window_remap register points to a current gr_instance_id for MIG.
This approach indirectly mandates to call
nvgpu_gr_exec_with_err_for_instance() which can be
completely avoided. get_num_hwpm_perfmon() is just a query call
which can be moved after the golden context creation.
Using this logic, we can avoid unnecessary invocation of
nvgpu_gr_exec_with_err_for_instance() during perform specific
HAL accesses.
1) Moved get_num_hwpm_perfmon() after golden context creation.
2) Added nvgpu_assert() if (g->num_sys_perfmon == 0U).
JIRA NVGPU-5656
Change-Id: I59a6ab4df93763adbc0765fa5e4d1712b2477521
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542438
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
1) Expose logical mask instead of physical mask when MIG is enabled.
For legacy, NvGpu expose physical mask.
2) Added fb related info in struct nvgpu_gpu_instance().
4) Added utility api to get the logical id for a given local id
nvgpu_grmgr_get_gr_gpc_logical_id()
5) Added grmgr api to get max_gpc_count
nvgpu_grmgr_get_max_gpc_count().
5) Added grmgr's fbp api to get num_fbps and its enable masks.
nvgpu_grmgr_get_num_fbps()
nvgpu_grmgr_get_fbp_en_mask()
nvgpu_grmgr_get_fbp_rop_l2_en_mask()
6) Used grmgr's fbp apis in ioctl_ctrl.c
7) Moved fbp_init_support() in nvgpu_early_init()
8) Added nvgpu_assert handling in grmgr.c
9) Added vgpu hal for get_max_gpc_count().
JIRA NVGPU-5656
Change-Id: I90ac2ad99be608001e7d5d754f6242ad26c70cdb
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538508
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
HAL .exec_regops used to first validate regops then execute it, now
moving it to only execute the regops.
- It helps B0CC on HV. On server side it does not track profiler object,
but regops validation uses the profiler, so moving validation to client
side.
- The change also remove ctx_buffer_offset checking in
validate_reg_op_offset. The offset already checked again whitelists
which have be verified when update whitelist. Also vgpu does not have
information of ctx and golden image.
- Added function nvgpu_regops_exec to cover both regops validation and
execution.
Jira GVSCI-10351
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I434e027290e263a8a64a25a55500f7294038c9c4
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534252
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
If user calls IOCTL to allocate object context for two channels in same
TSG in parallel, nvgpu_gr_setup_alloc_obj_ctx() could end up racing and
trying to allocate object context for both channels at the same time.
This could result in corrupting object context.
Fix this by introducing per-TSG mutex ctx_init_lock to serialize context
initialization for all channels within TSG.
In ideal scenario nvrm_gpu is the only caller of all the IOCTLs, and
nvrm_gpu makes sure to initialize object context for each channel in
serial order. Because of this new lock does not cause any contention.
Jira NVGPU-6431
Change-Id: Ibb1cbb4878748929bb7f23e8666c283c39ecbf5a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538333
(cherry picked from commit 8be447838dc1ecbd5637eb6bd13b8f338eaf33cd)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538773
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
CIC (Central Interrupt controller) will be responsible for the
interrupt handling. common.cic unit is the placeholder for all
interrupt related code. Move interrupt related defines and
Public APIs present in common.mc to common.cic.
Note: The common.mc interrupts related struct definitions are
not moved as part of this patch.
Adapt the code to use interrupt handling related defines and public
APIs migrated from common.mc to common.cic
JIRA NVGPU-6899
Change-Id: I747e2b556c0dd66d58d74ee5bb36768b9370d276
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2535618
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GFxP preemption for graphics contexts is not supported in safety.
But the support was enabled along with CONFIG_NVGPU_GRAPHICS since GFxP
preemption was protected under same config.
Add a separate config CONFIG_NVGPU_GFXP to protect all GFxP specific
code, enum values, and HALs.
Disable the config in safety profile.
Jira NVGPU-6893
Change-Id: Iebb5f754a1025dfa6e05a94704bdb8a7123b599a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534986
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a new Central Interrupt Controller(CIC) unit in common code.
The interrupt handling is done in a distributed manner currently.
The error handling policy for different errors resides in each unit's
ISR code. The goal is to converge this data under one central place -
the CIC unit.
This patch creates framework for CIC unit and moves the gv11b QNX
safety LUT to CIC unit. All the error reporting APIs from different
units are also moved to CIC.
New APIs are exposed by CIC unit to access its internal data like:
1. Struct err_desc - the static err handling /injection data per
error id
2. Num_hw_modules - the number of error reporting HW units
supported by CIC
Init and deinit of CIC unit:
1. CIC unit should be initialized earlyon during boot so that it
is available for any interrupt handling.
2. Initialize CIC just before the interrupts are enabled during
boot.
3. Similarly, CIC is disabled late during deinit cycle; right
after the interrupts are masked.
LUT:
1. LUT is currently used only for reporting error to safety
services in gv11b QNX safety build.
2. This error handling policy LUT currently has only two levels
of handing - correctable and quiecse.
3. Once, the error handling policy decision is moved from leaf
unit nodes to CIC, LUT will be updated to have additional levels
like fast recovery and full recovery.
4. Also, then a separate LUT will be added for each platform/build.
5. In current framework, the LUT is set to NULL for all
configurations except gv11b.
report_err() ops is added to report error to safety services.
This ops is only effective for gv11b qnx build; and set to NULL for
other configurations.
NVGPU-6521
NVGPU-6523
NVGPU-6750
NVGPU-6758
NVGPU-6760
NVGPU-6754
Change-Id: I24be7836a96d787741e37b732e19863ed8014635
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683
Reviewed-by: Ajesh K V <akv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added init function for common.ptimer unit and called
this init function during nvgpu early init.
int nvgpu_ptimer_init(struct gk20a *g);
Added following helper function for programming
prod values for slcg timer unit:
void nvgpu_cg_slcg_timer_load_enable(struct gk20a *g);
Invoked prod programming for slcg timer unit from
nvgpu_ptimer_init.
Jira NVGPU-6026
Change-Id: I29e32380a4d05ec8276d7ebe59bc2733917f8184
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524037
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Update SSMD array size to hold all supported super-surface
members
-Handle the error and report if invalid SSMD ID is found.
issue: At present SSMD array size set to 32 but overall
33 super-surface members are supported, when 33rd member
accessed system crash happened due to overflow access,
so fixing it by setting the SSMD array size to actual
number of super-surface members supported
Bug 200721968
Bug 200721966
Change-Id: I5ba1084a661d7497056f13a053d2fc79d50f595c
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2528569
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Some of the RTV circular buffer programming is under GRAPHICS config and
some is under DGPU config. For nvgpu-next, RTV circular buffer is
required even for iGPU so keeping the code under DGPU config does not
make sense.
Move all the code from DGPU config to GRAPHICS config.
Bug 3159973
Change-Id: I8438cc0e25354d27701df2fe44762306a731d8cd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2524897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
This is adding code to select MIG mode and boot
the GPU with selected mig config.
For testing MIG, after system boots
1. write mig_mode_config by
echo x > /sys/devices/gpu.0/mig_mode_config for igpu
echo x > /sys/devices/./platform/14100000.pcie/pci0001:00/0001:00:00.0/0001:01:00.0/ for dgpu
2. Then run any nvgpu* tests or nvrm_gpu_info.
If the mig_mode need to be changed , note down the supported
configs by "cat mig_mode_config_list" and reboot the system
3. Follow steps 1 and 2.
example output:
"cat mig_mode_config" 2
"cat mig_mode_config_list"
+++++++++ Config list Start ++++++++++
CONFIG_ID : 0 for CONFIG NAME : 2 GPU instances each with 4 GPCs
CONFIG_ID : 1 for CONFIG NAME : 4 GPU instances each with 2 GPCs
CONFIG_ID : 2 for CONFIG NAME : 7 GPU instances - 1 GPU instance with 2
GPCs + 6 GPU instances each with 1 GPC
CONFIG_ID : 3 for CONFIG NAME : 5 GPU instances - 1 GPU instance with 4
GPCs + 4 GPU instances each with 1 GPC
CONFIG_ID : 4 for CONFIG NAME : 4 GPU instances - 1 GPU instance with 2
GPCs + 2 GPU instances each with 1 GPC + 1 GPU instance with 4 GPCs
CONFIG_ID : 5 for CONFIG NAME : 6 GPU instances - 2 GPU instances each
with 2 GPCs + 4 GPU instances each with 1 GPC
CONFIG_ID : 6 for CONFIG NAME : 5 GPU instances - 1 GPU instance with
2 GPCs + 2 GPU instances each with 1 GPC + 2 GPU instances with 2 GPCs
CONFIG_ID : 7 for CONFIG NAME : 5 GPU instances - 2 GPU instances each
with 2 GPCs + 1 GPC instance with 2 GPCs + 2 GPU instances with 1 GPC
CONFIG_ID : 8 for CONFIG NAME : 5 GPU instances - 1 GPC instance with 2
GPCs + 2 GPU instances each with 1 GPC + 2 GPU instances each with 2
GPCs
CONFIG_ID : 9 for CONFIG NAME : 1 GPU instance with 8 GPCs
++++++++++ Config list End +++++++++++
JIRA NVGPU-6633
Change-Id: I3e56f8c836e1ced8753a60f328da63916faa7696
Signed-off-by: dt <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2522821
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Lazy bootstrap is a secure iGPU feature where LS falcons(FECS and
GPCCS) are bootstrapped by LSPMU in both cold boot and recovery boot.
As there is no ACR running after boot, we need LSPMU to bootstrap LS
falcons to support recovery.
In absence of LSPMU, ACR will bootstrap LS falcons but recovery is
not supported.
This CL will enable Low secure falcon manager(lsfm) to support
Lazy bootstrap feature. This will allow nvgpu to send cmds
to lspmu to bootstrap LS falcons.
Bug 200709761
Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I65d17cf5e07a45c040a9bb75f75cf18eb509cd4f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2506162
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
1) nvgpu poweron sequence split into two stages:
- nvgpu_early_init() - Initializes the sub units
which are required to be initialized before the grgmr init.
For creating dev node, grmgr init and its dependency unit
needs to move to early stage of GPU power on.
After successful nvgpu_early_init() sequence,
NvGpu can indetify the number of MIG instance required
for each physical GPU.
- nvgpu_finalize_poweron() - Initializes the sub units which
can be initialized at the later stage of GPU power on sequence.
- grmgr init depends on the following HAL sub units,
* device - To get the device caps.
* priv_ring - To get the gpc count and other
MIG config programming.
* fb - MIG config programming.
* ltc - MIG config programming.
* bios, bus, ecc and clk - dependent module of
priv_ring/fb/ltc.
2) g->ops.xve.reset_gpu() should be called before GPU sub unit
initialization. Hence, added g->ops.xve.reset_gpu() HAL in the
early stage of dGPU power on sequence.
3) Increased xve_reset timeout from 100ms to 200ms.
4) Added nvgpu_assert() for gpc_count, gpc_mask and
max_veid_count_per_tsg for identify the GPU boot
device probe failure during nvgpu_init_gr_manager().
JIRA NVGPU-6633
Change-Id: I5d43bf711198e6b3f8eebcec3027ba17c15fc692
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521894
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit