To achieve permanent fault coverage, the CTAs launched by
each kernel in the mission and redundant contexts must execute on
different hardware resources. This feature proposes modifications
in the software to modify the virtual SM id to TPC mapping across
the mission and redundant contexts. The virtual SM identifier to TPC
mapping is done by nvgpu when setting up the patch context.
The recommendation for the redundant setting is to offset the
assignment by one TPC, and not by one GPC. This will ensure that both
GPC and TPC diversity. The SM and Quadrant diversity will happen
naturally. For kernels with few CTAs, the diversity is guaranteed
to be 100%. In case of completely random CTA allocation,
e.g. large number of CTAs in the waiting queue, the diversity is
1 - 1/#SM, or 87.5% for GV11B, 97.9% for TU104.
Added NvGpu CFLAGS to enable/disable the SM diversity support
"CONFIG_NVGPU_SM_DIVERSITY".
This support is only enabled on gv11b and tu104 QNX non safety build.
JIRA NVGPU-4685
Change-Id: I8e3eaa72d8cf7aff97f61e4c2abd10b2afe0fe8b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2268026
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA rule 11.2 doesn't allow conversions of a pointer from or to an
incomplete type. These type of conversions may result in a pointer
aligned incorrectly and may further result in undefined behavior.
This patch addresses rule 11.2 violations related to pointers to and
from struct nvgpu_sgl. This patch replaces struct nvgpu_sgl pointers by
void pointers.
Jira NVGPU-3736
Change-Id: I8fd5766eacace596f2761b308bce79f22f2cb207
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2267876
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The following DEVCTLs not needed in safety build
Compiled out below DEVCTLs for safety build
* NVGPU_GPU_DEVCTL_SET_THERM_ALERT_LIMIT
* NVGPU_GPU_DEVCTL_GET_TPC_EXCEPTION_EN_STATUS
* NVGPU_GPU_DEVCTL_GET_CPU_TIME_CORRELATION_INFO
Also added config flag CONFIG_NVGPU_IOCTL_NON_FUSA
JIRA NVGPU-3768
Change-Id: Ia233d0aac8201268524581f588d97390a913ab9c
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2159398
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
There was a name clash between the nvgpu_set_error_notifier*() APIs and
the SET_ERROR_NOTIFIER IOCTL. Therefore, the APIs were renamed from
nvgpu_set_error_notifier*() to nvgpu_set_err_notifier*(). This rename
was done to fix MISRA 5.x errors.
JIRA NVGPU-1633
Change-Id: I06af551a664b0706f106e853f1ea8733894f11bd
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2159813
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Following are removed for safety build by adding
CONFIG_NVGPU_KERNEL_MODE_SUBMIT flag.
1) HAL ops in g->ops.sync.syncpt
add_wait_cmd
get_wait_cmd_size
add_incr_cmd
get_incr_cmd_size
get_incr_per_release
2) g->ops.sync.sema is removed in its entirety and contains the
following ops.
3) The following files are compiled out using the above flag.
hal/sync/sema_cmdbuf_gk20a.c
hal/sync/sema_cmdbuf_gv11b.c
Jira NVGPU-3479
Change-Id: I99ae6913e5fe5707ff9a3e2cf06cee8710def7cc
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2130352
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The following functions belong to the path of kernel_mode submit and
the flag CONFIG_NVGPU_KERNEL_MODE_SUBMIT is used to compile these out
of safety builds.
channel_gk20a_alloc_priv_cmdbuf
channel_gk20a_free_prealloc_resources
channel_gk20a_joblist_add
channel_gk20a_joblist_delete
channel_gk20a_joblist_peek
channel_gk20a_prealloc_resources
nvgpu_channel
nvgpu_channel_add_job
nvgpu_channel_alloc_job
nvgpu_channel_alloc_priv_cmdbuf
nvgpu_channel_clean_up_jobs
nvgpu_channel_free_job
nvgpu_channel_free_priv_cmd_entry
nvgpu_channel_free_priv_cmd_q
nvgpu_channel_from_worker_item
nvgpu_channel_get_gpfifo_free_count
nvgpu_channel_is_prealloc_enabled
nvgpu_channel_joblist_is_empty
nvgpu_channel_joblist_lock
nvgpu_channel_joblist_unlock
nvgpu_channel_kernelmode_deinit
nvgpu_channel_poll_wdt
nvgpu_channel_set_syncpt
nvgpu_channel_setup_kernelmode
nvgpu_channel_sync_get_ref
nvgpu_channel_sync_incr
nvgpu_channel_sync_incr_user
nvgpu_channel_sync_put_ref_and_check
nvgpu_channel_sync_wait_fence_fd
nvgpu_channel_update
nvgpu_channel_update_gpfifo_get_and_get_free_count
nvgpu_channel_update_priv_cmd_q_and_free_entry
nvgpu_channel_wdt_continue
nvgpu_channel_wdt_handler
nvgpu_channel_wdt_init
nvgpu_channel_wdt_restart_all_channels
nvgpu_channel_wdt_restart_all_channels
nvgpu_channel_wdt_rewind
nvgpu_channel_wdt_start
nvgpu_channel_wdt_stop
nvgpu_channel_worker_deinit
nvgpu_channel_worker_from_worker
nvgpu_channel_worker_init
nvgpu_channel_worker_poll_init
nvgpu_channel_worker_poll_wakeup_post_process_item
nvgpu_channel_worker_poll_wakeup_process_item
nvgpu_submit_channel_gpfifo_kernel
nvgpu_submit_channel_gpfifo_user
gk20a_userd_gp_get
gk20a_userd_pb_get
gk20a_userd_gp_put
nvgpu_fence_alloc
The following members of struct nvgpu_channel are compiled out of
safety build.
struct gpfifo_desc gpfifo;
struct priv_cmd_queue priv_cmd_q;
struct nvgpu_channel_sync *sync;
struct nvgpu_list_node worker_item;
struct nvgpu_channel_wdt wdt;
The following files are compiled out of safety build.
common/fifo/submit.c
common/sync/channe1_sync_semaphore.c
hal/fifo/userd_gv11b.c
Jira NVGPU-3479
Change-Id: If46c936477c6698f4bec3cab93906aaacb0ceabf
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127212
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Safety build does not support vidmem. This patch compiles out vidmem
related changes - vidmem, dma alloc, cbc/acr/pmu alloc based on
vidmem and corresponding tests like pramin, page allocator &
gmmu_map_unmap_vidmem..
As vidmem is applicable only in case of DGPUs the code is compiled
out using CONFIG_NVGPU_DGPU.
JIRA NVGPU-3524
Change-Id: Ic623801112484ffc071195e828ab9f290f945d4d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2132773
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
File vgpu_fifo_gv11b.c contained syncpoint related implementation
specific to gv11b. Move the implementations to a new file in
hal directory for vgpu hal/vgpu/sync/syncpt_cmdbuf_gv11b_vgpu.c.
Also move function vgpu_gv11b_init_fifo_setup_hw() to a new
file in hal directory for vgpu hal/vgpu/fifo/fifo_gv11b_vgpu.c.
Add a new yaml file nvgpu-hal-vgpu.yaml that contains vgpu
specific hal files. Update arch yaml to reflect the above changes.
Jira GVSCI-994
Change-Id: Ie33614473d5fd3fcd624c70709b109c4e45725ef
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2138390
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Nirav Patel <nipatel@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
vgpu_gv11b_tsg_bind_channel() was specific to gv11b. Modify
function vgpu_tsg_bind_channel() to handle gv11b specific
case by checking if subctx is supported.
Delete gv11b specific file common/vgpu/gv11b/vgpu_tsg_gv11b.c
and update arch yaml file accordingly.
Jira GVSCI-994
Change-Id: I36c1f7392087573afa06cd3652a145aa92055f1c
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2138389
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Nirav Patel <nipatel@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove below hals since the corresponding functions are same on all
platforms and they are h/w independent
g->ops.gr.enable_ctxsw()
g->ops.gr.disable_ctxsw()
g->ops.gr.reset()
Call the functions directly at all places
Remove CONFIG_NVGPU_DEBUGGER from places where these functions are
called since they are not debugger dependent
This also helps to disable CONFIG_NVGPU_DEBUGGER and to keep recovery
sequence intact
Jira NVGPU-3506
Change-Id: Id2b208ca23dc4667e78edcd8ad242a8558e0ff64
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2137255
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix the following MISRA rule violations in bitops unit,
MISRA Rule 10.1
MISRA Rule 10.3
MISRA Rule 10.4
MISRA Rule 11.8
MISRA Rule 21.2
Introduce nvgpu specific functions for bitops and bitmap operations
with unsigned integer as parameter for offset. OS specific type
conversions and handling of these inerfaces are taken care in the
respective OS files.
Jira NVGPU-3545
Change-Id: Ib1ef76563db6ba1d879a0b4d365b2958ea03f85c
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2129513
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Hal API g->ops.gr.halt_pipe() is defined in unsafe unit hal.gr.gr
It is called from safe unit, and it calls into API
g->ops.gr.falcon.ctrl_ctxsw() which is also safe
Hence get rid of unsafe API g->ops.gr.halt_pipe().
Caller now directly calls hal.gr.falcon API to halt pipe
Jira NVGPU-3506
Change-Id: I5439cb79431795fc7c22384832cf632d6db03316
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127755
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move some interrupt handling hals from hal.gr.gr unit to hal.gr.intr
unit as below
g->ops.gr.intr.set_hww_esr_report_mask()
g->ops.gr.intr.handle_tpc_sm_ecc_exception()
g->ops.gr.intr.get_esr_sm_sel()
g->ops.gr.intr.clear_sm_hww()
g->ops.gr.intr.handle_ssync_hww()
g->ops.gr.intr.log_mme_exception()
g->ops.gr.intr.record_sm_error_state()
g->ops.gr.intr.get_sm_hww_global_esr()
g->ops.gr.intr.get_sm_hww_warp_esr()
g->ops.gr.intr.get_sm_no_lock_down_hww_global_esr_mask()
g->ops.gr.intr.get_sm_hww_warp_esr_pc()
g->ops.gr.intr.tpc_enabled_exceptions()
g->ops.gr.intr.get_ctxsw_checksum_mismatch_mailbox_val()
Rename gv11b_gr_sm_offset() to nvgpu_gr_sm_offset() and move to
common.gr.gr unit
All of above functions and hals will be needed in safety build
Jira NVGPU-3506
Change-Id: I278d528e4b6176b62ff44eb39ef18ef28d37c401
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127753
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
renamed NVGPU_LS_PMU to NVGPU_FEATURE_LS_PMU to follow
nvgpu naming standard
Compile out LS PMU files when PMU RTOS support is
disabled for safety build by setting NVGPU_LS_PMU
build flag to 0
JIRA NVGPU-3418
Change-Id: Ib09924ac25657e932723c10be573f2f701cb7bea
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127794
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Compile out PMU RTOS specific PMU HAL code when
PMU RTOS support is disabled for safety build by setting
NVGPU_LS_PMU build flag to 0
Added new functions for gv11b PMU HAL to easy compile out
other PMU HAL files.
Replaced all gk20a_writel/readl calls with nvgpu_writel/readl
calls in hal/pmu/pmu_gv11b.c files
JIRA NVGPU-3418
Change-Id: I7c315349aa95721990dc7b1570383669bcb6221f
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2117691
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
pmu_gk20a.c includes hw_mc_gk20a.h other than hw_pwr_gk20a.h
to access & configure pmu interrupt, this breaks single hw header
for HAL file.
Moved PMU interrupt enable to MC unit by creating/modifying current
mc ops intr_unit_config to intr_pmu_unit_config to configure PMU
interrupt specifically as this ops is only used by PMU unit
JIRA NVGPU-3239
Change-Id: I2514f17197708047b46ea712cf4569a5b3bfab2a
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2126420
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Renamed gk20a_channel_* APIs to nvgpu_channel_* APIs.
Removed unused channel API int gk20a_wait_channel_idle
Renamed nvgpu_channel_free_usermode_buffers in os/linux-channel.c to
nvgpu_os_channel_free_usermode_buffers to avoid conflicts with the API
with the same name in channel unit.
Jira NVGPU-3248
Change-Id: I21379bd79e64da7e987ddaf5d19ff3804348acca
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2121902
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>