As part of supporting GPU minimal boot, some of the HALs are assigned
with NULL. Before dereferencing such HALs, check for NULL.
Further, this patch reorganizes needs_init API to first check whether a
function pointer is NULL, to enable quick turnaround to the caller of this API.
JIRA NVGPU-9283
Change-Id: I768d1cddef8766948c99892c59f6ebe6f31be9d8
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2814225
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add API to query all memhandles used for pde and pte.
- Some direct pde/pte allocation should also add entry to the pd-cache
full list.
- Add OS API for querying MemServ handle from nvgpu_mem.
- Traverse through all pd-cache partial and full lists to get memhandles
for all pde/pte buffers.
Jira NVGPU-8284
Change-Id: I8e7adf1be1409264d24e17501eb7c32a81950728
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2735657
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Now that we are planning to enable CTRL_FIFO support with NVS,
there is no need for a separate enabled flag for the same.
CTRL_FIFO support is instead determined by the presence of
NVGPU_SUPPORT_NVS enable flag alone.
For non-auto platforms, Control-Fifo can be disabled by restricting
access to /dev/nvsched_ctrl_fifo.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I9dbec60e5668f38e1460c43800584e88b16a2550
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2814435
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Use CONFIG_KMD_SCHEDULING_WORKER_THREAD instead of
CONFIG_NVS_KMD_BACKEND to remove confusion about the CPU based
KMD scheduling worker thread.
The KMD based scheduling worker thread caters to both Manual Mode
CPU based scheduler as well as Automatic Round Robin CPU based
scheduler.
For the traditional submit path, add correct handling of the
CONFIG_NVS_PRESENT. CPU based worker thread should be part of
CONFIG_NVS_PRESENT. Eventually, when DCONFIG_KMD_SCHEDULING_WORKER_THREAD
is removed, the application must switch to GSP.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I0886ef3b2e0124b6fe22c2bf0bf7d1fa98039d00
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2810217
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Mon subelement needs to access GPC TPC configuration registers
to figure out GPC/TPC count post floor sweeping. These registers
are needed as part of intr handling.
Move priv ring init ahead in RM boot to make sure priv ring
enumeration is done before any unit enables it's interrupts.
This ensures that priv ring is enumerated before any interrupt
handler is run to access GPC/TPC count.
JIRA NVGPU-6528
Change-Id: I8aa6ac182e6dd60a79fa76af6813ea70102316f4
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2809442
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
- The pmu_pg_task thread remains alive in the background
during railgate and rail-ungate.
- During rail-ungate, the PG task thread starts again and
executes PG-related tasks.
- It comes in pmu_pg_init_powergating() and waits for GR
initialization. Here it waits for gr to be initialized.
- In parallel, the main GPU thread works on rmmod (from
gpu_module_reload test).
- By this time, the main gpu thread has started rmmod and
gr->initialized can be set to false, thus causing an uninterruptible
wait for pmu_pg_task thread.
- To solve this, wake gr wait wq in rmmod path when
NVGPU_DRIVER_IS_DYING and NVGPU_KERNEL_IS_DYING flgas are set.
Bug 3806514
Change-Id: Id78d92f30b75aba1aee22398cc86a3acebd50ef6
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2798003
(cherry picked from commit d9345065bcb6d9ff497c127fa4cd52077f4ecfa4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2807245
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Any corrected or uncorrected error reported by gpu hw
will be seen by nvgpu-mon. nvgpu-mon will raise a devctl call
to notify nvgpu-rm if its a fatal error interrupt.
nvgpu_cic_mon_handle_fatal_intr is the corresponding handler which will
walk through the entire tree structure of interrupts for all the subunits
and enter quiesce state.
Change-Id: I3c00c61a7f2c52ae1920f84ee7dfb65cba6b683d
Signed-off-by: Kishan <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2801693
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
- When we enable AELPG, we send maximum idle threshold value
as one of the parameters.
- This was set to 70 msec whereas the ELPG IDLE threshold
was set to 15 msec.
- Thus, when AELPG is enabled, the threshold is getting modified
to a much higher value. So sometimes ELPG is not getting engaged
and this is leading to timeout in ELPG MODS 101 test on GVS.
Bug 3737783
Change-Id: Iac94f053e19cea5898b962c3d02369d556b8518f
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2786749
(cherry picked from commit 5010eaffda2ecaf6c9cc5f0fed68498cb2947071)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2809154
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Get the device id from runlist pri base instead of reading from
runlist structure which could be failing if the device node not
present inside the runlist struct.
This Change will call the get device id hal to get it from rlend_id
and runlist pri base.
NVGPU-8531
Change-Id: Ia81189a6c2281ed09ee52eb461f0cd87164c5fc4
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2791605
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In order to share the TSG across different devices securely, device
instance IDs are to be exchanged for endpoint identification. Add
device instance ID field to gk20a_ctrl_priv which is generated
from gk20a level device instance id value.
Share this ID to userspace via gpu characteristics.
Bug 3677982
JIRA NVGPU-8681
Change-Id: I79d92a81c02272c52e24f5b12c452c8993137037
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2792079
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Scott Long <scottl@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Currently, the registeration with error injection utility is done
only for GA10b using HAL. But HALs are not initialized during the
probe stage when we try to register the error injection utility.
So, the callback registration does not happen HAL is set to NULL.
Move the callback registration from probe to poweron stage when HAL
is initialized.
Update the nvgpu_cic_mon_init_lut() API name as it is no longer
doing only LUT initialization.
Bug 3828050
Change-Id: Ide718029e9317124749b4a51c423ae70dc8227c8
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2790269
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Implement the ioctls NVGPU_TSG_IOCTL_CREATE_SUBCONTEXT and
NVGPU_TSG_IOCTL_DELETE_SUBCONTEXT. These will allocate and
free the VEID numbers.
Address space association with the VEIDs is verified to
ensure that channels association with VEIDs and address
space remains consistent.
Bug 3677982
JIRA NVGPU-8681
Change-Id: I2d913baf61a6bdeec412c58270c0024b80ca15c6
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2766765
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
To avoid duplication of same code across multiple chips, export the
following functions through the corresponding headers for the
consumption of other GPU enabling functions:
- ga10b_gr_intr_report_tpc_sm_rams_ecc_err
- gv11b_gr_intr_report_l1_tag_uncorrected_err
- gv11b_gr_intr_report_l1_tag_corrected_err
- gv11b_gr_intr_report_icache_uncorrected_err
JIRA NVGPU-9075
Change-Id: I927285b6e638479ac52cd5d214711e490e5f151e
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2798371
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Due to changes in the host1x driver, dma_fence callbacks will be
executed in interrupt context instead of workqueue context as
previously. To allow for that, this patch effectively moves the
workqueue step into nvgpu so that the in-nvgpu fence callback gets
executed in workqueue context.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I7bfa294aa3b4bea9888921b79175a8fc218d8e3f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2785968
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Added infrastructure for enabling parsing Control-Fifo's
ring buffers(i.e. send/receive).
Initialization of these buffers are handled as part of
nvgpu_nvs_buffer_alloc() call itself.
A follow-up change shall implement the methods defined here
as part of the existing NVS worker thread.
The changes adhered to the design laid out in the header
nvsched/include/nvs/nvs-control-interface.h.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I2050e6fb681eba80e01cf547ada37a955e58315a
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2764518
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Use CONFIG_NVS_KMD_BACKEND to enclose all NVS KMD based scheduling
code.
Current configuration contains all the scheduling code managed within
CONFIG_NVS_PRESENT. Eventually, scheduling code shall only use GSP.
Hence, isolate KMD based scheduling code to a config
CONFIG_NVS_KMD_BACKEND. This shall make it easier to remove this code
later.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I9dc668e0fa3e7706c111fda7a5e2415e1fc0dd03
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2769465
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
If NEXT bit remains set for a channel being unbound, it can lead to
MMU fault of type unbound inst block. When userspace is closing the
channel and NEXT bit is set, userspace retries.
When force killing the channel, nvgpu can retry few iterations to
ensure the channel is truly idle and unbound. If the channel is
really stuck then unbind will fail and TSG will be aborted.
Bug 3800844
Change-Id: I8fb024630ff2dd272245ae27116f3db6d6e0f788
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2787533
(cherry picked from commit 99e39f4b387743a93b05ba4b097c33b23fbbcf68)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2786479
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
When building NVGPU as an OOT module for upstream Linux kernels, the
NVGPU driver source is now copied into a common location with all the
other OOT modules. Therefore, we can now use the 'srctree.nvidia' path
for finding the necessary header files for Host1x and NVMAP. Update the
include search paths to use 'srctree.nvidia' when building NVGPU as an
OOT module.
Bug 3817518
Change-Id: I63066e4331c66a0f47ada83fde3e63402faaf38a
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2785910
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The error injection code was enabled only when CONFIG_NVGPU_DGPU = n
so that the dGPUs do not attempt any error injection callback
function registration. But, this introduced dependency on DGPU
config when needs to be explicitly set to n for error injection to
be enabled.
Remove the dependency by moving the error injection callback
registration and deregistration to a HAL which is enabled only
on GA10b.
Bug 3819160
Change-Id: I4f4eb99189b1af3502d719536a91cc5e5d866bce
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2787202
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Changes
- Initialize virtual memory for gsp. This space is used for creating
queues for ctrl fifo. Also can be used to ro map sync-pt to this
instance where gsp firmware can poll the sync-pt with sync-pt id.
- Enabled gsp context interface and written the instance block pointer
to nxtctx register for the gsp firmware to access created virtual memory.
- Added required gsp registers for this feature.
NVGPU-8730
Bug 3770916
Change-Id: If538f615eca3f9b7840ffe2787826528b4808886
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2764649
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch fixes nvgpu_css_allocate_perfmon_ids which
leads to a buffer overflow if the allocation of perfmon
ids does not succeed.
If the allocation of perfmon ids cannot be satisfied,
bitmap_find... would return CSS_MAX_PERFMON_IDS and
nvgpu_bitmap_set would still be called with start after
the bitmap array. This results into a buffer overflow.
Bug 3814963
Change-Id: I4caff36cf0c920b4445e1841d16ba2b4c3d19aaa
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2786747
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Prateek Sethi <prsethi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
As part of the negative test case we replace the ACR binaries with
corrupted one(by editing the binary in hex editor). The expectaion
is that the process should log the error and exit properly but instead
the process crashed.
The root cause was because NVGPU driver was trying to pause the thread
using nvgpu_nvs_worker_pause but the but NVS isn't initialized at that
point. NVS is initialized after acr init.
Mitigated this failure by adding a checking condition in
nvgpu_nvs_worker_pause.
Bug 3670576
Change-Id: Ibfe66b253be034e7ca2c3ed298dc28d27e1d6de9
Signed-off-by: ht <ht@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2782937
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Prateek Sethi <prsethi@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>