There are three issues with shadow domain submission:
1. runlist mem is not being swapped with mem_hw for shadow domain when
there is non-shadow domain being bound to tsg which does not allow runlist
to have all the tsgs. To fix this nvgpu_runlist_swap_mem() is being called
for shadow domain as well.
2. tsg num_active_channels is being set as part of non-shadow domain which
does happen after shadow domain. Due to this, runlist tsg length is not
being set as part of runlist reconstruct and leaving tsg length 0 for last
tsg. To fix this, tsg num_active_channels always get set for shadow domain
as it gets configured first.
3. NV_BUILD_CONFIGURATION_VARIANT_IS_EMBEDDED is not solving the purpose
to differentiate l4t and embedded_linux builds so using
NV_BUILD_SYSTEM_TYPE in place of this to find out the build type.
L4t is using round robin scheduling and this issue coming with manual mode
scheduling so adding the fix only for manual mode scheduling support.
Bug 3884011
Change-Id: Ic55da8f75294eb32c8df6e35fb1fa47df78db8f8
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2833880
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The memory leak issue was due to nvgpu_profiler_unbind_context() calling
nvgpu_profiler_pm_resource_release() for all resources which clears the
flag required by nvgpu_profiler_free_pma_stream() to release the memory
for perf_buf instance block.
Fixing this issue by splitting nvgpu_profiler_unbind_context() to release
all the pm resources at a later time separately.
Bug 3510455
Change-Id: Ibab8d071693e600c46f7e7f16575e36e6f62af3c
Signed-off-by: atanand <atanand@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2825013
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Use CONFIG_KMD_SCHEDULING_WORKER_THREAD instead of
CONFIG_NVS_KMD_BACKEND to remove confusion about the CPU based
KMD scheduling worker thread.
The KMD based scheduling worker thread caters to both Manual Mode
CPU based scheduler as well as Automatic Round Robin CPU based
scheduler.
For the traditional submit path, add correct handling of the
CONFIG_NVS_PRESENT. CPU based worker thread should be part of
CONFIG_NVS_PRESENT. Eventually, when DCONFIG_KMD_SCHEDULING_WORKER_THREAD
is removed, the application must switch to GSP.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I0886ef3b2e0124b6fe22c2bf0bf7d1fa98039d00
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2810217
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Get the device id from runlist pri base instead of reading from
runlist structure which could be failing if the device node not
present inside the runlist struct.
This Change will call the get device id hal to get it from rlend_id
and runlist pri base.
NVGPU-8531
Change-Id: Ia81189a6c2281ed09ee52eb461f0cd87164c5fc4
Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2791605
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement the ioctls NVGPU_TSG_IOCTL_CREATE_SUBCONTEXT and
NVGPU_TSG_IOCTL_DELETE_SUBCONTEXT. These will allocate and
free the VEID numbers.
Address space association with the VEIDs is verified to
ensure that channels association with VEIDs and address
space remains consistent.
Bug 3677982
JIRA NVGPU-8681
Change-Id: I2d913baf61a6bdeec412c58270c0024b80ca15c6
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2766765
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Use CONFIG_NVS_KMD_BACKEND to enclose all NVS KMD based scheduling
code.
Current configuration contains all the scheduling code managed within
CONFIG_NVS_PRESENT. Eventually, scheduling code shall only use GSP.
Hence, isolate KMD based scheduling code to a config
CONFIG_NVS_KMD_BACKEND. This shall make it easier to remove this code
later.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I9dc668e0fa3e7706c111fda7a5e2415e1fc0dd03
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2769465
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
If NEXT bit remains set for a channel being unbound, it can lead to
MMU fault of type unbound inst block. When userspace is closing the
channel and NEXT bit is set, userspace retries.
When force killing the channel, nvgpu can retry few iterations to
ensure the channel is truly idle and unbound. If the channel is
really stuck then unbind will fail and TSG will be aborted.
Bug 3800844
Change-Id: I8fb024630ff2dd272245ae27116f3db6d6e0f788
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2787533
(cherry picked from commit 99e39f4b387743a93b05ba4b097c33b23fbbcf68)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2786479
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
GR context associated with channel is updated in various driver paths.
Sequence to do the same is disable the TSG, preempt the TSG, update
the GR context or instance block and then enable the TSG.
These operations and runlist updates for channel have to be done under
TSG specific ctx_init_lock to avoid the race.
suspend_contexts and resume_contexts needs special handling which is
not covered in this patch.
Bug 3677982
Change-Id: I837257fe9d9ef3eb6f69f5d7e0707e0bb6d4ea72
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2720222
Reviewed-by: Scott Long <scottl@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Only certain combination of channels of GFX/Compute object classes can
be assigned to particular pbdma and/or VEID. CILP can be enabled only
in certain configs. Implement checks for the configurations verified
during alloc_obj_ctx and/or setting preemption mode.
Bug 3677982
Change-Id: Ie7026cbb240819c1727b3736ed34044d7138d3cd
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2719995
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Subcontext PDBs and valid mask in the instance blocks of the channels
in various subcontexts has to be updated when new subcontext is
created or a subcontext is removed.
Replayable fault state is cached in the channel structure. Replayable
fault state for subcontext is set based on first channel's bind
parameter. It was earlier programmed in function channel_setup_ramfc.
init_inst_block_core is updated to setup TSG level pdb map and mask.
Added new hal gv11b_channel_bind to enable the subcontext on channel
bind.
Bug 3677982
Change-Id: I58156c5b3ab6309b6a4b8e72b0e798d6a39c1bee
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2719994
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
This patch introduces following relationships among various nvgpu
objects to support multiple address spaces with subcontexts.
IOCTLs setting the relationships are shown in the braces.
nvgpu_tsg 1<---->n nvgpu_tsg_subctx (TSG_BIND_CHANNEL_EX)
nvgpu_tsg 1<---->n nvgpu_gr_ctx_mappings (ALLOC_OBJ_CTX)
nvgpu_tsg_subctx 1<---->1 nvgpu_gr_subctx (ALLOC_OBJ_CTX)
nvgpu_tsg_subctx 1<---->n nvgpu_channel (TSG_BIND_CHANNEL_EX)
nvgpu_gr_ctx_mappings 1<---->n nvgpu_gr_subctx (ALLOC_OBJ_CTX)
nvgpu_gr_ctx_mappings 1<---->1 vm_gk20a (ALLOC_OBJ_CTX)
On unbinding the channel, objects are deleted according
to dependencies.
Without subcontexts, gr_ctx buffers mappings are maintained in the
struct nvgpu_gr_ctx. For subcontexts, they are maintained in the
struct nvgpu_gr_subctx.
Preemption buffer with index NVGPU_GR_CTX_PREEMPT_CTXSW and PM
buffer with index NVGPU_GR_CTX_PM_CTX are to be mapped in all
subcontexts when they are programmed from respective ioctls.
Global GR context buffers are to be programmed only for VEID0.
Based on the channel object class the state is patched in
the patch buffer in every ALLOC_OBJ_CTX call unlike
setting it for only first channel like before.
PM and preemptions buffers programming is protected under TSG
ctx_init_lock.
tsg->vm is now removed. VM reference for gr_ctx buffers mappings
is managed through gr_ctx or gr_subctx mappings object.
For vGPU, gr_subctx and mappings objects are created to reference
VMs for the gr_ctx lifetime.
The functions nvgpu_tsg_subctx_alloc_gr_subctx and nvgpu_tsg_-
subctx_setup_subctx_header sets up the subcontext struct header
for native driver.
The function nvgpu_tsg_subctx_alloc_gr_subctx is called from
vgpu to manage the gr ctx mapping references.
free_subctx is now done when unbinding channel considering
references to the subcontext by other channels. It will unmap
the buffers in native driver case. It will just release the
VM reference in vgpu case.
Note that TEGRA_VGPU_CMD_FREE_CTX_HEADER ioctl is not called
by vgpu any longer as it would be taken care by native driver.
Bug 3677982
Change-Id: Ia439b251ff452a49f8514498832e24d04db86d2f
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2718760
Reviewed-by: Scott Long <scottl@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
With subcontexts support added, nvgpu has to allocate VEID0 channel
itself to initialize the golden context image. Allocate the channel
and init the golden context image at the beginning of alloc_obj_ctx
call for first user channel.
It can't be initialized at the end of probe as tpc pg settings need
to be updated before golden context image is initialized.
Bug 3677982
Change-Id: Ia82f6ad6e088c2bc1578a6bd32b7c7a707a17224
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2756289
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
The wait_pending HAL is now modified to simply
check the pending status of a given runlist.
The while loop is removed from this HAL.
A new function nvgpu_runlist_wait_pending_legacy() is
added that emulates the older wait_pending() HAL.
nvgpu_runlist_tick() is modified to accept a 64 bit
"preempt_grace_ns" value.
These changes prepare for upcoming control-fifo parser
changes.
Jira NVGPU-8619
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: If3f288eb6f2181743c53b657219b3b30d56d26bc
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2766100
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Following changes are added here to simplify the overall
sequence.
1) Remove deferred update for runlists. NVS worker thread
shall submit the updated runlist.
2) Moved Runlist mem swap inside update itself. Protect
the swap() and hw_submit() path with a spinlock. This
is temporary till GSP.
3) Enable Control-Fifo mode from nvgpu driver.
Jira NVGPU-8609
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Icc52e5d8ccec9d3653c9bc1cf40400fc01a08fde
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2757406
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
A previous commit ID 44b6bfbc1 added a hack to prevent
the worker thread from calling nvgpu_runlist_tick()
in post_process if the next domain matches the previous.
This could potentially still face issues with multi-domains
in future.
A better way is to synchronize the thread to suspend/resume
alongwith the device's common OS agnostic suspend/resume
operations. This shall emulate the GSP as well.
This shall also take care of the power constraints
i.e. the worker thread can be expected to always work
with the power enabled and thus we can get rid of the complex
gk20a_busy() lock here for good.
Implemented a state-machine based approach for suspending/
resuming the NVS worker thread from the existing callbacks.
Remove support for NVS worker thread creation for VGPU.
hw_submit method is currently set to NULL for VGPU. VGPU
instead submits its updates via the runlist.reload() method.
Jira NVGPU-8609
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I51a20669e02bf6328dfe5baa122d5bfb75862ea2
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2750403
Reviewed-by: Prateek Sethi <prsethi@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
On tsg.unbind_channel hal failure, channel teardown was being done
again that was done already as part of the function
nvgpu_tsg_unbind_channel_common.
Just abort the TSG and return err in that case. Also, decrement
the TSG ch_count in the fail path.
Bug 3677982
Change-Id: I466f2b2c693d43ed64dc531b08bf740bf00f28a6
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2749970
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
KMD needs to send the domain id and GPU_VA corresponding
to the struct runlist_domains to GSP. In the current
implementation, struct nvgpu_runlist_domain contains
the domain name instead of domain id. This requires
an additional search by name everytime an update
is needed to be submitted to the GSP.
Modify the struct nvgpu_runlist_domain to store domain id
instead of domain name. This simplifies the flow and avoids
unnecessary search.
Removed the conditional check for existence of shadow domain
as its a deadcode. Shadow Domain is not searchable in the list
of domains inside the struct nvgpu_runlist.
Jira NVGPU-8610
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I0d67cfa93d89186240290e933aa750702b14f4f0
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2744890
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Some of the functions with no traceability to unit tests are already
covered by callee API functions. Skip these functions in SWVR by
skipping doxygen for them.
Some of the functions are non-fusa like those in profile.h and
bsearch.h. Those were included as the header was included in
Doxygen sources. Mark then non-safe.
Some of the nvgpu functions were not added to Targets entries for
respective tests. Fix those.
JIRA NVGPU-7211
Change-Id: Iacf22dccdd9340100cf93814566d3979734c455d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2612982
(cherry picked from commit a40f62654747102cc8ef53ddbd9f953c21c2b745)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2737672
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In order to maintain separate mappings of GR TSG and global context
buffers for different subcontexts, we need to separate the memory
struct and the mapping struct for the buffers. This patch moves
the mappings of all GR ctx buffers to new structure
nvgpu_gr_ctx_mappings.
This will be instantiated per subcontext in the upcoming patches.
Summary of changes:
1. Various context buffers were allocated and mapped separately.
All TSG context buffers are now stored in gr_ctx->mem[] array
since allocation and mapping is unified for them.
2. Mapping/unmapping and querying the GPU VA of the context
buffers is now handled in ctx_mappings unit. Structure
nvgpu_gr_ctx_mappings in nvgpu_gr_ctx holds the maps.
On ALLOC_OBJ_CTX this struct is instantiated and deleted
on free_gr_ctx.
3. Introduce mapping flags for TSG and global context buffers.
This is to map different buffers with different caching
attribute. Map all buffers as cacheable except
PRIV_ACCESS_MAP, RTV_CIRCULAR_BUFFER, FECS_TRACE, GR CTX
and PATCH ctx buffers. Map all buffers as privileged.
4. Wherever VM or GPU VA is passed in the obj_ctx allocation
functions, they are now replaced by nvgpu_gr_ctx_mappings.
5. free_gr_ctx API need not accept the VM as mappings struct
will hold the VM. mappings struct will be kept in gr_ctx.
6. Move preemption buffers allocation logic out of
nvgpu_gr_obj_ctx_set_graphics_preemption_mode.
7. set_preemption_mode and gr_gk20a_update_hwpm_ctxsw_mode
functions need update to ensure buffers are allocated
and mapped.
8. Keep the unit tests and documentation updated.
With these changes there is clear seggregation of allocation and
mapping of GR context buffers. This will simplify further change
to add multiple address spaces support. With multiple address
spaces in a TSG, subcontexts created after first subcontext
just need to map the buffers.
Bug 3677982
Change-Id: I3cd5f1311dd85aad1cf547da8fa45293fb7a7cb3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2712222
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch primary separates runlist modification from
runlist submits.
Instead of submitting the runlist(domain) immediately after
modification, a worker thread interface is now being used to
synchronously schedule runlist submits. If the runlist being
scheduled is currently active, the submit happens instantly,
otherwise, it will happen in the next iteration when the nvs
thread will schedule the domain. This external interface uses
a condition variable to wait for the completion of the
synchronous submits.
A pending_update variable is used to synchronize domain memory
swaps just before being submitted.
To facilitate faster scheduling via the NVS thread, nvgpu_dom
itself contains an array of rl_domain pointers. This can then
be used to select the appropriate rl_domain directly for scheduling
as against the earlier approach of maintaining nvs domains and rl
domains in sync everytime.
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I1725c7cf56407cca2e3d2589833d1c0b66a7ad7b
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2739795
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
While creating a new channel, ioctls are called in the below sequence:
1. GPU_IOCTL_OPEN_CHANNEL
2. AS_IOCTL_BIND_CHANNEL
3. TSG_IOCTL_BIND_CHANNEL_EX
4. CHANNEL_ALLOC_GPFIFO_EX
5. CHANNEL_ALLOC_OBJ_CTX.
subctx pdbs and valid mask are programmed in the channel instance block
in the channel ioctls AS_IOCTL_BIND_CHANNEL & CHANNEL_ALLOC_GPFIFO_EX.
Programming them in the ioctl AS_IOCTL_BIND_CHANNEL is redundant.
Remove related hal g->ops.mm.init_inst_block_for_subctxs.
The hal init_inst_block will program context pdb and big page size.
The hal init_inst_block_core will program context pdb, big page size
and subctx 0 pdb. This is used by h/w units (fecs, pmu, hwpm, bar1,
bar2, sec2, gsp, perfbuf etc.).
For user channels, subctx pdbs are programmed as part of ramfc setup.
Bug 3677982
Change-Id: I6656b002d513404c1fd7c3d349933e80cca7e604
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2680907
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Background: In case of a deferred suspend implemented by gk20a_idle,
the device waits for a delay before suspending and invoking
power gating callbacks. This helps minimize resume latency for any
resume calls(gk20a_busy) that occur before the delay.
Now, some APIs spread across the driver requires that if the device
is powered on, then they can proceed with register writes, but if its
powered off, then it must return. Examples of such APIs include
l2_flush, fb_flush and even nvs_thread. We have relied on
some hacks to ensure the device is kept powered on to prevent any such
delayed suspension to proceed. However, this still raced for some calls
like ioctl l2_flush, so gk20a_busy() was added (Refer to commit Id
dd341e7ecbaf65843cb8059f9d57a8be58952f63)
Upstream linux kernel has introduced the API pm_runtime_get_if_active
specifically to handle the corner case for locking the state during the
event of a deferred suspend.
According to the Linux kernel docs, invoking the API with
ign_usage_count parameter set to true, prevents an incoming suspend
if it has not already suspended.
With this, there is no longer a need to check whether
nvgpu_is_powered_off(). Changed the behavior of gk20a_busy_noresume()
to return bool. It returns true, iff it managed to prevent
an imminent suspend, else returns false. For cases where
PM runtime is disabled, the code follows the existing implementation.
Added missing gk20a_busy_noresume() calls to tlb_invalidate.
Also, moved gk20a_pm_deinit to after nvgpu_quiesce() in
the module removal path. This is done to prevent regs access
after registers are locked out at the end of nvgpu_quiesce. This
can happen as some free function calls post quiesce might still
have l2_flush, fb_flush deep inside their stack, hence invoke
gk20a_pm_deinit to disable pm_runtime immediately after quiesce.
Kept the legacy implementation same for VGPU and
older kernels
Jira NVGPU-8487
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I972f9afe577b670c44fc09e3177a5ce8a44ca338
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2715654
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
In current form, the default domain acts like any schedulable
domain. TSGs are bound to it and it can be enumerated via the
public interfaces.
The new expectation for the default domain is meant to change
from the current form to a pseudo domain that cannot act like
an ordinary domain in other ways, i.e. it must not be reachable
by in particular the domain management API, it can't be removed,
does not show up in lists, and TSGs cannot be explicitly bound to
this domain. It won't participate in round-robin domain scheduling.
It is not really a domain, and acts like one only when activated in
the manual mode.
Following changes are made overall to support the above change in
definition.
1) Domain creation and attaching the domain to the scheduler are now
split into two separate functions. The new default domain (having ID
= UINT64_MAX) is created separately from a static function without
linking it with other domains in the scheduler.
2) struct nvgpu_nvs_scheduler explicitely stores the default domain
to support direct lookups.
3) TSGs are initially not bound to default domain/rl_domain.
Jira NVGPU-8165
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I916d11f4eea5124d8d64176dc77f3806c6139695
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2697477
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In order to support the concept of the default domain, a new
rl domain is created that shadows all the other domains i.e.
all channels of all TSGs are replicated here. This is scheduled
by default during GPU boot.
1) The shadow rl_domain is constructed during poweron sequence via
nvgpu_runlist_alloc_shadow_rl_domain(). struct nvgpu_runlist
is appended to store this separately as 'shadow_rl_domain'.
This is scheduled in background as long as no other user created
rl domains exist.
2) 'shadow_rl_domain' is scheduled out once user created rl domain
exist. At this point, any updates in the user created rl domains
are synchronized with the 'shadow_rl_domain'. i.e. 'shadow_rl_domain'
is also reconstructed to contain active channels and tsgs from the rl
domain.
3) 'shadow_rl_domain' is scheduled back in when the last user created
rl domain is removed.
4) In future for manual mode, driver shall support explicitely
switching to 'shadow_rl_domain'. Also, we will move to an
implementation where 'shadow_rl_domain' is switched out only when
other domains are actively scheduled. These changes will be
implemented later.
Jira NVGPU-8165
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Ia6a07d6bfe90e7f6c9e04a867f58c01b9243c3b0
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2704702
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
BVEC changes for nvgpu_rc_pbdma_fault and nvgpu_rc_mmu_fault
started reporting below MISRA issue.
kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:321:
1. misra_c_2012_rule_5_1_violation: Declaration with identifier
"nvgpu_tsg_unbind_channel_check_hw_state", which is ambiguous.
kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:349:
2. other_declaration: The first 31 characters of identifiers
"nvgpu_tsg_unbind_channel_check_ctx_reload" and
"nvgpu_tsg_unbind_channel_check_hw_state" are identical.
Do below renames to fix the issue. Doing both for consistency.
s/nvgpu_tsg_unbind_channel_check_hw_state/nvgpu_tsg_unbind_channel_hw_state_check
s/nvgpu_tsg_unbind_channel_check_ctx_reload/nvgpu_tsg_unbind_channel_ctx_reload_check
JIRA NVGPU-6772
Change-Id: Ib92cabe11c486621351bf15ddb86e20d16d514c4
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584152
(cherry picked from commit a619f259c6a4ffccb05550767212989af60c2a90)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2706551
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
- MISRA Directive 4.7
Calling function "nvgpu_tsg_unbind_channel(tsg, ch, true)" which returns
error information without testing the error information.
- MISRA Rule 10.3
Implicit conversion from essential type "unsigned 64-bit int" to different
or narrower essential type "unsigned 32-bit int"
- MISRA Rule 5.7
A tag name shall be a unique identifier
JIRA NVGPU-5955
Change-Id: I109e0c01848c76a0947848e91cc6bb17d4cf7d24
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572776
(cherry picked from commit 073daafe8a11e86806be966711271be51d99c18e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678681
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Include the scheduling domain name in the channel debug dump. The domain
name of a channel is the domain name of its parent TSG, if any. Copy
just the name into the dump info to avoid refcounting concerns.
While at it, reword the deterministic flag for less ambiguity.
Jira NVGPU-6791
Change-Id: I06041277f938e20f23de9aa419cfffbaa028035e
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2673101
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Replace id-based lookup with fd-based lookup when binding a TSG to a
domain. The device node based domain interface naturally provides access
control; this way userspace tools can limit which uid/gid can access
each domain.
Also, explicitly disallow binding channels to a TSG that has no runlist
domain yet. Normally a TSG is in the default domain if nothing else has
been specified, but the default domain can be deleted.
Jira NVGPU-6788
Change-Id: I2af96dfc002367d894eaf0c175006332f790c27f
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2651165
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
A reference to the default scheduling domain is taken when a TSG is
opened. Although the explicit bind is designed to support only one bind,
the TSG is bound to the default one implicitly at that point. Release
the reference to avoid leaking it.
The domain might be null at that point if the default domain has been
removed; in that case there's just no domain to put back.
Change-Id: I7db43f7bbb2a8c86c391280eb7fa32431c8982da
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2663420
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
The nvgpu timeout API has an internal override for presilicon mode by
default: in presi simulation environments the timeouts never trigger.
This behaviour is intended in the original usecase of the timer unit
with hardware polling loops. In pure software logic though, the timer
must trigger after the specified timeout even in presi mode so add a new
init function to produce a timer for software logic. Use this new kind
of timer in channel and scheduling worker threads.
The channel worker currently times out for just the purpose of the
channel watchdog timer which has its own internal timer. Although that's
just software, the general expectation is that the watchdog does not
trigger in presilicon tests that run slower than usual. The internal
watchdog timer thus keeps the non-sw mode.
Bug 3521828
Change-Id: I48ae8522c7ce2346a930e766528d8b64195f81d8
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662541
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
- Make the domain scheduler timeslice type nanoseconds to future proof
the interface
- Return -ENOSYS from ioctls if the nvs code is not initialized
- Return the number of domains also when user supplied array is present
- Use domain id instead of name for TSG binding
- Improve documentation in the uapi headers
- Verify that reserved fields are zeroed
- Extend some internal logging
- Release the sched mutex on alloc error
- Add file mode checks in the nvs ioctls. The create and remove ioctls
require writable file permissions, while the query does not; this
allows filesystem based access control on domain management on the
single dev node.
Jira NVGPU-6788
Change-Id: I668eb5972a0ed1073e84a4ae30e3069bf0b59e16
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2639017
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit