GR context associated with channel is updated in various driver paths.
Sequence to do the same is disable the TSG, preempt the TSG, update
the GR context or instance block and then enable the TSG.
These operations and runlist updates for channel have to be done under
TSG specific ctx_init_lock to avoid the race.
suspend_contexts and resume_contexts needs special handling which is
not covered in this patch.
Bug 3677982
Change-Id: I837257fe9d9ef3eb6f69f5d7e0707e0bb6d4ea72
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2720222
Reviewed-by: Scott Long <scottl@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
In order to maintain separate mappings of GR TSG and global context
buffers for different subcontexts, we need to separate the memory
struct and the mapping struct for the buffers. This patch moves
the mappings of all GR ctx buffers to new structure
nvgpu_gr_ctx_mappings.
This will be instantiated per subcontext in the upcoming patches.
Summary of changes:
1. Various context buffers were allocated and mapped separately.
All TSG context buffers are now stored in gr_ctx->mem[] array
since allocation and mapping is unified for them.
2. Mapping/unmapping and querying the GPU VA of the context
buffers is now handled in ctx_mappings unit. Structure
nvgpu_gr_ctx_mappings in nvgpu_gr_ctx holds the maps.
On ALLOC_OBJ_CTX this struct is instantiated and deleted
on free_gr_ctx.
3. Introduce mapping flags for TSG and global context buffers.
This is to map different buffers with different caching
attribute. Map all buffers as cacheable except
PRIV_ACCESS_MAP, RTV_CIRCULAR_BUFFER, FECS_TRACE, GR CTX
and PATCH ctx buffers. Map all buffers as privileged.
4. Wherever VM or GPU VA is passed in the obj_ctx allocation
functions, they are now replaced by nvgpu_gr_ctx_mappings.
5. free_gr_ctx API need not accept the VM as mappings struct
will hold the VM. mappings struct will be kept in gr_ctx.
6. Move preemption buffers allocation logic out of
nvgpu_gr_obj_ctx_set_graphics_preemption_mode.
7. set_preemption_mode and gr_gk20a_update_hwpm_ctxsw_mode
functions need update to ensure buffers are allocated
and mapped.
8. Keep the unit tests and documentation updated.
With these changes there is clear seggregation of allocation and
mapping of GR context buffers. This will simplify further change
to add multiple address spaces support. With multiple address
spaces in a TSG, subcontexts created after first subcontext
just need to map the buffers.
Bug 3677982
Change-Id: I3cd5f1311dd85aad1cf547da8fa45293fb7a7cb3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2712222
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
gr ctx buffer in non-cacheable hence there is no need to do L2 cache
flush when updating the buffer. Remove the flushes.
pm ctx buffer is cacheable hence add l2 flush in the function
nvgpu_profiler_quiesce_hwpm_streamout_non_resident since it
updates the buffer.
Bug 3677982
Change-Id: I0c15ec7a7f8fa250af1d25891122acc24443a872
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2713916
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Upstream commit 7938f4218168 ("dma-buf-map: Rename to iosys-map")
renames 'struct dma_buf_map' to 'struct iosys_map' and breaks building
the NVGPU driver with Linux v5.18-rc1. In the NVGPU driver there are
many places where 'dma_buf_map' is used and so to clean-up the code and
minimise the impact of this change, add a gk20a_dmabuf_vmap() and a
gk20a_dmabuf_vunmap() helper function. These new functions support all
kernel versions and eliminate a lot the KERNEL_VERSION ifdefs.
Bug 3598986
Change-Id: Id0f904ec0662f20f3d699b74efd9542d12344228
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693970
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Patch udpates/fixes following issues.
- Updates nvgpu_dbg_gpu_get_mappings_entry.size to u64 to address >4G
limitations.
- Removes offset from original cpuva and unmaps only original mapped
address.
- Call nvgpu_vm_find_mapped_buf_range() in place of
nvgpu_vm_find_mapped_buf() to find the addresses which are not
page aligned.
- Update logic to parse the gpuva while trying to find gpu mappings so
that gpuva which are more than the mapped buffer base address can also
be considered.
Bug 200722275
Change-Id: If33d85db37a9f03a662984c212544a8b2ade471c
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2612129
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
The PMA unit can only access GPU VAs within a 4GB window, hence both
the user allocated PMA buffer and the kernel allocated bytes available
buffer should lie in the same 4GB window. This is accomplished by
carving out and reserving a 4GB VA space in perbuf.vm and using fixed
GPU VAs to ensure that both buffers are bound within the same 4GB window.
In addition, update ALLOC_PMA_STREAM to use pma_buffer_offset,
pma_buffer_map_size fields correctly.
Bug 3503708
Change-Id: Ic5297a22c2db42b18ff5e676d565d3be3c1cd780
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671637
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
Finding gpu va mapping inside a given range is a two step process where
in first step number of mapping are queried and at second step it
queries for all the continues mapping range for that given gpu va
range. Mapping interface should count and return number of mappings if
input count is 0 in place of failing it.
Patch make the change for this two step process and only returns count
at first step and in second step returns the continues memory ranges.
Patch also replaces nvgpu_zalloc with nvgpu_big_zalloc to handle bigger
size allocation.
Bug 200722275
Change-Id: I56428deafa560ac8471c78f102bb1f9dbe20cabc
Signed-off-by: prsethi <prsethi@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2591043
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add Setter and Getter methods for accessing tsg->sm_error_states.
Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state.
This renders it unnecessary to add BVEC for above fields for the struct
in multiple locations. The current design ensures that only a constant
pointer is obtained from the owner unit i.e. FIFO.
The following new methods are added. Both unit tests and BVEC tests
are added for them as well.
nvgpu_tsg_store_sm_error_state
nvgpu_tsg_get_sm_error_state
Jira NVGPU-6947
Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
HAL .exec_regops used to first validate regops then execute it, now
moving it to only execute the regops.
- It helps B0CC on HV. On server side it does not track profiler object,
but regops validation uses the profiler, so moving validation to client
side.
- The change also remove ctx_buffer_offset checking in
validate_reg_op_offset. The offset already checked again whitelists
which have be verified when update whitelist. Also vgpu does not have
information of ctx and golden image.
- Added function nvgpu_regops_exec to cover both regops validation and
execution.
Jira GVSCI-10351
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I434e027290e263a8a64a25a55500f7294038c9c4
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2534252
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add gv11b and tu104 HALs to get allowed HWPM resource register ranges,
offsets, and stride meta data.
Add new enum nvgpu_pm_resource_hwpm_register_type for HWPM register
type. Add new struct nvgpu_pm_resource_register_range_map to store all
the register ranges for HWPM resources. Add pointer of map in struct
nvgpu_profiler_object along with map entry count.
Add new API nvgpu_profiler_build_regops_allowlist() to build the regops
allowlist dynamically while binding the resources. Map entry count is
received with get_pm_resource_register_range_map_entry_count() and only
those resource ranges are added for which resource is reserved by
profiler object.
Add nvgpu_profiler_destroy_regops_allowlist() to destroy the allowlist
while unbinding the resources.
Add static functions allowlist_range_search() to search a register
offset in HWPM resource ranges. Add another static function
allowlist_offset_search() to search the offset in per-resource offset
list.
Add nvgpu_profiler_validate_regops_allowlist() that accepts an offset
value, checks if it is in allowed ranges using allowlist_range_search()
and then checks if offset is in allowlist using allowlist_offset_search().
Update gops.regops.exec_regops() to receive profiler object pointer as
a parameter.
Invoke nvgpu_profiler_validate_regops_allowlist() from
validate_reg_ops() if prof pointer is not-null. This will be true only
for new profiler stack and not legacy profilers.
In gr_exec_ctx_ops(), skip regops execution if offset is invalid.
Bug 2510974
Jira NVGPU-5360
Change-Id: I40acb91cc37508629c83106ea15b062250bba473
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2460001
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Add new API nvgpu_get_gk20a_from_cdev() that extracts gk20a pointer
from cdev pointer. This helps in keeping cdev related implementation
details in ioctl.c and away from other device ioctl files.
Also move struct nvgpu_cdev, nvgpu_class, and nvgpu_cdev_class_priv_data
from os_linux.h to ioctl.h since all of these structures are more IOCTL
related and better to keep them in ioctl specific header.
Jira NVGPU-5648
Change-Id: Ifad8454fd727ae2389ccf3d1ba492551ef1613ac
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2435466
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Remove static dev node meta data from struct nvgpu_os_linux and replace
it by a dynamic list. Struct nvgpu_os_linux will only keep track of list
head and number of entries.
Add new structure nvgpu_cdev to store meta data of each dev node and
create/setup it dynamically in gk20a_user_init(). Once done, add the new
node under list head maintained in nvgpu_os_linux.
Add a static list dev_node_list[] that contains list of dev node names
and file operations. This static list is used to create nvgpu_cdev data
structures and to register new device nodes.
Update all dev node open file operations (e.g. gk20a_as_dev_open()) to
extract struct gk20a pointer from device pointer of dev node.
gk20a device is the parent of dev node device.
Jira NVGPU-5648
Change-Id: If070c3428afd6215e45b4919335d9f43e04c36f9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2428500
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Legacy profiler does not reserve PMA stream resource with PM
reservation system. Also, HWPM system reset is separately implemented
in membuf disable path. And it does not even restore perf unit SLCG
prod values.
Allcoate a dummy profiler object for debug session in perfbuf map
path. Free it in perfbuf unmap path.
This has advantage of synchronizing PMA stream reservation with new
profiler stack. And this also leverages HWPM system reset and SLCG
handling code during resource reservation.
Remove explicit HWPM reset from gops.perf.membuf_reset_streaming()
HALs
Bug 2510974
Jira NVGPU-5360
Change-Id: I54c5202b6251dea3d80a4dfc011e8a296339e07f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399595
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement new API nvgpu_prof_ioctl_exec_reg_ops() to support regops on
new profiler objects.
Add two new staging buffers to hold regops copied from userspace, and
to convert and execute regops in common code.
Buffers are allocated and released along with the profiler object.
New API will implements this :
- copy regops data in chunks of 4K from userspace
- store them in staging buffer
- convert the new regop struct into common regop struct and also
copy the content into second staging buffer
- trigger gops.regops.exec_regops() with second staging buffer as
operation pointer
- convert common regop struct back into new regop struct and copy
back to userspace
Export bunch of helper functions from ioctl_dbg.h. e.g.
nvgpu_get_regops_op_values_common()
Update regop execution code to skip regop execution if regop status
is not valid. This is only possible when userspace requests for
CONTINUE_ON_ERROR mode.
Add more documentation to some of the fields in UAPI header.
Note that maximum atomic operations reported by new API are same
as legacy API and are incorrect. This will be fixed up in upcoming
patches.
Bug 2510974
Jira NVGPU-5360
Change-Id: I9f82052b22143aec33f6e778c0784386744b699e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394208
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rework regops execution API to accomodate below updates for new
profiler design
- gops.regops.exec_regops() should accept TSG pointer instead of
channel pointer.
- Remove individual boolean parameters and add one flag field.
Below new flags are added to this API :
NVGPU_REG_OP_FLAG_MODE_ALL_OR_NONE
NVGPU_REG_OP_FLAG_MODE_CONTINUE_ON_ERROR
NVGPU_REG_OP_FLAG_ALL_PASSED
NVGPU_REG_OP_FLAG_DIRECT_OPS
Update other APIs, e.g. gr_gk20a_exec_ctx_ops() and validate_reg_ops()
as per new API changes.
Add new API gk20a_is_tsg_ctx_resident() to check context residency
from TSG pointer.
Convert gr_gk20a_ctx_patch_smpc() to a HAL gops.gr.ctx_patch_smpc().
Set this HAL only for gm20b since it is not required for later chips.
Also, remove subcontext code from this function since gm20b does not
support subcontext.
Remove stale comment about missing vGPU support in exec_regops_gk20a()
Bug 2510974
Jira NVGPU-5360
Change-Id: I3c25c34277b5ca88484da1e20d459118f15da102
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389733
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In nvgpu_dbg_gpu_ioctl_smpc_ctxsw_mode(), check if SMPC global mode
capability is supported instead of checking for the function pointer.
Enable the capability only for Turing since pre-Turing GPUs don't
support it.
Bug 2510974
Jira NVGPU-5360
Change-Id: I352fb2a91b836cd8ef727966a53a28255d8ea834
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389653
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
gk20a_perfbuf_map() allocates perfbuf VM, maps the user buffer into new
VM, and then triggers gops.perfbuf.perfbuf_enable(). This HAL then does
following :
- Allocate perfbuf instance block
- Initialize perfbuf instance block
- Reset stream buffer
- Program instance block address in PMA registers
- Program user buffer address into PMA registers
New profiler interface will have it's own API to setup PMA strem, and
it requires above setup to be done in two phases of perfbuf
initialization and then user buffer setup.
Split above functionalities into below functions
- nvgpu_perfbuf_init_vm()
- Allocate perfbuf VM
- Call gops.perfbuf.init_inst_block() to initialize perfbuf instance
block
- gops.perfbuf.init_inst_block()
- Allocate perfbuf instance block
- Initialize perfbuf instance block
- Program instance block address in PMA registers using
gops.perf.init_inst_block()
- In case of vGPU, trigger TEGRA_VGPU_CMD_PERFBUF_INST_BLOCK_MGT
command to gpu server
- gops.perf.init_inst_block()
- Reset stream buffer
- Program user buffer address into PMA registers
Also add corresponding cleanup functions as below :
gops.perf.deinit_inst_block()
gops.perfbuf.deinit_inst_block()
nvgpu_perfbuf_deinit_vm()
Bug 2510974
Jira NVGPU-5360
Change-Id: I486370f21012cbb7fea84fe46fb16db95bc16790
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372984
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add tu104 specific HAL tu104_gr_falcon_ctrl_ctxsw() that processes below
CTXSW methods to start/stop SMPC global mode :
NVGPU_GR_FALCON_METHOD_START_SMPC_GLOBAL_MODE
NVGPU_GR_FALCON_METHOD_STOP_SMPC_GLOBAL_MODE
Add new tu104 specific HAL tu104_gr_update_smpc_global_mode() to trigger
SMPC global mode start/stop using gops.gr.falcon.ctrl_ctxsw().
Update nvgpu_dbg_gpu_ioctl_smpc_ctxsw_mode() to enable/disable SMPC
global mode if channel is not bound to debug session.
Bug 2510974
Bug 2257799
Jira NVGPU-5360
Change-Id: I1f9d8f2a2d30a4738f291db3fc72c400d24f4048
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368696
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Current PM resource reservation system is limited to HWPM resources
only. And reservation tracking is done using boolean variables.
New upcoming profiler support requires reservation for all the PM
resources like SMPC and PMA stream. Using boolean variables is
not scalable and confusing. Plus the variables have to be replicated
on gpu server in case of virtualization.
Remove flag tracking mechanism and use list based approach to track
all PM reservations. Also, current HALs are defined on debugger object.
Implement new HALs in new pm_reservation object since it is really an
independent functionality.
Add new source file common/profiler/pm_reservation.c which implements
functions to reserve/release resources and to check if any resource
is reserved or not.
Add common/vgpu/pm_reservation_vgpu.c for vGPU which simply forwards
the request to gpu server.
Define new HAL object gops.pm_reservation and assign above functions
to below respective HALs :
g->ops.pm_reservation.acquire()
g->ops.pm_reservation.release()
g->ops.pm_reservation.release_all_per_vmid()
Last HAL above is only used for gpu server cleanup of guest OS.
Add below new common profiler functions that act as APIs to reserve/
release resources for rest of the units in nvgpu.
nvgpu_profiler_pm_resource_reserve()
nvgpu_profiler_pm_resource_release()
Initialize the meta data required for reservtion system in
nvgpu_pm_reservation_init() and call it during nvgpu_finalize_poweron.
Clean up the meta data before releasing struct gk20a.
Delete below HALs :
g->ops.debugger.check_and_set_global_reservation()
g->ops.debugger.check_and_set_context_reservation()
g->ops.debugger.release_profiler_reservation()
Bug 2510974
Jira NVGPU-5360
Change-Id: I4d9f89c58c791b3b2e63099a8a603462e5319222
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2367224
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below APIs to update hwpm/smpc ctxsw mode take a channel pointer as a
parameter. APIs then extract corresponding TSG from channel and perform
various operations on context stored in TSG.
g->ops.gr.update_smpc_ctxsw_mode()
g->ops.gr.update_hwpm_ctxsw_mode()
Update both above APIs to accept TSG pointer instead of a channel.
This is a refactor work to support new profiler design where a profiler
object is bound to TSG and keeps track of TSG only.
Bug 2510974
Jira NVGPU-5360
Change-Id: Ia4cefda503d8420f2bd32d07c57534924f0f557a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366122
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move profiler object allocation/free APIs to separate profiler
specific file common/profiler.c.
Store struct gk20a pointer in struct dbg_profiler_object_data for
convenience of accessing global struct pointer.
Update profiler object to store TSG pointer instead of channel
pointer. Since expectations is to have one profiler object
per context/TSG.
nvgpu_profiler_reserve_acquire() has a case to check if resource
reservation is acquired by some other channel in TSG.
But now since we keep track of TSG itself, this case becomes
redundant and can be removed.
All the support is compiled out of safety build with compile
flag CONFIG_NVGPU_PROFILER.
Linux will always compile the support.
Bug 2510974
Change-Id: I197bbd67a9cdd1fbea42f1effd1b74b15a6068e5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2365674
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added dependency between the Kconfig options as follows where
'->' indicates 'depends on' relation:
SUPPORT_CDE -> COMPRESSION -> DMABUF_HAS_DRVDATA
DGPU -> GK20A_PCI
Defined Kconfig option for VPR and for DGPU that is dependent GK20A_PCI
as well. DGPU related sources are now compiled under config flag DGPU.
Also update conditional compilation of the driver paths w.r.t DGPU,
VPR and COMPRESSION flags.
Bug 2834141
Change-Id: Ia0a39d6d4cf8b36e7f955b7355a5ab41783f821c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2299627
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Replaced ch->mmu_debug_mode_enabled with ch->mmu_debug_mode_refcnt.
If channel is enabled multiple times by userspace, then ref count is
updated accordingly. There is an expectation that enable/disable
calls are balanced for setting channel's mmu debug mode.
When unbinding the channel, decrease refcnt for the channel until it
reaches 0.
Also, removed tsg parameter from nvgpu_tsg_set_mmu_debug_mode as it
can be retrieved from ch.
Bug 2515097
Change-Id: If334e374a55bd14ae219edbfd3b1fce5ff25c226
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2184702
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove below hals since the corresponding functions are same on all
platforms and they are h/w independent
g->ops.gr.enable_ctxsw()
g->ops.gr.disable_ctxsw()
g->ops.gr.reset()
Call the functions directly at all places
Remove CONFIG_NVGPU_DEBUGGER from places where these functions are
called since they are not debugger dependent
This also helps to disable CONFIG_NVGPU_DEBUGGER and to keep recovery
sequence intact
Jira NVGPU-3506
Change-Id: Id2b208ca23dc4667e78edcd8ad242a8558e0ff64
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2137255
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make the nvgpu_init_mutex function return void.
In linux case, this doesn't affect anything since mutex_init
returns void.
For posix, we assert() and die if pthread_mutex_init fails.
This alleviates the need to error inject for _every_
nvgpu_mutex_init function in the driver.
Jira NVGPU-3476
Change-Id: Ibc801116dc82cdfcedcba2c352785f2640b7d54f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2130538
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GPC MMU debug mode should be set if at least one channel
in the TSG has requested it. Add refcounting for MMU debug
mode, to make sure debug mode is disabled only when no
channel in the TSG is using it.
Bug 2515097
Change-Id: Ic5530f93523a9ec2cd3bfebc97adf7b7000531e0
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2123017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Renamed gk20a_channel_* APIs to nvgpu_channel_* APIs.
Removed unused channel API int gk20a_wait_channel_idle
Renamed nvgpu_channel_free_usermode_buffers in os/linux-channel.c to
nvgpu_os_channel_free_usermode_buffers to avoid conflicts with the API
with the same name in channel unit.
Jira NVGPU-3248
Change-Id: I21379bd79e64da7e987ddaf5d19ff3804348acca
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2121902
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE ioctl to set MMU
debug mode for a given context.
Added gr.set_mmu_debug_mode HAL to change NV_PGPC_PRI_MMU_DEBUG_CTRL
for a given channel. HAL implementation for native case is
gm20b_gr_set_mmu_debug_mode. It internally uses regops, which directly
writes to the register if the context is resident, or writes to
gr context otherwise.
Added NVGPU_SUPPORT_SET_CTX_MMU_DEBUG_MODE to enable the feature.
NV_PGPC_PRI_MMU_DEBUG_CTRL has to be context switched in FECS ucode,
so the feature is only enabled on TU104 for now.
Bug 2515097
Change-Id: Ib4efaf06fc47a8539b4474f94c68c20ce225263f
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2110720
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>