linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-24 10:34:43 +03:00

Author	SHA1	Message	Date
Sagar Kamble	d75473a115	gpu: nvgpu: fix unit test traceability issues Some of the functions with no traceability to unit tests are already covered by callee API functions. Skip these functions in SWVR by skipping doxygen for them. Some of the functions are non-fusa like those in profile.h and bsearch.h. Those were included as the header was included in Doxygen sources. Mark then non-safe. Some of the nvgpu functions were not added to Targets entries for respective tests. Fix those. JIRA NVGPU-7211 Change-Id: Iacf22dccdd9340100cf93814566d3979734c455d Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2612982 (cherry picked from commit a40f62654747102cc8ef53ddbd9f953c21c2b745) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2737672 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-15 07:15:34 -07:00
Sagar Kamble	f95cb5f4f8	gpu: nvgpu: maintain ctx buffers mappings separately from ctx mems In order to maintain separate mappings of GR TSG and global context buffers for different subcontexts, we need to separate the memory struct and the mapping struct for the buffers. This patch moves the mappings of all GR ctx buffers to new structure nvgpu_gr_ctx_mappings. This will be instantiated per subcontext in the upcoming patches. Summary of changes: 1. Various context buffers were allocated and mapped separately. All TSG context buffers are now stored in gr_ctx->mem[] array since allocation and mapping is unified for them. 2. Mapping/unmapping and querying the GPU VA of the context buffers is now handled in ctx_mappings unit. Structure nvgpu_gr_ctx_mappings in nvgpu_gr_ctx holds the maps. On ALLOC_OBJ_CTX this struct is instantiated and deleted on free_gr_ctx. 3. Introduce mapping flags for TSG and global context buffers. This is to map different buffers with different caching attribute. Map all buffers as cacheable except PRIV_ACCESS_MAP, RTV_CIRCULAR_BUFFER, FECS_TRACE, GR CTX and PATCH ctx buffers. Map all buffers as privileged. 4. Wherever VM or GPU VA is passed in the obj_ctx allocation functions, they are now replaced by nvgpu_gr_ctx_mappings. 5. free_gr_ctx API need not accept the VM as mappings struct will hold the VM. mappings struct will be kept in gr_ctx. 6. Move preemption buffers allocation logic out of nvgpu_gr_obj_ctx_set_graphics_preemption_mode. 7. set_preemption_mode and gr_gk20a_update_hwpm_ctxsw_mode functions need update to ensure buffers are allocated and mapped. 8. Keep the unit tests and documentation updated. With these changes there is clear seggregation of allocation and mapping of GR context buffers. This will simplify further change to add multiple address spaces support. With multiple address spaces in a TSG, subcontexts created after first subcontext just need to map the buffers. Bug 3677982 Change-Id: I3cd5f1311dd85aad1cf547da8fa45293fb7a7cb3 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2712222 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-15 07:10:11 -07:00
Sagar Kamble	80efe558b1	gpu: nvgpu: add BVEC test for nvgpu_rc_pbdma_fault Update nvgpu_rc_pbdma_fault with invalid checks and add BVEC test for it. Make ga10b_fifo_pbdma_isr static. NVGPU-6772 Change-Id: I5485760c53e1fff1278557a5b25659a1fc0e4eaf Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551617 (cherry picked from commit e917042d395d07cb902580bad3d5a7d0096cc303) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623625 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-07-14 08:58:31 -07:00
Debarshi Dutta	d8e8eb65d3	nvgpu: gpu: separate runlist submit from construction This patch primary separates runlist modification from runlist submits. Instead of submitting the runlist(domain) immediately after modification, a worker thread interface is now being used to synchronously schedule runlist submits. If the runlist being scheduled is currently active, the submit happens instantly, otherwise, it will happen in the next iteration when the nvs thread will schedule the domain. This external interface uses a condition variable to wait for the completion of the synchronous submits. A pending_update variable is used to synchronize domain memory swaps just before being submitted. To facilitate faster scheduling via the NVS thread, nvgpu_dom itself contains an array of rl_domain pointers. This can then be used to select the appropriate rl_domain directly for scheduling as against the earlier approach of maintaining nvs domains and rl domains in sync everytime. Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Change-Id: I1725c7cf56407cca2e3d2589833d1c0b66a7ad7b Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2739795 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-07-13 16:36:19 -07:00
Sagar Kamble	d82400d2b8	gpu: nvgpu: fix MISRA Rule 5.1 violation BVEC changes for nvgpu_rc_pbdma_fault and nvgpu_rc_mmu_fault started reporting below MISRA issue. kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:321: 1. misra_c_2012_rule_5_1_violation: Declaration with identifier "nvgpu_tsg_unbind_channel_check_hw_state", which is ambiguous. kernel/nvgpu/drivers/gpu/nvgpu/common/fifo/tsg.c:349: 2. other_declaration: The first 31 characters of identifiers "nvgpu_tsg_unbind_channel_check_ctx_reload" and "nvgpu_tsg_unbind_channel_check_hw_state" are identical. Do below renames to fix the issue. Doing both for consistency. s/nvgpu_tsg_unbind_channel_check_hw_state/nvgpu_tsg_unbind_channel_hw_state_check s/nvgpu_tsg_unbind_channel_check_ctx_reload/nvgpu_tsg_unbind_channel_ctx_reload_check JIRA NVGPU-6772 Change-Id: Ib92cabe11c486621351bf15ddb86e20d16d514c4 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2584152 (cherry picked from commit a619f259c6a4ffccb05550767212989af60c2a90) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2706551 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-05-11 04:18:12 -07:00
srajum	8381647662	gpu: nvgpu: fixing MISRA violations - MISRA Directive 4.7 Calling function "nvgpu_tsg_unbind_channel(tsg, ch, true)" which returns error information without testing the error information. - MISRA Rule 10.3 Implicit conversion from essential type "unsigned 64-bit int" to different or narrower essential type "unsigned 32-bit int" - MISRA Rule 5.7 A tag name shall be a unique identifier JIRA NVGPU-5955 Change-Id: I109e0c01848c76a0947848e91cc6bb17d4cf7d24 Signed-off-by: srajum <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572776 (cherry picked from commit 073daafe8a11e86806be966711271be51d99c18e) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678681 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-03-10 16:01:18 -08:00
Shashank Singh	19a3b86f06	gpu: nvgpu: remove unused code from common.nvgpu on safety build - remove unused code from common.nvgpu unit on safety build. Also, remove the code which uses them in other places. - document use of compiler intrinsics as mandated in code inspection checklist. Jira NVGPU-6876 Change-Id: Ifd16dd197d297f56a517ca155da4ed145015204c Signed-off-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561584 (cherry picked from commit 900391071e9a7d0448cbc1bb6ed57677459712a4) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561583 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-02-17 04:58:32 -08:00
Konsta Hölttä	632644b44a	gpu: nvgpu: couple runlist domains and nvs Now that the main nvsched code exists in the nvgpu build, make it control the runlist domains. As a new nvs domain is created, create the relevant runlist data too. To support the default domain, create a default nvs domain at boot. The scheduling domain code owns the responsibility of domain lifetime, and runlist domains exist to serve that logic although the RL domains are directly used by channel and TSG logic. Add refcounting to the scheduler uapi level to make sure that busy domains (that still have TSG participants) do not get removed too early. Adjust error injection sensitive unit tests to match the updated logic. Jira NVGPU-6425 Jira NVGPU-6427 Change-Id: I1beec97c54c60ad334165b1c0acb5e827c24f2ac Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2632287 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-12-07 07:07:12 -08:00
Sagar Kamble	48d17e9c53	gpu: nvgpu: fix the unit test traceability gk20a_tsg_unbind_channel_check_hw_next was not added to Targets in unit test specification. Add it. __attribute__ in debug.h is captured by Doxygen as function with no tests. However it is not really a function and applies to non-fusa function so skip it in Doxygen. JIRA NVGPU-7211 Change-Id: I2adadaebbf4e43768eb408dd10aaa20b1e13eccc Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2615256 (cherry picked from commit e829afb55a17dc0dacf17c71633f5689324171d7) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623629 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-11-14 04:24:17 -08:00
Sagar Kamble	f64b5e20b0	gpu: nvgpu: add unit test for gk20a_tsg_unbind_channel_check_hw_next false branch when NEXT bit is not set is not covered. Add unit test for same. JIRA NVGPU-7211 Change-Id: I57725e35971605bf8144e7eaac618f44a38e5b31 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614209 (cherry picked from commit 2064209f92700dc859d7398e061b3d7dc2725521) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623628 Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-11-14 04:24:11 -08:00
Konsta Hölttä	6cff904dc3	gpu: nvgpu: use runlist obj for wait_pending Change the gops_runlist::wait_pending API to take a runlist pointer instead of a runlist ID to better match with the rest of that interface. Jira NVGPU-6425 Change-Id: I96c4f49df8e2613498e0a09cc75a950824828bed Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621214 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-11-11 20:39:47 -08:00
Konsta Hölttä	3cf796b787	gpu: nvgpu: move active bitmaps to domain Move the active_channels and active_tsgs bitmaps from struct nvgpu_runlist to struct nvgpu_runlist_domain. A TSG and its channels are currently active as part of a runlist; in the future, a runlist may be switched from multiple domains that each are a collection of TSGs. The changes are still internal to the runlist code. Users of runlists need no modifications. Jira NVGPU-6425 Change-Id: I2d0e98e97f04b9716bc3f4890cf881735d0ab664 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2618387 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-11-03 20:55:08 -07:00
Konsta Hölttä	1d23b8f13a	gpu: nvgpu: introduce internal runlist domain The current runlist code assumes a single runlist buffer to hold all TSG and channel entries. Create separate RL domain and domain memory types to hold data that is related to only a scheduling domain and not directly to the runlist hardware; in the future, more than one domains may exist and one of them is enabled at a time. The domain is used only internally by the runlist code at this point and is functionally equivalent to the current runlist memory that houses the round robin entries. The double buffering is still kept, although more domains might benefit from some cleverness. Although any number of created domains may be edited in runtime, nly one runlist memory is accessed by the hardware at a time. To spare some contiguous memory, this should be considered an opportunity for optimization in the future. Jira NVGPU-6425 Change-Id: Id99c55f058ad56daa48b732240f05b3195debfb1 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2618386 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-11-03 20:54:48 -07:00
Konsta Hölttä	9b3f3ea4be	gpu: nvgpu: remove timeout fault injection tests The timeout init API is changing to return void in most cases. Adapt the unit tests to the reduced branching. Change-Id: I4d05484529fe4ef46b518f41d10b71a4a9f9c6fb Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614286 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-10-26 13:47:20 -07:00
Debarshi Dutta	791dc18666	gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields Add Setter and Getter methods for accessing tsg->sm_error_states. Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state. This renders it unnecessary to add BVEC for above fields for the struct in multiple locations. The current design ensures that only a constant pointer is obtained from the owner unit i.e. FIFO. The following new methods are added. Both unit tests and BVEC tests are added for them as well. nvgpu_tsg_store_sm_error_state nvgpu_tsg_get_sm_error_state Jira NVGPU-6947 Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-13 20:57:09 -07:00
Tejal Kudav	b33079d47e	gpu: nvgpu: Move intr data members from MC to CIC Move interrupt specific data-members from common.mc to common.cic Some of these data members like sw_irq_stall_last_handled_cond need To be initialized much earlier during the OS specific init/probe stage. Also, some more members from struct nvgpu_interrupts(like stall_size, stall_lines[]), which will soon be moved to CIC will also need to be initialized early during the OS specific probe stage. However, the chip specific LUT can only be initialized after the hal_init stage where the HALs are all initialized. Split the CIC init to accommodate the above initialization requirements. JIRA NVGPU-6899 Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-19 18:06:28 -07:00
Debarshi Dutta	6d917822c8	gpu: nvgpu: bvec for ramin unit. 1) added a BVEC test for g->ops.ramin.set_big_page_size 2) Currently, runlist unit tests are not enabled in Dev-Main. Left it as it is. Jira NVGPU-6905 Change-Id: I7aefce472743653624cc5a22d978632f77b5f404 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548305 (cherry picked from commit c84b6c890a1711cd7c15ec974ea59041a0ace6d5) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554022 Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-07 12:25:57 -07:00
Debarshi Dutta	200777b854	gpu: nvgpu: bvec for channel and tsg Below changes are added. 1) Added checks in nvgpu_channel_from_id__func, nvgpu_tsg_check_and_get_from_id 2) Added BVEC tests for nvgpu_channel_open_new, nvgpu_channel_from_id, nvgpu_tsg_check_and_get_from_id, nvgpu_tsg_set_error_notifier 3) Added common function get_random_u32. Jira NVGPU-6905 Change-Id: I374d6f5503dc05e3224213d772a1752d82cbdc91 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548304 (cherry picked from commit 39b2529b3e96cfd3cbd3bb020f32ee2cca0ea363) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554021 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-07-07 12:25:50 -07:00
tkudav	0526e7eaa9	gpu: nvgpu: Create CIC-mon and CIC-rm subunits common.cic unit is divided into common.cic.mon and common.cic.rm based on rm and mon process split. CIC-mon subunit includes the code which is utilized in critical interrupt handling path like initialization, error detection and error reporting path. CIC-rm subunit includes the code corresponding to rest of interrupt handling(like collecting error debug data from registers) and ISR status management (status of deferred interrupts). Split the CIC APIs and data-members into above two subunits. JIRA NVGPU-6899 Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-02 09:57:56 -07:00
Tejal Kudav	e0a1fcf5f5	gpu: nvgpu: Add Central Intr Controller unit Add a new Central Interrupt Controller(CIC) unit in common code. The interrupt handling is done in a distributed manner currently. The error handling policy for different errors resides in each unit's ISR code. The goal is to converge this data under one central place - the CIC unit. This patch creates framework for CIC unit and moves the gv11b QNX safety LUT to CIC unit. All the error reporting APIs from different units are also moved to CIC. New APIs are exposed by CIC unit to access its internal data like: 1. Struct err_desc - the static err handling /injection data per error id 2. Num_hw_modules - the number of error reporting HW units supported by CIC Init and deinit of CIC unit: 1. CIC unit should be initialized earlyon during boot so that it is available for any interrupt handling. 2. Initialize CIC just before the interrupts are enabled during boot. 3. Similarly, CIC is disabled late during deinit cycle; right after the interrupts are masked. LUT: 1. LUT is currently used only for reporting error to safety services in gv11b QNX safety build. 2. This error handling policy LUT currently has only two levels of handing - correctable and quiecse. 3. Once, the error handling policy decision is moved from leaf unit nodes to CIC, LUT will be updated to have additional levels like fast recovery and full recovery. 4. Also, then a separate LUT will be added for each platform/build. 5. In current framework, the LUT is set to NULL for all configurations except gv11b. report_err() ops is added to report error to safety services. This ops is only effective for gv11b qnx build; and set to NULL for other configurations. NVGPU-6521 NVGPU-6523 NVGPU-6750 NVGPU-6758 NVGPU-6760 NVGPU-6754 Change-Id: I24be7836a96d787741e37b732e19863ed8014635 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683 Reviewed-by: Ajesh K V <akv@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-05-25 14:28:04 -07:00
Sagar Kamble	07d8a39647	gpu: nvgpu: wait for stalling interrupts to complete during TSG unbind preempt Some of the engine stalling interrupts can block the context save off the engine if not handled during fifo.preempt_tsg. They need to be handled while polling for engine ctxsw status. Bug 200711183 Change-Id: I7418a9e0354013b81fbefd8c0cab5068404fc44e Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521971 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-05-03 20:40:05 -07:00
Vedashree Vidwans	625d942c52	gpu: nvgpu: update gops_fifo.intr_1_isr logic Update gops_fifo.intr_1_isr to clear interrupt and return NVGPU_NONSTALL_OPS_WAKEUP_SEMAPHORE only if channel interrupt is pending Jira NVGPU-6222 Change-Id: I976f8bcf53c7735b154f40bb70b5f401020c8dd4 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2479250 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-03-05 01:27:39 -08:00
Alex Waterman	06f554318a	userspace: Update unit tests to use new .id and .runlist fields Update the unit test code to reflect the following changes: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470305 https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470306 These adjust field names/types in struct nvgpu_runlist and struct nvgpu_tsg; no functional changes are made but these do affect the unit tests that rely on setting these private fields. JIRA NVGPU-6425 Change-Id: I1fc9d56e363d326f1e6901aaec710903894ac61d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2477585 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-02-19 15:16:51 -08:00
Alex Waterman	d925e33e8b	userspace: Prune unit tests for new runlist code Remove and prune the now broken tests related to the runlist updates. JIRA NVGPU-6425 Change-Id: I76e03c943ceae261e35958aa64717b5590a19c0e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2474334 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-01-29 09:51:50 -08:00
Alex Waterman	11d3785faf	gpu: nvgpu: Rename struct nvgpu_runlist_info, fields in fifo Rename struct nvgpu_runlist_info to struct nvgpu_runlist; the info is not necessary. struct nvgpu_runlist is soon to be a first class object among the nvgpu object model. Also rename the fields runlist_info and active_runlist_info to simply runlists and active_runlists respectively. Again the info text is just not necessary and somewhat misleading. These structs _are_ the runlist representations in SW; they are not merely informational. Also add an rl_dbg() macro to print debug info specific to runlist management and some debug prints specifying the runlist topology for the running chip. Change-Id: Id9fcbdd1a7227cb5f8c75cca4abbff94fe048e49 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470303 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-01-20 21:56:33 -08:00
Sagar Kamble	cf287a4ef5	gpu: nvgpu: retry tsg unbind if NEXT is set The NEXT bit can remain set for the channel if timeslice expires before scheduler clears it. Due to this nvgpu fails TSG unbind and in turn nvrm_gpu fails channel close. In this case, checking the channel hw state after some time can help see NEXT bit cleared by scheduler. Reenable the tsg and return -EAGAIN to nvrm_gpu for it to retry again. Bug 3144960 Change-Id: I35f417f02270e371a4e632986b73a00f8a4f921a Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2468391 Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-01-18 23:11:57 -08:00
Vedashree Vidwans	2386ddd038	gpu: nvgpu: modify pbdma.get_fc_target Modify pbdma.get_fc_target() to accept nvgpu_device pointer. This is required for nvgpu-next. JIRA NVGPU-6135 Change-Id: I8baa58c704ee32ee68e87915029ac2be2132d4a4 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2440180 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:48 -06:00
tkudav	2ca4f145e4	gpu: nvgpu: Fix HAL checker pointed mismatches Add new HALs for register field definition/value changes in GV11B as compared to Pascal. Update the HALs for recent chips too if applicable. Bug 200604892 Change-Id: I14ee9440859007e86a1ffa937df399a31e2628bd Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2437564 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	78fb67bb0b	gpu: nvgpu: move fuse definitions to fuse.h Move common fuse definition macros to fuse.h. This will allow all chip specific fuse files to use the common macros. Jira NVGPU-6081 Change-Id: I85b5250809eef26a40f5b4b9bf6908dfa0d2be1f Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2422892 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Shashank Singh	3aec79d242	gpu: nvgpu: add check for valid engine id -Check validity of engine-id when iterating through all engines and passing the engine-id as an argument to other function(s). -Skip test test_gv100_dump_engine_status which fails due to this change. Bug 200660469 Change-Id: I64ebb1a0297f605dd3cba7ef73954ff5594828bc Signed-off-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2424655 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Antony Clince Alex	09857ecd91	userspace: units: replace PAGE_SIZE with NVGPU_CPU_PAGE_SIZE Replace PAGE_SIZE with NVGPU_CPU_PAGE_SIZE, which is a nvgpu defined wrapper over OS native page size. Bug 200658101 Jira NVGPU-6018 Change-Id: If35e23d5df38a6b52b586911d1055e0b00b12ebe Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2424792 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Peter Daifuku	a331fd4b3a	gpu: nvgpu: pd_cache enablement for >4k allocations in qnx Mapping of large buffers to GMMU end up needing many pages for the PTE tables. Allocating these one by one can end up being a performance bottleneck, particularly in the virtualized case. This is adding the following changes: - As the TLB invalidation doesn't have access to mem_off, allow top-level allocation by alloc_cache_direct(). - Define NVGPU_PD_CACHE_SIZE, the allocation size for a new slab for the PD cache, effectively set to 64K bytes - Use the PD cache for any allocation < NVGPU_PD_CACHE_SIZE When freeing up cached entries, avoid prefetch errors by invalidating the entry (memset to 0). - Try to fall back to direct allocation of smaller chunk for contiguous allocation failures. - Unit test changes. Bug 200649243 Change-Id: I0a667af0ba01d9147c703e64fc970880e52a8fbc Signed-off-by: dt <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2404371 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Lakshmanan M	c0e2dc5b74	gpu: nvgpu: Add subctx programming for MIG This CL covers the following code changes, 1) Added api to init inst_block for more than one subctxs. 2) Added logic to limit the subctx bind based on max. VEID count allocated to a gr instance. 3) Renamed nvgpu_grmgr_get_gr_runlist_id. JIRA NVGPU-5647 Change-Id: Ifec8164a9e5f46fbd0538c3dd50e19ee63667a54 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2418463 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
srajum	6aec282dc1	Revert "gpu: nvgpu: Fix for unit test failures" This reverts commit 0e353e0022da6064a2c0f71ed43a2a76ceec1a97. - created unit test change to exercise change with "23293fef" but there was issues with that change and now made correponding driver change and no longer this unit test change required. JIRA NVGPU-6051 Change-Id: Id8131ad027069062435947d79d627b23470a7199 Signed-off-by: srajum <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2415023 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Seeta Rama Raju	1bd0261cbe	gpu: nvgpu: Fix for unit test failures JIRA NVGPU-6051 Change-Id: Ic061594096ef49f7984cde4405f4934ded220e91 Signed-off-by: Seeta Rama Raju <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2411562 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	dfd9feace6	gpu: nvgpu: recover pbdma errors before ack When a pbdma fault needs a channel teardown, do the recovery/teardown process before acking the pbdma interrupt status back. Acking it causes the hardware to proceed which could release fences too early before the involved channel(s) have been found to be broken. With these host copyengine interrupts, the teardown sequence is light and proceeds even with the pbdma intr flag still set; there are no engines to reset when these pbdma launch check interrupts happen. The bad tsg is just disabled and the channels in it aborted. A few unit tests are so heavily affected by this refactor that they would need to be rewritten. They're not strictly needed at the moment, so do only half of the rewrite: just delete them. Bug 200611198 Change-Id: Id126fb158b6d05e46ba124cd426389046eedc053 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392669 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	0f501d806f	gpu: nvgpu: Unit test fixes and staging for device rework Fix up what unit tests can be easily fixed up. Stage everything else. In short the unit test code is _incredibly_ fragile since it's designed to hit every branch, positive and negative, in the code. However, the result of that is unit tests that are painful to modify. A lot of unit tests are also extremely opaque and rely on internal nvgpu behavior. This patch will be updated with fixes as I make them. Or, alternatively, it may be worth just temporarily disabling unit tests on dev-main. We'll have a _lot_ of work for Orin that will essentially gut the gr, host, and interrupt code. If we retain the unit test code for this, it may end up being backgreaking. JIRA NVGPU-5421 Change-Id: I8055fc72521f6a3a8a0d8f07fbe50c649a675016 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347274 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	a04525ece8	gpu: nvgpu: require deterministic for usermode Deterministic mode has always been a requirement for usermode submit; enforce it in the setup_bind path. Adjust tests to use the flag. QNX uses NVGPU_SETUP_BIND_FLAGS_SUPPORT_DETERMINISTIC only if CONFIG_NVGPU_IOCTL_NON_FUSA is set, so guard the check with that for now. Jira NVGPU-5582 Change-Id: Idedd01a3a24420b45195a472e8ca5c9f32f4ef46 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369818 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Dinesh	d0087f3ad8	gpu: nvgpu: Support for runlist_max_supported nvgpu_next needs support for max_runlist_supported by litter value. So the function is changed to support. JIRA NVGPU-5534 Change-Id: I097f6343295049532c46904316314dc82092a46b Signed-off-by: Dinesh <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2382882 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	881a6f35be	gpu: nvgpu: Trigger quiesce on PBDMA preempt fail During recovery, we preempt the faulty TSG from PBDMA and engines. If the TSG preempt on PBDMA times out(timeout = 100ms), the PBDMA might be hung state. We do not reset the HOST during recovery, so stuck PBDMAs are unrecoverable. Abort the recovery and trigger GPU to quiesce as there is no way back. Triggering Quiesce from recovery sequence should be fine as the only redundant operation will be write to FIFO_RUNLIST_PREEMPT register. The error notifiers will eventually be set by Quiesce thread. Bug 2768005 JIRA NVGPU-4631 Change-Id: I914b9379aa8e48014e6ddace9abe47180a072863 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368187 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	359fc24aaf	gpu: nvgpu: Rework engine management to work with vGPU Currently the vGPU engine management rewrites a lot of the common device agnostic engine management code. With the new top HAL parsing one device at a time, it is now more easily possible to tie the vGPU into the new common device framework by implementing the top HAL but with the vGPU engine list backend. This lets the vGPU inherit all the common engine and device management code. By doing so the vGPU HAL need only implement a trivial and simple HAL. This also gets us a step closer to merging all of the CE init code: logically it just iterates through all CE engines whatever they may be. The only reason this differs between chips is because of the swap from CE0-2 to LCEs in the Pascal generation. This could be abstracted by the unit code easily enough. Also, the pbdma_id for each engine has to be added to the device struct. Eventually this was going to happen anyway, since the device struct will soon replace the nvgpu_engine_info struct. It's a little bit of an abuse but might be worth it long term. If not, it should not be difficult to replace uses of dev->pbdma_id with a proper lookup of PBDMA ID based on the device info. JIRA NVGPU-5421 Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	fbb6a5bc1c	gpu: nvgpu: Remove fifo->pbdma_map The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists. This array allows other software to query what PBDMA(s) serves a given runlist. The PBDMA map is read verbatim from an array of host registers. These registers are stored in a kmalloc()'ed array. This causes a problem for the device management code. The device management initialization executes well before the rest of the FIFO PBDMA initialization occurs. Thus, if the device management code queries the PBDMA mapping for a given device/runlist, the mapping has yet to be populated. In the next patches in this series the engine management code is subsumed into the device management code. In other words the device struct is reused by the engine management and all host SW does is pull pointers to the host managed devices from the device manager. This means that all engine initialization that used to be done on top of the device management needs to move to the device code. So, long story short, the PBDMA map needs to be read from the registers directly, instead of an array that gets allocated long after the device code has run. This patch removes the pbdma map array, deletes two HALs that managed that, and instead provides a new HAL to query this map directly from the registers so that the device code can use it. JIRA NVGPU-5421 Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	fb1433811c	gpu: nvgpu: modify gr.falcon.dump_stats - Add gm20b_gr_falcon_gpccs_dump_stats() to print gpccs context switch mailbox register values for all gpcs. - Make gm20b_gr_falcon_fecs_dump_stats() a static function - Add gm20b_gr_falcon_dump_stats() to trigger gm20b_gr_falcon_fecs_dump_stats() and gm20b_gr_falcon_gpccs_dump_stats() - Update legacy chips gr.falcon.dump_stats() to gm20b_gr_falcon_dump_stats(). JIRA NVGPU-5597 Change-Id: I992c6432f3c2e3049bacc953f9b53ff6c4aa2f36 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357470 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: Seema Khowala <seemaj@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	59eb714c48	unit: Disable some unit tests for device work Fix what unit tests can be easily fixed, but disable some others. It's not clear why the MM related tests started failing - there's really zero reason for this. The list of disable tests are primarily engine related but there are some others that get inflenced by the device and engine structure. test_poweroff.init_poweroff=2 test_is_stall_and_eng_intr_pending.intr_is_stall_and_eng_intr_pending=2 test_isr_nonstall.isr_nonstall=2 test_isr_stall.isr_stall=2 test_engine_enum_from_type.enum_from_type=2 test_engine_find_busy_doing_ctxsw.find_busy_doing_ctxsw=2 test_engine_get_active_eng_info.get_active_eng_info=2 test_engine_get_fast_ce_runlist_id.get_fast_ce_runlist_id=2 test_engine_get_gr_runlist_id.get_gr_runlist_id=2 test_engine_get_mask_on_id.get_mask_on_id=2 test_engine_get_runlist_busy_engines.get_runlist_busy_engines=2 test_engine_ids.ids=2 test_engine_init_info.init_info=2 test_engine_interrupt_mask.interrupt_mask=2 test_engine_is_valid_runlist_id.is_valid_runlist_id=2 test_engine_mmu_fault_id.mmu_fault_id=2 test_engine_mmu_fault_id_veid.mmu_fault_id_veid=2 test_engine_setup_sw.setup_sw=2 test_engine_status.status=2 test_fifo_init_support.init_support=2 test_fifo_remove_support.remove_support=2 test_gp10b_engine_init_ce_info.engine_init_ce_info=2 test_nvgpu_mem_iommu_translate.mem_iommu_translate=2 test_nvgpu_mem_phys_ops.nvgpu_mem_phys_ops=2 And delete unit tests for functions that no longer exist: test_device_info_parse_enum.top_device_info_parse_enum test_get_device_info.top_get_device_info test_get_num_engine_type_entries.top_get_num_engine_type_entries test_is_engine_ce.top_is_engine_ce test_is_engine_gr.top_is_engine_gr JIRA NVGPU-5421 Change-Id: I343c0b1ea44c472b22356c896672153fc889ffc0 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355300 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	319520ff57	gpu: nvgpu: Add a new device manager unit This adds a new device management unit in the common code responsible for facilitating the parsing of the GPU top device list and providing that info to other units in nvgpu. The basic idea is to read this list once from HW and store it in a set of lists corresponding to each device type (graphics, LCE, etc). Many of the HALs in top can be deleted and instead implemented using common code parsing the SW representation. Every time the driver queries the device list it does so using a device type and instance ID. This is common code. The HAL is responsible for populating the device list in such a way that the driver can query it in a chip agnostic manner. Also delete some of the unit tests for functions that no longer exist. This code will require new unit tests in time; those should be quite simple to write once unit testing is needed. JIRA NVGPU-5421 Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	2a3bb9107f	gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h> top.h is a description of "devices" available on the GPU. As such rename this header to device.h. device.h will ultimately be a unit of actual C code that will rely on the top HAL to fill a device list. JIRA NVGPU-5421 Change-Id: If6e4a537d2209e429a678761a34713723da7a00a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
tkudav	957b19092f	gpu: nvgpu: Enable Quiesce on all builds Make Recovery and quiesce co-exist to support quiesce state on unrecoverrable errors. Currently, the quiesce code is wrapped under ifndef CONFIG_NVGPU_RECOVERY. Isolate the quiesce code from recovery config, thereby enabling it on all builds. On Linux, the hung_task checker(check_hung_uninterruptible_tasks() in kernel/hung_task.c) complains that quiesce thread is stuck for more than 120 seconds. INFO: task sw-quiesce:1068 blocked for more than 120 seconds. The wait time of more than 120 seconds is expected as quiesce thread will wait until quiesce call is triggered on fatal unrecoverable errors. However, the INFO print upsets the kernel_warning_test(KWT) on Linux builds. To fix the failing KWT, change the quiesce task to interruptible instead of uninterruptible as checker only looks at uninterruptible tasks. Bug 2919899 JIRA NVGPU-5479 Change-Id: Ibd1023506859d8371998b785e881ace52cb5f030 Signed-off-by: tkudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342774 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	5f0fdf085c	nvgpu: unit: Add new mock register framework Many tests used various incarnations of the mock register framework. This was based on a dump of gv11b registers. Tests that greatly benefitted from having generally sane register values all rely heavily on this framework. However, every test essentially did their own thing. This was not efficient and has caused a some issues in cleaning up the device and host code. Therefore introduce a much leaner and simplified register framework. All unit tests now automatically get a good subset of the gv11b registers auto-populated. As part of this also populate the HAL with a nvgpu_detect_chip() call. Many tests can now _probably_ have all their HAL init (except dummy HAL stuff) deleted. But this does require a few fixups here and there to set HALs to NULL where tests expect HALs to be NULL by default. Where necessary HALs are cleared with a memset to prevent unwanted code from executing. Overall, this imposes a far smaller burden on tests to initialize their environments. Something to consider for the future, though, is how to handle supporting multiple chips in the unit test world. JIRA NVGPU-5422 Change-Id: Icf1a63f728e9c5671ee0fdb726c235ffbd2843e2 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335334 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	068e00749b	gpu: nvgpu: update config_userd_writeback_enable Field value of pbdma_config_userd_writeback_enable is changing from 0x1 to 0x0 for nvgpu-next. So, - Update config_userd_writeback_enable() hal to accept u32 value. - Update config_userd_writeback_enable() hal to return modified value after setting pbdma_config_userd_writeback_enable field. Jira NVGPU-5162 Change-Id: I94efa20c34bb867f185778c973bd52b86902b32c Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2330160 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Prateek sethi	470fe3a6d4	gpu: nvgpu: unit: update cg unit test CG unit tests check for invalid registers access during configuration of various CG modes for various units that involve multiple registers accesses. Since ECC detect is now being done in hal init now, corresponding registers need to be added to io space. Bug 2919887 Change-Id: I8ded6a95952d810d9c8627a71752266e493e2c47 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332262 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00

1 2 3 4 5

202 Commits