linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
Tejal Kudav	b33079d47e	gpu: nvgpu: Move intr data members from MC to CIC Move interrupt specific data-members from common.mc to common.cic Some of these data members like sw_irq_stall_last_handled_cond need To be initialized much earlier during the OS specific init/probe stage. Also, some more members from struct nvgpu_interrupts(like stall_size, stall_lines[]), which will soon be moved to CIC will also need to be initialized early during the OS specific probe stage. However, the chip specific LUT can only be initialized after the hal_init stage where the HALs are all initialized. Split the CIC init to accommodate the above initialization requirements. JIRA NVGPU-6899 Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-19 18:06:28 -07:00
tkudav	0526e7eaa9	gpu: nvgpu: Create CIC-mon and CIC-rm subunits common.cic unit is divided into common.cic.mon and common.cic.rm based on rm and mon process split. CIC-mon subunit includes the code which is utilized in critical interrupt handling path like initialization, error detection and error reporting path. CIC-rm subunit includes the code corresponding to rest of interrupt handling(like collecting error debug data from registers) and ISR status management (status of deferred interrupts). Split the CIC APIs and data-members into above two subunits. JIRA NVGPU-6899 Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-07-02 09:57:56 -07:00
Tejal Kudav	e0a1fcf5f5	gpu: nvgpu: Add Central Intr Controller unit Add a new Central Interrupt Controller(CIC) unit in common code. The interrupt handling is done in a distributed manner currently. The error handling policy for different errors resides in each unit's ISR code. The goal is to converge this data under one central place - the CIC unit. This patch creates framework for CIC unit and moves the gv11b QNX safety LUT to CIC unit. All the error reporting APIs from different units are also moved to CIC. New APIs are exposed by CIC unit to access its internal data like: 1. Struct err_desc - the static err handling /injection data per error id 2. Num_hw_modules - the number of error reporting HW units supported by CIC Init and deinit of CIC unit: 1. CIC unit should be initialized earlyon during boot so that it is available for any interrupt handling. 2. Initialize CIC just before the interrupts are enabled during boot. 3. Similarly, CIC is disabled late during deinit cycle; right after the interrupts are masked. LUT: 1. LUT is currently used only for reporting error to safety services in gv11b QNX safety build. 2. This error handling policy LUT currently has only two levels of handing - correctable and quiecse. 3. Once, the error handling policy decision is moved from leaf unit nodes to CIC, LUT will be updated to have additional levels like fast recovery and full recovery. 4. Also, then a separate LUT will be added for each platform/build. 5. In current framework, the LUT is set to NULL for all configurations except gv11b. report_err() ops is added to report error to safety services. This ops is only effective for gv11b qnx build; and set to NULL for other configurations. NVGPU-6521 NVGPU-6523 NVGPU-6750 NVGPU-6758 NVGPU-6760 NVGPU-6754 Change-Id: I24be7836a96d787741e37b732e19863ed8014635 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683 Reviewed-by: Ajesh K V <akv@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-05-25 14:28:04 -07:00
Peter Daifuku	a331fd4b3a	gpu: nvgpu: pd_cache enablement for >4k allocations in qnx Mapping of large buffers to GMMU end up needing many pages for the PTE tables. Allocating these one by one can end up being a performance bottleneck, particularly in the virtualized case. This is adding the following changes: - As the TLB invalidation doesn't have access to mem_off, allow top-level allocation by alloc_cache_direct(). - Define NVGPU_PD_CACHE_SIZE, the allocation size for a new slab for the PD cache, effectively set to 64K bytes - Use the PD cache for any allocation < NVGPU_PD_CACHE_SIZE When freeing up cached entries, avoid prefetch errors by invalidating the entry (memset to 0). - Try to fall back to direct allocation of smaller chunk for contiguous allocation failures. - Unit test changes. Bug 200649243 Change-Id: I0a667af0ba01d9147c703e64fc970880e52a8fbc Signed-off-by: dt <dt@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2404371 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	fb1433811c	gpu: nvgpu: modify gr.falcon.dump_stats - Add gm20b_gr_falcon_gpccs_dump_stats() to print gpccs context switch mailbox register values for all gpcs. - Make gm20b_gr_falcon_fecs_dump_stats() a static function - Add gm20b_gr_falcon_dump_stats() to trigger gm20b_gr_falcon_fecs_dump_stats() and gm20b_gr_falcon_gpccs_dump_stats() - Update legacy chips gr.falcon.dump_stats() to gm20b_gr_falcon_dump_stats(). JIRA NVGPU-5597 Change-Id: I992c6432f3c2e3049bacc953f9b53ff6c4aa2f36 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357470 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: Seema Khowala <seemaj@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	59eb714c48	unit: Disable some unit tests for device work Fix what unit tests can be easily fixed, but disable some others. It's not clear why the MM related tests started failing - there's really zero reason for this. The list of disable tests are primarily engine related but there are some others that get inflenced by the device and engine structure. test_poweroff.init_poweroff=2 test_is_stall_and_eng_intr_pending.intr_is_stall_and_eng_intr_pending=2 test_isr_nonstall.isr_nonstall=2 test_isr_stall.isr_stall=2 test_engine_enum_from_type.enum_from_type=2 test_engine_find_busy_doing_ctxsw.find_busy_doing_ctxsw=2 test_engine_get_active_eng_info.get_active_eng_info=2 test_engine_get_fast_ce_runlist_id.get_fast_ce_runlist_id=2 test_engine_get_gr_runlist_id.get_gr_runlist_id=2 test_engine_get_mask_on_id.get_mask_on_id=2 test_engine_get_runlist_busy_engines.get_runlist_busy_engines=2 test_engine_ids.ids=2 test_engine_init_info.init_info=2 test_engine_interrupt_mask.interrupt_mask=2 test_engine_is_valid_runlist_id.is_valid_runlist_id=2 test_engine_mmu_fault_id.mmu_fault_id=2 test_engine_mmu_fault_id_veid.mmu_fault_id_veid=2 test_engine_setup_sw.setup_sw=2 test_engine_status.status=2 test_fifo_init_support.init_support=2 test_fifo_remove_support.remove_support=2 test_gp10b_engine_init_ce_info.engine_init_ce_info=2 test_nvgpu_mem_iommu_translate.mem_iommu_translate=2 test_nvgpu_mem_phys_ops.nvgpu_mem_phys_ops=2 And delete unit tests for functions that no longer exist: test_device_info_parse_enum.top_device_info_parse_enum test_get_device_info.top_get_device_info test_get_num_engine_type_entries.top_get_num_engine_type_entries test_is_engine_ce.top_is_engine_ce test_is_engine_gr.top_is_engine_gr JIRA NVGPU-5421 Change-Id: I343c0b1ea44c472b22356c896672153fc889ffc0 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355300 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	5f0fdf085c	nvgpu: unit: Add new mock register framework Many tests used various incarnations of the mock register framework. This was based on a dump of gv11b registers. Tests that greatly benefitted from having generally sane register values all rely heavily on this framework. However, every test essentially did their own thing. This was not efficient and has caused a some issues in cleaning up the device and host code. Therefore introduce a much leaner and simplified register framework. All unit tests now automatically get a good subset of the gv11b registers auto-populated. As part of this also populate the HAL with a nvgpu_detect_chip() call. Many tests can now _probably_ have all their HAL init (except dummy HAL stuff) deleted. But this does require a few fixups here and there to set HALs to NULL where tests expect HALs to be NULL by default. Where necessary HALs are cleared with a memset to prevent unwanted code from executing. Overall, this imposes a far smaller burden on tests to initialize their environments. Something to consider for the future, though, is how to handle supporting multiple chips in the unit test world. JIRA NVGPU-5422 Change-Id: Icf1a63f728e9c5671ee0fdb726c235ffbd2843e2 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2335334 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Prateek sethi	470fe3a6d4	gpu: nvgpu: unit: update cg unit test CG unit tests check for invalid registers access during configuration of various CG modes for various units that involve multiple registers accesses. Since ECC detect is now being done in hal init now, corresponding registers need to be added to io space. Bug 2919887 Change-Id: I8ded6a95952d810d9c8627a71752266e493e2c47 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332262 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Thomas Fleury	55510f266d	gpu: nvgpu: unit: improve coverage for engines Improve branch coverage for the following functions: - nvgpu_engine_get_active_eng_info - nvgpu_engine_get_ids - nvgpu_ce_engine_interrupt_mask - nvgpu_engine_get_gr_runlist_id Add unit tests for the following functions: -_nvgpu_engine_get_fast_ce_runlist_id - nvgpu_engine_is_valid_runlist_id - nvgpu_engine_id_to_mmu_fault_id - nvgpu_engine_mmu_fault_id_to_engine_id - nvgpu_engine_get_mask_on_id - nvgpu_engine_get_id_and_type - nvgpu_engine_find_busy_doing_ctxsw - nvgpu_engine_get_runlist_busy_engines - nvgpu_engine_mmu_fault_id_to_veid - nvgpu_engine_mmu_fault_id_to_eng_id_and_veid - nvgpu_engine_mmu_fault_id_to_eng_ve_pbdma_id Jira NVGPU-4511 Change-Id: Ib340df17468ff3447e271a86af9a47a067f6ad11 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2262222 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Vedashree Vidwans	0ca906a6ad	gpu: nvgpu: unit: fifo: fifo unit test This unit test covers most of the nvgpu.common.fifo.fifo module lines and almost all branches. Jira NVGPU-3697 Change-Id: I5722277a3e1630a902f63b707eb3de1c4e1876b0 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2237796 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00

10 Commits