linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
Alex Waterman	fba96fdc09	gpu: nvgpu: Replace nvgpu_engine_info with nvgpu_device Delete the struct nvgpu_engine_info as it's essentially identical to struct nvgpu_device. Duplicating data structures is not ideal as it's terribly confusing what does what. Update all uses of nvgpu_engine_info to use struct nvgpu_device. This is often a fairly straight forward replacement. Couple of places though where things got interesting: - The enum_type that engine_info uses is defined in engines.h and has a bit of SW abstraction - in particular the GRCE type. The only place this seemed to be actually relevant (the IOCTL providing device info to userspace) the GRCE engines can be worked out by comparing runlist ID. - Addition of masks based on intr_id and reset_id; those can be computed easily enough using BIT32() but this is an area that could be improved on. This reaches into a lot of extraneous code that traverses the fifo active engines list and dramtically simplifies this. Now, instead of having to go through a table of engine IDs that point to the list of all host engines, the active engine list is just a list of pointers to valid engines. It's now trivial to do a for-all-active-engines type loop. This could even be turned into a generic macro or otherwise abstracted in the future. JIRA NVGPU-5421 Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	ab2b0b5949	gpu: nvgpu: Set unserviceable flag early during RC During recovery, we set ch->unserviceable at the end after we preempt the TSG and reset the engines. It might be too late and user-space might submit more work to the broken channel which is not desirable. Move setting this unserviceable flag right at the start of recovery sequence. Another thread doing a submit can still read the unserviceable flag just before it is set here, leaving that submit stuck if recovery completes before the submit thread advances enough to set up a post fence visible for other threads. This could be fixed with a big lock or with a double check at the end of the submit code after the job data has been made visible. We still release the fences, semaphore and error notifier wait queues at the end; so user-space would not trigger channel unbind while channel is being recovered. Also, change the handle_mmu_fault APIs to return void as the debug_dump return value is not used in any of the caller APIs. JIRA NVGPU-5843 Change-Id: Ib42c2816dd1dca542e4f630805411cab75fad90e Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2385256 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	359fc24aaf	gpu: nvgpu: Rework engine management to work with vGPU Currently the vGPU engine management rewrites a lot of the common device agnostic engine management code. With the new top HAL parsing one device at a time, it is now more easily possible to tie the vGPU into the new common device framework by implementing the top HAL but with the vGPU engine list backend. This lets the vGPU inherit all the common engine and device management code. By doing so the vGPU HAL need only implement a trivial and simple HAL. This also gets us a step closer to merging all of the CE init code: logically it just iterates through all CE engines whatever they may be. The only reason this differs between chips is because of the swap from CE0-2 to LCEs in the Pascal generation. This could be abstracted by the unit code easily enough. Also, the pbdma_id for each engine has to be added to the device struct. Eventually this was going to happen anyway, since the device struct will soon replace the nvgpu_engine_info struct. It's a little bit of an abuse but might be worth it long term. If not, it should not be difficult to replace uses of dev->pbdma_id with a proper lookup of PBDMA ID based on the device info. JIRA NVGPU-5421 Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	194fac7f3c	gpu: nvgpu: Remove clutter in engine code Remove the get_mask_on_id() HAL and replace it's usage with the global nvgpu_engine_get_mask_on_id() function. There's no need to have this function as a HAL. JIRA NVGPU-5420 Change-Id: I4fc843beff8e65806da26a0addc83fa218d390ac Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361315 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Philip Elcan	06fd513e1e	gpu: nvgpu: move common.unit into common.mc nvgpu.common.unit was just an enum used for passing to nvgpu.common.mc APIs. So, move the enum into mc.h, and replace the include of unit.h with mc.h where appropriate. And update the yaml arch. JIRA NVGPU-4144 Change-Id: I210ea4d3b49cd494e43add1b52f3fbcdb020a1e3 Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2216106 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Seema Khowala	39070c653f	gpu: nvgpu: move FIFO_INVAL_* out of fifo_gk20a.h Move and rename FIFO_INVAL_ENGINE_ID -> NVGPU_INVALID_ENG_ID FIFO_INVAL_TSG_ID -> NVGPU_INVALID_TSG_ID FIFO_INVAL_RUNLIST_ID -> NVGPU_INVALID_RUNLIST_ID FIFO_INVAL_SYNCPT_ID -> NVGPU_INVALID_SYNCPT_ID FIFO_INVAL_CHANNEL_ID -> NVGPU_INVALID_CHANNEL_ID JIRA NVGPU-2012 Change-Id: Ic4cc16ece64d85e22f16e4d28dcfd0c187bb65f3 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2109011 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-02 23:40:26 -07:00
Thomas Fleury	258a6141fd	gpu: nvgpu: rename runlist functions Renamed: - gk20a_runlist_reload -> nvgpu_runlist_reload - gk20a_fifo_interleave_level_name -> nvgpu_runlist_interleave_level_name - gk20a_runlist_update_for_channel -> nvgpu_runlist_update_for_channel - nvgpu_fifo_lock_active_runlists -> nvgpu_runlist_lock_active_runlists - nvgpu_fifo_unlock_active_runlists -> nvgpu_runlist_unlock_active_runlists - nvgpu_fifo_get_runlists_mask -> nvgpu_runlist_get_runlists_mask - nvgpu_fifo_unlock_runlists -> nvgpu_runlist_unlock_runlists - gk20a_runlist_update -> nvgpu_runlist_update Jira NVGPU-3198 Change-Id: Ifc5ad2aae546614667c174643ee07283d2716adc Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2108029 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-30 12:46:02 -07:00
Seema Khowala	60633ca551	gpu: nvgpu: move gv11b rc code to rc_gv11b.c Move chip specific recovery code for volta onwards architecture to hal/rc/rc_gv11b.c Rename fifo.teardown_ch_tsg -> fifo.recover gk20a_runlist_update_locked -> nvgpu_runlist_update_locked Remove Unused h/w headers from fifo_gv11b.c Use local variable f instead of g->fifo JIRA NVGPU-1314 Change-Id: Ia535bbe4780e7241fdd911a8f577c6b98cf0fe53 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2102897 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-24 20:23:06 -07:00
Nicolas Benech	0435ca4eb3	gpu: nvgpu: fix MISRA 17.7 in nvgpu.common.hal.fifo.* MISRA Rule-17.7 requires the return value of all functions to be used. Fix is either to use the return value or change the function to return void. This patch contains fixes for all 17.7 violations in the following units: - nvgpu.common.hal.fifo.runlist - nvgpu.common.hal.fifo.fifo JIRA NVGPU-3039 Change-Id: I9483f5cb623cfe36d6b26e41c33f124c24710c08 Signed-off-by: Nicolas Benech <nbenech@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2098765 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-19 19:04:05 -07:00
Seema Khowala	da9dee85e2	gpu: nvgpu: move mmu fault handling to hal/fifo Move chip specific mmu fault handling from fifo_gk20a.c to hal/fifo/mmu_fault_gk20a.c Move gk20a_teardown_ch_tsg to hal/rc/rc_gk20a.c JIRA NVGPU-1314 Change-Id: Idf88b1c312bc9f46c2508f2c63e948d71d622297 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2094051 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-18 15:56:08 -07:00

10 Commits