linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
Lakshmanan M	883c12529a	gpu: nvgpu: Add multi GR reset support for MIG * Added multi GR reset/recovery support for MIG. * Added a api to get the gr engine id using gr instance id. JIRA NVGPU-5650 JIRA NVGPU-5653 Change-Id: I12ece75a4c33f0944f404121b54879e814dda6df Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2443644 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:48 -06:00
Konsta Hölttä	e8201d6ce3	gpu: nvgpu: decouple channel watchdog dependencies The channel code needs the watchdog code and vice versa. Cut this circular dependency with a few simplifications so that the watchdog wouldn't depend on so much. When calling watchdog APIs that cause stores or comparisons of channel progress, provide a snapshot of the current progress instead of a whole channel pointer. struct nvgpu_channel_wdt_state is added as an interface for this to track gp_get and pb_get. When periodically checking the watchdog state, make the channel code ask whether a hang has been detected and abort the channel from within channel code instead of asking the watchdog to abort the channel. The debug dump verbosity flag is also moved back to the channel data. Move the functionality to restart all channels' watchdogs to channel code from watchdog code. Looping over active channels is not a good feature for the watchdog; it's better for the channel handling to just use the watchdog as a tracking tool. Move a few unserviceable checks up in the stack to the callers of the wdt code. They're a kludge but this will do for now and demonstrates what needs to be eventually fixed. This does not leave much code in the watchdog unit. Now the purpose of the watchdog is to only isolate the logic to couple a timer and progress snapshots with careful locking to start and stop the tracking. Jira NVGPU-5582 Change-Id: I7c728542ff30d88b1414500210be3fbaf61e6e8a Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369820 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	359fc24aaf	gpu: nvgpu: Rework engine management to work with vGPU Currently the vGPU engine management rewrites a lot of the common device agnostic engine management code. With the new top HAL parsing one device at a time, it is now more easily possible to tie the vGPU into the new common device framework by implementing the top HAL but with the vGPU engine list backend. This lets the vGPU inherit all the common engine and device management code. By doing so the vGPU HAL need only implement a trivial and simple HAL. This also gets us a step closer to merging all of the CE init code: logically it just iterates through all CE engines whatever they may be. The only reason this differs between chips is because of the swap from CE0-2 to LCEs in the Pascal generation. This could be abstracted by the unit code easily enough. Also, the pbdma_id for each engine has to be added to the device struct. Eventually this was going to happen anyway, since the device struct will soon replace the nvgpu_engine_info struct. It's a little bit of an abuse but might be worth it long term. If not, it should not be difficult to replace uses of dev->pbdma_id with a proper lookup of PBDMA ID based on the device info. JIRA NVGPU-5421 Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	194fac7f3c	gpu: nvgpu: Remove clutter in engine code Remove the get_mask_on_id() HAL and replace it's usage with the global nvgpu_engine_get_mask_on_id() function. There's no need to have this function as a HAL. JIRA NVGPU-5420 Change-Id: I4fc843beff8e65806da26a0addc83fa218d390ac Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361315 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6cbc174fc2	gpu: nvgpu: avoid channel wdt ifdefs Implement empty stubs of the channel watchdog functions for when watchdog is disabled from build. Add some forward declarations that were missing. Now most call sites don't need #idefs for the build flag. Add error checks for the wdt alloc failure. Jira NVGPU-5494 Jira NVGPU-5493 Change-Id: I2d42e8ab4c5e045cd280b2e1f254396127bd154b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352050 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	21e02878f4	gpu: nvgpu: move wdt code out of channel.c Cut and paste the existing channel watchdog functions to another file for better isolation of units. Jira NVGPU-5494 Change-Id: Id437f0939e69a4a8b495eaee164c4d7a9f283fa9 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345934 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Scott Long	5ee9a446b5	gpu: nvgpu: misra 12.1 fixes MISRA Advisory Rule states that the precedence of operators within expressions should be made explicit. This change removes the Advisory Rule 12.1 violations from various common units. Jira NVGPU-3178 Change-Id: I4b77238afdb929c81320efa93ac105f9e69af9cd Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2277480 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	6b62e0f79a	gpu: nvgpu: engine preempt timeout in safety Preempt TSG occurs in non-mission mode, when unbinding channel from TSG, or aborting TSG. Should a preempt not complete on engine, we expect other HW safety mechanisms such as FECS watchdog to detect issues that prevented saving current context. Add BUG_ON when attempting to recover from preempt timeout, to make sure we got such error, and sw_quiesce has been requested. Jira NVGPU-4230 Change-Id: Ia26a61e703f74eb28d29e72e75664ca4ec97a586 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2265082 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	5e9bdbc80d	gpu: nvgpu: runlist update timeout in safety Runlist update occurs in non-mission mode, when adding/removing channel/TSGs. The pending bit is a debug only feature. As a result logging a warning is sufficient. We expect other HW safety mechanisms such as PBDMA timeout to detect issues that caused pending to not clear. It's possible bad base address could cause some MMU faults too. Worst case we rely on the application level task monitor to detect the GPU tasks are not completing on time. Jira NVGPU-4322 Change-Id: I7233770349db5dfad6904170a1e9a2d5eada70b2 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2265094 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	5e688c35f8	gpu: nvgpu: set error notifier in SW quiesce For MMU and PBDMA faults, error notifier needs to be set before entering SW quiesce. Otherwise it ends up with default NVGPU_ERR_NOTIFIER_FIFO_ERROR_IDLE_TIMEOUT. Added nvgpu_rc_mmu_fault to: - call g->ops.fifo.recover when recovery is enabled - set MMU error when recovery is disabled Updated nvgpu_rc_pbdma_fault to set PBDMA error when recovery is disabled as well. Wait for deferred interrupts to complete before actually entering SW quiesce state, to make sure error notifier has been set. Jira NVGPU-4127 Change-Id: Ia84c723e021e397391c6c609d4bb96c06afdcc47 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2210909 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	b8465d479d	gpu: nvgpu: sw quiesce when recovery is disabled When CONFIG_NVGPU_RECOVERY is disabled, warn if recovery function is entered with sw_quiesce_pending false. Jira NVGPU-3871 Change-Id: Ic8e878ff6637c07f80b1a3542355ec51f729fe12 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2175446 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:01:38 -06:00
Scott Long	4277f65834	gpu: nvgpu: fix misra 2.7 violations Advisory Rule 2.7 states that there should be no unused parameters in functions. This patch removes unused function parameters from the following: * nvgpu_channel_ctxsw_timeout_debug_dump_state() * nvgpu_channel_destroy() * nvgpu_tsg_destroy() * nvgpu_rc_pdbma_fault() Jira NVGPU-3178 Change-Id: I12ad0d287fd7980533663a9776428ef5d4fd1fb9 Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2176066 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-16 16:06:04 -07:00
Thomas Fleury	4899875f00	gpu: nvgpu: fix missing exports when disabling recovery Some recovery functions are currently exported in libnvgpu_safe.export. Once CONFIG_NVGPU_RECOVERY is permanently disabled for safety build, we can remove those functions from the export file. Until we can disable it, make sure that related functions do exist, even when CONFIG_NVGPU_RECOVERY is disabled, by using #ifdefs inside the functions, instead of redeclaring functions as static inline. Jira NVGPU-3871 Change-Id: Ib682ae81268b35cd1050a55cc73653fb6637b87c Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2170433 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-08 16:06:30 -07:00
Debarshi Dutta	0ef96e4b1a	gpu: nvgpu: correct handling of pbdma rc nvgpu_rc_pbdma_fault just checks for the id and id_type from struct nvgpu_pbdma_status_info. These contain invalid values during chsw_load and chsw_switch. This patch corrects the above bug by checking for the chsw status and then loading the values for id and type. The current code reads the pbdma_status info after clearing the interrupt. Other interrupts can cause enough delay between clearing the interrupt and pbdma switching the channel leading to invalid channel/tsg ID. Correct that by reading the pbdma_status info register before clearing of the pbdma interrupt to correctly read the context information before the pbdma can switch out the context. Bug 2648298 Change-Id: Ic2f0682526e00d14ad58f0411472f34388183f2b Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2165047 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-05 02:56:16 -07:00
Deepak Nibade	0755b25231	gpu: nvgpu: remove reset and enable/disable ctxsw hals Remove below hals since the corresponding functions are same on all platforms and they are h/w independent g->ops.gr.enable_ctxsw() g->ops.gr.disable_ctxsw() g->ops.gr.reset() Call the functions directly at all places Remove CONFIG_NVGPU_DEBUGGER from places where these functions are called since they are not debugger dependent This also helps to disable CONFIG_NVGPU_DEBUGGER and to keep recovery sequence intact Jira NVGPU-3506 Change-Id: Id2b208ca23dc4667e78edcd8ad242a8558e0ff64 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2137255 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-06-18 01:39:20 -07:00
Deepak Nibade	10fae67c21	gpu: nvgpu: add flag for debugger fields in struct gk20a Add CONFIG_NVGPU_DEBUGGER flag for debugger specific fields in struct gk20a Jira NVGPU-3506 Change-Id: Icfae87e16e0079a2c5f16714b8a8ced7c6572cd4 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2137254 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-06-18 01:39:10 -07:00
Sagar Kamble	3f08cf8a48	gpu: nvgpu: rename feature Make and C flags Name the Make and C flag variables consistently wih syntax: CONFIG_NVGPU_<feature name> s/NVGPU_DEBUGGER/CONFIG_NVGPU_DEBUGGER s/NVGPU_CYCLESTATS/CONFIG_NVGPU_CYCLESTATS s/NVGPU_USERD/CONFIG_NVGPU_USERD s/NVGPU_CHANNEL_WDT/CONFIG_NVGPU_CHANNEL_WDT s/NVGPU_FEATURE_CE/CONFIG_NVGPU_CE s/NVGPU_GRAPHICS/CONFIG_NVGPU_GRAPHICS s/NVGPU_ENGINE/CONFIG_NVGPU_FIFO_ENGINE_ACTIVITY s/NVGPU_FEATURE_CHANNEL_TSG_SCHED/CONFIG_NVGPU_CHANNEL_TSG_SCHED s/NVGPU_FEATURE_CHANNEL_TSG_CONTROL/CONFIG_NVGPU_CHANNEL_TSG_CONTROL s/NVGPU_FEATURE_ENGINE_QUEUE/CONFIG_NVGPU_ENGINE_QUEUE s/GK20A_CTXSW_TRACE/CONFIG_NVGPU_FECS_TRACE s/IGPU_VIRT_SUPPORT/CONFIG_NVGPU_IGPU_VIRT s/CONFIG_TEGRA_NVLINK/CONFIG_NVGPU_NVLINK s/NVGPU_DGPU_SUPPORT/CONFIG_NVGPU_DGPU s/NVGPU_VPR/CONFIG_NVGPU_VPR s/NVGPU_REPLAYABLE_FAULT/CONFIG_NVGPU_REPLAYABLE_FAULT s/NVGPU_FEATURE_LS_PMU/CONFIG_NVGPU_LS_PMU s/NVGPU_FEATURE_POWER_PG/CONFIG_NVGPU_POWER_PG JIRA NVGPU-3624 Change-Id: I8b2492b085095fc6ee95926d8f8c3929702a1773 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2130290 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-06-11 09:46:24 -07:00
Deepak Nibade	649a2b57a8	gpu: nvgpu: add debugger flag for hal.gr.gr unit Add NVGPU_DEBUGGER flag for common.hal.gr.gr unit and corresponding hals. Also add this flag for deferred reset functionality Jira NVGPU-3506 Change-Id: Iee4fbc1305346bb4d779cd69e8fd5539cb07206b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2130149 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-06-06 16:28:44 -07:00
Debarshi Dutta	4c30bd599f	gpu: nvgpu: rename tsg_gk20a/gk20a_tsg functions. rename the functions with the prefixes tsg_gk20a/gk20a_tsg to nvgpu_tsg_* Jira NVGPU-3248 Change-Id: I9f5f601040d994cd7798fe76813cc86c8df126dc Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2120165 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-17 01:49:27 -07:00
Debarshi Dutta	1dea88c6c7	gpu: nvgpu: Add NVGPU_CHANNEL_WDT flag NVGPU_CHANNEL_WDT feature is embedded within the NVGPU_CHANNEL_WDT flag to allow it to be compiled out for safety builds. Jira NVGPU-3012 Change-Id: I0ca54af9d7b1b8e01f4090442341eaaadca8e339 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2114480 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-16 23:28:13 -07:00
Seema Khowala	671f1c8a36	gpu: nvgpu: channel MISRA fix for Rule 21.2 Rename _gk20a_channel_get -> nvgpu_channel_get__func gk20a_channel_get -> nvgpu_channel_get _gk20a_channel_put -> nvgpu_channel_put__func gk20a_channel_put -> nvgpu_channel_put trace_gk20a_channel_get -> trace_nvgpu_channel_get trace_gk20a_channel_put -> trace_nvgpu_channel_put JIRA NVGPU-3388 Change-Id: I4e37adddbb5ce14aa18132722719ca2f73f1ba52 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2114118 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-09 04:39:34 -07:00
Seema Khowala	26d13b3b6b	gpu: nvgpu: channel MISRA fix for Rule 21.2 Rename functions starting with '_' and '__'. __gk20a_channel_kill -> nvgpu_channel_kill _gk20a_channel_from_id -> nvgpu_channel_from_id__func gk20a_channel_from_id -> nvgpu_channel_from_id JIRA NVGPU-3388 Change-Id: I3b5f63bf214c5c5e49bc84ba8ef79bd49831c56e Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2114037 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-09 04:39:08 -07:00
Debarshi Dutta	17486ec1f6	gpu: nvgpu: rename tsg_gk20a and channel_gk20a structs rename struct tsg_gk20a to struct nvgpu_tsg and rename struct channel_gk20a to struct nvgpu_channel Jira NVGPU-3248 Change-Id: I2a227347d249f9eea59223d82f09eae23dfc1306 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2112424 GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-06 02:56:53 -07:00
Seema Khowala	cfb4ff0bfb	gpu: nvgpu: rename struct fifo_gk20a Rename struct fifo_gk20a -> nvgpu_fifo JIRA NVGPU-2012 Change-Id: Ifb5854592c88894ecd830da092ada27c7f05380d Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2109625 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Adeel Raza <araza@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-03 16:25:43 -07:00
Seema Khowala	39070c653f	gpu: nvgpu: move FIFO_INVAL_* out of fifo_gk20a.h Move and rename FIFO_INVAL_ENGINE_ID -> NVGPU_INVALID_ENG_ID FIFO_INVAL_TSG_ID -> NVGPU_INVALID_TSG_ID FIFO_INVAL_RUNLIST_ID -> NVGPU_INVALID_RUNLIST_ID FIFO_INVAL_SYNCPT_ID -> NVGPU_INVALID_SYNCPT_ID FIFO_INVAL_CHANNEL_ID -> NVGPU_INVALID_CHANNEL_ID JIRA NVGPU-2012 Change-Id: Ic4cc16ece64d85e22f16e4d28dcfd0c187bb65f3 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2109011 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-05-02 23:40:26 -07:00
Debarshi Dutta	965062c2bc	gpu: nvgpu: remove direct tsg retrieval from fifo Added - nvgpu_tsg_check_and_get_from_id - nvgpu_tsg_get_from_id And removed direct accesses to f->tsg array. Jira NVGPU-3156 Change-Id: I8610e19c1a6e06521c16a1ec0c3a7a011978d0b7 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2101251 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-26 14:16:47 -07:00
Vinod G	344b164eea	gpu: nvgpu: remove gr_gk20a.h from gk20a.h Remove gr_gk20a.h from gk20a.h Add gr_gk20a.h in all gr hal files Removed ununsed gr_priv.h from two files Jira NVGPU-3217 Jira NVGPU-3218 Change-Id: Ic74c068782432e99ddba168f65a5cf42e1405305 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2104569 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-25 16:27:11 -07:00
Seema Khowala	60633ca551	gpu: nvgpu: move gv11b rc code to rc_gv11b.c Move chip specific recovery code for volta onwards architecture to hal/rc/rc_gv11b.c Rename fifo.teardown_ch_tsg -> fifo.recover gk20a_runlist_update_locked -> nvgpu_runlist_update_locked Remove Unused h/w headers from fifo_gv11b.c Use local variable f instead of g->fifo JIRA NVGPU-1314 Change-Id: Ia535bbe4780e7241fdd911a8f577c6b98cf0fe53 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2102897 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-24 20:23:06 -07:00
Seshendra Gadagottu	a91535e3a3	gpu: nvgpu: avoid gr_falcon dependency outside gr Basic units like fifo, rc are having dependency on gr_falcon. Avoided outside gr units dependency on gr_falcon by moving following functions to gr: int nvgpu_gr_falcon_disable_ctxsw(struct gk20a g, struct nvgpu_gr_falcon falcon); -> int nvgpu_gr_disable_ctxsw(struct gk20a g); int nvgpu_gr_falcon_enable_ctxsw(struct gk20a g, struct nvgpu_gr_falcon falcon); -> int nvgpu_gr_enable_ctxsw(struct gk20a g); int nvgpu_gr_falcon_halt_pipe(struct gk20a g); -> int nvgpu_gr_halt_pipe(struct gk20a g); HALs also moved accordingly and updated code to reflect this. Also moved following data back to gr from gr_falcon: struct nvgpu_mutex ctxsw_disable_mutex; int ctxsw_disable_count; JIRA NVGPU-3168 Change-Id: I2bdd4a646b6f87df4c835638fc83c061acf4051e Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2100009 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-23 05:04:44 -07:00
Vinod G	dc82262b99	gpu: nvgpu: Add gr_priv header file Move nvgpu_gr structure to private file gr_priv.h Include the private file where gr variables are used. JIRA NVGPU-3132 JIRA NVGPU-3079 Change-Id: Ib26ca5c5cb25fd8dd013a7c643278efc34aa55d4 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2098021 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-22 03:15:09 -07:00
Vinod G	556e139077	gpu: nvgpu: Cleanup for gr_gk20a header Removed unused struct from gr_gk20a.h Change static allocation for struct gr_gk20a to dynamic type. Change all the files that being affected by that change. Call gr allocation from corresponding init_support functions, which are part of the probe functions. nvgpu_pci_init_support in pci.c vgpu_init_support in vgpu_linux.c gk20a_init_support in module.c Call gr free before the gk20a free call in nvgpu_free_gk20a. Rename struct gr_gk20a to struct nvgpu_gr JIRA NVGPU-3132 Change-Id: Ief5e664521f141c7378c4044ed0df5f03ba06fca Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2095798 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-19 00:04:00 -07:00
Seema Khowala	ca628dfd6e	gpu: nvgpu: move engine functions to engines.c Removed fifo.runlist_busy_engines ops Moved to engines.c and renamed gk20a_fifo_get_failing_engine_data -> nvgpu_engine_find_busy_doing_ctxsw gk20a_fifo_get_faulty_id_type -> nvgpu_engine_get_id_and_type gk20a_fifo_runlist_busy_engines -> nvgpu_engine_get_runlist_busy_engines JIRA NVGPU-1314 Change-Id: I89c81f331321d47a616a785082d66f9b4a51ff71 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2093788 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-18 15:55:24 -07:00
Seema Khowala	92e26141d5	gpu: nvgpu: move gk20a_fifo_recover to common/rc Move gk20a_fifo_recover from gk20a/fifo_gk20a.c to common/rc/rc.c Rename gk20a_fifo_recover -> nvgpu_rc_fifo_recover JIRA NVGPU-1314 Change-Id: I5155a73cda9a60275dacd2568423386cd0f808ee Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2093719 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-16 17:06:08 -07:00
Seema Khowala	03b521d9d7	gpu: nvgpu: move nvgpu_tsg_recover to common/rc Moved from common/tsg to common/rc and renamed nvgpu_tsg_recover -> nvgpu_rc_tsg_and_related_engines JIRA NVGPU-1314 Change-Id: I887d5fcdb15def13cc74e2993312b3b36119c97c Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2095622 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-16 17:05:59 -07:00
Seema Khowala	c570ba99ed	gpu: nvgpu: move sched error bad tsg recovery Move sched error bad tsg recovery from fifo_intr_gv11b.c to common/rc/rc.c JIRA NVGPU-1314 Change-Id: Ic731a3162cad2fe184d764f0b3ad98acc1f382cb Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2095621 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-16 17:05:49 -07:00
Seema Khowala	8c5c9de72a	gpu: nvgpu: move gr recovery from gr_gk20a.c to common/rc Move gr fault recovery from gr_gk20a.c to common/rc/rc.c JIRA NVGPU-1314 Change-Id: I0d924975f0397ae2417e5a43b2d048f3ae9c4f79 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2093706 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-16 17:05:40 -07:00
Seema Khowala	2f00275584	gpu: nvgpu: move preempt timeout rc from fifo to rc Move preempt timeout recovery related function to common/rc. Remove nvgpu_channel_recover as bare channels are not recovered. Recover channels bound to tsg. JIRA NVGPU-1314 Change-Id: Ic1f94b321d0404eea86dd6d6d990529b2f3a8d57 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2093682 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-16 17:05:25 -07:00
Seema Khowala	1882a7413d	gpu: nvgpu: move runlist update timeout rc to common/rc Move runlist update timeout recovery from runlist.c to rc.c Move RC_TYPE defines from fifo.h to rc.h JIRA NVGPU-1314 Change-Id: I66925ca9fba904c523be69ad99808e3de33a7d46 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2093666 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-04-16 17:05:10 -07:00
Debarshi Dutta	c48bfdd0d6	gpu: nvgpu: move gk20a_fifo_pbdma_fault_rc to common.rc unit gk20a_fifo_pbdma_fault_rc is moved to common.rc unit and renamed to nvgpu_rc_pbdma_fault. The function is modified such that when the pbdma id is a channel, recovery is issued only when the channel is part of a valid tsg. Jira NVGPU-2950 Change-Id: I5e975cf79810479f83ffd50581c214a64d1619a6 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2083749 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-03-29 04:47:28 -07:00
Seema Khowala	dfafddcc21	gpu: nvgpu: move common and chip specific ctxsw timeout Delete apply_ctxsw_timeout_intr ops and add ctxsw_timeout_enable ops Move chip specific sched_error and ctxsw_timeout functions to hal/fifo/fifo_intr_* and hal/fifo/ctxsw_timeout_* Add nvgpu_rc_ctxsw_timeout function under common/rc/rc.c Do not check ctxsw timeout for channels that are no more bound to tsg. JIRA NVGPU-1312 Change-Id: Ide977fb60b3b72a27d9f22873f7a416c3bd1181d Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2075734 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-03-25 22:47:45 -07:00

40 Commits