linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 18:16:01 +03:00

Author	SHA1	Message	Date
Seema Khowala	c9d4df288d	gpu: nvgpu: remove code for ch not bound to tsg - Remove handling for channels that are no more bound to tsg as channel could be referenceable but no more part of a tsg - Use tsg_gk20a_from_ch to get pointer to tsg for a given channel - Clear unhandled gr interrupts Bug 2429295 JIRA NVGPU-1580 Change-Id: I9da43a2bc9a0282c793b9f301eaf8e8604f91d70 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1972492 (cherry picked from commit `013ca60edd` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2018262 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Tested-by: Debarshi Dutta <ddutta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-22 18:59:18 -08:00
Seema Khowala	220860d043	gpu: nvgpu: rename has_timedout and make it thread safe Currently has_timedout variable is protected by wmb at places where it is being set and there is no correspoding rmb whenever has_timedout variable is read. This is prone to errors for concurrent execution. This change is supposed to fix this issue. Rename has_timedout variable of channel struct to ch_timedout. Also to avoid rmb every time ch_timedout is read, ch_timedout_spinlock is added to protect ch_timedout variable for taking care of concurrent execution. Bug 2404865 Bug 2092051 Change-Id: I0bee9f50af0a48720aa8b54cbc3af97ef9f6df00 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1930935 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit `1f54ea09e3` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2016975 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-13 13:19:37 -08:00
Debarshi Dutta	18643ac135	gpu: nvgpu: replace input param chid with pointer to channel preempt_channel needs to use the channel to pass it to other public functions, get access to a tsg etc. This qualifies it to take a pointer to a channel as an input parameter instead of a chid. Increment the channel ref counter using the function gk20a_channel_from_id in functions where we get the chid from the h/w registers directly. Once the prempt_channel function call is done, use a gk20a_channel_put on the referenced channel. Jira NVGPU-1461 Change-Id: I6c87c8104cfcb418d468c8c590087fd4aeabf4bd Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1963200 (cherry picked from commit `9abe9fe062` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013728 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-11 08:18:47 -08:00
Debarshi Dutta	a8f0cb89f4	gpu: nvgpu: replace input param chid with pointer to channel gk20a_fifo_recover_channel takes a reference to the channel via its chid before passing the channel pointer to other public functions such as gk20a_channel_abort and gk20a_fifo_error_ch. This qualifies the gk20a_fifo_recover_channel to take a pointer to a channel instead of only chid. Jira NVGPU-1461 Change-Id: I338a12a05e5ccee785a202fea7848db5201a3a39 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1963199 (cherry picked from commit `99acb8011a` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013727 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-11 08:18:44 -08:00
Debarshi Dutta	d9efcd5871	gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20a The function gk20a_fifo_recover_tsg has to pass a valid struct tsg to other functions from within. This qualifies it to have a pointer to struct tsg_gk20a as an input parameter. Tsg specific parts of the gk20a_fifo_preempt_timeout_rc are now moved into another function gk20a_fifo_preempt_timeout_rc_tsg that takes a tsg as an input and passes it to gk20a_fifo_recover_tsg. The pointer to a tsg is also used to enumerate channels from within. The function gk20a_fifo_preempt_timeout_rc now contains only channel specific code. Jira NVGPU-1461 Change-Id: Ice0a9921567841fb5586a7e4e010c442ca6cf172 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1961675 (cherry picked from commit `e19cea7ab3` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013726 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-11 08:18:40 -08:00
Debarshi Dutta	ef9de9e992	gpu: nvgpu: replace input parameter tsgid with pointer to struct tsg_gk20a gv11b_fifo_preempt_tsg needs to access the runlist_id of the tsg as well as pass the tsg pointer to other public functions such as gk20a_fifo_disable_tsg_sched. This qualifies the preempt_tsg to use a pointer to a struct tsg_gk20a instead of just using the tsgid. Jira NVGPU-1461 Change-Id: I01fbd2370b5746c2a597a0351e0301b0f7d25175 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1959068 (cherry picked from commit `1e78d47f15` in rel-32) Reviewed-on: https://git-master.nvidia.com/r/2013725 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-11 08:18:36 -08:00
Debarshi Dutta	5b8ecbc51f	gpu: nvgpu: replace tsgid input variable with pointer to a struct tsg_gk20a replace tsgid with a pointer to a struct tsg_gk20a in the function gk20a_fifo_tsg_abort(). gk20a_fifo_tsg_abort needs to enumerate through all the channels within the tsg as well as pass the tsg pointer to other functions, qualifying the need to use a pointer instead as an input parameter. Jira NVGPU-1461 Change-Id: I59cec05d5d778f733d0c3e9ffadf46e74e249080 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1956567 (cherry picked from commit `e5bebd880f` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2013724 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-11 08:18:33 -08:00
Konsta Holtta	7e8ba851a8	gpu: nvgpu: delete raw chid lookup This (dangerous) array lookup with no channel references is now unused. Jira NVGPU-1460 Change-Id: Ic6bdbcf19fc8996bc6ff02a40afe3224bdd5bc27 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1955402 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit `4a53854a92` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008517 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-05 09:04:33 -08:00
Konsta Holtta	3794afbeb1	gpu: nvgpu: add safe channel id lookup Add gk20a_channel_from_id() to retrieve a channel, given a raw channel ID, with a reference taken (or NULL if the channel was dead). This makes it harder to mistakenly use a channel that's dead and thus uncovers bugs sooner. Convert code to use the new lookup when applicable; work remains to convert complex uses where a ref should have been taken but hasn't. The channel ID is also validated against FIFO_INVAL_CHANNEL_ID; NULL is returned for such IDs. This is often useful and does not hurt when unnecessary. However, this does not prevent the case where a channel would be closed and reopened again when someone would hold a stale channel number. In all such conditions the caller should hold a reference already. The only conditions where a channel can be safely looked up by an id and used without taking a ref are when initializing or deinitializing the list of channels. Jira NVGPU-1460 Change-Id: I0a30968d17c1e0784d315a676bbe69c03a73481c Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1955400 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit `7df3d58750` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008515 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-05 09:04:20 -08:00
Philip Elcan	ed6e396090	gpu: nvgpu: channel: make chid u32 The chid member of the channel_gk20a struct was being used as a unsigned value. By being declared as an int, it was causing MISRA 10.3 violations for implicit assignment of different types. JIRA NVGPU-647 Change-Id: I7477fad6f0c837cf7ede1dba803158b1dda717af Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1918470 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit `1c7bb9b538` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008514 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-05 09:04:11 -08:00
Philip Elcan	bace52ac7a	gpu: nvgpu: make tsgid a consistent type Different units were declaring tsgid as int or u32. This makes everyone use u32. This change resolves MISRA 10.3 violations for implicit assingment to different types. JIRA NVGPU-647 Change-Id: I78660e737acb0dad76dd538e5dd37f4527cf5acd Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1918469 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit `f5cac144a0` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/2008513 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-05 09:04:02 -08:00
Konsta Holtta	aa84e8a986	gpu: nvgpu: fix double handling in timeout The context switch timeout works by triggering a hardware timeout at 10 Hz. When handling these, we check whether a channel has actually timed out. Currently the timeout limit can be shorter than the 10 Hz interval which always causes us to recover a channel but would also cause detection of progress if there was any in the interval. Handling both situations at the same time would reuse the channel pointer local to the function after a loop has finished and would cause memory corruption. Fix this by making the two branches mutually exclusive, and move the recover case to happen first because that's how our tests assume things to work. Jira NVGPU-967 Bug 2502074 Change-Id: I26aa0fa7fd80ab42a9a1a93a6cca2cd29c9d3f3f Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1932449 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 8ac9a53d816a3d012a6948a9a96ac6db699c662di in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/1997597 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Bibek Basu <bbasu@nvidia.com> Tested-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-02-05 01:53:04 -08:00
Seema Khowala	0d110b7522	gpu: nvgpu: clear all handled fifo interrupts Issue is that local variable clear_intr is reset if fifo intr handler happens to handle interrupts handled by fifo_error_isr. This fix is to take care of clearing all handled fifo interrupts. Bug 2361571 Change-Id: Ic8fe2294cfb25c58925942750a81c104ec9747de Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1960330 (cherry picked from commit `1195239d1c` in dev-kernel) Reviewed-on: https://git-master.nvidia.com/r/1979744 GVS: Gerrit_Virtual_Submit Reviewed-by: Bitan Biswas <bbiswas@nvidia.com> Tested-by: Bitan Biswas <bbiswas@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-12-28 23:51:32 -08:00
Terje Bergstrom	e3ae03e17a	gpu: nvgpu: Add MC APIs for reset masks Add API for querying reset mask corresponding to a unit. The reset masks need to be read from MC HW header, and we do not want all units to access Mc HW headers themselves. JIRA NVGPU-954 Change-Id: I49ebbd891569de634bfc71afcecc8cd2358805c0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1823384 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-09-27 15:05:25 -07:00
Konsta Holtta	34d552957d	gpu: nvgpu: move channel header to common channel_gk20a is clear from chip specifics and from most dependencies, so move it under the common directory. Jira NVGPU-967 Change-Id: I41f2160b96d4ec84064288ecc22bb360e82352df Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1810578 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-09-05 20:40:32 -07:00
Nicolas Benech	2eface802a	gpu: nvgpu: Fix mutex MISRA 17.7 violations MISRA Rule-17.7 requires the return value of all functions to be used. Fix is either to use the return value or change the function to return void. This patch contains fix for calls to nvgpu_mutex_init and improves related error handling. JIRA NVGPU-677 Change-Id: I609fa138520cc7ccfdd5aa0e7fd28c8ca0b3a21c Signed-off-by: Nicolas Benech <nbenech@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1805598 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-09-05 20:39:08 -07:00
Srirangan	43851d41b1	gpu: nvgpu: gk20a: Fix MISRA 15.6 violations MISRA Rule-15.6 requires that all if-else blocks be enclosed in braces, including single statement blocks. Fix errors due to single statement if blocks without braces by introducing the braces. JIRA NVGPU-671 Change-Id: Iedac7d50aa2ebd409434eea5fda902b16d9c6fea Signed-off-by: Srirangan <smadhavan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1797695 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-09-05 04:51:32 -07:00
Scott Long	a18f364fd2	gpu: nvgpu: fix various MISRA 10.1 bool violations This patch corrects a handful of MISRA 10.1 violations involving illegal arithmetic operations (e.g. bitwise OR) on boolean values: * fix to status handling in regops validation code * fix to debugger event handling in gr code * fix to entries_left tracking in runlist construct code * fix to verbose channel dumping and reset tracking in fifo code JIRA NVGPU-650 Change-Id: I3c3d9123b5a0e08fc936d0e63d51de99fc310ade Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1810957 Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-09-04 10:54:24 -07:00
Scott Long	07f6739285	gpu: nvgpu: switch gk20a nonstall ops to #defines Fix MISRA rule 10.1 violations involving gk20a_nonstall_ops enums by replacing them with with corresponding #defines. Because these values can be used in expressions that require unsigned values (e.g. bitwise OR) we cannot use enums. The g->ce2.isr_nonstall() function was previously returning an int that was a combination of gk20a_nonstall_ops enum bits which led to the violations. JIRA NVGPU-650 Change-Id: I6210aacec8829b3c8d339c5fe3db2f3069c67406 Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1796242 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-08-22 17:31:42 -07:00
Amulya	da43fc5560	gpu: nvgpu: MISRA 10.3-Conversions to/from an enum Fix violations where the conversion is from a non-enum type to enum type or vice-versa. JIRA NVGPU-659 Change-Id: I45f43c907b810cc86b2a4480809d0c6757ed3486 Signed-off-by: Amulya <Amurthyreddy@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1802322 GVS: Gerrit_Virtual_Submit Tested-by: Amulya Murthyreddy <amurthyreddy@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Adeel Raza <araza@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-08-21 14:54:51 -07:00
Vinod G	c9f8f1ea05	gpu: nvgpu: remove utils.h from gk20a.h Removed the utils.h include from gk20a.h utils.h is included in those files which make use of the macros in utils.h JIRA NVGPU-1005 Change-Id: Ifb41da58db6ff8682fa6b5dfdd8eda11a751fcac Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1785952 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-08-10 18:11:26 -07:00
Srirangan	17aeea4a2f	gpu: nvgpu: gk20a: Fix MISRA 15.6 violations This fixes errors due to single statement loop bodies without braces, which is part of Rule 15.6 of MISRA. This patch covers in gpu/nvgpu/gk20a/ JIRA NVGPU-989 Change-Id: I2f422e9bc2b03229f4d2c3198613169ce5e7f3ee Signed-off-by: Srirangan <smadhavan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1791019 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-08-06 17:36:39 -07:00
Debarshi Dutta	82a90170d3	gk20a: nvgpu: Remove io.h dependency from gk20a.h In the current code, gk20a.h includes io.h which gets directly included in a lot of other files. io.h contains methods which uses a struct gk20a as a parameter leading to a circular dependency between io.h and gk20a.h. This can be mitigated by removing io.h from gk20a.h as part of larger effort to moving gk20a.h to nvgpu/gk20a.h JIRA NVGPU-597 Change-Id: I93e504fa9371b88152737b342a75580c65e8f712 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1787316 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-07-30 11:24:06 -07:00
Seema Khowala	4cbec6b2c7	gpu: nvgpu: set preempt timeout -For Si platforms, gk20a_get_gr_idle_timeout returns 3000 ms i.e. 3 sec. Currently this time is used for preempt polling and this conflicts with channel timeout if polling times out. Use fifo_eng_timeout_us converted to ms for preempt polling. -In case of preempt timeout, do not issue recovery for si platform. ctxsw timeout will trigger recovery if needed. For non si platforms, issue preempt timeout rc if preempt times out. Bug 2113657 Bug 2064553 Bug 2038366 Bug 2028993 Bug 200426402 Change-Id: I8d9f58be9ac634e94defa92a20fb737bf256d841 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1762076 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-07-30 00:21:04 -07:00
Seema Khowala	5d2058791f	gpu: nvgpu: acquire/release runlist_lock during teardown/mmu_fault -Recovery can be called for various types of faults. Acquire runlist_lock for all runlists so that current teardown is done before proceeding to next one. -For legacy chips teardown is done by triggering mmu fault so make sure runlist_locks are acquired during teardown and also during handling mmu fault. -gk20a_fifo_handle_mmu_fault is renamed as gk20a_fifo_handle_mmu_fault_locked -gk20a_fifo_handle_mmu_fault called from gk20a_fifo_teardown_ch_tsg is replaced with gk20a_fifo_handle_mmu_fault_locked -gk20a_fifo_handle_mmu_fault acquires/release runlist_lock for all runlists and calls gk20a_fifo_handle_mmu_fault_locked Bug 2113657 Bug 2064553 Bug 2038366 Bug 2028993 Change-Id: I973d7ddb6924b50bae2d095152867e99c87e780a Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1761197 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-07-30 00:21:00 -07:00
Vinod G	509139b8a0	gpu: nvgpu: Rearrange the static inline code In order to avoid the circular dependencies, rearrange the static inline functions from gk20a.h file. Moved gk20a_gr_flush_channel_tlb function to gr_gk20a.c and removed the #include gr_gk20a.h from gk20a.h Added a helper function utils.h to move all generic static inline functions which have no reference to gpu related structures. ptimer related functions are moved to ptimer.h Implementations for as and pmu are moved to corresponding files. JIRA NVGPU-624 Change-Id: I4e956326e773ba037bf3a1696cc4c462085dbbe5 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1781941 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-07-24 16:11:07 -07:00
Seema Khowala	b1d0d8ece8	Revert "Revert: GV11B runlist preemption patches" This reverts commit `0b02c8589d`. Originally change was reverted as it was making ap_compute test on embedded-qnx-hv e3550-t194 fail. With fixes related to replacing tsg preempt with runlist preempt during teardown, preempt timeout set to 100 ms (earlier this was set to 1000ms for t194 and 3000ms for legacy chips) and not issuing preempt timeout recovery if preempt fails, helped resolve the issue. Bug 200426402 Change-Id: If9a68d028a155075444cc1bdf411057e3388d48e Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1762563 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-07-19 13:54:26 -07:00
Nitin Kumbhar	7c494c83cc	gpu: nvgpu: add error check for init_runlist Allocations in init_runlist can fail. Check for such a failure during fifo setup is being done. Bug 1987855 Change-Id: I1771a15ebeac81ab2e3ebc9a75363445a0b6f20d Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1770801 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-07-06 07:35:47 -07:00
Alex Waterman	0b02c8589d	Revert: GV11B runlist preemption patches This reverts commit `2d397e34a5`. This reverts commit `cd6e821cf6`. This reverts commit `5cf1eb145f`. This reverts commit `a8d6f31bde`. This reverts commit `067ddbc4e4`. This reverts commit `3eede64de0`. This reverts commit `1407133b7e`. This reverts commit `797dde3e32`. Looks like this makes the ap_compute test on embedded-qnx-hv e3550-t194 quite bad. Might also affect ap_resmgr. Signed-off-by: Alex Waterman <alexw@nvidia.com> Change-Id: Ib9f06514d554d1a67993f0f2bd3d180147385e0a Reviewed-on: https://git-master.nvidia.com/r/1761864 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit	2018-06-26 14:43:08 -07:00
Seema Khowala	cd6e821cf6	gpu: nvgpu: gv11b: add runlist abort & remove bare channel -Add support for aborting runlist/s. Aborting runlist/s, will abort all active tsgs and associated active channels within these active tsgs -Bare channels are no longer supported. Remove recovery support for bare channels. In case there are bare channels, recovery will trigger runlist abort Bug 2125776 Bug 2108544 Bug 2105322 Bug 2092051 Bug 2048824 Bug 2043838 Bug 2039587 Bug 2028993 Bug 2029245 Bug 2065990 Bug 1945121 Bug 200401707 Bug 200393631 Bug 200327596 Change-Id: I6bec8a0004508cf65ea128bf641a26bf4c2f236d Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1640567 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-06-24 09:53:44 -07:00
Seema Khowala	067ddbc4e4	gpu: nvgpu: remove timeout_rc_type i/p param -is_preempt_pending hal does not need timeout_rc_type input param as for volta, reset_eng_bitmask is saved if preempt times out. For legacy chips, recovery triggers mmu fault and mmu fault handler takes care of resetting engines. -For volta, no special input param needed to differentiate between preempt polling during normal scenario and preempt polling during recovery. Recovery path uses preempt_ch_tsg hal to issue preempt. This hal does not issue recovery if preempt times out. Bug 2125776 Bug 2108544 Bug 2105322 Bug 2092051 Bug 2048824 Bug 2043838 Bug 2039587 Bug 2028993 Bug 2029245 Bug 2065990 Bug 1945121 Bug 200401707 Bug 200393631 Bug 200327596 Change-Id: Ie76a18ae0be880cfbeee615859a08179fb974fa8 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1709799 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-06-24 09:53:33 -07:00
Vinod G	0aa8d6e273	gpu: nvgpu: Mask an unused HCE_ILLEGAL_OP Interrupt HCE interrupt is not being used in nvgpu platform now, masking the bit from the interrupt register. bug 2082123 Change-Id: I1d53584afebe57b9621c8f4ec395cd1dcd6c7611 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1746850 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-06-14 06:44:08 -07:00
Thomas Fleury	943e3158bc	gpu: nvgpu: add g->fifo_eng_timeout_us Add g->fifo_eng_timeout_us to define engine timeout in microseconds. It is initialized with GRFIFO_TIMEOUT_CHECK_PERIOD_US. In RM server case, it can be overriden with value defined in device tree. Jira EVLR-2674 Change-Id: I69ac2ce779fe575566c8ba48e8cd2d0e6b2d93cf Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1728391 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-06-14 06:44:06 -07:00
Seema Khowala	25e727d997	gpu: nvgpu: release runlist_lock before issuing recovery Release runlist_lock before issuing runlist update timeout recovery. Bug 2115080 Change-Id: I22cd0dd8ab6828412fcc98f587e4a5cdce907651 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1722308 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-18 19:55:10 -07:00
Seema Khowala	982fcfa737	gpu: nvgpu: Add timeouts_disabled_refcount for enabling timeout -timeouts will be enabled only when timeouts_disabled_refcount will reach 0 -timeouts_enabled debugfs will change from u32 type to file type to avoid race enabling/disabling timeout from debugfs and ioctl -unify setting timeouts_enabled from debugfs and ioctl Bug 1982434 Change-Id: I54bab778f1ae533872146dfb8d80deafd2a685c7 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1588690 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-18 19:54:33 -07:00
Vinod G	ac687c95d3	gpu: nvgpu: Code updates for MISRA violations Code related to MC module is updated for handling MISRA violations Rule 10.1: Operands shalln't be an inappropriate essential type. Rule 10.3: Value of expression shalln't be assigned to an object with a narrow essential type. Rule 10.4: Both operands in an operator shall have the same essential type. Rule 14.4: Controlling if statement shall have essentially Boolean type. Rule 15.6: Enclose if() sequences with braces. JIRA NVGPU-646 JIRA NVGPU-659 JIRA NVGPU-671 Change-Id: Ia7ada40068eab5c164b8bad99bf8103b37a2fbc9 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1720926 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-18 14:53:58 -07:00
David Li	a807cf2041	gpu: nvgpu: add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLIST Add NVGPU_IOCTL_CHANNEL_RESCHEDULE_RUNLIST ioctl to reschedule runlist, and optionally check host and FECS status to preempt pending load of context not belonging to the calling channel on GR engine during context switch. This should be called immediately after a submit to decrease worst case submit to start latency for high interleave channel. There is less than 0.002% chance that the ioctl blocks up to couple miliseconds due to race condition of FECS status changing while being read. For GV11B it will always preempt pending load of unwanted context since there is no chance that ioctl blocks due to race condition. Also fix bug with host reschedule for multiple runlists which needs to write both runlist registers. Bug 1987640 Bug 1924808 Change-Id: I0b7e2f91bd18b0b20928e5a3311b9426b1bf1848 Signed-off-by: David Li <davli@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1549050 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-17 23:34:20 -07:00
Seema Khowala	4654d9abd1	gpu: nvgpu: runlist_lock released before preempt timeout recovery Release runlist_lock and then initiate recovery if preempt timed out. Also do not issue preempt if ch, tsg or runlist id is invalid. tsgid could be invalid for below call trace gk20a_prepare_poweroff->gk20a_channel_suspend-> _fifo_preempt_channel->_fifo_preempt_tsg Bug 2065990 Bug 2043838 Change-Id: Ia1e3c134f06743e1258254a4a6f7256831706185 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662656 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-17 09:33:04 -07:00
Deepak Nibade	0301cc01f6	gpu: nvgpu: add HAL to insert semaphore commands Add below new HALs gops.fifo.add_sema_cmd() to insert HOST semaphore acquire/release methods gops.fifo.get_sema_wait_cmd_size() to get size of acquire command buffer gops.fifo.get_sema_incr_cmd_size() to get size of release command buffer Separate out new API gk20a_fifo_add_sema_cmd() to implement semaphore acquire/ release sequence and set it to gops.fifo.add_sema_cmd() Add gk20a_fifo_get_sema_wait_cmd_size() and gk20a_fifo_get_sema_incr_cmd_size() to return respective command buffer sizes Jira NVGPUT-16 Change-Id: Ia81a50921a6a56ebc237f2f90b137268aaa2d749 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704490 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-16 03:10:37 -07:00
Deepak Nibade	c5db005b73	gpu: nvgpu: remove access to mc_enable_pb_r() We don't need to configure mc_enable_pb_r() register in any of the supported chips, so remove access to this register Jira NVGPUT-52 Change-Id: I8a7a524367ce7953f926143242c6d63bc8fd5ed1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1711245 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-10 10:04:44 -07:00
Terje Bergstrom	dd739fcb03	gpu: nvgpu: Remove gk20a_dbg* functions Switch all logging to nvgpu_log(). gk20a_dbg macros are intentionally left there because of use from other repositories. Because the new functions do not work without a pointer to struct gk20a, and piping it just for logging is excessive, some log messages are deleted. Change-Id: I00e22e75fe4596a330bb0282ab4774b3639ee31e Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704148 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-09 18:26:04 -07:00
Vinod G	010439ba08	gpu: nvgpu: add HALs to mmu fault descriptors. mmu fault information for client and gpc differ on various chip. Add separate table for each chip based on that change and add hal functions to access those descriptors. bug 2050564 Change-Id: If15a4757762569d60d4ce1a6a47b8c9a93c11cb0 Signed-off-by: Vinod G <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1704105 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-03 23:57:12 -07:00
Seema Khowala	c9463fdbb3	gpu: nvgpu: add rc_type i/p param to gk20a_fifo_recover Add below rc_types to be passed to gk20a_fifo_recover MMU_FAULT PBDMA_FAULT GR_FAULT PREEMPT_TIMEOUT CTXSW_TIMEOUT RUNLIST_UPDATE_TIMEOUT FORCE_RESET SCHED_ERR This is nice to have to know what triggered recovery. Bug 2065990 Change-Id: I202268c5f237be2180b438e8ba027fce684967b6 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662619 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-03 21:43:06 -07:00
Seema Khowala	bf03799977	gpu: nvgpu: rename mutex to runlist_lock Rename mutex to runlist_lock in fifo_runlist_info_gk20a struct. This is good to have for code readability. Bug 2065990 Bug 2043838 Change-Id: I716685e3fad538458181d2a9fe592410401862b9 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1662587 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-05-03 21:42:57 -07:00
Deepak Nibade	fc1ebe57f5	gpu: nvgpu: add HALs to submit and wait for runlist Add below two new HALs gops.fifo.runlist_hw_submit() to submit a new runlist to hardware gops.fifo.runlist_wait_pending() to wait until runlist write is successful Set existing API gk20a_fifo_runlist_wait_pending() to gops.fifo.runlist_wait_pending HAL Add new API gk20a_fifo_runlist_hw_submit() which submits the runlist to h/w and set it to gops.fifo.runlist_hw_submit HAL Jira NVGPUT-20 Change-Id: Ic23f7d947e30883aca0b536de818e79e14733195 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1700548 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-04-24 11:10:48 -07:00
Sourab Gupta	abd5f68eef	gpu: nvgpu: add usermode submission interface HAL The patch adds the HAL interfaces for handling the usermode submission, particularly allocating channel specific usermode userd. These interfaces are currently implemented only on QNX, and are created accordingly. As and when linux adds the usermode submission support, we can revisit them if any further changes are needed. Change-Id: I790e0ebdfaedcdc5f6bb624652b1af4549b7b062 Signed-off-by: Sourab Gupta <sourabg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1683392 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-04-05 05:22:58 -07:00
Sourab Gupta	0b2ea2924b	gpu: nvgpu: add gops.fifo.setup_sw bar1/userd setup is different for RM server. created common function gk20a_init_fifo_setup_sw_common. Jira VQRM-3058 Change-Id: I655b54e21ed5f15dcb8e7b01bd9cd129b35ae7a3 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1665691 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-03-29 18:54:38 -07:00
Richard Zhao	8d8ff9d34e	gpu: nvgpu: add gops.fifo.set_error_notifier RM Server overrides it for handling stall interrupts. Jira VQRM-3058 Change-Id: I8b14f073e952d19c808cb693958626b8d8aee8ca Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1679709 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-03-29 18:54:29 -07:00
Richard Zhao	bcab5c1486	gpu: nvgpu: add gops.fifo.check_tsg_ctxsw_timeout/check_ch_ctxsw_timeout RM Server acts differently for ctxsw timeout check. It won't check GP_GET or accumulated timeouts, but notify guest and go to recovery. Jira VQRM-3058 Change-Id: I428aea34dc517311eb7e73feb556145e916309fb Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1679706 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-03-29 18:54:11 -07:00
Richard Zhao	c5f03db98a	gpu: nvgpu: add gops.fifo.ch_abort_clean_up Channel abort clean up is only needed by native and vgpu driver but not RM server. RM server expects guest will clean up itself. RM server should not set the callback. Jira VQRM-3058 Change-Id: I11b49b6f2d51c871e31de16955d487dca82609cb Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/1679705 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2018-03-29 18:54:02 -07:00

1 2 3 4 5 ...

317 Commits