linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 09:57:08 +03:00

Author	SHA1	Message	Date
Konsta Holtta	f6c921ec97	gpu: nvgpu: bring back tegra idle registration To make do_idle work when nvgpu is built as a module, reverse the order of call dependencies for do_idle. Don't provide visible gk20a_do_{idle,unidle}() functions for the kernel but instead call the kernel for registering and unregistering pointers to them when the driver loads and unloads. Refactor the internal __gk20a_do_{idle,unidle} functions to take a struct gk20a * instead of struct device *, and use the callback api for providing that g instead of retrieving the plat device from device tree. Bug 200290850 Change-Id: Ibef8b069302e547b298069cbb97734f461a10cc3 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1493774 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-06-15 05:23:19 -07:00
Deepak Nibade	0ad7f1d9aa	gpu: nvgpu: use nvgpu specific nvhost APIs Remove use of linux specifix header files <linux/nvhost.h> and <linux/nvhost_ioctl.h> and use nvgpu specific header file <nvgpu/nvhost.h> instead This is needed to remove all Linux dependencies from nvgpu driver Replace all nvhost_() calls by nvgpu_nvhost_() calls from new nvgpu library Remove platform device pointer host1x_dev from struct gk20a and add struct nvgpu_nvhost_dev instead Jira NVGPU-29 Change-Id: Ia7af70602cfc16f9ccc380752538c05a9cbb8a67 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1489726 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>	2017-06-08 06:37:15 -07:00
seshendra Gadagottu	c0822cb22e	gpu: nvgpu: add chip specific sync point support Added support for chip specific sync point implementation. Relevant fifo hal functions are added and updated for legacy chips. JIRA GPUT19X-2 Change-Id: I9a9c36d71e15c384b5e5af460cd52012f94e0b04 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1258232 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-05-26 14:07:12 -07:00
Stephen Warren	2e338c77ea	gpu: nvgpu: remove duplicate \n from log messages nvgpu_log/info/warn/err() internally add a \n to the end of the message. Hence, callers should not include a \n at the end of the message. Doing so results in duplicate \n being printed, which ends up creating empty log messages. Remove the duplicate \n from all err/warn messages. Bug 1928311 Change-Id: I99362c5327f36146f28ba63d4e68181589735c39 Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-on: http://git-master/r/1487232 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-05-26 03:34:30 -07:00
Konsta Holtta	ee9733e587	gpu: nvgpu: expose deterministic submit support Add these bits in the gpu characteristics flags: NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_NO_JOBTRACKING - fast submits with no in-kernel job tracking are supported. NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_FULL - deterministic submits also with job tracking and num_inflight_jobs set are supported. Either of these may get disabled if the particular channel or submit still requires features that block these. Make gk20a_channel_sync_needs_sync_framework() take a gk20a pointer instead of a channel pointer so that it can be called without a channel. It does not need any per-channel data. Bug 200291300 Change-Id: I5f82510b6d39b53bcf6f1006dd83bdd9053963a0 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1456845 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-05-05 07:54:18 -07:00
Terje Bergstrom	71af78d2c2	gpu: nvgpu: Move has_syncpts to gk20a Copy has_syncpts to struct gk20a at probe time, and access it from gk20a instead of platform_gk20a. JIRA NVGPU-16 Change-Id: I50329e3a5141a62e6e9828e97ea0747abc1ce1ee Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1463545 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-04-26 09:14:26 -07:00
Alex Waterman	84dadb1a9a	gpu: nvgpu: Move semaphore impl to nvgpu_mem Use struct nvgpu_mem for DMA allocations (and the corresponding nvgpu_dma_alloc_sys()) instead of custom rolled code. This migrates away from using linux scatter gather tables directly. Instead this is hidden in the nvgpu_mem struct. With this change the semaphore.c code no longer has any direct Linux dependencies. JIRA NVGPU-12 JIRA NVGPU-30 Change-Id: I92167c98aac9b413ae87496744dcee051cd60207 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1464081 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>	2017-04-25 14:26:00 -07:00
Deepak Nibade	8929ab7533	gpu: nvgpu: clean up linux list includes Remove linux list includes <linux/list.h> and include <nvgpu/list.h> since we now use nvgpu list APIs instead of linux APIs Jira NVGPU-13 Change-Id: I59bd433a9bc5c15d4c40e6fe4b18cf44246ba3b2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1462080 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2017-04-19 12:15:57 -07:00
Terje Bergstrom	a0fa2b0258	gpu: nvgpu: Add wrapper nvgpu/bug.h Add wrapper header file nvgpu/bug.h. It #includes <linux/bug.h> in Linux. JIRA NVGPU-13 Change-Id: I7bf02ba554333f7cbd79d72bd1cb423c81ebcb49 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1461545 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-04-13 08:56:06 -07:00
Terje Bergstrom	44d5fb76aa	gpu: nvgpu: Add wrapper nvgpu/atomic.h Add wrapper header file nvgpu/atomic.h. It #includes <linux/atomic.h> on Linux. JIRA NVGPU-13 Change-Id: I6f2b3a04c964e7664b1f61b6073b643629bd99c5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1460792 Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com>	2017-04-12 07:01:12 -07:00
Konsta Holtta	1a4647272f	gpu: nvgpu: remove fence dependency tracking In preparation for better abstraction in job synchronization, drop support for the dependency fences tracked via submit pre-fences in semaphore-based syncs. This has only worked for semaphores, not nvhost syncpoints, and hasn't really been used. The dependency was printed in the sync framework's sync pt value string. Remove also the userspace-visible gk20a_sync_pt_info which is not used and depends on this feature (providing a duration since the dependency fence's timestamp). Jira NVGPU-43 Change-Id: Ia2b26502a9dc8f5bef5470f94b1475001f621da1 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1456880 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-04-11 09:57:21 -07:00
Terje Bergstrom	3ba374a5d9	gpu: nvgpu: gk20a: Use new error macro gk20a_err() and gk20a_warn() require a struct device pointer, which is not portable across operating systems. The new nvgpu_err() and nvgpu_warn() macros take struct gk20a pointer. Convert code to use the more portable macros. JIRA NVGPU-16 Change-Id: Ia51f36d94c5ce57a5a0ab83b3c83a6bce09e2d5c Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1331694 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2017-04-10 19:04:19 -07:00
Deepak Nibade	92ab6c3bbc	gpu: nvgpu: use nvgpu list for pending semaphore waits Use nvgpu list APIs instead of linux list APIs to store pending semaphore waits Jira NVGPU-13 Change-Id: I42fc6c6233e39f475a939ddd6a81c0cda851b6bf Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1454693 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>	2017-04-09 23:54:32 -07:00
Alex Waterman	b69020bff5	gpu: nvgpu: Rename gk20a_mem_* functions Rename the functions used for mem_desc access to nvgpu_mem_*. JIRA NVGPU-12 Change-Id: Ibfdc1112d43f0a125e4487c250e3f977ffd2cd75 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1323325 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-04-06 18:14:48 -07:00
Alex Waterman	24e1c7e0a7	gpu: nvgpu: Use new kmem API functions (misc) Use the new kmem API functions in misc gk20a code. Some additional modifications were also made: o Add a struct gk20a pointer to gk20a_fence to enable proper kmem free usage. o Add gk20a pointer to alloc_session() in dbg_gpu_gk20a.c to use kmem API for allocating a session. o Plumb a gk20a pointer through the fence creation and deletion. o Use statically allocated buffers for names in file creation. Bug 1799159 Bug 1823380 Change-Id: I3678080e3ffa1f9bcf6934e3f4819a1bc531689b Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1318323 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-03-30 12:36:09 -07:00
Terje Bergstrom	3e39798997	gpu: nvgpu: Remove unnecessary use of dev_name() Move the name field from struct gpu_ops up to struct gk20a. The field is not a function op, so it doesn't belong in gpu_ops. Replace all uses of dev_name() with use of g->name when possible. JIRA NVGPU-16 Change-Id: Ic6e99e39258cbf3bb7c806962cbbd7de5126688f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1328534 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-03-28 12:05:03 -07:00
Konsta Holtta	33f637585e	gpu: nvgpu: split nvhost dependency on plat interface Add CONFIG_TEGRA_GK20A_NVHOST and remove the TEGRA_GRHOST \|\| TEGRA_HOST1X dependency in CONFIG_TEGRA_GK20A to allow using the iGPU without the nvhost driver. Use the new config to guard syncpt-related code. Also make TEGRA_ACR depend on GK20A too so that it aligns properly under gk20a in menuconfig. Bug 1853519 Change-Id: I9e9b0a7915d000aae7930821627b7a01d08d3f5c Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1321303 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-03-23 08:04:23 -07:00
Alex Waterman	4a94c135f0	gpu: nvgpu: Use new kmem API functions (channel) Use the new kmem API functions in the channel and channel related code. Also delete the usage of kasprintf() since that must be paired with a kfree(). Since the kasprintf() doesn't use the nvgpu kmem machinery (and is Linux specific) instead use a small buffer statically allocated on the stack. Bug 1799159 Bug 1823380 Change-Id: Ied0183f57372632264e55608f56539861cc0f24f Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1318312 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-03-22 18:37:10 -07:00
Konsta Holtta	dd0f3a061b	gpu: nvgpu: avoid double-free of incr cmd The call site (gk20a_submit_prepare_syncs) owns the incr_cmd buffer passed to __gk20a_channel_semaphore_incr. Delete the free in the error path of the latter case to avoid freeing the same buffer twice. Bug 1853519 Change-Id: I9b90ce7ebb17ac63992938c7f9fe90bbd139f85f Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1321117 Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2017-03-16 09:17:22 -07:00
Konsta Holtta	f1072a28be	gpu: nvgpu: add worker for watchdog and job cleanup Implement a worker thread to replace the delayed works in channel watchdog and job cleanups. Watchdog runs by polling the channel states periodically, and job cleanup is performed on channels that are appended on a work queue consumed by the worker thread. Handling both of these two in the same thread makes it impossible for them to cause a deadlock, as has previously happened. The watchdog takes references to channels during checking and possibly recovering channels. Jobs in the cleanup queue have an additional reference taken which is released after the channel is processed. The worker is woken up from periodic sleep when channels are added to the queue. Currently, the queue is only used for job cleanups, but it is extendable for other per-channel works too. The worker can also process other periodic actions dependent on channels. Neither the semantics of timeout handling or of job cleanups are yet significantly changed - this patch only serializes them into one background thread. Each job that needs cleanup is tracked and holds a reference to its channel and a power reference, and timeouts can only be processed on channels that are tracked, so the thread will always be idle if the system is going to be suspended, so there is currently no need to explicitly suspend or stop it. Bug 1848834 Bug 1851689 Bug 1814773 Bug 200270332 Jira NVGPU-21 Change-Id: I355101802f50841ea9bd8042a017f91c931d2dc7 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1297183 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-03-02 17:51:03 -08:00
Deepak Nibade	8ee3aa4b31	gpu: nvgpu: use common nvgpu mutex/spinlock APIs Instead of using Linux APIs for mutex and spinlocks directly, use new APIs defined in <nvgpu/lock.h> Replace Linux specific mutex/spinlock declaration, init, lock, unlock APIs with new APIs e.g struct mutex is replaced by struct nvgpu_mutex and mutex_lock() is replaced by nvgpu_mutex_acquire() And also include <nvgpu/lock.h> instead of including <linux/mutex.h> and <linux/spinlock.h> Add explicit nvgpu/lock.h includes to below files to fix complilation failures. gk20a/platform_gk20a.h include/nvgpu/allocator.h Jira NVGPU-13 Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1293187 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-02-22 04:15:02 -08:00
Alex Waterman	e7a0c0ae8b	gpu: nvgpu: Move from gk20a_ to nvgpu_ in semaphore code Change the prefix in the semaphore code to 'nvgpu_' since this code is global to all chips. Bug 1799159 Change-Id: Ic1f3e13428882019e5d1f547acfe95271cc10da5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284628 Reviewed-by: Varun Colbert <vcolbert@nvidia.com> Tested-by: Varun Colbert <vcolbert@nvidia.com>	2017-02-13 18:15:03 -08:00
Alex Waterman	aa36d3786a	gpu: nvgpu: Organize semaphore_gk20a.[ch] Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore code is common to all chips. Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu and rename it to semaphore.h. Also update all places where the header is inluced to use the new path. This revealed an odd location for the enum gk20a_mem_rw_flag. This should be in the mm headers. As a result many places that did not need anything semaphore related had to include the semaphore header file. Fixing this oddity allowed the semaphore include to be removed from many C files that did not need it. Bug 1799159 Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1284627 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2017-02-13 18:14:45 -08:00
Alex Waterman	9da40c79fc	gpu: nvgpu: Store pending sema waits Store pending sema waits so that they can be explicitly handled when the driver dies. If the sema_wait is freed before the pending wait is either handled or canceled problems occur. Internally the sync_fence_wait_async() function uses the kernel timers. That uses a linked list of possible events. That means every so often the kernel iterates through this list. If the list node that is in the sync_fence_waiter struct is freed before it can be removed from the pending timers list then the kernel timers list can be corrupted. When the kernel then iterates through this list crashes and other related problems can happen. Bug 1816516 Bug 1807277 Change-Id: Iddc4be64583c19bfdd2d88b9098aafc6ae5c6475 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1250025 (cherry picked from commit 01889e21bd31dbd7ee85313e98079138ed1d63be) Reviewed-on: http://git-master/r/1261920 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-12-19 15:40:19 -08:00
Terje Bergstrom	d29afd2c9e	gpu: nvgpu: Fix signed comparison bugs Fix small problems related to signed versus unsigned comparisons throughout the driver. Bump up the warning level to prevent such problems from occuring in future. Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1252068 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-11-16 21:35:36 -08:00
Sachit Kadle	ab593b9ccd	gpu: nvgpu: make deferred clean-up conditional This change makes the invocation of the deferred job clean-up mechanism conditional. For submissions that require job tracking, deferred clean-up is only required if any of the following conditions are met: 1) Channel's deterministic flag is not set 2) Rail-gating is enabled 3) Channel WDT is enabled 4) Buffer refcounting is enabled 5) Dependency on Sync Framework In case deferred clean-up is not needed, we clean-up a single job tracking resource in the submit path. For deterministic channels, we do not allow deferred clean-up to occur and fail any submits that require it. Bug 1795076 Change-Id: I4021dffe8a71aa58f12db6b58518d3f4021f3313 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1220920 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> (cherry picked from commit b09f7589d5ad3c496e7350f1ed583a4fe2db574a) Reviewed-on: http://git-master/r/1223941 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-10-21 11:23:53 -07:00
Sachit Kadle	63e8592e06	gpu: nvgpu: use inplace allocation in sync framework This change is the first of a series of changes to support the usage of pre-allocated job tracking resources in the submit path. With this change, we still maintain a dynamically-allocated joblist, but make the necessary changes in the channel_sync & fence framework to use in-place allocations. Specifically, we: 1) Update channel sync framework routines to take in pre-allocated priv_cmd_entry(s) & gk20a_fence(s) rather than dynamically allocating themselves 2) Move allocation of priv_cmd_entry(s) & gk20a_fence(s) to gk20a_submit_prepare_syncs 3) Modify fence framework to have seperate allocation and init APIs. We expose allocation as a seperate API, so the client can allocate the object before passing it into the channel sync framework. 4) Fix clean_up logic in channel sync framework Bug 1795076 Change-Id: I96db457683cd207fd029c31c45f548f98055e844 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1206725 (cherry picked from commit 9d196fd10db6c2f934c2a53b1fc0500eb4626624) Reviewed-on: http://git-master/r/1223933 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-20 08:14:04 -07:00
Alex Waterman	cab7514f4b	gpu: nvgpu: Fix invalid test for signaled sync_fences Fix a check that was backwards for signaled sync_fences. This would cause the code to not wait on some sync_fences that had not already signaled and wait on other fences that had signaled. Bug 1787348 Reviewed-on: http://git-master/r/1204710 (cherry picked from commit 75b94bb30f79c3a7a9992773dc8a93b507121006) Change-Id: I00b0f8a373a9954a5ad9ab31aff6423e91574153 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221044 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-15 21:58:44 -07:00
Alex Waterman	7b8cbd2be3	gpu: nvgpu: Greatly simplify the semaphore detection Greatly simplify and make more robust the gpu semaphore detection in sync_fences. Instead of using a magic number use the parent timeline of sync_pts. This will also work with multi-GPU setups using nvgpu since the timeline ops pointer will be the same across all instances of nvgpu. Bug 1732449 Reviewed-on: http://git-master/r/1203834 (cherry picked from commit 66eeb577eae5d10741fd15f3659e843c70792cd6) Change-Id: I4c6619d70b5531e2676e18d1330724e8f8b9bcb3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221042 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-15 21:58:37 -07:00
Alex Waterman	9bd76b7fa0	gpu: nvgpu: Optimize sync fence creation Only create sync-fences in the semaphore synchronization path when they are actually needed (i.e requested by userspace). Bug 1795076 Reviewed-on: http://git-master/r/1201564 (cherry picked from commit dc52d424a839e6c064c02b7f02905dd6a59a50af) Change-Id: Ieac6aef415678d4ea982683a955897c64959436e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1221041 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-15 21:58:36 -07:00
Alex Waterman	9eac0fd849	gpu: nvgpu: Add debugging to the semaphore code Add GPU debugging to the semaphore code. Bug 1732449 JIRA DNVGPU-12 Change-Id: I98466570cf8d234b49a7f85d88c834648ddaaaee Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1198594 (cherry picked from commit 420809cc31fcdddde32b8e59721676c67b45f592) Reviewed-on: http://git-master/r/1153671 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-08-30 10:04:30 -07:00
Alex Waterman	dfd5ec53fc	gpu: nvgpu: Revamp semaphore support Revamp the support the nvgpu driver has for semaphores. The original problem with nvgpu's semaphore support is that it required a SW based wait for every semaphore release. This was because for every fence that gk20a_channel_semaphore_wait_fd() waited on a new semaphore was created. This semaphore would then get released by SW when the fence signaled. This meant that for every release there was necessarily a sync_fence_wait_async() call which could block. The latency of this SW wait was enough to cause massive degredation in performance. To fix this a fast path was implemented. When a fence is passed to gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore a semaphore acquire is directly used to block the GPU. No longer is a sync_fence_wait_async() performed nor is there an extra semaphore created. To implement this fast path the semaphore memory had to be shared between channels. Previously since a new semaphore was created every time through gk20a_channel_semaphore_wait_fd() what address space a semaphore was mapped into was irrelevant. However, when using the fast path a sempahore may be released on one address space but acquired in another. Sharing the semaphore memory was done by making a fixed GPU mapping in all channels. This mapping points to the semaphore memory (the so called semaphore sea). This global fixed mapping is read-only to make sure no semaphores can be incremented (i.e released) by a malicious channel. Each channel then gets a RW mapping of it's own semaphore. This way a channel may only acquire other channel's semaphores but may both acquire and release its own semaphore. The gk20a fence code was updated to allow introspection of the GPU backed fences. This allows detection of when the fast path can be taken. If the fast path cannot be used (for example when a fence is sync-pt backed) the original slow path is still present. This gets used when the GPU needs to wait on an event from something which only understands how to use sync-pts. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133792 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-28 15:49:11 -07:00
Alex Waterman	5cf1fddbab	gpu: nvgpu: Balance curly braces In some of the conditionally compiled code in the nvgpu driver there are places where the code looks like: #ifdef LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0) some-loop { #else a-diff-loop { #endif /* Some code... */ } This leaves unbalanced curley braces: two open braces for one close brace. This messes up some editors syntax highlighting and auto- indentation features. This patch puts in the extra brace. It's not necessary for compiling code but it makes some editors much happier. Change-Id: Ida28bc001cc840fe52a43982db934d49c07cc7d3 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1153668 Reviewed-by: Konsta Holtta <kholtta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-13 10:27:31 -07:00
Deepak Nibade	dc7af18bf8	gpu: nvgpu: suppress prints in submit path When we run out of gpfifo space or private command buffer space, we have error spew like below : __gk20a_channel_syncpt_incr: not enough priv cmd buffer space gk20a_submit_channel_gpfifo: fail Dumping these prints to UART cause increase in submit latencies But on these failures, we return -ENOSPC to UMD and then UMD retries the submit, hence it might be unnecessary to dump these prints Hence, remove the error prints of insufficient space and use gk20a_dbg_fn() instead of gk20a_err() to print failure in gk20a_submit_channel_gpfifo() Bug 200202653 Change-Id: I49efd7c6c554dd4fbfa4e66d196eb352e69f92c6 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1152378 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-24 12:35:34 -07:00
Konsta Holtta	abec0ddc19	gpu: nvgpu: use mem_desc for semaphores Replace manual buffer allocation and cpu_va pointer accesses with gk20a_gmmu_{alloc,free}() and gk20a_mem_{rd,wr}() using a struct mem_desc in gk20a_semaphore_pool, for buffer aperture flexibility. JIRA DNVGPU-23 Change-Id: I394c38f407a9da02480bfd35062a892eec242ea3 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1146684 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-18 11:55:11 -07:00
Konsta Holtta	dc45473eeb	gpu: nvgpu: use mem_desc in priv_cmd_entry Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a reference to the underlying mem_desc (within priv_cmd_queue) paired with an offset, for buffer aperture flexibility. JIRA DNVGPU-21 JIRA DNVGPU-23 Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1145922 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-18 11:54:34 -07:00
Deepak Nibade	e0c9da1fe9	gpu: nvgpu: implement sync refcounting We currently free sync when we find job list empty If aggressive_sync is set to true, we try to free sync during channel unbind() call But we rarely free sync from channel_unbind() call since freeing it when job list is empty is aggressive enough Hence remove sync free code from channel_unbind() Implement refcounting for sync: - get a refcount while submitting a job (and allocate sync if it is not allocated already) - put a refcount while freeing the job - if refcount==0 and if aggressive_sync_destroy is set, free the sync - if aggressive_sync_destroy is not set, we will free the sync during channel close time Bug 200187553 Change-Id: I74e24adb15dc26a375ebca1fdd017b3ad6d57b61 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1120410 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-04-19 08:16:13 -07:00
Deepak Nibade	b6dc4315a4	gpu: nvgpu: support kernel-3.10 version Make necessary changes to support nvgpu on kernel-3.10 This includes below changes - PROBE_PREFER_ASYNCHRONOUS is defined only for K3.10 - Fence handling and struct sync_fence is different between K3.10 and K3.18 - variable status in struct sync_fence is atomic on K3.18 whereas it is int on K3.10 - if SOC == T132, set soc_name = "tegra13x" - ioremap_cache() is not defined on K3.10 ARM versions, hence use ioremap_cached() Bug 200188753 Change-Id: I18d77eb1404e15054e8510d67c9a61c0f1883e2b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1121092 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-04-15 08:11:14 -07:00
Terje Bergstrom	e8bac374c0	gpu: nvgpu: Use device instead of platform_device Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466	2016-04-08 09:42:41 -07:00
Alex Van Brunt	76ab6c1e5b	host: move nvhost to its own git repo Move nvhost out of the common Linux repo and into its own git repo. By doing this, the same nvhost driver can work on different Linux kernel versions. Previously android/sync.h was referenced relative to kernel/drivers/video/tegra/host. However, host moved to a completely different part of the tree. Instead, reference it relative to kernel/include. bug 1749413 Change-Id: Ic7f94093c712e5b64c9b3b660d6fce5d18e59bc0 Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com> Reviewed-on: http://git-master/r/1120544 Reviewed-by: Arto Merilainen <amerilainen@nvidia.com> Tested-by: Arto Merilainen <amerilainen@nvidia.com>	2016-04-07 08:02:11 -07:00
Yunbo Wang	e689d62d88	gpu: nvgpu: fix a sync_fence leak Fixes a bug where reference to sync_fence is not closed before return. Bug 200171146 Change-Id: If174eb124bd69692bab4cc8629a103517d7cfef1 Signed-off-by: Yunbo Wang <yunbow@nvidia.com> Reviewed-on: http://git-master/r/1029844 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Eric Miao <emiao@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-03-14 19:44:20 -07:00
Alex Waterman	766506d6e0	gpu: nvgpu: Increase semaphore count Increase the semaphore count per channel. Some channels were running out of semaphores. The original limit was 255 (256 fits in 1 page, but the 0th semaphore is used to return error codes from the allocator). Easy fix was to simply increase the number of semaphores each channel is allocated to 1024. Bug 1604892 Change-Id: I163e24b8d42a3dc1bb9b418dadc0c8532aff9adb Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/935911 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit	2016-01-27 10:59:08 -08:00
Alex Waterman	f7d219dd1c	gpu: nvgpu: Fix semaphore race condition A race condition existed in gk20a_channel_semaphore_wait_fd(). In some instances the semaphore underlying the sync_fence being waited on would have already signaled. This would cause the subsequent sync_fence_wait_async() call to return 1 and do nothing. Normally, the sync_fence_wait_async() call would release the newly created semaphore but in the above case that would not happen and hang any channel waiting on that semaphore. To fix this problem if sync_fence_wait_async() returns 1 immediately release the newly created semaphore. Bug 1604892 Change-Id: I1f5e811695bb099f71b7762835aba4a7e27362ec Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/935910 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit	2016-01-27 10:59:00 -08:00
Deepak Nibade	e153458aad	gpu: nvgpu: return ENOSPC if no private command buffer space If we run out of gpfifo space or private command buffer space, we currently return EAGAIN as error code Instead of EAGAIN, return ENOSPC as error code so that caller (user space) can read the error code and do some re-trials As the jobs are processed, it is possible to free up some space. And hence such re-trials could succeed Bug 1715291 Change-Id: I9a2ed7134d2496b383899b3c02c0e70452b26115 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/929402 Reviewed-by: Sachin Nikam <snikam@nvidia.com> Tested-by: Sachin Nikam <snikam@nvidia.com>	2016-01-12 23:23:43 -08:00
Deepak Nibade	52753b51f1	gpu: nvgpu: create sync_fence only if needed Currently, we create sync_fence (from nvhost_sync_create_fence()) for every submit But not all submits request for a sync_fence. Also, nvhost_sync_create_fence() API takes about 1/3rd of the total submit path. Hence to optimize, we can allocate sync_fence only when user explicitly asks for it using (NVGPU_SUBMIT_GPFIFO_FLAGS_FENCE_GET && NVGPU_SUBMIT_GPFIFO_FLAGS_SYNC_FENCE) Also, in CDE path from gk20a_prepare_compressible_read(), we reuse existing fence stored in "state" and that can result into not returning sync_fence_fd when user asked for it Hence, force allocation of sync_fence when job submission comes from CDE path Bug 200141116 Change-Id: Ia921701bf0e2432d6b8a5e8b7d91160e7f52db1e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/812845 (cherry picked from commit 5fd47015eeed00352cc8473eff969a66c94fee98) Reviewed-on: http://git-master/r/837662 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>	2015-12-08 01:18:04 -08:00
Deepak Nibade	7f79d647d6	gpu: nvgpu: set aggressive_sync_destroy at runtime We currently set "aggressive_destroy" flag to destroy sync object statically and for each sync object Move this flag to per-platform structure so that it can be set per-platform for all the sync objects Also, set the default value of this flag as "false" and set it to "true" once we have more than 64 channels in use Bug 200141116 Change-Id: I1bc271df4f468a4087a06a27c7289ee0ec3ef29c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/822041 (cherry picked from commit 98741e7e88066648f4f14490c76b61dbff745103) Reviewed-on: http://git-master/r/835800 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-11-23 08:43:49 -08:00
Richard Zhao	9cf28bc529	gpu: nvgpu: fix memory corrupt replace sprinf with snprintf in func gk20a_channel_syncpt_create. sync point name can be long. Bug 1638853 Change-Id: Ie305d04edfbb299c8b1241eca52101439bb4a6c6 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/769113 Reviewed-on: http://git-master/r/776424 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>	2015-08-11 20:39:50 -07:00
Deepak Nibade	c27e094002	gpu: nvgpu: remove gk20a_busy() from channel_syncpt_incr() gk20a_busy() is already called on all the paths to __gk20a_channel_syncpt_incr() i.e. in gk20a_submit_channel_gpfifo() hence remove the redundant gk20a_busy() call since it causes deadlock scenario with VPR resize use case Bug 200128257 Bug 1645760 Bug 200114947 Bug 200124519 Change-Id: I4cd47b7e7cdc92aaeda17256a99f2ba93833a3b3 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/778341 (cherry picked from commit 5a5dc5b5a9d38a5e8d5c1ca29dc6de425c00b605) Reviewed-on: http://git-master/r/779070 Reviewed-by: Sachin Nikam <snikam@nvidia.com>	2015-08-06 22:20:22 -07:00
Deepak Nibade	a969bc98ae	gpu: nvgpu: remove gk20a_busy() from channel_syncpt_update() gk20a_busy() was added to gk20a_channel_syncpt_update() for possible case of channel deletion But API to delete a channel (i.e. gk20a_free_channel()) is already called in paths which ensure gk20a_busy() is called before deleting the channel Hence, remove redundant gk20a_busy()/idle() calls This also fixes a deadlock scenario with VPR resize use case Bug 200128257 Bug 1645760 Bug 200114947 Bug 200124519 Change-Id: I05dc739b3be88af2ba22b0a667e5004d8100bf6f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/778340 (cherry picked from commit 306282aa950201cf1ae91a5cc48d75719b179d19) Reviewed-on: http://git-master/r/779069 Reviewed-by: Sachin Nikam <snikam@nvidia.com>	2015-08-06 22:20:08 -07:00
Konsta Holtta	6085c90f49	gpu: nvgpu: add per-channel refcounting Add reference counting for channels, and wait for reference count to get to 0 in gk20a_channel_free() before actually freeing the channel. Also, change free channel tracking a bit by employing a list of free channels, which simplifies the procedure of finding available channels with reference counting. Each use of a channel must have a reference taken before use or held by the caller. Taking a reference of a wild channel pointer may fail, if the channel is either not opened or in a process of being closed. Also, add safeguards for protecting accidental use of closed channels, specifically, by setting ch->g = NULL in channel free. This will make it obvious if freed channel is attempted to be used. The last user of a channel might be the deferred interrupt handler, so wait for deferred interrupts to be processed twice in the channel free procedure: once for providing last notifications to the channel and once to make sure there are no stale pointers left after referencing to the channel has been denied. Finally, fix some races in channel and TSG force reset IOCTL path, by pausing the channel scheduler in gk20a_fifo_recover_ch() and gk20a_fifo_recover_tsg(), while the affected engines have been identified, the appropriate MMU faults triggered, and the MMU faults handled. In this case, make sure that the MMU fault does not attempt to query the hardware about the failing channel or TSG ids. This should make channel recovery more safe also in the regular (i.e., not in the interrupt handler) context. Bug 1530226 Bug 1597493 Bug 1625901 Bug 200076344 Bug 200071810 Change-Id: Ib274876908e18219c64ea41e50ca443df81d957b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/448463 (cherry picked from commit 3f03aeae64ef2af4829e06f5f63062e8ebd21353) Reviewed-on: http://git-master/r/755147 Reviewed-by: Automatic_Commit_Validation_User	2015-06-09 11:13:43 -07:00

1 2

80 Commits