linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 18:16:01 +03:00

Author	SHA1	Message	Date
Konsta Hölttä	ff41d97ab5	gpu: nvgpu: always prealloc jobs and fences Unify the job metadata handling by deleting the parts that have handled dynamically allocated job structs and fences. Now a channel can be in one less mode than before which reduces branching in tricky places and makes the submit/cleanup sequence easier to understand. While preallocating all the resources upfront may increase average memory consumption by some kilobytes, users of channels have to supply the worst case numbers anyway and this preallocation has been already done on deterministic channels. Flip the channel_joblist_delete() call in nvgpu_channel_clean_up_jobs() to be done after nvgpu_channel_free_job(). Deleting from the list (which is a ringbuffer) makes it possible to reuse the job again, so the job must be freed before that. The comment about using post_fence is no longer valid; nvgpu_channel_abort() does not use fences. This inverse order has not posed problems before because it's been buggy only for deterministic channels, and such channels do not do the cleanup asynchronously so no races are possible. With preallocated job list for all channels, this would have become a problem. Jira NVGPU-5492 Change-Id: I085066b0c9c2475e38be885a275d7be629725d64 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346064 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	7b8a08af7a	gpu: nvgpu: check ch->wdt on wdt restart all channels ch->wdt is not always initialized. For example it's not initialized on gpu server, since the channel wdt is managed on client side. Bug 2833924 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Idb06f7de6a15e093bbb08be16454777b9d7582b9 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361978 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Richard Zhao	98264f7505	gpu: nvgpu: call gops.tsg.unbind_channel on fail path When current context is busy, nvgpu_tsg_unbind_channel_common may fail because of preemption failed. In such case, the .unbind_channel hal still need to be called to notify vserver that the channel will be removed from tsg in teardown path. Bug 2833924 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I9996202485429b4d9cba0c2f985f8e55fcdd3f29 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361977 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	fbb6a5bc1c	gpu: nvgpu: Remove fifo->pbdma_map The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists. This array allows other software to query what PBDMA(s) serves a given runlist. The PBDMA map is read verbatim from an array of host registers. These registers are stored in a kmalloc()'ed array. This causes a problem for the device management code. The device management initialization executes well before the rest of the FIFO PBDMA initialization occurs. Thus, if the device management code queries the PBDMA mapping for a given device/runlist, the mapping has yet to be populated. In the next patches in this series the engine management code is subsumed into the device management code. In other words the device struct is reused by the engine management and all host SW does is pull pointers to the host managed devices from the device manager. This means that all engine initialization that used to be done on top of the device management needs to move to the device code. So, long story short, the PBDMA map needs to be read from the registers directly, instead of an array that gets allocated long after the device code has run. This patch removes the pbdma map array, deletes two HALs that managed that, and instead provides a new HAL to query this map directly from the registers so that the device code can use it. JIRA NVGPU-5421 Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	194fac7f3c	gpu: nvgpu: Remove clutter in engine code Remove the get_mask_on_id() HAL and replace it's usage with the global nvgpu_engine_get_mask_on_id() function. There's no need to have this function as a HAL. JIRA NVGPU-5420 Change-Id: I4fc843beff8e65806da26a0addc83fa218d390ac Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361315 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	ca1f93bdd7	gpu: nvgpu: add user fence type Decouple the fence information needed for providing submit postfences to userspace by adding a separate type for that and using it to pass fence data to ioctls. The data in struct nvgpu_fence_type is used in various places: - job tracking needs to know when a post fence is expired - job submitters within the driver (vidmem clears) need to be able to wait for these fences - userspace needs the fence as an id, value pair or as a file descriptor created from an os fence To keep object lifetimes strict, start decoupling the os fence data out of struct nvgpu_fence_type: delete nvgpu_fence_install_fd() and add nvgpu_fence_extract_user() to return a struct nvgpu_user_fence that contains only the necessary information. Storing the os fence in job tracking metadata is legacy code and not useful. Passing the os fence from where it's created through the whole submit path inside this combined fence type has been convenient, though. The internally stored cde job fence in dmabuf compression metadata is still nvgpu_fence_type to keep this patch simple. Jira NVGPU-5248 Change-Id: I75b7da676fb6aa083828f888c55571bbf7645ef3 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2359064 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	160669a7bb	gpu: nvgpu: return device from nvgpu_device_get() Instead of copying the device contents into the passed pointer have nvgpu_device_get() return a device pointer. This will let the engines.c code move towards using the nvgpu_device type directly, instead of maintaining its own version of an essentially identical struct. JIRA NVGPU-5421 Change-Id: I6ed2ab75187a207c8962d4c0acd4003d1c20dea4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319758 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	70ce67df2d	gpu: nvgpu: Add a generic profiler Add a generic profiler based on the channel kickoff profiler. This aims to provide a mechanism to allow engineers to (more) easily profile arbitrary software paths within nvgpu. Usage of this profiler is still primarily through debugfs. Next up is a generic debugfs interface for this profiler in the Linux code. The end goal for this is to profile the recovery code and generate interesting statistics. JIRA NVGPU-5606 Signed-off-by: Alex Waterman <alexw@nvidia.com> Change-Id: I99783ec7e5143855845bde4e98760ff43350456d Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2355319 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	319520ff57	gpu: nvgpu: Add a new device manager unit This adds a new device management unit in the common code responsible for facilitating the parsing of the GPU top device list and providing that info to other units in nvgpu. The basic idea is to read this list once from HW and store it in a set of lists corresponding to each device type (graphics, LCE, etc). Many of the HALs in top can be deleted and instead implemented using common code parsing the SW representation. Every time the driver queries the device list it does so using a device type and instance ID. This is common code. The HAL is responsible for populating the device list in such a way that the driver can query it in a chip agnostic manner. Also delete some of the unit tests for functions that no longer exist. This code will require new unit tests in time; those should be quite simple to write once unit testing is needed. JIRA NVGPU-5421 Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6cbc174fc2	gpu: nvgpu: avoid channel wdt ifdefs Implement empty stubs of the channel watchdog functions for when watchdog is disabled from build. Add some forward declarations that were missing. Now most call sites don't need #idefs for the build flag. Add error checks for the wdt alloc failure. Jira NVGPU-5494 Jira NVGPU-5493 Change-Id: I2d42e8ab4c5e045cd280b2e1f254396127bd154b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2352050 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	2a3bb9107f	gpu: nvgpu: rename <nvgpu/top.h> to <nvgpu/device.h> top.h is a description of "devices" available on the GPU. As such rename this header to device.h. device.h will ultimately be a unit of actual C code that will rely on the top HAL to fill a device list. JIRA NVGPU-5421 Change-Id: If6e4a537d2209e429a678761a34713723da7a00a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319648 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
tkudav	957b19092f	gpu: nvgpu: Enable Quiesce on all builds Make Recovery and quiesce co-exist to support quiesce state on unrecoverrable errors. Currently, the quiesce code is wrapped under ifndef CONFIG_NVGPU_RECOVERY. Isolate the quiesce code from recovery config, thereby enabling it on all builds. On Linux, the hung_task checker(check_hung_uninterruptible_tasks() in kernel/hung_task.c) complains that quiesce thread is stuck for more than 120 seconds. INFO: task sw-quiesce:1068 blocked for more than 120 seconds. The wait time of more than 120 seconds is expected as quiesce thread will wait until quiesce call is triggered on fatal unrecoverable errors. However, the INFO print upsets the kernel_warning_test(KWT) on Linux builds. To fix the failing KWT, change the quiesce task to interruptible instead of uninterruptible as checker only looks at uninterruptible tasks. Bug 2919899 JIRA NVGPU-5479 Change-Id: Ibd1023506859d8371998b785e881ace52cb5f030 Signed-off-by: tkudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342774 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	16fb7654a5	gpu: nvgpu: isolate channel watchdog unit Move the definition of struct nvgpu_channel_wdt to watchdog.c. Adjust users of it to access it via an unified interface instead of poking directly at the channel internals. Jira NVGPU-5494 Change-Id: Ie11826e6732a8b98e72c4f81dd06bd7e49848121 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345935 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	21e02878f4	gpu: nvgpu: move wdt code out of channel.c Cut and paste the existing channel watchdog functions to another file for better isolation of units. Jira NVGPU-5494 Change-Id: Id437f0939e69a4a8b495eaee164c4d7a9f283fa9 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2345934 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	6b1302f23c	gpu: nvgpu: Reduce linux debug log spew Currently when nvgpu prints debug information for something like an MMU fault the result includes a lot of usless boiler plate logging spew. In some cases this can be helpful in identifying where the log message came from in the nvgpu code base. However, for debug spews from faults, the viewer of that info does not care which function printed the log (for example). Instead having a fast and readable debug dump is more valuable. So to that end, add a special debug dump printing function that does not use the normal log format. Instead, it prints only a breif prefix to use as a grep search query. The new print out is listed below. Since often the kernel logs are impressively long and obtuse, having a clear debug search string can be helpful. With this log format, one can simply do: $ grep __$CHIP__ kernel.log And find any debug logs for the desired chip. New log format - collected on a gv11b under L4T running `nvgpu_submit_mmu_fault': [ 32.005793] nvgpu: 17000000.gv11b gv11b_fb_mmu_fault_info_dump:311 [ERR] [MMU FAULT] mmu engine id: 32, ch id: 511, fault addr: 0x1000, fault addr aperture: 0, fault type: invalid pde, access type: virt read, [ 32.006137] nvgpu: 17000000.gv11b gv11b_fb_mmu_fault_info_dump:320 [ERR] [MMU FAULT] protected mode: 0, client type: hub, client id: host, gpc id if client type is gpc: 0, [ 32.006417] nvgpu: 17000000.gv11b nvgpu_rc_mmu_fault:296 [ERR] mmu fault id=0 id_type=1 act_eng_bitmask=00000000 [ 32.007125] __gv11b__ Channel Status - chip gv11b [ 32.007128] __gv11b__ --------------------------- [ 32.007241] __gv11b__ 511-gv11b, TSG: 0, pid 955, refs: 2, deterministic: [ 32.007364] __gv11b__ channel status: in use pending busy [ 32.007509] __gv11b__ RAMFC : TOP: 8000000000001000 PUT: 0000000000001030 GET: 0000000000001000 FETCH: 0000600000001000HEADER: 60400000 COUNT: 00000000SEMAPHORE: addr 0000000000000000payload 0000000000000000 execute 00000000 [ 32.007601] __gv11b__ [ 32.008696] __gv11b__ [ 32.008700] __gv11b__ PBDMA Status - chip gv11b [ 32.008894] __gv11b__ ------------------------- [ 32.013477] __gv11b__ pbdma 0: [ 32.017840] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.020992] __gv11b__ PBDMA_PUT 0000000000001030 PBDMA_GET 0000000000001000 [ 32.029037] __gv11b__ GP_PUT 00000001 GP_GET 00000001 FETCH 00000001 HEADER 60400000 [ 32.036386] __gv11b__ HDR 00000000 SHADOW0 00001000 SHADOW1 80003000 [ 32.044787] __gv11b__ pbdma 1: [ 32.051964] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.055099] __gv11b__ PBDMA_PUT 0000000042003200 PBDMA_GET 00000050728bc914 [ 32.062997] __gv11b__ GP_PUT 00000000 GP_GET 2080a000 FETCH 00000000 HEADER e1850010 [ 32.070424] __gv11b__ HDR 00110000 SHADOW0 02000000 SHADOW1 10000004 [ 32.078652] __gv11b__ pbdma 2: [ 32.085913] __gv11b__ id: -1 - [channel] next_id: - -1 [channel] \| status: invalid [ 32.088973] __gv11b__ PBDMA_PUT 00000021040c0004 PBDMA_GET 0000000140020000 [ 32.096502] __gv11b__ GP_PUT 00000000 GP_GET 8080a440 FETCH 00000000 HEADER 61400040 [ 32.103679] __gv11b__ HDR 14000010 SHADOW0 00000000 SHADOW1 00000400 [ 32.112336] __gv11b__ [ 32.119860] __gv11b__ gv11b eng 0: [ 32.122119] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.125807] __gv11b__ [ 32.135954] __gv11b__ gv11b eng 1: [ 32.135958] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.139457] __gv11b__ [ 32.149945] __gv11b__ gv11b eng 2: [ 32.149950] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.153543] __gv11b__ [ 32.163598] __gv11b__ gv11b eng 3: [ 32.163601] __gv11b__ id: -1 (channel), next_id: -1 (channel), ctx status: invalid [ 32.167278] __gv11b__ [ 32.177076] __gv11b__ [ 32.186145] nvgpu: 17000000.gv11b nvgpu_tsg_set_ctx_mmu_error:492 [ERR] TSG 0 generated a mmu fault [ 32.189443] nvgpu: 17000000.gv11b nvgpu_set_err_notifier_locked:140 [ERR] error notifier set to 31 for ch 511 JIRA NVGPU-5541 Change-Id: Iad60adfab5198ee11dd2ec595f2422ea541b7a2a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349166 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	5d06a59bc5	gpu: nvgpu: Cleanup uart and debugfs debug prints The gk20a_debug_dump() function implicitly adds a newline since it uses nvgpu_err() under the hood (for uart destined prints). For the seq_file destined writes it does not so there is an annoying inconsistency. Remove the newline that many of the gk20a_debug_dump() calls add and add the newline to the (now) seq_printf() call. This reduces the length of debug dump logs and speeds them up - UART is _very_ slow after all. Also cleanup some formatting issues in the various debug prints I happened to notice. JIRA NVGPU-5541 Change-Id: Iabf853d5c50214794fc4cbb602dfffabeb877132 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2347956 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	2077df9b1a	gpu: nvgpu: use set_syncpt only with nvhost nvgpu_channel_set_syncpt() is not useful if nvhost and thus syncpts are missing and semaphores are used for synchronization. Require CONFIG_TEGRA_GK20A_NVHOST to be set for the set_syncpt hal. Jira NVGPU-5496 Change-Id: Ief8b4a0fb29af631817aba55c04181b1a360ce56 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2344064 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	b6bf13290e	gpu: nvgpu: alloc correct prealloc buffer sizes The trivial ringbuffer implementation in channel job list and priv cmd buffers acts such that the buffer is full when the number of inserted entries in it is one less than allocation size, similarly to the hardware gpfifo. Take this into account when allocating the job tracking resources: previously the allocation has been off-by-one too small. Jira NVGPU-5492 Change-Id: If7bfd4919daa5b0328394ca289d5692c0d2b4f5f Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342129 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	d916e85171	gpu: nvgpu: incr sync once submit is ready to go Split out the max value increment and syncpt interrupt registration out of nvgpu_channel_sync_incr*(). This API is called in the submit path to prepare buffers and tracking resources, but later on in the submit path errors can still occur so that the increment wouldn't happen (unless artificially forced by sw). The increment and irq registration cannot easily be undone and it makes more sense to do these at the moment when the prepared job is finally ready, so add a new nvgpu_channel_sync_mark_progress() API to be called later in the submit path to signal that progress shall eventually happen on the sync. Without this, the max value would stay too large after an unsuccessful submit until the channel gets closed. The sync object (syncpt or semaphore) is always exclusively owned by the channel that allocated it, so nonatomically reading the max value first in sync_incr() and incrementing it later in mark_progress() is racefree; all submits per channel are serialized. Change the channel syncpoint to client managed from host managed so that nvhost-exported sync fences behave correctly with the temporary state where the fence threshold is over the max value. Ideally we'd always track nvgpu-owned syncpts' max values internally, but this is enough for now. Jira NVGPU-5491 Change-Id: Idf0bda7ac93d7f2f114cdeb497fe6b5369d21c95 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340465 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	a629b48013	gpu: nvgpu: split channel sema wakeup function Extract the functionality to post semaphore signals to one channel into a separate function for readability. Jira NVGPU-5491 Change-Id: Ib5e8d34f42a64c253b3b3b8cb9e2c5dd2656fd1f Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340466 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	23d6b36101	gpu: nvgpu: add dma_fence semaphore support Support exporting and importing semaphore-based synchronization with the stable dma-fence API. The "Android" sync fence API used until now is deprecated. The Android sync framework is still kept as the default. Jira NVGPU-5353 Change-Id: I9e57947adeb4d2ef5d59135ed7d008553c44f97c Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2336119 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	d44ed9d3a8	gpu: nvgpu: rollback gpfifo on error Submitting new work may fail in the middle of writing the gpfifo entries. Undo the increments on the gp_put shadow pointer in case of error to avoid submitting wrong data during the next submit. Jira NVGPU-5491 Change-Id: I064eaac8773b24da0a56db79ac6bfd07c008da03 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340464 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	f388b1f596	gpu: nvgpu: simplify cmdbuf construction in submit Split out the wait cmd and incr cmd setup work in submit path to separate functions to minimize cyclomatic complexity and to increase readability. Jira NVGPU-5491 Change-Id: I7dfabd2de287ae10aaae9fb8d4d85d752db8631c Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2340463 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	3875c0825f	gpu: nvgpu: avoid sema/channel dependencies Move the per-channel hw semaphore object to be owned by the channel sync (just like with syncpoints, too). Store just the channel ID in the hw sema for debug prints to get rid of sema->channel dependencies. Make nvgpu_semaphore_alloc() take a hw sema instead of a channel. Fix up some channel-related documentation that has been incorrect. Jira NVGPU-5353 Change-Id: I04d49da3aac50a4cea32e7393f48e6f85a80ca0d Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2339931 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	b8f398f6a7	gpu: nvgpu: clean up struct priv_cmd_entry The valid flag is no longer useful as the lifetime of priv cmd entries is clearer than before. Delete it. Delete also the stored gva that can be calculated from the nvgpu_mem plus offset. Jira NVGPU-4548 Change-Id: Ibf322acbb2ab1a454e9b644af24c02d291b75633 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> (cherry picked partially from commit b9f6512e803873aaa92218dcbc090ff31a4f9c50) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332509 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	05df07945a	gpu: nvgpu: avoid channel dependency in priv cmdbuf The priv cmdbuf queue needs only the vm_gk20a of the channel that owns it. Pass the vm to the queue constructor and have the channel code store the queue to itself instead of poking at the channel from the queue code. Adjust the cmdbuf queue api to take the queue, not the channel. Move the inflight job fallback calculation to the channel code. The size of the channel gpfifo isn't needed in the queue; just the job count is. [not part of the cherry-pick: a bunch of MISRA mitigations.] Jira NVGPU-4548 Change-Id: I4277dc67bb50380cb157f3aa3c5d57b162a8f0ba Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329659 (cherry picked from commit 83b2276f7bea563602eee20ce24b70ce70c8475a) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332508 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	991002c88b	gpu: nvgpu: hide struct priv_cmd_entry The type for entries allocated from the priv cmd queue is no longer necessary to be visible for its users other than as an opaque handle, except for a few minor debug prints. Make those prints output the entry pointer value instead and move the struct definition to priv_cmdbuf.c. Jira NVGPU-4548 Change-Id: Ia75ff41d840ac928561525a46d5973640e4b5f7e Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329658 (cherry picked from commit 3292cdadbc78ca129d1e0878c3947b0839487fc2) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332507 Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	9bee2fe660	gpu: nvgpu: prealloc priv cmdbuf metadata Move preallocation of priv cmdbuf metadata structs to the priv cmdbuf level and do it always, not only on deterministic channels. This makes job tracking simpler and loosens dependencies from jobs to cmdbuf internals. The underlying dma memory for the cmdbuf data has always been preallocated. Rename the priv cmdbuf functions to have a consistent prefix. Refactor the channel sync wait and incr ops to free any priv cmdbufs they allocate. They have been depending on the caller to free their resources even on error conditions, requiring the caller to know how they work. The error paths that could occur after a priv cmdbuf has been allocated have likely been wrong for a long time. Usually the cmdbuf queue allows allocating only from one end and freeing from only the other end, as that's natural with the hardware job queue. However, in error conditions the just recently allocated entries need to be put back. Improve the interface for this. [not part of the cherry-pick:] Delete the error prints about not enough priv cmd buffer space. That is not an error. When obeying the user-provided job sizes more strictly, momentarily running out of job tracking resources is possible when the job cleanup thread does not catch up quickly enough. In such a case the number of inflight jobs on the hardware could be less than the maximum, but the inflight job count that nvgpu sees via the consumed resources could reach the maximum. Also remove the wrong translation to -EINVAL from err from one call to nvgpu_priv_cmdbuf_alloc() - the -EAGAIN from the failed allocation is important. [not part of the cherry-pick: a bunch of MISRA mitigations.] Jira NVGPU-4548 Change-Id: I09d02bd44d50a5451500d09605f906d74009a8a4 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2329657 (cherry picked from commit 25412412f31436688c6b45684886f7552075da83) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2332506 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	6fc1e41150	gpu: nvgpu: split submit on deterministic Avoid repetitive branching on the c->deterministic flag and on build time flags by breaking the submit function on the runtime flag into two functions of which one gets called. In deterministic mode the job tracking conditions are simpler, there are a few extra prechecks to guarantee deterministic latency and the railgate corner case, and deferred cleanup is never done. In nondeterministic mode job tracking has more conditions, a power reference is taken for the job lifetime, and deferred cleanup is assumed. These two paths still share some common code. Split it to two more functions to act as easy building blocks so that the main logic is apparent. Jira NVGPU-4548 Change-Id: I64f91dcf09acb16f409dc04a12ad1e144d0cce56 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333728 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	b077c6787d	gpu: nvgpu: split sync and gpfifo work in submit Make the big submit function somewhat shorter by splitting out the work to do job allocation, sync command buffer creation and gpfifo writing out to another function. To emphasize the difference between tracked and fast submits, add two separate functions for those two cases. Jira NVGPU-4548 Change-Id: I97432a3d70dd408dc5d7c520f2eb5aa9c76d5e41 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2333727 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	dd2fb50a1a	gpu: nvgpu: require deferred cleanup for aggressive sync destroy Aggressive sync destroy is used on some platforms where the amount of syncpoints is limited. It can cause sync objects to get allocated and freed in the submit path and when jobs are cleaned up, so require deferred cleanup. Allocations do not belong to job tracking in a deterministic submit path. Although this has been technically allowed before, deterministic channels have likely not been a priority on those old platforms with aggressive sync destroy set. Update virtualized gp10b platform data to match on a gp10b-vgpu compat string instead of gk20a-vgpu. gk20a (Tegra T124) hasn't been supported for a long time. Delete the aggressive sync destroy field from this platform. It's got enough syncpoints to not dynamically allocate them; having this property set for gp10b-vgpu has likely been a mistake. This is not a completely pure cherry-pick: also extend the gpu characteristics to not advertise full deterministic submit support when aggressive sync destroy is off. This platform flag cannot be adjusted by the user unlike many other flags. Jira NVGPU-4548 Change-Id: I283f546d48b79ac94b943d88e5dce55710858330 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322042 (cherry picked from commit b1ba2b997b2174e365bcb0782ef3e67260ff9e57) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328411 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	0b70fff5db	gpu: nvgpu: fix job count calculation for non-pow2 The CIRC_SPACE and CIRC_CNT macros work as expected when the buffer size is a power of two. The userspace-supplied number of inflight jobs is not necessarily so. Compare the get and put pointers manually. Jira NVGPU-4548 Change-Id: Ifa7bd6d78f82ec8efcac21fcca391053a2f6f311 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328572 (cherry picked from commit 33dffa1cfb142eea0f28474566c31b632eee04f5) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331340 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	47c3d4582c	gpu: nvgpu: hide priv cmdbuf gva and size Add an accessor function in the priv cmdbuf object for gva and size to be written in a gpfifo entry once the cmdbuf build is finished. This helps in eventually hiding the struct priv_cmd_entry as an implementation detail. Add a sanity check to verify that the buffer has been filled exactly to the requested size. The cmdbufs are used to hold wait and increment commands for syncpoints or gpu semaphores. A prefence buffer can hold a number of wait commands of equal size, and the postfence buffer holds exactly one increment. Jira NVGPU-4548 Change-Id: I83132bf6de52794ecc419e033e9f4599e488fd68 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325102 (cherry picked from commit d1831463a487666017c4c80fab0292a0b85c7d83) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331339 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Dinesh	1c1da3d6b4	gpu: nvgpu: Syncpoint invalid value to ~0. As qnx syncpoint's invalid value is ~0, change the code to handle this. Bug 200603716 Change-Id: I5ec79688cd9e60066725781f1effe57692ec0c27 Signed-off-by: Dinesh <dt@nvidia.com> (cherry picked from commit 705260565a75bc90683841c4c08e4c857bda39f0) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331208 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	e9747d5477	gpu: nvgpu: remove wait_fence_fd from incr_user The wait_fence_fd parameter in nvgpu_channel_sync_incr_user() has not been used since commit `1a4647272f` ("gpu: nvgpu: remove fence dependency tracking") where it was used to save a dependency fd to sema-based post fences. The commit probably should have removed this param; it has no purpose in the current design. Jira NVGPU-4548 Change-Id: Id7e68b24f8e9ba0e43ff01b7af946434580b166e Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2326604 (cherry picked from commit f8031142270fb87ac41597ae70a80505078ae6d5) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328423 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	39844fb27c	gpu: nvgpu: hide priv cmdbuf mem writes Add an API to append data to a priv cmdbuf entry. Hold the write pointer offset internally in the entry instead of having the user keep track of where those words are written to. This helps in eventually hiding struct priv_cmd_entry from users and provides a more consistent interface in general. The wait and incr commands are now slightly easier to read as well when they're just arrays of data. A syncfd-backed prefence may be composed of several individual fences. Some of those (or even a fence backed by just one) may be already expired, and currently the syncfd export design releases and nulls semaphores when expired (see gk20a_sync_pt_has_signaled()) so for those the wait cmdbuf is appended with zeros; the specific function is for this purpose. Jira NVGPU-4548 Change-Id: I1057f98c1b5b407460aa6e1dcba917da9c9aa9c9 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325099 (cherry picked from commit 6a00a65a86d8249cfeb06a05682abb4771949f19) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2331336 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	d58d6ff321	gpu: nvgpu: use job count for priv cmdbuf size Reduce the priv cmdbuf allocation size to match the actual space needed in the worst case when num_in_flight is not specified. Although synchronization may indeed take up to 2/3 of the gpfifo entries, the number of jobs is what matters and it will be the remaining 1/3. Each job uses up at most one wait and incr command from the pre and post fences, so half of the 2/3 will be only wait commands and the other half will be only incr commands. Jira NVGPU-4548 Change-Id: Ib3566a76b97d8f65538d961efb97408ef23ec281 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325233 (cherry picked from commit 515deae4f58fedc7d004988f0f85470a7a894ddf) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328413 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	116c385089	gpu: nvgpu: alloc priv cmdbuf based on chip The semaphore wait and incr sizes are not 8 and 10 for gv11b onwards. Use the specific HAL API to retrieve their sizes and compute the priv cmdbuf queue based on them instead of the up-to-gp10b values. We haven't run out of space likely for several reasons: 1) userspace may not need both pre and post fences for absolutely each submitted job 2) submitted jobs may consist of more than one gpfifo entry, reducing the relative required sync capacity 3) the queue size is rounded up to the next power of two which leaves some margin for error in this calculation 4) the gpfifo size based num-in-flight guess has been twice as big as it needs to be (fixed in a next patch) Jira NVGPU-4548 Change-Id: I172b5c0d8bb7d2231cc45cbed5e1e8b60ce7c707 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323148 (cherry picked from commit 03fb194d105242c3eb20a9857a22743f5f64b9b9) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328412 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	1dcd4957f0	gpu: nvgpu: extract job from channel.c Start moving job and job list related functionality out of the big channel.c file. The lowest level job list stuff is moved, as is resource preallocation which is tied to the job list. Adding and cleaning jobs still stays in channel.c for now. The joblist is still owned by the channel as a direct struct field. Jira NVGPU-4548 Change-Id: I2733484d8ce6bd7b1fe0c32a867139c682616dfd Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323149 (cherry picked from commit cbd20803ee10058da9d258e9e8cb91b34d2278d5) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328408 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	72151c579f	gpu: nvgpu: hide priv cmd queue type Move struct priv_cmd_queue to priv_cmdbuf.c so that its definition does not need to be visible to all users of channel.h. This also forces it to be separately allocated (during channel init time). While at it, rename the functions to allocate and free priv cmdbuf queues now that they're not in channel.c anymore. A private command buffer queue is a piece of dma memory from which entries for incr and wait command lists are suballocated. As the name implies, it's a queue; allocations and frees of the bufs must happen in certain order. Jira NVGPU-4548 Change-Id: I1b47029f3a478e1942f24292918b7b59a5d91528 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323147 (cherry picked from commit 1fcf9b04275f44638059c0147dc16c1dc6956510) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328407 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	b3d16b23d5	gpu: nvgpu: extract priv cmdbuf from channel.c Move private command buffer related functionality to priv_cmdbuf.c. This is used only for kernel mode submits, so it makes sense to group it out, and the priv cmdbuf stuff is used also by things that don't care about channels. Jira NVGPU-4548 Change-Id: Idbb42e3ed3984e16c654bb9aa2b7564b780048a4 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323146 (cherry picked from commit bb67bfc7ab8e87236f31bc4f6c80dab042609f21) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2328406 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	52835c39ae	gpu: nvgpu: do not skip completed syncpt prefences A corner case has existed since ancient times for syncpoint-backed prefences to not cause a gpu wait if the fence is found to be completed in the submit path. This adds some unnecessary complexity, so don't check for completion in software. Let the gpu "wait" for these known-to-be-trivial waits too. Necessary priv cmdbuf space has been allocated anyway. Originally nvhost had 16-bit fences which would wrap around relatively quickly, so waiting for an old fence could have looked like waiting for a fence that will expire long in the future. With 32-bit thresholds, this hasn't been the case for several Tegra generations anymore, and nvhost doesn't ignore waits like this either. The wait priv cmdbuf in submit path can still be missing even with a prefence supplied because the Android sync framework supports sync fds that contain zero fences inside; this can happen at least when merging fences that have all been expired. In such conditions the wait cmdbuf wouldn't even get allocated. [this is squashed with commit 8b3b0cb12d118 (gpu: nvgpu: allow no wait cmd with valid input fence) from https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325677] Jira NVGPU-4548 Change-Id: Ie81fd8735c2614d0fedb7242dc9869d0961610eb Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321762 (cherry picked from commit 8f3dac44934eb727b1bf4fb853f019cf4c15a5cd) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324254 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	c6908922e5	gpu: nvgpu: move generic preempt hals to common - Move fifo.preempt_runlists_for_rc and fifo.preempt_tsg hals to common source file as nvgpu_fifo_preempt_runlists_for_rc and nvgpu_fifo_preempt_tsg. Jira NVGPU-4881 Change-Id: I31f7973276c075130d8a0ac684c6c99e35be6017 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2323866 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	2d9b839f21	gpu: nvgpu: remove user sync related apis Set safe state and get syncpt address in the kernel submission tracking syncs was implemented for userspace syncs. Now that it's clear that the user sync object provides them, there are no users left for these APIs. Remove them. Jira NVGPU-4548 Change-Id: I58e04162dee55bb8d8547c9252033f40ed908144 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321950 (cherry picked from commit a95c8f7ace562a11ca235d71496d3a7ce150bc7d) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324251 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	4f80c6b8a9	gpu: nvgpu: add channel_user_syncpt Refactor user managed syncpoints out of the channel sync infrastructure that deals with jobs submitted via the kernel api. The user syncpt only needs to expose the id and gpu address of the reserved syncpoint. None of the rest (fences, priv cmdbufs) is needed for that, so it hasn't been ideal to couple with the user-allocated syncpts. With user syncpts now provided by channel_user_syncpt, remove the user_managed flag from the kernel sync api. This allows moving all the kernel submit sync code to be conditionally compiled in only when needed, and separates the user sync functionality in a more clear way from the rest with a minimal API. [this is squashed with commit 5111caea601a (gpu: nvgpu: guard user syncpt with nvhost config) from https://git-master.nvidia.com/r/c/linux-nvgpu/+/2325009] Jira NVGPU-4548 Change-Id: I99259fc9cbd30bbd478ed86acffcce12768502d3 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321768 (cherry picked from commit 1095ad353f5f1cf7ca180d0701bc02a607404f5e) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319629 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	b813adbf49	gpu: nvgpu: require os fence when only supported If the os fence is the only kind that's supported, fail a submit if the user wants fences but doesn't explicitly request sync fences, expecting syncpoints. Syncpoint support is advertised to userspace in the gpu characteristics, so userspace already has the knowledge to request the correct sync type. Do this check at the ioctl level. The in-kernel stuff that needs submits (cde, copyengine) can work without syncpoints and sync fences are used only in userspace. Fail a submit also if CONFIG_SYNC is not set and sync fences are requested. Lack of kernel support doesn't guarantee that userspace would still wrongly want that. Clarify the deferred cleanup requirements. The sync framework is needed only for post sync fences, but deferred cleanup is still always needed with semaphores because the internal tracking is done with dynamically allocated (although small) objects. Jira NVGPU-4548 Change-Id: I2e5a6554930cb413b2bb46ddfe388e41390bc7e4 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321715 (cherry picked from commit d870956170906eae1088846ec05266c859669771) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2318157 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	62955ec7f1	gpu: nvgpu: reorganize gpfifo writes in submit Reduce the number of branches and make the code flow more straightforward by having two complete paths for the gpfifo entry writes: one when job tracking is done and another when not. Although this adds some very minor duplication (of the user gpfifo append call), this way it's easier to read what happens to the job metadata, and when do we even have one. Jira NVGPU-4548 Change-Id: I6be8bc5afaf139e7c49d5e44837e04f642dd5721 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321761 (cherry picked from commit 9a3d3c8d556d563b9d67b370636791d6a1dd57ee) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324253 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	550d45430f	gpu: nvgpu: extract submit prechecks to own function Reduce complexity of the big gpfifo submit function by adding another function to perform channel-global and driver-global sanity checks that don't depend on submit parameters. The nvgpu_channel_check_unserviceable() check was in the middle of the submit function because there used to be a blocking wait just before it when the hw gpfifo would be full. The blocking wait could exit with the channel recovered from a timeout. Now it's ok to check this only once in the beginning because the submit is non-blocking. Jira NVGPU-4548 Change-Id: Idf19a560ca58a4f7da776c420dc9c6299cd7f7e7 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321760 (cherry picked from commit 5359a2180f13505f57c62b9f639344913716370a) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324252 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	8b96f27c45	gpu: nvgpu: delete channel refs in job tracking Each submitted job has held a reference to the channel where the job runs. This is not necessary: all that the refs do is prevent the channel from getting freed before the jobs are done in case the channel file is closed early. However, that is already taken care of, so remove the per-job get/put pair. The channel closure path needs to unbind the channel from its tsg if that hasn't done by the channel's user. Unbind gets the channel off the runlist and forces all fences to expire, then enqueues the channel for final job cleanup. No jobs can outlive this. Delete also the extra get/put pair in job cleanup. The caller (either the channel worker thread or the submit path in case of deterministic channels) will always hold a reference. Jira NVGPU-4548 Change-Id: I3a01759e1b2caf66c46cff19f6557645489ca8f4 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2322541 (cherry picked from commit 8af6260b8fcfd7bf393f50addb681b5353cbae38) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2324255 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vinod G	340ea241cb	gpu: nvgpu: remove channel debug_dump hal Channel debug_dump hal function does not involve any register related code. Move gv11b_channel_debug_dump hal function to common code nvgpu_channel_info_debug_dump function. Check gpu hw version to limit instance variables dump that differs between socs. Add new hal pointer syncpt_debug_dump for pbdma. Jira NVGPU-5109 Signed-off-by: Vinod G <vinodg@nvidia.com> Change-Id: Icfca837ce8e4117387cffa6fadf6c094c7da5946 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2321016 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00

... 2 3 4 5 6 ...

561 Commits