linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 01:50:07 +03:00

Author	SHA1	Message	Date
Shardar Shariff Md	a470647ad7	gpu: nvgpu: use soc/tegra/chip-id.h for soc header The soc tegra headers are unified and moved all the content of linux/tegra-soc.h to the soc/tegra/chip-id.h to have the single soc header for Tegra. Change-Id: I281e19dd3eb1538b8dfbea4eb0779fb64d1fcffa Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com> Reviewed-on: http://git-master/r/1288365 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2017-01-20 08:24:01 -08:00
Alex Waterman	6e2237ef62	gpu: nvgpu: Use timer API in gk20a code Use the timers API in the gk20a code instead of Linux specific API calls. This also changes the behavior of several functions to wait for the full timeout for each operation that can timeout. Previously the timeout was shared across each operation. Bug 1799159 Change-Id: I2bbed54630667b2b879b56a63a853266afc1e5d8 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1273826 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-01-18 16:46:33 -08:00
Alex Waterman	b928f10d37	gpu: nvgpu: Start re-organizing the HW headers Reorganize the HW headers of gk20a. The headers are moved to a new directory: include/nvgpu/hw/gk20a And from the code are included like so: #include <nvgpu/hw/gk20a/hw_pwr_gk20a.h> This is the first step in reorganizing all of the HW headers for gm20b, gm206, etc. This is part of a larger effort to re-structure and make the driver more readable and scalable. Bug 1799159 Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1244790 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-01-11 12:44:14 -08:00
Alex Waterman	6df3992b60	gpu: nvgpu: Move allocators to common/mm/ Move the GPU allocators to common/mm/ since the allocators are common code across all GPUs. Also rename the allocator code to move away from gk20a_ prefixed structs and functions. This caused one issue with the nvgpu_alloc() and nvgpu_free() functions. There was a function for allocating either with kmalloc() or vmalloc() depending on the size of the allocation. Those have now been renamed to nvgpu_kalloc() and nvgpu_kfree(). Bug 1799159 Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1274400 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2017-01-09 12:33:16 -08:00
Alex Waterman	91d977ced4	gpu: nvgpu: Misc fixes for crashes on shutdown Fix miscellaneous issues seen during driver shutdown. o Make sure pointers are valid before accessing them. o Busy the GPU during channel timeout. o Cancel delayed work on channels. o Avoid access to channels that may have been freed. Bug 1816516 Bug 1807277 Change-Id: I62df40373fdfb1c4a011364e8c435176a08a7a96 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1250026 (cherry picked from commit 64a95fc96c8ef7c5af9c53c4bb3402626e0d2f60) Reviewed-on: http://git-master/r/1274474 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2017-01-04 15:53:55 -08:00
Konsta Holtta	92fe000749	gpu: nvgpu: rename timeout_check to timeout_expired Change "check" to "expired" in nvgpu_timeout_check* and append _expired to nvgpu_timeout_peek to clarify what the boolean-like return value means and thus avoid bugs. Bug 200260715 Change-Id: I47e097ee922e856005a79fa9e27eddb1c8d77f8b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1269366 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-12-19 15:40:36 -08:00
Terje Bergstrom	a4731b3282	gpu: nvgpu: Use end of vidmem as bootstrap region Instead of hard coding bootstrap region, it should always be set to the last 256MB of vidmem. Bug 200244445 Change-Id: I91779d1bf861f4f23a0b646f70b1febbbc4581b5 Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1242409 Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-on: http://git-master/r/1267124 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-12-09 15:05:12 -08:00
Konsta Holtta	54810444d2	gpu: nvgpu: fix timeout retry usage in mm_gk20a.c Loop conditions of timeout checking introduced in commit `2109478311` ("gpu: nvgpu: Use timeout retry API in mm_gk20a.c") were flipped by accident, so each usage in a loop actually did not wait enough but ran only one iteration. Fix the conditions to loop as long as the timeout is NOT expired. Also restore l2 flush timeout to 10 ms from 1, which was done in commit `030ef82bdd` ("gpu: nvgpu: increase l2 flush timeout") but overwritten by the above "use timeout" commit. Bug 200260715 Change-Id: I0db16be79a1a27caa3d97fac9d4361582cc232e8 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1268482 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-12-09 13:44:53 -08:00
Alex Waterman	2109478311	gpu: nvgpu: Use timeout retry API in mm_gk20a.c Use the retry API that is part of the nvgpu timeout API to keep track of retry attempts in the vrious flushing and invalidating operations in the MM code. Bug 1799159 Change-Id: I36e98d37183a13d7c3183262629f8569f64fe4d7 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1255866 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-12-05 16:16:24 -08:00
Krishna Reddy	89e707d2b4	gpu: nvgpu: fix hardcoded page size reference Bug 1843356 Bug 1769772 Change-Id: I6c2a3a72f7082074bbf1165a74d5070195e1e653 Signed-off-by: Krishna Reddy <vdumpa@nvidia.com> Reviewed-on: http://git-master/r/1258352 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-12-05 13:49:17 -08:00
seshendra Gadagottu	030ef82bdd	gpu: nvgpu: increase l2 flush timeout Under heavy throttling case gpu runs 8 times slower. This is making l2 flush to timeout, to avoid this increase timeout to 10msec from 1msec. Bug 1787261 Change-Id: Ia112ce968c136135ccb9df4a7364073103684403 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1216559 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-12-02 05:37:03 -08:00
Deepak Nibade	af5d2d208a	gpu: nvgpu: API to access fb memory Add IOCTL API NVGPU_DBG_GPU_IOCTL_ACCESS_FB_MEMORY to read/write fb/vidmem memory Interface will accept dmabuf_fd of the buffer in vidmem, offset into the buffer to access, temporary buffer to copy data across API, size of read/write and command indicating either read or write operation API will first parse all the inputs, and then call gk20a_vidbuf_access_memory() to complete fb access gk20a_vidbuf_access_memory() will then just use gk20a_mem_rd_n() or gk20a_mem_wr_n() depending on the command issued Bug 1804714 Jira DNVGPU-192 Change-Id: Iba3c42410abe12c2884d3b603fa33d27782e4c56 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1255556 (cherry picked from commit 2c49a8a79d93fc526adbf6f808484fa9a3fa2498) Reviewed-on: http://git-master/r/1260471 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>	2016-11-30 02:18:17 -08:00
seshendra Gadagottu	ef95e43d97	gpu: nvgpu: chip specific init_inst_block Add function pointer to add chip specific init_inst_block. Update this function pointer for gk20a and gm20b. JIRA GV11B-21 Change-Id: I74ca6a8b4d5d1ed36f7b25b7f62361c2789b9540 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1254875 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-11-21 08:50:45 -08:00
Terje Bergstrom	d29afd2c9e	gpu: nvgpu: Fix signed comparison bugs Fix small problems related to signed versus unsigned comparisons throughout the driver. Bump up the warning level to prevent such problems from occuring in future. Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1252068 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-11-16 21:35:36 -08:00
Alex Waterman	2fa54c94a6	gpu: nvgpu: Remove global debugfs variable Remove a global debugfs variable and instead save the allocator debugfs root node in the gk20a struct. Bug 1799159 Change-Id: If4eed34fa24775e962001e34840b334658f2321c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225611 (cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a) Reviewed-on: http://git-master/r/1242390 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-10-26 11:10:01 -07:00
Alex Waterman	93eea1d729	gpu: nvgpu: Move CE cleanup Move the CE cleanup to before the FIFO cleanup. Since the CE closes a channel during its cleanup the FIFO needs to be initialized since the FIFO code maintains the vmalloc()'ed channels. Bug 1816516 Change-Id: Ia7a97059a12a0c2b52368ffe411e597f803e8e6e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225613 (cherry picked from commit 707bd2a6d4672c6a7b7a8b2e581ea3a606ed971d) Reviewed-on: http://git-master/r/1240106 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-10-26 11:09:57 -07:00
Alex Waterman	f0fe1c2f02	gpu: nvgpu: Only cleanup existing semaphore pools Not all VMs have semaphore pools made for them even when semaphores are going to be used. Thus only VMs with existing semaphore pools should have their pools cleaned up. Bug 1816516 Change-Id: I07828708faef451f1711f58c0d5b3f8e4d296dd0 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225612 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> (cherry picked from commit 6cdb7b6650765465dca68dc3c23b3d795ccdafb5) Reviewed-on: http://git-master/r/1240105 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-10-26 11:09:56 -07:00
Alex Waterman	fc4f0ddddb	gpu: nvgpu: SLAB allocation for page allocator Add the ability to do "SLAB" allocation in the page allocator. This is generally useful since the allocator manages 64K pages but often we only need 4k chunks (for example when allocating memory for page table entries). Bug 1799159 JIRA DNVGPU-100 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225322 (cherry picked from commit 299a5639243e44be504391d9155b4ae17d914aa2) Change-Id: Ib3a8558d40ba16bd3a413f4fd38b146beaa3c66b Reviewed-on: http://git-master/r/1227924 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-18 12:24:33 -07:00
Konsta Holtta	70e0462861	gpu: nvgpu: skip pramin barriers for page tables Page table updates have an explicit write barrier at the end of a pte update operation in update_gmmu_ptes_locked(), so the per-wr32 wmb()s are not necessary. Jira DNVGPU-23 Change-Id: I2e2596f0900d840fadb369ee1261c5e2305f2070 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225150 (cherry picked from commit 6664a667ea326e9663a6b502765f858d8669f4d9) Reviewed-on: http://git-master/r/1227475 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-17 14:39:10 -07:00
Konsta Holtta	a8e260bc8d	gpu: nvgpu: allow skipping pramin barriers A wmb() next to each gk20a_mem_wr32() via PRAMIN may be overly careful, so support not inserting these barriers for performance, in cases where they are not necessary, where the caller would do an explicit barrier after a bunch of reads. Also, move those optional wmb()s to be done at the end of the whole internally batched write for gk20a_mem_{wr_n,memset} from the per-batch subloops that may run multiple times. Jira DNVGPU-23 Change-Id: I61ee65418335863110bca6f036b2e883b048c5c2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225149 (cherry picked from commit d2c40327d1995f76e8ab9cb4cd8c76407dabc6de) Reviewed-on: http://git-master/r/1227474 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-17 14:39:00 -07:00
Alex Waterman	718af968f0	gpu: nvgpu: Fix wmb() order in pramin access wmb() should come after the writes to ensure that the writes have completed before progressing. Bug 1811382 Change-Id: I98fba317b1760240c0b5de531accf398fe69c9b3 Signed-off-by: Alex Waterman <alexw@nvidia.com> (cherry picked from commit 1b1201b9c109061590e6e25260d7230ae2c89888) Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225251 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-17 14:38:50 -07:00
Konsta Holtta	fa6ab1943e	gpu: nvgpu: add ioctl for querying memory state Add NVGPU_GPU_IOCTL_GET_MEMORY_STATE to read the amount of free device-local video memory, if applicable. Some reserved fields are added to support different types of queries in the future (e.g. context-local free amount). Bug 1787771 Bug 200233138 Change-Id: Id5ffd02ad4d6ed3a6dc196541938573c27b340ac Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1223762 (cherry picked from commit 96221d96c7972c6387944603e974f7639d6dbe70) Reviewed-on: http://git-master/r/1235980 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-14 08:12:34 -07:00
Konsta Holtta	a4e4466a0c	gpu: nvgpu: track pending bytes for vidmem clears Change clears_pending to bytes_pending and track accordingly the number of bytes to be freed instead of the number of buffers. This, atomically combined with the amount of space in the allocator, is the total amount of free memory available. Bug 200233138 Change-Id: Ibbb4e80a32728781ba19a74307d8a8ac1a4d7431 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1231422 (cherry picked from commit 025e765f312c253b201ecf2dbbe0f4972fe1d4bc) Reviewed-on: http://git-master/r/1235957 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-14 08:12:17 -07:00
Konsta Holtta	8728da1c6e	gpu: nvgpu: compact pte buffers The lowest page table level may hold very few entries for mappings of large pages, but a new page is allocated for each list of entries at the lowest level, wasting memory and performance. Compact these so that the new "allocation" of ptes is appended at the end of the previous allocation, if there is space. 4 KB page is still the smallest size requested from the allocator; any possible overhead in the allocator (e.g., internally allocating big pages only) is not taken into account. Bug 1736604 Change-Id: I03fb795cbc06c869fcf5f1b92def89a04583ee83 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1221841 (cherry picked from commit fa92017ed48e1d5f48c1a12c512641c6ce9924af) Reviewed-on: http://git-master/r/1234996 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-13 08:09:16 -07:00
Konsta Holtta	f5069622bb	gpu: nvgpu: make sure vidmem is cleared only once Protect the initial vidmem zeroing performed during the first userspace alloc with a mutex, so that it blocks next concurrent users and is run only once. Otherwise, multiple clears could end up running in parallel, so that the next ones corrupt memory allocated by the thread that has finished earlier and advanced to allocate and use memory. Jira DNVGPU-84 Change-Id: If497749abf481b230835250191d011c4a9d1483b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1232461 (cherry picked from commit 79435a68e6d2713b78acdb0ec6f77cfd78651d7f) Reviewed-on: http://git-master/r/1234990 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-10-12 13:02:13 -07:00
seshendra Gadagottu	fda4ddfa79	gpu: nvgpu: userd allocation from sysmem When bar1 memory is not supported then userd will be allocated from sysmem. Functions gp_get and gp_put are updated accordingly. JIRA GV11B-1 Change-Id: Ia895712a110f6cca26474228141488f5f8ace756 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1225384 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-11 09:16:03 -07:00
Konsta Holtta	0bb1f45958	gpu: nvgpu: optimize barrier in batch pramin writes Move wmb() before the loop in pramin-accessed batch writes and use writel_relaxed() directly, instead of calling gk20a_writel() that would do wmb() on each iteration separately. Jira DNVGPU-24 Change-Id: I4c1375a819266727f97e2f109d3132b5b0974ac6 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1213600 (cherry picked from commit 79e3e38e0c5384ababfd55b8e6cd9723eb8f7b66) Reviewed-on: http://git-master/r/1184343 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-10-06 23:34:57 -07:00
Deepak Nibade	5c049b5c79	gpu: nvgpu: fix allocation and map size mismatch while mapping It is possible to allocate larger size than user requested e.g. If we allocate at 64k granularity, and user asks for 32k buffer, we end up allocating 64k chunk. User still asks to map the buffer with size 32k and hence we reserve mapping addresses only for 32k But due to bug in mapping in update_gmmu_ptes_locked() we end up creating mappings considering size of 64k and corrupt some mappings Fix this by considering min(chunk->length, map_size) while mapping address range for a chunk Also, map_size will be zero once we map all requested address range. So bail out from the loop if map_size is zero Bug 1805064 Change-Id: I125d3ce261684dce7e679f9cb39198664f8937c4 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1217755 (cherry picked from commit 3ee1c6bc0718fb8dd9a28a37eff43a2872bdd5c0) Reviewed-on: http://git-master/r/1221775 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>	2016-09-26 03:33:49 -07:00
Alex Waterman	f361f89b12	gpu: nvgpu: Use actual carveouts for WPR region Use a carveout for the WPR region in the VIDMEM. Jira DNVGPU-84 Change-Id: I191ecc3bb317ae3af6b56f5970194e646c513964 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1208527 (cherry picked from commit 7edf74d7468dcff1f01cbd901d83aa0e32602f0e) Reviewed-on: http://git-master/r/1223455 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-20 14:56:43 -07:00
Konsta Holtta	919b642214	gpu: nvgpu: log page table addr only for sysmem Don't attempt to use get_iova_addr() on vidmem which does not make sense. Jira DNVGPU-20 Change-Id: Ibfe1516b88ed8b60b8134c330e6b0569d52cbb5b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1217077 (cherry picked from commit c912f0349d24fde033dbcd9874948ff14ad89a43) Reviewed-on: http://git-master/r/1221264 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 21:58:53 -07:00
Deepak Nibade	f919aab509	gpu: nvgpu: add safety for vidmem addresses Add new API set_vidmem_page_alloc() which sets BIT(0) in sg_dma_address() only for vidmem allocation Add and use new API get_vidmem_page_alloc() which receives scatterlist and returns pointer to vidmem allocation i.e. struct gk20a_page_alloc *alloc In this API, check if BIT(0) is set or not in sg_dma_address() before converting it to allocation address In gk20a_mm_smmu_vaddr_translate(), ensure that the address is pure IOVA address by verifying that BIT(0) is not set in that address Jira DNVGPU-22 Change-Id: Ib53ff4b63ac59a8d870bc01d0af59839c6143334 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1216142 (cherry picked from commit 03c9fbdaa40746dc43335cd8fbe9f97ef2ef50c9) Reviewed-on: http://git-master/r/1219705 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 12:24:45 -07:00
Deepak Nibade	f036594190	gpu: nvgpu: move all instance blocks to vidmem Use gk20a_gmmu_alloc() in gk20a_alloc_inst_block() so that we always try to allocate all inst blocks in vidmem first Also use common API gk20a_alloc_inst_block() in channel_gk20a_alloc_inst() as well Jira DNVGPU-22 Change-Id: I6c47c19aae1189d7e57f47a51d21a32e2df53c1f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1216140 (cherry picked from commit 6c84961a50eb8a8b080b2db08f87e58143f5a6e8) Reviewed-on: http://git-master/r/1219704 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 12:24:38 -07:00
Konsta Holtta	97512aecb6	gpu: nvgpu: fix inst block leak for vidmem Test for size, not cpu_va, to check for buffer validity before attempting to free. Jira DNVGPU-22 Change-Id: I416c0963bf4e1819aa2f8d200c69a2d989524f83 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1215575 (cherry picked from commit ce0077feca55bfb5665c82972598a075abd8f2a0) Reviewed-on: http://git-master/r/1219702 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 12:24:24 -07:00
Deepak Nibade	130dd3dd3e	gpu: nvgpu: use get_base_addr() to get inst_block address Since inst_block could reside either in sysmem or vidmem, use gk20a_mem_get_base_addr() to get it's base address Jira DNVGPU-22 Change-Id: Ic9b4370e0a88b585483e78ea81df0ec6ff799487 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1212702 (cherry picked from commit ecdffa7664f48dba0bcbd15b1340af5bf3b45802) Reviewed-on: http://git-master/r/1219700 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 12:24:10 -07:00
Konsta Holtta	51a7cdce32	gpu: nvgpu: fall back to sysmem for generic alloc-maps In gk20a_gmmu_alloc_map_attr(), which is used for in-kernel allocations combined with immediate gmmu map, fall back to attempting to allocate sysmem when vidmem allocation fails. Bug 1809939 Change-Id: I4ec4fbf93d41fd9681166b47b3ecad24b51ea274 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1216814 (cherry picked from commit a9929682f1f356f7e8a652a2cec8ed73cc492448) Reviewed-on: http://git-master/r/1217688 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 12:23:53 -07:00
Konsta Holtta	1a63ca3a65	gpu: nvgpu: fall back to sysmem for generic allocs In gk20a_gmmu_alloc_attr(), which is used for in-kernel allocations, fall back to attempting to allocate sysmem when vidmem allocation fails. Bug 1809939 Change-Id: I0397026fd1b3bc803f6d8bb7409e05ab31ec961d Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1215447 (cherry picked from commit 3ec37992b830cee917e8ad35ede50e048907014a) Reviewed-on: http://git-master/r/1217687 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-15 12:23:46 -07:00
Konsta Holtta	b700d3a040	gpu: nvgpu: fix null access in page table allocation Check entry->mem.sgt for validity before attempting to dereference it in a debug print. Bug 1809939 Change-Id: If7aa7444c162a076d8f23a88dfd2e3e0a9c33813 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1215522 (cherry picked from commit 48c25cd4f1db9d5bb07847af4de29d8f369b52e3) Reviewed-on: http://git-master/r/1220547 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-14 14:13:55 -07:00
Konsta Holtta	93d3199019	gpu: nvgpu: test free user vidmem atomically An empty list of soon-to-be-freed userspace vidmem buffers is not enough to safely assume that an allocation may succeed or not if tried again, because removal from the list and actually marking the memory freed is not atomic. Fix this by using an atomic counter for the number of pending frees (so that it's still safe to first remove from the job list and then perform the free), and making allocation attempts combined with a test of pending frees atomic. This still does not guarantee that there is memory available (as the actual amount of pending memory in bytes plus the current free amount isn't computed), but removes the race that produces false negatives in case a single program expects repeated frees and allocs to succeed. Bug 1809939 Change-Id: I6a92da2e21cbf3f886b727000c924d56f35ce55b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1217078 (cherry picked from commit 83c1f1e70dccd92fdd4481132cf5b6717760d432) Reviewed-on: http://git-master/r/1220545 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-14 13:03:44 -07:00
Terje Bergstrom	a0dd3ee5be	gpu: nvgpu: Allocate vidmem fds from 1024 Allocate vidmem fds from 1024 onwards. This prevents us from using up the 0-1023 range which is tracked per process, and fits within FD_SETSIZE. Bug 200222681 Change-Id: I104b81f2831f1816ff66fc245fa63013d78001ec Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1199269 (cherry picked from commit 5d5cbaf6a63dd31538fa35081b70e103d8a658f4) Reviewed-on: http://git-master/r/1217294 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit	2016-09-08 20:05:41 -07:00
Deepak Nibade	f31e575ed6	gpu: nvgpu: remove blocking wait for vidmem allocation We have blocking 1sec wait for vidmem allocation Remove this blocking wait and just return proper error code to the caller In case we have some buffers to be cleaned up in the list (clear_list_head), return EAGAIN so that caller can retry Otherwise return ENOMEM indicating that no memory is available right now Jira DNVGPU-84 Change-Id: Ife2b17c989fc80e568f03bb18ad75b93a25be962 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1204969 (cherry picked from commit 2bacdf0bc6d5b1cdcb8be37e574ca5f4f0663cae) Reviewed-on: http://git-master/r/1213451 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-08 09:43:49 -07:00
Deepak Nibade	0ca01a3355	gpu: nvgpu: fix non-contiguous pramin access In pramin_access_batched(), in each iteration of the loop we first decide size of data that we should write in that iteration. In case this size is equal to length of the chunk, we need to move to use next chunk for subsequent iteration But since we change offset variable before we check above, we end up using same chunk in next iteration Fix this by correcting the sequnce to first check if we should move to next chunk and then only adjust the offset variable Jira DNVGPU-24 Change-Id: I58c2e24678f4c6dfbe33bf111edd06788629eca8 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1210892 (cherry picked from commit 83cc179199692d28a93b3b884c9bc094ff513298) Reviewed-on: http://git-master/r/1213450 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-09-08 09:43:46 -07:00
Deepak Nibade	f5895f44ea	gpu: nvgpu: fix compilation errors for 32 bit arch Converting return value of sg_dma_address() (which is u64) into a pointer results in compilation failure on 32 bit machines Hence convert address first into uintptr_t and then into pointer Change-Id: I8e036af8f4c936b88883cf8af1491f03025ed356 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1211243 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:12:24 -07:00
Deepak Nibade	f43231f7a5	gpu: nvgpu: enable big page support for pci While mapping the buffer, first check if buffer is in vidmem, and if yes convert allocation into base address And then walk through each chunk to decide the alignment Add new API gk20a_mm_get_align() which returns the alignment based on scatterlist and aperture, and use this API to get alignment during mapping Enable big page support for pci by unsetting disable_bigpage Jira DNVGPU-97 Change-Id: I358dc98fac8103fdf9d2bde758e61b363fea9ae9 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1207673 (cherry picked from commit d14d42290eed4aa7a2dd2be25e8e996917a58e82) Reviewed-on: http://git-master/r/1210959 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:12:15 -07:00
Deepak Nibade	737d634630	gpu: nvgpu: make default vidmem page size of 64k Allocate 64k pages for vidmem by default Also make sure that base address of vidmem is aligned to page size Jira DNVGPU-20 Change-Id: Ie2e5111f942467754db5b45f1518d72c925d3d19 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206405 (cherry picked from commit 542ebf7f571ba6dc631466e562f7d8e05df4a9a6) Reviewed-on: http://git-master/r/1210958 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:12:08 -07:00
Konsta Holtta	4f6c989898	gpu: nvgpu: use vidmem for page tables if available Use the common gk20a_gmmu_alloc() that tries vidmem too. Jira DNVGPU-20 Change-Id: I4ea02bc4962d299c6f71444048d4a2a22bd80f55 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206404 (cherry picked from commit 7297727cce8c5c7b26f82afe98cc5428135b4777) Reviewed-on: http://git-master/r/1178831 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:12:01 -07:00
Deepak Nibade	44c5b5877b	gpu: nvgpu: add new API to get base address for sysmem/vidmem buffers Add new API gk20a_mem_get_base_addr() which will return vidmem base address in case of vidmem and IOVA address in case of sysmem Even though vidmem allocations are non-contiguous, this API is useful (and should only be used) for allocations with one chunk (e.g. page tables) Also, since page tables could either reside in sysmem or vidmem, use this API to get address of page tables Jira DNVGPU-20 Change-Id: Ie04af9ca7bfccfec1a8a8e4be2c507cef5cef8e1 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206403 (cherry picked from commit a8c74dc188878f2948fa1e0e47bf1837fba6c5e0) Reviewed-on: http://git-master/r/1210957 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:11:53 -07:00
Deepak Nibade	93a436f581	gpu: nvgpu: allocate blob space early Allocting blob space for pmu might need fixed address allocation in vidmem and during boot up But if some page tables are allocated before blob space, blob space allocation could fail Fix this by allocating blob space early during boot up Jira DNVGPU-20 Change-Id: I30eca1023c8f8f8be101bb7e160ba57a7040911a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1206402 (cherry picked from commit fad4309ce345ed3879f497bda27f2eceb1084dbb) Reviewed-on: http://git-master/r/1210956 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:11:46 -07:00
Lakshmanan M	8de995d4af	gpu: nvgpu: Add proper timeout handling for vidmem clear operations gk20a_fence_wait() api may be interrupted by a signal before actual its timeout elapsed. This CL does retry (-ERESTARTSYS) mechanism if gk20a_fence_wait() return before its timeout elapsed. Bug 200230544 Change-Id: I347ed2004935a8b9413f95dcb6fca2b74bf49f2a Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1206265 (cherry picked from commit d3ef533942487785d84d109f985ae648eb3c2434) Reviewed-on: http://git-master/r/1210955 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:11:35 -07:00
Deepak Nibade	713f1ddcdf	gpu: nvgpu: support pramin access for non-contiguous vidmem API pramin_access_batched() currenly only supports contiguous allocations. Modify this API to support non-contiguous allocations from page allocator as well Update gk20a_mem_wr32() and gk20a_mem_rd32()to reuse pramin_access_batched() Use gk20a_memset() in gk20a_gmmu_free_attr_vid() to clear vidmem pages for kernel buffers Jira DNVGPU-30 Change-Id: I43630912f4837d8ebc6b9c58f4f427218ef9725b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1204303 (cherry picked from commit 2f84f141d02fd2f641cb18a48896fb3ae5f7e51f) Reviewed-on: http://git-master/r/1210954 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:11:07 -07:00
Deepak Nibade	9ebd051779	gpu: nvgpu: track allocator and user for each mem Store allocator pointer for each mem_desc This pointer should be used while freeing the mem instead of assuming a common allocator Add flag user_mem to mem_desc which will be set only in case of User vidmem allocations We will delay free of mem in worker only if this flag is set on mem. Otherwise, we will free it immediately This is needed so that all kernel allocations can work with both sysmem and vidmem Jira DNVGPU-84 Change-Id: Ib9a9209b164bc56b7880448f86bd6d42b324cc86 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1203099 (cherry picked from commit 8f0b0122f36a0b6f1932fa9a98d7eb03b1f623d1) Reviewed-on: http://git-master/r/1210953 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2016-09-01 09:10:54 -07:00

... 2 3 4 5 6 ...

388 Commits