linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-23 09:57:08 +03:00

Author	SHA1	Message	Date
Konsta Holtta	d33fb5a964	gpu: nvgpu: use vidmem by default in gmmu_alloc variants For devices that have vidmem available, use the vidmem allocator in gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem. Because all of the buffers haven't been tested to work in vidmem yet, rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at the end to declare explicitly that vidmem is used. Enabling vidmem for each now is a matter of removing "_sys" from the function call. Jira DNVGPU-18 Change-Id: Ibe42f67eff2c2b68c36582e978ace419dc815dc5 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1176805 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-07-08 04:19:04 -07:00
Konsta Holtta	b8915ab5aa	gpu: nvgpu: support in-kernel vidmem mappings Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that buffers represented as a mem_desc and present in vidmem can be mapped to gpu. JIRA DNVGPU-18 JIRA DNVGPU-76 Change-Id: I46cf87e27229123016727339b9349d5e2c835b3e Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169308 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-07-06 03:34:23 -07:00
Konsta Holtta	71478a031c	gpu: ngpu: add support for vidmem in page tables Modify page table updates to take an aperture flag (up until gk20a_locked_gmmu_map()), don't hard-assume sysmem and propagate it to hardware. Jira DNVGPU-76 Change-Id: Ifcb22900c96db993068edd110e09368f72b06f69 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169307 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-07-04 23:12:29 -07:00
Konsta Holtta	e12c5c8594	gpu: nvgpu: initial support for vidmem apertures add gk20a_aperture_mask() for memory target selection now that buffers can actually be allocated from vidmem, and use it in all cases that have a mem_desc available. Jira DNVGPU-76 Change-Id: I4353cdc6e1e79488f0875581cfaf2a5cfb8c976a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1169306 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-07-04 23:10:59 -07:00
Alex Waterman	dfd5ec53fc	gpu: nvgpu: Revamp semaphore support Revamp the support the nvgpu driver has for semaphores. The original problem with nvgpu's semaphore support is that it required a SW based wait for every semaphore release. This was because for every fence that gk20a_channel_semaphore_wait_fd() waited on a new semaphore was created. This semaphore would then get released by SW when the fence signaled. This meant that for every release there was necessarily a sync_fence_wait_async() call which could block. The latency of this SW wait was enough to cause massive degredation in performance. To fix this a fast path was implemented. When a fence is passed to gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore a semaphore acquire is directly used to block the GPU. No longer is a sync_fence_wait_async() performed nor is there an extra semaphore created. To implement this fast path the semaphore memory had to be shared between channels. Previously since a new semaphore was created every time through gk20a_channel_semaphore_wait_fd() what address space a semaphore was mapped into was irrelevant. However, when using the fast path a sempahore may be released on one address space but acquired in another. Sharing the semaphore memory was done by making a fixed GPU mapping in all channels. This mapping points to the semaphore memory (the so called semaphore sea). This global fixed mapping is read-only to make sure no semaphores can be incremented (i.e released) by a malicious channel. Each channel then gets a RW mapping of it's own semaphore. This way a channel may only acquire other channel's semaphores but may both acquire and release its own semaphore. The gk20a fence code was updated to allow introspection of the GPU backed fences. This allows detection of when the fast path can be taken. If the fast path cannot be used (for example when a fence is sync-pt backed) the original slow path is still present. This gets used when the GPU needs to wait on an event from something which only understands how to use sync-pts. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133792 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-28 15:49:11 -07:00
Konsta Holtta	27baafaad1	gpu: nvgpu: use gpfifo_mem via gk20a_mem_{rd,wr} Use gk20a_mem_*() accessors for gpfifo memory in work submission instead of direct cpu accesses in order to support other apertures than sysmem. The gpfifo memory is still allocated from sysmem for dgpus too. Split the copying of priv_cmds and the main gpfifo to be submitted in gk20a_submit_channel_gpfifo() into separate functions. JIRA DNVGPU-21 Change-Id: If271ca8e7e34235f00d31855dbccf77c0008e10b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1145923 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-20 07:45:33 -07:00
Konsta Holtta	75f6a1dff4	gpu: nvgpu: add vidmem allocation API Add in-nvgpu APIs for allocating and freeing mem_descs in video memory. Changes for gmmu tables etc. will be added in upcoming changes. Video memory is allocated via nvmap by initially registering the aperture size to it and binding it to a struct device, and then going via the usual dma alloc. This API allows also fixed-address allocations, meant for reserving special memory areas at boot. The aperture registration is skipped completely if vidmem isn't found for the particular device. gk20a_gmmu_alloc_attr() still uses sysmem, and the unmap/free paths select internally the correct path by the mem_desc's aperture. Video memory allocation is off by default, and can be turned on with CONFIG_GK20A_VIDMEM. JIRA DNVGPU-16 Change-Id: I77eae5ea90cbed6f4b5db0da86c5f70ddf2a34f9 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1157216 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-15 09:34:07 -07:00
Konsta Holtta	987de66583	gpu: nvgpu: optimize mem_desc accessor loops Instead of going via gk20a_mem_{wr,rd}32() on each iteration, do direct memcpy/memset with sysmem, and minimize the enter/exit overhead with vidmem. JIRA DNVGPU-23 Change-Id: I5437e35f8393a746777a40636c1e9b5d93ced1f6 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1159524 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-13 07:42:26 -07:00
Alex Waterman	5f36481371	gpu: nvgpu: Remove dead priv_cmdbuf code Remove the gp_get and gp_put pointers from the priv_cmdbuf code. These pointers appear to track the position of th the priv_cmdbuf in the gp_fifo. However, these pointers are not used for anything nor are they needed for anything in the future. This code appears to be a relic left over from the past. Change-Id: Ibed1a6d51fa0cac12c5e0429760e8e2f611fc899 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1161859 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-13 07:38:14 -07:00
Konsta Holtta	d215bc1107	gpu: nvgpu: detect vidmem configuration from HW Read video memory size from hardware during initialization for devices that support it. JIRA DNVGPU-14 Change-Id: If190f2d89f7148520ee274ca674f972987c8056d Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1157215 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-08 12:05:05 -07:00
Konsta Holtta	8432f6d80a	gpu: nvgpu: cache whole bar0_window for mem accesses Save the whole bar0 window register that encodes also the target aperture (vid/sys mem) instead of only the base address that could overlap between the two. JIRA DNVGPU-23 Change-Id: I2ccbea0e1f7c7310c1ca6b158afafe8fd974a615 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1159523 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-07 09:24:14 -07:00
Konsta Holtta	3e431e26c5	gpu: nvgpu: add PRAMIN support for mem accessors To support vidmem, implement a way to access buffers via the PRAMIN window instead of just kernel-mapped sysmem buffers for iGPU as of now. Depending on the buffer aperture, choose between the two access types in the buffer memory accessor functions. vmap()/vunmap() pairs are no-ops for buffers that can't be cpu-mapped. Two uses of DMA_ATTR_READ_ONLY are removed in the ucode loading path to support writing to them too via the indirection in addition to cpu. JIRA DNVGPU-23 Change-Id: I282dba6741c6b8224bc12e69c1fb3936bde7e6ed Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1141314 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-24 12:39:06 -07:00
Konsta Holtta	dc45473eeb	gpu: nvgpu: use mem_desc in priv_cmd_entry Replace the plain cpu pointer accesses with gk20a_mem_wr32(), and use a reference to the underlying mem_desc (within priv_cmd_queue) paired with an offset, for buffer aperture flexibility. JIRA DNVGPU-21 JIRA DNVGPU-23 Change-Id: I317672c94bb682bb895f9ed3e8116729c8bb7f4b Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1145922 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-18 11:54:34 -07:00
Konsta Holtta	6eebc87d99	gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem To support vidmem, pass g and mem_desc to the buffer memory accessor functions. This allows the functions to select the memory access method based on the buffer aperture instead of using the cpu pointer directly (like until now). The selection and aperture support will be in another patch; this patch only refactors these accessors, but keeps the underlying functionality as-is. gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}() for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like functionality, and gk20a_memset() for filling buffers with a constant. The 8 and 16 bit accessor functions are removed. vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to support other types of mappings or conditions where mapping the buffer is unnecessary or different. Several function arguments that would access these buffers are also changed to take a mem_desc instead of a plain cpu pointer. Some relevant occasions are changed to use the accessor functions instead of cpu pointers without them (e.g., memcpying to and from), but the majority of direct accesses will be adjusted later, when the buffers are moved to support vidmem. JIRA DNVGPU-23 Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1121143 Reviewed-by: Ken Adams <kadams@nvidia.com> Tested-by: Ken Adams <kadams@nvidia.com>	2016-05-13 07:11:33 -07:00
Deepak Nibade	d868b65441	gpu: nvgpu: separate IOCTL to set preemption mode Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE to allow setting preemption modes from UMD Define preemption modes in nvgpu.h and use them everywhere Remove mode definitions from mm_gk20a.h Also, we support setting only one preemption mode in a channel But it is possible to have multiple preemption modes (one from graphics and one from compute) set simultaneously Hence, update struct gr_ctx_desc to include two separate preemption modes (graphics_preempt_mode and compute_preempt_mode) API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports setting two separate preemption modes i.e. one for graphics and one for compute Make necessary changes in code to support two preemption modes Bug 1646259 Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1131805 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-09 13:16:29 -07:00
Thomas Fleury	93678f571c	gpu: nvgpu: Add trace and debugfs for sched params JIRA EVLR-244 JIRA EVLR-318 Change-Id: Ie95f42212dadcf2d0c1737eeb28812afb03b712f Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: http://git-master/r/1120603 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Ken Adams <kadams@nvidia.com>	2016-05-05 09:25:02 -07:00
Konsta Holtta	98679b0ba4	gpu: nvgpu: adapt gk20a_mm_entry for mem_desc For upcoming vidmem refactor, replace struct gk20a_mm_entry's contents identical to struct mem_desc, with a struct mem_desc member. This makes it possible to use the page table buffers like the others too. JIRA DNVGPU-23 JIRA DNVGPU-20 Change-Id: I714ee5dcb33c27aaee932e8e3ac367e84610b102 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1139694 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-05-03 13:27:22 -07:00
Alex Waterman	8891fd8267	gpu: nvgpu: Add gk20a_gmmu_fixed_map() function Add a function to allow the kernel to do fixed mappings. Necessary for the semaphore functionality since there needs to be a common address in each VM for the semaphores. Bug 1732449 JIRA DNVGPU-12 Change-Id: I2b451db2d3cb3c003d951f7b0ffc87f6c91db7dc Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133789 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-04-29 14:43:48 -07:00
Terje Bergstrom	b10e02f537	gpu: nvgpu: Program NISO sysmem flush addr Program sysmem flush address to prevent random accesses of address 0. Change-Id: I886170395f036805f02e0bce7ecd3c8c46b921df Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1129216 GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com>	2016-04-25 08:16:10 -07:00
Terje Bergstrom	7d8e219389	gpu: nvgpu: Use sysmem aperture for SoC memory In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it has to be accessed as sysmem. Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120468	2016-04-15 08:50:34 -07:00
Alex Van Brunt	a596f0bd6d	debugfs: Pass bool pointer to debugfs_create_bool() Port the change 621a5f7ad9cd1ce7933f1d302067cbd58354173c from kernel.org to the nvgpu driver. bug 200187033 Change-Id: I7d742f614161d9d4ed59c4216d7c730d57ef4116 Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com> Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1118397 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>	2016-04-14 06:51:52 -07:00
Terje Bergstrom	9b5427da37	gpu: nvgpu: Support GPUs with no physical mode Support GPUs which cannot choose between SMMU and physical addressing. Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120469 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com>	2016-04-13 13:12:41 -07:00
Peter Daifuku	6eeabfbdd0	gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch Add support for SMPC and HWPM context switching when virtualized Bug 1648200 JIRASW EVLR-219 JIRASW EVLR-253 Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1122034 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-04-08 12:34:50 -07:00
Terje Bergstrom	e8bac374c0	gpu: nvgpu: Use device instead of platform_device Use struct device instead of struct platform_device wherever possible. This allows adding other bus types later. Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1120466	2016-04-08 09:42:41 -07:00
Peter Daifuku	37155b65f1	gpu: nvgpu: support for hwpm context switching Add support for hwpm context switching Bug 1648200 Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: http://git-master/r/1120450 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-04-07 11:05:49 -07:00
Vishal Annapurve	cd29c45e67	gpu: nvgpu: Fix compilation with CONFIG_DEBUG_FS disabled This change fixes issues with kernel compilation when CONFIG_DEBUG_FS is disabled. Bug 1737085 Change-Id: I74719674d07ae071e3df99b0dda249b54173f40b Signed-off-by: Vishal Annapurve <vannapurve@nvidia.com> Reviewed-on: http://git-master/r/1024167 GVS: Gerrit_Virtual_Submit Reviewed-by: Sandeep Trasi <strasi@nvidia.com>	2016-03-29 02:00:58 -07:00
Alex Waterman	fbc21ed2ee	gpu: nvgpu: split address space for fixed allocs Allow a special address space node to be split out from the user adress space or fixed allocations. A debugfs node, /d/<gpu>/separate_fixed_allocs Controls this feature. To enable it: # echo <SPLIT_ADDR> > /d/<gpu>/separate_fixed_allocs Where <SPLIT_ADDR> is the address to do the split on in the GVA address range. This will cause the split to be made in all subsequent address space ranges that get created until it is turned off. To turn this off just echo 0x0 into the same debugfs node. Change-Id: I21a3f051c635a90a6bfa8deae53a54db400876f9 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1030303 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-03-25 13:19:17 -07:00
Konsta Holtta	db7095ce51	gpu: nvgpu: bitmap allocator for comptags Restore comptags to be bitmap-allocated, like they were before we had the buddy allocator. The new buddy allocator introduced by e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally 6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but unsuitable for the small compbit store. This commit reverts partially the combination of the above commit and also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed a bug which is not present when using a bitmap. With a bitmap allocator, pruning the extra allocation necessary for user-mapped mode is possible, so that is also restored. The original generic bitmap allocator is not restored; instead, a comptag-only allocator is introduced. Bug 200145635 Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/837180 (cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2) Reviewed-on: http://git-master/r/839742 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2016-01-19 17:44:27 -08:00
Terje Bergstrom	9812bd5eea	gpu: nvgpu: Control comptagline assignment from kernel On Maxwell comptaglines are assigned per 128k, but preferred big page size for graphics is 64k. Bit 16 of GPU VA is used for determining which half of comptagline is used. This creates problems if user space wants to map a page multiple times and to arbitrary GPU VA. In one mapping the page might be mapped to lower half of 128k comptagline, and in another mapping the page might be mapped to upper half. Turn on mode where MSB of comptagline in PTE is used instead of bit 16 for determining the comptagline lower/upper half selection. Bug 1704834 Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/924322 Reviewed-by: Alex Waterman <alexw@nvidia.com>	2016-01-05 07:50:02 -08:00
Deepak Nibade	2d40ebb1ca	gpu: nvgpu: rework private command buffer free path We currently allocate private command buffers (wait_cmd and incr_cmd) before submitting the job but we never free them explicitly. When private command queue of the channel is full, we then try to recycle/remove free command buffers. But this recycling happens during submit path, and hence that particular submit path takes much longer Rework this as below : - add reference of command buffers to job structure - when job completes, free the command buffers explicitly - remove the code to recycle buffers since it should not be needed now Note that command buffers need to be freed in order of their allocation. Ensure this with error print before freeing the command buffer entry Bug 200141116 Bug 1698667 Change-Id: Id4b69429d7ad966307e0d122a71ad55076684307 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/827638 (cherry picked from commit c6cefd69b71c9b70d6df5343b13dfcfb3fa99598) Reviewed-on: http://git-master/r/835802 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-11-23 08:33:01 -08:00
Sami Kiminki	9d2c9072c8	gpu: nvgpu: User-space managed address space support Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which enables creating userspace-managed GPU address spaces. When an address space is marked as userspace-managed, the following changes are in effect: - Only fixed-address mappings are allowed. - VA space allocation for fixed-address mappings is not required, except to mark space as sparse. - Maps and unmaps are always immediate. In particular, the mapping ref increments at kickoffs and decrements at job completion are skipped. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/738558 Reviewed-on: http://git-master/r/833253 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-11-18 09:45:07 -08:00
Sami Kiminki	30632cec54	gpu: nvgpu: Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used to identify buffers and retrieve their sizes. This allows the userspace to be agnostic to the dmabuf implementation, as the generic dmabuf fd interface does not have a reliable way for buffer identification. Bug 1614735 Bug 1623949 Bug 1660392 Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/822845 Reviewed-on: http://git-master/r/833252 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-11-17 12:38:44 -08:00
Konsta Holtta	411c3a9a4f	gpu: nvgpu: use a separate big vm for cde Allocate a separate VM for CDE channels instead of using the system (PMU) vm, and make it much bigger than the PMU's to fit the maximum number of CDE channels there. Bug 1566740 Change-Id: I4f487c40c9ec79cc9ffb880b0ecd3f47eb450336 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/815149 Reviewed-by: Automatic_Commit_Validation_User	2015-11-10 23:19:45 -08:00
Terje Bergstrom	8452348539	gpu: nvgpu: Do not use G_ELPG_FLUSH G_ELPG_FLUSH is protected in some chips. Use L2 flush operations instead. Bug 1698618 Change-Id: I984a8ace8bcd0ad2d4a4e2d63af75a342bdeb75a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/828656 (cherry picked from commit ba9075fa43975112a221d37d246f0b8f5af40fab) Reviewed-on: http://git-master/r/829415	2015-11-10 10:33:39 -08:00
Terje Bergstrom	cccd038f8d	Revert "gpu: nvgpu: Implement sparse PDEs" This reverts commit c44947b1314bb2afa1f116b4928f4e8a4c34d7b1. It introduces a regression in T124. Bug 1702063 Change-Id: I64e333f66d98bd4dbcfe40a60f1aa825d90376a5 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/830786 GVS: Gerrit_Virtual_Submit	2015-11-09 10:11:23 -08:00
Terje Bergstrom	4b5c08f4c0	gpu: nvgpu: Implement sparse PDEs Change-Id: Idfeb3bf95751902d52a895d77045a529f69abc0b Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/758651 GVS: Gerrit_Virtual_Submit	2015-10-30 16:36:06 -07:00
Aingara Paramakuru	ee18a3ae26	gpu: nvgpu: vgpu: re-factor gr ctx management Move the gr ctx management to the GPU HAL. Also, add support for a new interface to allocate gr ctxsw buffers. Bug 1677153 Change-Id: I5a7980acf4de0de7dbd94b7dd20f91a6196dc989 Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com> Reviewed-on: http://git-master/r/806961 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/817009 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-10-22 07:39:56 -07:00
Seshendra Gadagottu	5b7b59714a	gpu: nvgpu: add support to remove bar2 mm Adding support to remove bar2 mm on gpu module remove. Change-Id: Id5f680b1abf7056da9871d5460d9fbc40422673e Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/814571 (cherry picked from commit e7c6c87dd6b0893d26a9a3b4568121a691e1eb3c) Reviewed-on: http://git-master/r/815429 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-10-12 18:22:12 -07:00
Jussi Rasanen	f4b6d4d176	gpu: nvgpu: fix ctag computation overflow with 8GB Bug 1689976 Change-Id: I97ad14c9698030b630d3396199a2a5296c661392 Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com> Reviewed-on: http://git-master/r/806590 (cherry picked from commit c90cd5ee674d6357db3be2243950ff0d81ef15ef) Reviewed-on: http://git-master/r/808249 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-10-06 13:30:23 -07:00
Sami Kiminki	eade809c26	gpu: nvgpu: Separate kernel and user GPU VA regions Separate the kernel and userspace regions in the GPU virtual address space. Do this by reserving the last part of the GPU VA aperture for the kernel, and extend GPU VA aperture accordingly for regular address spaces. This prevents the kernel polluting the userspace-visible GPU VA regions, and thus, makes the success of fixed-address mapping more predictable. Bug 200077571 Change-Id: I63f0e73d4c815a4a9fa4a9ce568709974690ef0f Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/747191 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-09-07 12:37:15 -07:00
Alex Waterman	ca26ce6f8c	gpu: nvgpu: Increase VA space to 40 bits Now that the buddy allocator is merged we can increase the VA space without dramatically increasing memory usage by the allocator. 40 bits is the max VA space available on gk20a and gm20b. Change-Id: I7bc8d86e35b28f041e9a435f2571c8288970c8ee Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/745076 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/771152 Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>	2015-07-20 11:33:07 -07:00
Terje Bergstrom	63714e7cc1	gpu: nvgpu: Implement priv pages Implement support for privileged pages. Use them for kernel allocated buffers. Change-Id: I720fc441008077b8e2ed218a7a685b8aab2258f0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/761919	2015-07-03 17:59:12 -07:00
Sami Kiminki	e7ba93fefb	gpu: nvgpu: Initial MAP_BUFFER_BATCH implementation Add batch support for mapping and unmapping. Batching essentially helps transform some per-map/unmap overhead to per-batch overhead, namely gk20a_busy()/gk20a_idle() calls, GPU L2 flushes, and GPU TLB invalidates. Batching with size 64 has been measured to yield >20x speed-up in low-level fixed-address mapping microbenchmarks. Bug 1614735 Bug 1623949 Change-Id: Ie22b9caea5a7c3fc68a968d1b7f8488dfce72085 Signed-off-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-on: http://git-master/r/733231 (cherry picked from commit de4a7cfb93e8228a4a0c6a2815755a8df4531c91) Reviewed-on: http://git-master/r/763812 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-06-30 08:35:23 -07:00
Alex Waterman	099af76674	gpu: nvgpu: Remove simulation WAR The WAR put into simulation to avoid a simulator crash can now be removed (c85be1a0968de813fe9b99ebd5c261dcb0ca8875). The first issue with the failing test was found to be GPFIFO entries that were not invalid. Other issues are still present with the test and are fixed in a later commit. Change-Id: I7d3def2e384eede82cfc82b961f09ca23b239d30 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/753378 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/755815 Reviewed-by: Automatic_Commit_Validation_User	2015-06-11 10:16:47 -07:00
Terje Bergstrom	c5367d7350	gpu: nvgpu: Do not pre-allocate cmdbuf entries Do not preallocate cmdbuf tracking entries. Allocate them only when needed. Bug 200104160 Change-Id: I12f8392723c301a368af1e280893ff993480477f Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/743953 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/755148	2015-06-09 11:14:34 -07:00
Terje Bergstrom	de48cc83b4	gpu: nvgpu: Fix build problem by including slab.h Change-Id: I2346ed48bf82c032194264bab999097bf86404e2 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/745885 (cherry picked from commit aba5f1ce08c2813c8bd570f36a5d15aa2a1aeaf8) Reviewed-on: http://git-master/r/753286 Reviewed-by: Automatic_Commit_Validation_User	2015-06-06 07:26:32 -07:00
Terje Bergstrom	0dc66952e4	gpu: nvgpu: Use vmalloc only when size >4K When allocation size is 4k or below, we should use kmalloc. vmalloc should be used only for larged allocations. Introduce nvgpu_alloc, which checks the size, and decides the API to use. Change-Id: I593110467cd319851b27e57d1bfe8d228d3f2909 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/743974 (cherry picked from commit 7f56aa1f0ecafbfde7286353b60e25e494674d26) Reviewed-on: http://git-master/r/753276 Reviewed-by: Automatic_Commit_Validation_User	2015-06-06 07:24:11 -07:00
Bharat Nihalani	b8aa486109	Revert "Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space"""" This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since the issue seen with bug 200106514 is fixed with change http://git-master/r/#/c/752080/. Bug 200112195 Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/752503 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-06-04 10:41:00 -07:00
Bharat Nihalani	1d8fdf5695	Revert "Revert "Revert "gpu: nvgpu: New allocator for VA space""" This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since it causes GPU pbdma interrupt to be generated. Bug 200106514 Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com> Reviewed-on: http://git-master/r/749291	2015-06-02 20:18:55 -07:00
Alex Waterman	01f359f3f1	Revert "Revert "gpu: nvgpu: New allocator for VA space"" This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8. The original commit was actually fine. Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/743300 (cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff) Reviewed-on: http://git-master/r/743301 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>	2015-05-19 13:09:00 -07:00

1 2 3 4 5

207 Commits