Commit Graph

294 Commits

Author SHA1 Message Date
Thomas Fleury
2caea7576a gpu: nvgpu: vgpu: add clear single SM error state
Add support for clearing single SM error state for CUDA debugger.
In addition to clearing local copy of SM error state,
vgpu_gr_clear_sm_error_state now sends a command to RM server
(TEGRA_VGPU_CMD_CLEAR_SM_ERROR_STATE), to clear global ESR and
warp ESR.

Bug 1791111

Change-Id: I3a1f0644787fd900ec59a0e7974037d46a603487
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1296311
(cherry picked from commit fd07e03c3d086f396e4d65575c576a4dd68c920a)
Reviewed-on: http://git-master/r/1299060
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Cory Perry <cperry@nvidia.com>
Tested-by: Cory Perry <cperry@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-09 10:44:56 -08:00
Thomas Fleury
6c35cebdcb gpu: nvgpu: vgpu: suspend/resume contexts
Add ability to suspend/resume contexts for a debug session
(NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS), in virtualized
case:
- added hal function to resume contexts.
- added vgpu support for suspend contexts, i.e. build a list
of channel ids, and send TEGRA_VGPU_CMD_SUSPEND_CONTEXTS
- added vgpu support for resume contexts, i.e. build a list
of channel ids, and send TEGRA_VGPU_CMD_RESUME_CONTEXTS

Bug 1791111

Change-Id: Icc1c00d94a94dab6384ac263fb811c00fa4b07bf
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1294761
(cherry picked from commit d17a38eda312ffa92ce92e5bafc30727a8b76c4e)
Reviewed-on: http://git-master/r/1299059
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Cory Perry <cperry@nvidia.com>
Tested-by: Cory Perry <cperry@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-09 10:44:55 -08:00
Richard Zhao
bc47d82229 gpu: nvgpu: add NVGPU_GPU_FLAGS_SUPPORT_MAP_COMPBITS
native gpu driver supports map compbits but vgpu does not.

Bug 1778448
Bug 200275051
JIRA VFND-3513

Change-Id: I433a6f8631b495875ba899af9609203ab36187ef
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1314065
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-08 11:35:24 -08:00
Richard Zhao
baf464779e gpu: nvgpu: vgpu: remove cycle stats from characteristics
vgpu does not support cycle stats.

Bug 1781434
Bug 200275051
JIRA VFND-3513

Change-Id: Id1d566027913632fc8a4f0285609be5f56b26288
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1314064
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-08 09:55:43 -08:00
Alex Waterman
707ea45e0f gpu: nvgpu: kmem abstraction and tracking
Implement kmem abstraction and tracking in nvgpu. The abstraction
helps move nvgpu's core code away from being Linux dependent and
allows kmem allocation tracking to be done for Linux and any other
OS supported by nvgpu.

Bug 1799159
Bug 1823380

Change-Id: Ieaae4ca1bbd1d4db4a1546616ab8b9fc53a4079d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1283828
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-03 10:34:48 -08:00
Alex Waterman
76b78b6fdc gpu: nvgpu: Remove nvgpu_gpuid_t18x.h
Remove nvgpu_gpuid_t18x.h since this file is now visible. Migrate
the relevant definitions and defines into their expected places and
make the code use the real defines. No longer is hiding t18x specific
stuff necessary.

Bug 1799159

Change-Id: I47fa2392e46fdb7aacc70aeb0cc8c3f5ca0dc22f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1300976
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-02 18:48:41 -08:00
vishnu Reddy
f9b2c30b36 gpu: nvgpu: Add and move FECS trace flag
Added fecs trace characteristic flag to vgpu to determine for
platform support

Also moved fecs flag from ctxsw init to fecs init for native linux.
This makes flag init uniform across both platforms.

JIRA EVLR-993

Change-Id: Id62f31954cb5d7028ef01a199548f5f908b51eb0
Signed-off-by: Vishnu Reddy Mandalapu <vmandalapu@nvidia.com>
Reviewed-on: http://git-master/r/1305045
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-02 18:15:47 -08:00
Konsta Holtta
f1072a28be gpu: nvgpu: add worker for watchdog and job cleanup
Implement a worker thread to replace the delayed works in channel
watchdog and job cleanups. Watchdog runs by polling the channel states
periodically, and job cleanup is performed on channels that are appended
on a work queue consumed by the worker thread. Handling both of these
two in the same thread makes it impossible for them to cause a deadlock,
as has previously happened.

The watchdog takes references to channels during checking and possibly
recovering channels. Jobs in the cleanup queue have an additional
reference taken which is released after the channel is processed. The
worker is woken up from periodic sleep when channels are added to the
queue.

Currently, the queue is only used for job cleanups, but it is extendable
for other per-channel works too. The worker can also process other
periodic actions dependent on channels.

Neither the semantics of timeout handling or of job cleanups are yet
significantly changed - this patch only serializes them into one
background thread.

Each job that needs cleanup is tracked and holds a reference to its
channel and a power reference, and timeouts can only be processed on
channels that are tracked, so the thread will always be idle if the
system is going to be suspended, so there is currently no need to
explicitly suspend or stop it.

Bug 1848834
Bug 1851689
Bug 1814773
Bug 200270332
Jira NVGPU-21

Change-Id: I355101802f50841ea9bd8042a017f91c931d2dc7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1297183
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-03-02 17:51:03 -08:00
Thomas Fleury
ac8cea9351 gpu: nvgpu: ignore set preempt mode to default
In native case, attempting to set graphics/compute preempt mode
to default is ignored if preemption mode has already been set
for the context. In virtualized case an error is currently
returned.
Align behaviour for native and virtualized case, by ignoring
such request.

Bug 200186530

Change-Id: Ieb3a37107bdbd3284804ee9fd392786409224082
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1306506
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-24 12:14:17 -08:00
Deepak Nibade
8cdb91c527 gpu: nvgpu: remove use of DEFINE_MUTEX()
API DEFINE_MUTEX() is defined in Linux and might
not be available in other OSs.
Hence remove its usage from nvgpu

Declare and explicitly initialize below mutexes
for both nvgpu and vgpu
g->mm.priv_lock
g->mm.tlb_lock

Jira NVGPU-13

Change-Id: If72885a6da0227a1552303206172f1f2b751471d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1298042
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-22 04:15:08 -08:00
Deepak Nibade
8ee3aa4b31 gpu: nvgpu: use common nvgpu mutex/spinlock APIs
Instead of using Linux APIs for mutex and spinlocks
directly, use new APIs defined in <nvgpu/lock.h>

Replace Linux specific mutex/spinlock declaration,
init, lock, unlock APIs with new APIs
e.g
struct mutex is replaced by struct nvgpu_mutex and
mutex_lock() is replaced by nvgpu_mutex_acquire()

And also include <nvgpu/lock.h> instead of including
<linux/mutex.h> and <linux/spinlock.h>

Add explicit nvgpu/lock.h includes to below
files to fix complilation failures.
gk20a/platform_gk20a.h
include/nvgpu/allocator.h

Jira NVGPU-13

Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1293187
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-22 04:15:02 -08:00
Peter Boonstoppel
907adfd785 gpu: nvgpu: Add NVGPU_IOCTL_CHANNEL_SET_BOOSTED_CTX
This ioctl can be used on gp10b to set a flag in the context header
indicating this context should be run at elevated clock
frequency. FECS ctxsw ucode will read this flag as part of the context
switch and will request higher GPU clock frequencies from BPMP for the
duration of the context execution.

Bug 1819874

Change-Id: I84bf580923d95585095716d49cea24e58c9440ed
Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-on: http://git-master/r/1292746
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-14 14:54:46 -08:00
Aparna Das
28b0d6cfa8 gpu: nvgpu: remove call to invalidate tlb
Guest doesn't explicitly send command to the RM server
to invalidate tlb which is done implicitly when mapping
or unmapping buffer. Remove support for this call.

Bug 1665111

Change-Id: Icf2edae7feffa35b1dbf87c227b3e98b506e6519
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: http://git-master/r/1287728
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-14 11:15:27 -08:00
Alex Waterman
aa36d3786a gpu: nvgpu: Organize semaphore_gk20a.[ch]
Move semaphore_gk20a.c drivers/gpu/nvgpu/common/ since the semaphore
code is common to all chips.

Move the semaphore_gk20a.h header file to drivers/gpu/nvgpu/include/nvgpu
and rename it to semaphore.h. Also update all places where the header
is inluced to use the new path.

This revealed an odd location for the enum gk20a_mem_rw_flag. This should
be in the mm headers. As a result many places that did not need anything
semaphore related had to include the semaphore header file. Fixing this
oddity allowed the semaphore include to be removed from many C files that
did not need it.

Bug 1799159

Change-Id: Ie017219acf34c4c481747323b9f3ac33e76e064c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1284627
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-13 18:14:45 -08:00
Alex Waterman
7e403974d3 gpu: nvgpu: Simplify ref-counting on VMs
Simplify ref-counting on VMs: take a ref when a VM is bound to a
channel and drop a ref when a channel is freed.

Previously ref-counts were scattered over the driver. Also the CE
and CDE code would bind channels with custom rolled code. This was
because the gk20a_vm_bind_channel() function took an as_share as
the VM argument (the VM was then inferred from that as_share).
However, it is trivial to abtract that bit out and allow a central
bind channel function that just takes a VM and a channel.

Bug 1846718

Change-Id: I156aab259f6c7a2fa338408c6c4a3a464cd44a0c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1261886
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-02-07 14:54:02 -08:00
Alex Waterman
b9b94c073c gpu: nvgpu: Remove separate fixed address VMA
Remove the special VMA that could be used for allocating fixed
addresses. This feature was never used and is not worth maintaining.

Bug 1396644
Bug 1729947

Change-Id: I06f92caa01623535516935acc03ce38dbdb0e318
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1265302
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-31 16:23:07 -08:00
Alex Waterman
d630f1d99f gpu: nvgpu: Unify the small and large page address spaces
The basic structure of this patch is to make the small page allocator
and the large page allocator into pointers (where they used to be just
structs). Then assign each of those pointers to the same actual
allocator since the buddy allocator has supported mixed page sizes
since its inception.

For the rest of the driver some changes had to be made in order to
actually support mixed pages in a single address space.

1. Unifying the allocation page size determination

   Since the allocation and map operations happen at distinct
   times both mapping and allocation of GVA space must agree
   on page size. This is because the allocation has to separate
   allocations into separate PDEs to avoid the necessity of
   supporting mixed PDEs.

   To this end a function __get_pte_size() was introduced which
   is used both by the balloc code and the core GPU MM code. It
   determines page size based only on the length of the mapping/
   allocation.

2. Fixed address allocation + page size

   Similar to regular mappings/GVA allocations fixed address
   mapping page size determination had to be modified. In the
   past the address of the mapping determined page size since
   the address space split was by address (low addresses were
   small pages, high addresses large pages). Since that is no
   longer the case the page size field in the reserve memory
   ioctl is now honored by the mapping code. When, for instance,
   CUDA makes a memory reservation it specifies small or large
   pages. When CUDA requests mappings to be made within that
   address range the page size is then looked up in the reserved
   memory struct.

   Fixed address reservations were also modified to now always
   allocate at a PDE granularity (64M or 128M depending on
   large page size. This prevents non-fixed allocations from
   ending up in the same PDE and causing kernel panics or GMMU
   faults.

3. The rest...

   The rest of the changes are just by products of the above.
   Lots of places required minor updates to use a pointer to
   the GVA allocator struct instead of the struct itself.

Lastly, this change is not truly complete. More work remains to be
done in order to fully remove the notion that there was such a thing
as separate address spaces for different page sizes. Basically after
this patch what remains is cleanup and proper documentation.

Bug 1396644
Bug 1729947

Change-Id: If51ab396a37ba16c69e434adb47edeef083dce57
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1265300
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-31 16:23:07 -08:00
Aparna Das
bad0572cb1 gpu: nvgpu: vgpu: retrieve gpu load
Add support to send command to RM server to retrieve
GPU load.

Bug 200261903

Change-Id: Ie3d0ba7ec91317e9a2911f71613ad78d20f9c1fb
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: http://git-master/r/1275045
(cherry picked from commit 5a6c1de1e6997bfd803b4b95b3e44e282ba32f67)
Reviewed-on: http://git-master/r/1283279
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-24 16:24:12 -08:00
Thomas Fleury
f6a634ff24 gpu: nvgpu: use HAL to set TSG timeslice
Setting timeslice for virtualized case was not effective,
because both ioctls NVGPU_TSG_IOCTL_SET_TIMESLICE and
NVGPU_SCHED_IOCTL_TSG_SET_TIMESLICE were calling the
native function to set TSG timeslice.
- Fixed wrapper function to call HAL
- Defined HAL function for "native" set TSG timeslice
- Also, properly update timeout_us in TSG context, in
  virtualized case.

This change also moves the min/max bounds checking for
tsg timeslice into the native function implementation.
There is no sysfs node for these parameters for vgpu,
as RM server is ultimately responsible for this check.

Bug 200263575

Change-Id: Ibceab9427561ad58ec28abfff0c96ca8f592bdb9
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1283180
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-16 12:15:23 -08:00
Alex Waterman
865514be2d gpu: nvgpu: Move gp10b HW headers
Move the gp10b HW headers to a new directory specially for them:

  include/nvgpu/hw/gp10b

And change the code to include like so:

  #include <nvgpu/hw/gp10b/hw_fb_gp10b.h>

This is part of the process to restructure the nvgpu driver.

Bug 1799159

Change-Id: Ic80ea5b7f5c280839e502e2178a345181f7a7ef9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1280326
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-11 12:44:14 -08:00
Alex Waterman
b928f10d37 gpu: nvgpu: Start re-organizing the HW headers
Reorganize the HW headers of gk20a. The headers are moved to a
new directory:

  include/nvgpu/hw/gk20a

And from the code are included like so:

  #include <nvgpu/hw/gk20a/hw_pwr_gk20a.h>

This is the first step in reorganizing all of the HW headers for
gm20b, gm206, etc. This is part of a larger effort to re-structure
and make the driver more readable and scalable.

Bug 1799159

Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1244790
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-11 12:44:14 -08:00
Alex Waterman
6df3992b60 gpu: nvgpu: Move allocators to common/mm/
Move the GPU allocators to common/mm/ since the allocators are common
code across all GPUs. Also rename the allocator code to move away from
gk20a_ prefixed structs and functions.

This caused one issue with the nvgpu_alloc() and nvgpu_free() functions.
There was a function for allocating either with kmalloc() or vmalloc()
depending on the size of the allocation. Those have now been renamed to
nvgpu_kalloc() and nvgpu_kfree().

Bug 1799159

Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1274400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-09 12:33:16 -08:00
Richard Zhao
e229514bec gpu: nvgpu: vgpu: receive event TEGRA_VGPU_EVENT_SM_ESR
- allocate gr.sm_error_state
- handle event of sm error state
- add callback of clear sm error state

JIRA VFND-3291
Bug 200257899

Change-Id: I49b9437013e8c65290750b7fe21fc6819ea93b1c
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1278397
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-06 15:02:24 -08:00
Richard Zhao
ecc3722aa1 gpu: nvgpu: vgpu: restructure event handling
Take interrupts as one kind of event message, and make it
easier to add new kind of events.

JIRA VFND-3291
Bug 200257899

Change-Id: I83482293230c0aa10b05caf61e249a042bf6653c
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1278396
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-06 15:02:24 -08:00
Aparna Das
f41740bf08 gpu: nvgpu: vgpu: no support for sparse mapping
Currently sparse mapping is not supported for gp10b
in virtualized environment. Modify gpu characteristics
to reflect non-implementation of this functionality.

Also fix return value in vgpu_gp10b_locked_gmmu_map()
on error condition.

Bug 200243373

Change-Id: Ia367b923b87738a5cad0617cdb074f5a24fb1c81
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: http://git-master/r/1269710
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-12-27 15:26:53 +05:30
Richard Zhao
d7f15a2b50 gpu: nvgpu: vgpu: fix va leak when call gk20a_vm_free_va
page size index needs to be set explicitly when call gk20a_vm_free_va.

Bug 200255799
JIRA VFND-3033

Change-Id: Ic23ea68905ea423173d1859fd100e7b2c82a1bcc
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1262590
(cherry picked from commit 918aea147b395f7337db348d2616fb4b195dc53a)
Reviewed-on: http://git-master/r/1263400
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-12-27 15:26:51 +05:30
Richard Zhao
cb78f5aa74 gpu: nvgpu: vgpu: add set_preemption_mode
Implement HAL callback set_preemption_mode

Bug 200238497
JIRA VFND-2683

Change-Id: I8fca8e1ba112d8782ce18f0899eca38a1d12b512
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1236976
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:26:50 +05:30
Richard Zhao
a862dd6122 gpu: nvgpu: vgpu: move to use vgpu_get_handle helper function
JIRA VFND-2103

Change-Id: Ic11cff40e64849cb6abb193bec54d03857433416
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1185205
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-12-27 15:26:19 +05:30
Konsta Holtta
bc45c8ef2b gpu: nvgpu: gp10x: support in-kernel vidmem mappings
Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that
buffers represented as a mem_desc and present in vidmem can be mapped to
gpu.

JIRA DNVGPU-18
JIRA DNVGPU-76

Change-Id: Icd675e83e3c28836f0ed8880425748697713bb0a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169296
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:26:18 +05:30
Terje Bergstrom
e175580d52 gpu: nvgpu: vgpu: Add CE engine to engine list
Initialize CE engine also for gp10b.

Change-Id: Ibce2f80b523a09fb1345995c03c5430f3b20844f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1170453
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Tested-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
2016-12-27 15:26:17 +05:30
Richard Zhao
2029134634 gpu: nvgpu: vgpu: manage gr_ctx as independent resource
gr_ctx will managed as independent resource in RM server
and vgpu can get a gr_ctx handle.

Bug 1702773

Change-Id: Ifceb44b7d9a1ba03fc2a4df847f4a78ac4c4a0d4
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1144934
(cherry picked from commit 0da3101d9b59fe1f9a47ce7b70b30cb8919f35ac)
Reviewed-on: http://git-master/r/1150707
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:26:16 +05:30
Deepak Nibade
6113c679a9 gpu: nvgpu: API to set preemption mode
Separate out new API gr_gp10b_set_ctxsw_preemption_mode()
which will check requested preemption modes and take appropriate
action for each preemption mode
This API will also do some sanity checking for valid
preemption modes and combinations

Define API set_preemption_mode() for gp10b which will set the
preemption modes passed as argument and then use
gr_gp10b_set_ctxsw_preemption_mode() and
update_ctxsw_preemption_mode() to update preemption mode

Legacy path from gr_gp10b_alloc_gr_ctx() will convert
flags NVGPU_ALLOC_OBJ_FLAGS_* into appropriate preemption modes
and then call gr_gp10b_set_ctxsw_preemption_mode()

New API set_preemption_mode() will use new flags
NVGPU_GRAPHICS/COMPUTE_PREEMPTION_MODE_* and set and update
ctxsw preemption mode

In gr_gp10b_update_ctxsw_preemption_mode(), update graphics
context to set CTA premption mode if mode
NVGPU_COMPUTE_PREEMPTION_MODE_CTA is set

Also, define preemption modes in nvgpu-t18x.h
and use them everywhere
Remove old definitions of modes from gr_gp10b.h

Bug 1646259

Change-Id: Ib4dc1fb9933b15d32f0122a9e52665b69402df18
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131806
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:24:54 +05:30
Terje Bergstrom
eada66b2a9 gpu: nvgpu: gp10b: Allow importing makefile via include
Refactor makefiles so that there is one makefile, and that file
can be included in the main nvgpu build.

Bug 1476801

Change-Id: I23ac451d695fc64064de2300e83b9d9487c52743
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1028353
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
2016-12-27 15:22:11 +05:30
Richard Zhao
83e9c506ee gpu: nvgpu: vgpu: fix sparse warnings
Bug 200088648

Change-Id: I61be7b4787e9bc9bac310a8739977f43c38a67ee
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1000174
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:22:10 +05:30
Aingara Paramakuru
36fa64cab4 gpu: nvgpu: vgpu: update interface to free GR ctx
The server only releases ownership of the ctxsw buffer mappings
after the GR ctx has been released. Update the sequence to
account for this.

JIRA VFND-1117
Bug 1708163

Change-Id: I3aed015805b4ca51433e7d37ad32de2f8353999f
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/922817
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-12-27 15:22:10 +05:30
Aingara Paramakuru
9ab9436268 gpu: nvgpu: gp10b: map GfxP buffers as GPU cacheable
Some of the allocated buffers are used during normal graphics
processing. Mark them as GPU cacheable to improve performance.

Bug 1695718

Change-Id: I71d5d1538516e966526abe5e38a557776321597f
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/827087
(cherry picked from commit 60b40ac144c94e24a2c449c8be937edf8865e1ed)
Reviewed-on: http://git-master/r/828493
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:22:09 +05:30
Richard Zhao
b7de6b004b gpu: nvgpu: vgpu: set correct page size index for gp10b
VM server only know big page and small page, so convert
gmmu_page_size_kernel to according page size index.

JIRA VFND-890

Change-Id: Id1f932752b8ca33d14635ac9d71019364aa89dc4
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/816359
(cherry picked from commit 5bfc4a2a55889f5457bd34aa06861c042ee67421)
Reviewed-on: http://git-master/r/827131
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-12-27 15:22:09 +05:30
Aingara Paramakuru
9320d4711f gpu: nvgpu: vgpu: add interface to alloc ctxsw buffers
gp10b introduces support for preemption (GfxP and CILP).
Add a new interface to allow allocating buffers needed
to support this functionality.

Bug 1677153

Change-Id: I8578a7b0a4327f3496d852eeb8be5fc778e2c225
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/806963
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/817039
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:22:09 +05:30
Aingara Paramakuru
01ba044bdb gpu: nvgpu: vgpu: add gp10b support
Add support for gp10b in a virtualized environment.

Bug 1677153
VFND-693

Change-Id: I919ffa44c6773940a7a3411ee8bbc403a992b7cb
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/792556
Reviewed-on: http://git-master/r/806193
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-27 15:22:07 +05:30
Deepak Nibade
505b442551 gpu: nvgpu: acquire mutex for notifier read
We use &ch->error_notifier_mutex to protect
writes and free of error notifier
But we currently do not protect reading of
notifier in gk20a_fifo_set_ctx_mmu_error()
and vgpu_fifo_set_ctx_mmu_error()

Add new API gk20a_set_error_notifier_locked()
which is same as gk20a_set_error_notifier()
but without the locks.

In *_fifo_set_ctx_mmu_error() APIs, acquire
the mutex explicitly, and then use this new
API

gk20a_set_error_notifier() will now just call
gk20a_set_error_notifier_locked() within
a mutex

Bug 1824788
Bug 1844312

Change-Id: I1f3831dc63fe1daa761b2e17e4de3c155f505d6f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1273471
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-12-27 01:24:35 -08:00
Sami Kiminki
425f99335b gpu: nvgpu: gk20a: Allow regops lists longer than 128
Process long regops lists in 4-kB fragments, overcoming the overly
low limit of 128 reg ops per IOCTL call. Bump the list limit to 1024
and report the limit in GPU characteristics.

Bug 200248726

Change-Id: I3ad49139409f32aea8b1226d6562e88edccc8053
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/1253716
(cherry picked from commit 22314619b28f52610cb8769cd4c3f9eb01904eab)
Reviewed-on: http://git-master/r/1266652
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-12-26 00:03:59 -08:00
Konsta Holtta
339a67b2e8 gpu: nvgpu: replace tsg list mutex with rwsem
Lock only for modifications to the tsg channel list, and allow multiple
concurrent readers.

Bug 1848834
Bug 1814773

Change-Id: Ie3938d4239cfe36a14211f4649ce72b7fc3e2fa4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1269579
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-12-20 15:15:51 -08:00
Sachit Kadle
c1750f45f5 gpu: nvgpu: add tsg_open HAL interface
Add HAL interface for TSG open, which is intended to
be called from the exisiting gk20a_tsg_open function.

The tsg_open entryoint is only implemented for vgpu,
as the server needs to clear metadata when a tsg is opened.

Bug 200215060

Change-Id: Icc8fd602f31e52d9fa9b2e7786b665b9e7b9294e
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1249218
(cherry picked from commit 35c86f7c796c6574d3dc336e20012ea5c16d7cb4)
Reviewed-on: http://git-master/r/1256468
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-12-19 15:39:47 -08:00
Richard Zhao
9bc735ac6a gpu: nvgpu: vgpu: fix va leak when call gk20a_vm_free_va
page size index needs to be set explicitly when call gk20a_vm_free_va.

Bug 200255799
JIRA VFND-3033

Change-Id: I376c63e724b8f59aee389c54ca1589683536f043
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1262586
(cherry picked from commit 82c05633f17fa094d8e08c8a0fa4bad2d3275268)
Reviewed-on: http://git-master/r/1263403
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-12-08 01:40:12 -08:00
Terje Bergstrom
d29afd2c9e gpu: nvgpu: Fix signed comparison bugs
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.

Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-16 21:35:36 -08:00
Peter Daifuku
1ba169d7db Revert "Revert "gpu: nvgpu: vgpu: alloc hwpm ctxt buf on client""
This reverts commit 5f1c2bc27f.

Added back now that matching RM server has been updated:

In hypervisor mode, all GPU VA allocations must be done by client;
fix this for the allocation of the hwpm ctxt buffer

Bug 200231611

Change-Id: Ie5ce2c2562401b1f00821231d37608e3fc30d4a4
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1252138
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-11-14 15:04:22 -08:00
Terje Bergstrom
8fa5e7c58a gpu: nvgpu: Remove IOCTL FREE_OBJ_CTX
We have never used the IOCTL FREE_OBJ_CTX. Using it leads to context
being only partially available, and can lead to use-after-free.

Bug 1834225

Change-Id: I9d2b632ab79760f8186d02e0f35861b3a6aae649
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1250004
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-11 11:47:42 -08:00
Terje Bergstrom
d09d259d74 gpu: nvgpu: vgpu: Do not overwrite err code on fail
vgpu_vm_alloc_share() wants to return -EINVAL if VMA areas requested
do not fulfill the criteria. The error code gets overwritten by a
call to vgpu_comm_sendrecv(), which makes vgpu_vm_alloc_share() always
return 0.

Change-Id: I93f56025f963d1d4ad2f9b06139fce742d3be41b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249961
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
2016-11-11 08:20:31 -08:00
Terje Bergstrom
5855fe26cb gpu: nvgpu: Do not post events to unbound channels
Change-Id: Ia1157198aad248e12e94823eb9f273497c724b2c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1248366
Tested-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
GVS: Gerrit_Virtual_Submit
2016-11-07 15:47:49 -08:00
Sivaram Nair
5f1c2bc27f Revert "gpu: nvgpu: vgpu: alloc hwpm ctxt buf on client"
This reverts commit 57821e2157.

Change-Id: Ic4801115064ccbcd1435298a61871921d056b8ea
Signed-off-by: Sivaram Nair <sivaramn@nvidia.com>
Reviewed-on: http://git-master/r/1247825
Reviewed-by: Rakesh Babu Bodla <rbodla@nvidia.com>
Tested-by: Rakesh Babu Bodla <rbodla@nvidia.com>
2016-11-04 08:34:18 -07:00