When using semaphore based channel synchronization, a semaphore release
may mean that a job has completed. Call gk20a_channel_update from
gk20a_channel_semaphore_wakeup to check if there are memory refs to
release or sync timelines to signal.
Bug 1450122
Change-Id: Ib829c895dab05676c35f974d3f1c3d88c047c9b9
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/394576
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add semaphore implementation of the gk20a_channel_sync interface.
Each channel has one semaphore pool, which is mapped as read-write to
the channel vm. We allocate one or two semaphores from the pool for each
submit.
The first semaphore is only needed if we need to wait for an opaque sync
fd. In that case, we allocate the semaphore, and ask GPU to wait for
it's value to become 1 (semaphore acquire method). We also queue a
kernel work that waits on the fence fd, and subsequently releases the
semaphore (sets its value to 1) so that the command buffer can proceed.
The second semaphore is used on every submit, and is used for work
completion tracking. The GPU sets its value to 1 when the command buffer
has been processed.
The channel jobs need to hold references to both semaphores so that
their backing semaphore pool slots are not reused while the job is in
flight. Therefore gk20a_channel_fence will keep a reference to the
semaphore that it represents (channel fences are stored in the job
structure). This means that we must diligently close and dup the
gk20a_channel_fence objects to avoid leaking semaphores.
Bug 1450122
Bug 1445450
Change-Id: Ib61091a1b7632fa36efe0289011040ef7c4ae8f8
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/374844
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This patch reorders scaling resume to happen always when
we power on the GPU, so as to balance the scaling suspend
when we power off GPU.
bug 200010911
Change-Id: I9fde817fbf9fed7d90c48ea06050db4b82e670a8
Signed-off-by: Allen Yu <alleny@nvidia.com>
Reviewed-on: http://git-master/r/421541
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Do not warn about unknown regions in ctxsw firmware blob.
Bug 1435870
Change-Id: I343d85a09a3cd1d7c1c881836af6868296409f07
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/420670
In gk20a_dbg_gpu_dev_release() (when we close nvhost-dgb-gpu sysfs),
we return from function if there is no channel bound to dbg_session
without freeing the dbg_session memory.
If there is no channel bound then do not call dbg_unbind_channel_gk20a()
and then free dbg_session memory always.
Bug 200010382
Change-Id: I90dd2ed3cd72fbc5d429799660daf2a09b974fda
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/419306
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Rewrite PMU boot sequence as a state machine. At PMU power-up send
initial messages, and reset state machine. At each reply from PMU,
do the next stage of PMU boot and set state.
As now PMU and FECS boot are independent, we need to ensure engine
idle before saving ZBC.
Change-Id: I1ea747ab794ef08f1784eeabfdae7655d585ff21
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/410205
In error case we first disabled the channel, and reset sync point to
max. After this we set channel error state. This causes a race if
channel is closed between setting sync point and setting channel
state.
Rearrange the code so that error state is set first, and only then
channel is disabled.
Bug 1519646
Change-Id: I20550f6a2708f892b6ba4ee714e90bdecdd128ad
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/418948
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
When exiting rail gate, we reloaded default ZBC values. The correct
behavior is to reload the values.
Bug 1447255
Change-Id: I7aad3586dda91a91a3629062a27001af281b955e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/418346
For GM20B alone, the LTC count is already accounted for the HW logic
for the CBC base calculation from the postDivide address. So SW
doesn't have to explicity divide it by the LTC count in the postDivide
address calculation.
Bug 1477079
Change-Id: I558bbe66bbcfb7edfa21210d0dc22c6170149260
Signed-off-by: Kevin Huang <kevinh@nvidia.com>
Reviewed-on: http://git-master/r/414264
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
On PBDMA error even though the engine might not be wedged, we need to
kick the channel out of engine. Add that logic. Also when channel is
not in engine, we need to remove it from runlist.
Bug 1498688
Change-Id: I5939feb41d0a90635ba313b265c7e3b5d3f48622
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/417682
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Kevin Huang (Eng-SW) <kevinh@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Otherwise other trace event headers included may also be created, which
leads to duplicate definition issues in the 3.14 kernel.
Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
- Add a sysfs "force_idle" to forcibly idle the GPU
- read on this sysfs will return the current status
0 : not in idle (running)
1 : in forced idle state
"echo 1 > force_idle" will force the gpu into idle
"echo 0 > force_idle" will cause the gpu to resume
Bug 1376916
Bug 1487804
Change-Id: I48dfd52e0d14561220bc4baea0776d1bdfaa7ea5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
ELPG flush is initiated from a common broadcast register, but must be
waited on via per-L2 registers. Split gk20a and gm20b versions of
the flush.
Change-Id: I75c2d65e8da311b50d35bee70308b60464ec2d4d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/401545
Reviewed-by: Automatic_Commit_Validation_User
Add support for booting FECS and GPCCS via faster bootloader method.
We leave this disabled until the bootloader binaries are checked in.
Change-Id: I39df5d116f7a33486407518c743638b01923970d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/413005
Add below two new APIs for gk20a :
1) gk20a_do_idle()
this API will force GPU to idle and railgate
2) gk20a_do_unidle()
this API will unblock all the tasks blocked by do_idle()
Bug 1487804
Change-Id: Ic5e7f2d19fb8d35f43666d0e309dde3022349d92
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/412061
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
- add rw_semaphore busy_lock for gpu busy() path
- take read lock on busy_lock inside gk20a_busy()
so that all usual requests can execute simultaneously
- write lock can be taken when we need to block all
of the gk20a_busy() calls
Bug 1487804
Change-Id: I1b162b38bce9621723d3e45280c6076816cf771a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/412060
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
- Export gk20a_wait_channel_idle() function from channel_gk20a.h
- also, return error -EBUSY from this function when channel is
found to be not idle
Bug 1487804
Change-Id: Ia7425e9b1332260ee9a53dca55ab07541f2755a9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/412059
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add semaphore_gk20a.c/h that implement a new semaphore management API
for the gk20a driver. The API introduces two entities, 'semaphore pools'
and 'semaphores'.
Semaphore pools are memory areas dedicated for hosting one or more
semaphores. Typically, one pool equals one 4K page. A semaphore pool
is always mapped to the kernel memory, and it can be mapped and
unmapped to gpu address spaces using gk20a_semaphore_pool_map/unmap.
Semaphores are backed by 16 bytes of memory allocated from a semaphore
pool. The value of a semaphore can be 0=acuired or 1=released. When
allocated, the semaphores are initialized to the acquired state. They
can be released, or their releasing can be waited for by the CPU or GPU.
Semaphores are intended to be used only once, and after they are
released they should be freed so that the slot within the semaphore
pool can be reused. However GPU jobs must take references to the
semaphores that they use (similarly as they take references on memory
buffers that they use) so that the semaphore backing memory is not
reused too soon.
Bug 1450122
Bug 1445450
Change-Id: I3fd35f34ca55035decc3e06a9c0ede20c1d48db9
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/374842
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
nvhost_get_syncpt_host_managed() creates syncpt name based on
platform_device pointer passed to it
Passing host1x's pointer to this API results in setting gk20a
syncpt names as "host1x_0" which is conflicting
Hence to restore this pass gk20a's device pointer
which gives syncpt names as "gk20a.0_0"
Also, add a syncpt check for sycnpt received.
Bug 1305024
Change-Id: I4ff96c7c9ebff2dca385c5787a85b4a9451b9514
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/410121
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove redundant cache maintenance operations. Instance blocks and
graphics context buffers are uncached, so they do not need any cache
maintenance.
Bug 1421824
Change-Id: Ie0be67bf0be493d9ec9e6f8226f2f9359cba9f54
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406948
PMU, FECS and GPCCS use the same address space. We used to initialize
the address space only if PMU is enabled. Create the system address
space always.
FECS and GPCCS used to have slower bit bang and faster DMA method
for loading ucode. Slower method is needed when FECS and GPCCS do not
have an address space. Remove the slower method as not anymore
needed.
Change-Id: I155619741ecc36aa6bf13a9c1ccb03c7c1330f0a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406771
In channel_update(), we detect if channel is idle and if it is
idle then we free the syncpt. We do not free the syncpt if WFI is
scheduled on some other path.
Instead of checking for WFI, we can check if last submit is complete
or not (it can be WFI as well) and if last submit is complete then
we can free the syncpt.
Locking mechanism using submit lock will take care that syncpt is
kept alive until last submit or WFI completes
Bug 1305024
Change-Id: Ieafb82e1f924a01236ca73ed151eb03e88729835
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/405201
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add gk20a as a sub power domain of host1x. This enforces keeping
host1x on when using gk20a.
Bug 200003112
Change-Id: I08db595bc7b819d86d33fb98af0d8fb4de369463
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/407006
(cherry picked from commit 009812b3e510518740e9c7e89b8b8b80439fe26a)
Reviewed-on: http://git-master/r/408013
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
When rail gating, we cleared all PMU status. Clear only the relevant
fields.
Change-Id: I5b4e8d74339aae6f1c6b945f45b8378bb563e8be
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406843
Remove the path for turning on only gk20a. Always when turning on
hardware, turn both host1x and GPU on.
Change-Id: I5f972a487d3348bf2254bdb0fadb42ca600a559e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406405
This patch addresses two issues in fixes offset mappings:
- VA unmapping did not use lists safely. This caused an application hang
if the application did not free all (fixed offset) buffers before quiting.
- GPU was not powered closing AS node. If the address space had areas that
were not freed, the driver tried to access hw without powering it up first.
Change-Id: Ida526d222ea4e03b8d765eca16574ddc1823e60d
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/405872
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>