Otherwise other trace event headers included may also be created, which
leads to duplicate definition issues in the 3.14 kernel.
Signed-off-by: Dan Willemsen <dwillemsen@nvidia.com>
- Add a sysfs "force_idle" to forcibly idle the GPU
- read on this sysfs will return the current status
0 : not in idle (running)
1 : in forced idle state
"echo 1 > force_idle" will force the gpu into idle
"echo 0 > force_idle" will cause the gpu to resume
Bug 1376916
Bug 1487804
Change-Id: I48dfd52e0d14561220bc4baea0776d1bdfaa7ea5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
ELPG flush is initiated from a common broadcast register, but must be
waited on via per-L2 registers. Split gk20a and gm20b versions of
the flush.
Change-Id: I75c2d65e8da311b50d35bee70308b60464ec2d4d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/401545
Reviewed-by: Automatic_Commit_Validation_User
Add support for booting FECS and GPCCS via faster bootloader method.
We leave this disabled until the bootloader binaries are checked in.
Change-Id: I39df5d116f7a33486407518c743638b01923970d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/413005
Add below two new APIs for gk20a :
1) gk20a_do_idle()
this API will force GPU to idle and railgate
2) gk20a_do_unidle()
this API will unblock all the tasks blocked by do_idle()
Bug 1487804
Change-Id: Ic5e7f2d19fb8d35f43666d0e309dde3022349d92
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/412061
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
- add rw_semaphore busy_lock for gpu busy() path
- take read lock on busy_lock inside gk20a_busy()
so that all usual requests can execute simultaneously
- write lock can be taken when we need to block all
of the gk20a_busy() calls
Bug 1487804
Change-Id: I1b162b38bce9621723d3e45280c6076816cf771a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/412060
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
- Export gk20a_wait_channel_idle() function from channel_gk20a.h
- also, return error -EBUSY from this function when channel is
found to be not idle
Bug 1487804
Change-Id: Ia7425e9b1332260ee9a53dca55ab07541f2755a9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/412059
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add semaphore_gk20a.c/h that implement a new semaphore management API
for the gk20a driver. The API introduces two entities, 'semaphore pools'
and 'semaphores'.
Semaphore pools are memory areas dedicated for hosting one or more
semaphores. Typically, one pool equals one 4K page. A semaphore pool
is always mapped to the kernel memory, and it can be mapped and
unmapped to gpu address spaces using gk20a_semaphore_pool_map/unmap.
Semaphores are backed by 16 bytes of memory allocated from a semaphore
pool. The value of a semaphore can be 0=acuired or 1=released. When
allocated, the semaphores are initialized to the acquired state. They
can be released, or their releasing can be waited for by the CPU or GPU.
Semaphores are intended to be used only once, and after they are
released they should be freed so that the slot within the semaphore
pool can be reused. However GPU jobs must take references to the
semaphores that they use (similarly as they take references on memory
buffers that they use) so that the semaphore backing memory is not
reused too soon.
Bug 1450122
Bug 1445450
Change-Id: I3fd35f34ca55035decc3e06a9c0ede20c1d48db9
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/374842
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
nvhost_get_syncpt_host_managed() creates syncpt name based on
platform_device pointer passed to it
Passing host1x's pointer to this API results in setting gk20a
syncpt names as "host1x_0" which is conflicting
Hence to restore this pass gk20a's device pointer
which gives syncpt names as "gk20a.0_0"
Also, add a syncpt check for sycnpt received.
Bug 1305024
Change-Id: I4ff96c7c9ebff2dca385c5787a85b4a9451b9514
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/410121
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove redundant cache maintenance operations. Instance blocks and
graphics context buffers are uncached, so they do not need any cache
maintenance.
Bug 1421824
Change-Id: Ie0be67bf0be493d9ec9e6f8226f2f9359cba9f54
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406948
PMU, FECS and GPCCS use the same address space. We used to initialize
the address space only if PMU is enabled. Create the system address
space always.
FECS and GPCCS used to have slower bit bang and faster DMA method
for loading ucode. Slower method is needed when FECS and GPCCS do not
have an address space. Remove the slower method as not anymore
needed.
Change-Id: I155619741ecc36aa6bf13a9c1ccb03c7c1330f0a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406771
In channel_update(), we detect if channel is idle and if it is
idle then we free the syncpt. We do not free the syncpt if WFI is
scheduled on some other path.
Instead of checking for WFI, we can check if last submit is complete
or not (it can be WFI as well) and if last submit is complete then
we can free the syncpt.
Locking mechanism using submit lock will take care that syncpt is
kept alive until last submit or WFI completes
Bug 1305024
Change-Id: Ieafb82e1f924a01236ca73ed151eb03e88729835
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/405201
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add gk20a as a sub power domain of host1x. This enforces keeping
host1x on when using gk20a.
Bug 200003112
Change-Id: I08db595bc7b819d86d33fb98af0d8fb4de369463
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/407006
(cherry picked from commit 009812b3e510518740e9c7e89b8b8b80439fe26a)
Reviewed-on: http://git-master/r/408013
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
When rail gating, we cleared all PMU status. Clear only the relevant
fields.
Change-Id: I5b4e8d74339aae6f1c6b945f45b8378bb563e8be
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406843
Remove the path for turning on only gk20a. Always when turning on
hardware, turn both host1x and GPU on.
Change-Id: I5f972a487d3348bf2254bdb0fadb42ca600a559e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/406405
This patch addresses two issues in fixes offset mappings:
- VA unmapping did not use lists safely. This caused an application hang
if the application did not free all (fixed offset) buffers before quiting.
- GPU was not powered closing AS node. If the address space had areas that
were not freed, the driver tried to access hw without powering it up first.
Change-Id: Ida526d222ea4e03b8d765eca16574ddc1823e60d
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/405872
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Currently the generic platform is used only if the device tree
defines that we have a generic platform available, however, the
generic platform is fully compatible with the gk20a we have in tegra.
This patch modifies the definitions so that we use generic platform
also for tegra - even if if tegra configuration option is not enabled.
Bug 1434573
Change-Id: Ib35ce0ab935d27764e960bf4d74a5016ae047a1f
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/396867
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Enable/disable powergating around regops so that the user
need not call the powergating IOCTLs with the regops IOCTL.
If the user does call the powergating IOCTL then the ref-counting
will ensure the correct behavior.
Bug 1451949
Change-Id: I1746f7d7cd1d2c0c497c213939df44a59d5d2834
Signed-off-by: Sandarbh Jain <sanjain@nvidia.com>
Reviewed-on: http://git-master/r/395131
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
gk20a is going to be moved under platform bus, however, the sysfs
interface should remain stable over the transition period. This
patch adds a symlink to keep current interfaces stable.
Bug 1311528
Bug 1434573
Change-Id: I951000f4b25285ff96e93eb726342d5b76cc84f1
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/396926
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Currently the gpu driver assumes that the GPU is a child of host1x.
This is an invalid assumption and therefore we need to get the host1x
device from device tree based on nvidia,host1x property.
Bug 1311528
Bug 1434573
Change-Id: I097e39369aaa15ab6652cd23f353f88f7c2b9c48
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/395664
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Make the perfmon sampling configurable, by adding an 'enabled' flag.
This is set according to the CONFIG initially. Modify the perfmon event
handler to not touch clock rates. Add a counter to count the number of
perfmon events.
Also add debugfs entries for the above.
Bug 1410515
Change-Id: Ic8197eef0e46e35af1179a5b06140393541cfd43
Signed-off-by: Prashant Malani <pmalani@nvidia.com>
Reviewed-on: http://git-master/r/351564
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Currently creation of the load sysfs node is bound to devfreq
profile initialisation, however, this information is useful even
if the scaling is not enabled. This patch modifies the code to create
the sysfs node always.
Bug 1485489
Change-Id: Id20433344aa81108f89a36cd56c9a73dd9d2e1c8
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/399474
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
On channel_finish() path, we first check if last submit was
WFI and in that case we do not submit new WFI but just wait
on old syncpt fence.
But it is possible that sync resource is already freed from
another path (channel_suspend())
Hence add a NULL check there to prevent Null pointer
exception.
Also, in channel_free() path, move syncpt free API after
channel_unbind() since we logically free the syncpt after
unbinding the channel.
Bug 1305024
Change-Id: Icc2fc83f004310560fc459527e1d37730428ec2d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/400233
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
All of the channel's submit jobs are added to the list channel->jobs
In channel_update(), we iterate over this list and check if any job
has completed. If any job is complete then we remove it from the list.
If this list is empty then it means channel is idle and we can free
its syncpt.
Hence after iterating this list, check if it is empty or not.
If it is empty AND if we are aggressive to free the syncpt
(syncpt_aggressive_destroy flag is set) then free the syncpt
at this point.
Keep the syncpt free code inside submit_lock to avoid race conditions.
Also, do not free the syncpt if we have already scheduled WFI on some
other path. In that case, syncpt is still needed to check for channel
idle. Once WFI completes, we free the syncpt anyway.
Bug 1305024
Change-Id: I1654e1db3b76b7ad14644dbb900b03f195ca3b2c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/398617
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add submit mutex lock to avoid race conditions between submitting
a job, removing a job and submitting WFI
With this lock make below operations atomic :
during submit_gpfifo() -
1. getting new syncpt
2. inserting syncpt increment
3. submitting gpfifo
4. setting job completion interrupt
during submit_wfi() -
1. getting new syncpt
2. inserting syncpt increment when idle
during channel_update() -
1. checking the submit job completion
2. freeing the job if it is completed
Bug 1305024
Change-Id: I0e3c0b8906d83fd59642344626ffdf24fad2aaab
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/397670
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
CBC frontdoor access works incorrectly in the simulator if CBC
is allocated from IOVA. This patch makes CBC allocation to happen
from physical memory if are running in simulator.
Bug 1409151
Change-Id: Ide08f4eab6911adc5737001c6d751ee227fec8f9
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/401544
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>