Currently the generic platform is used only if the device tree
defines that we have a generic platform available, however, the
generic platform is fully compatible with the gk20a we have in tegra.
This patch modifies the definitions so that we use generic platform
also for tegra - even if if tegra configuration option is not enabled.
Bug 1434573
Change-Id: Ib35ce0ab935d27764e960bf4d74a5016ae047a1f
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/396867
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Enable/disable powergating around regops so that the user
need not call the powergating IOCTLs with the regops IOCTL.
If the user does call the powergating IOCTL then the ref-counting
will ensure the correct behavior.
Bug 1451949
Change-Id: I1746f7d7cd1d2c0c497c213939df44a59d5d2834
Signed-off-by: Sandarbh Jain <sanjain@nvidia.com>
Reviewed-on: http://git-master/r/395131
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
gk20a is going to be moved under platform bus, however, the sysfs
interface should remain stable over the transition period. This
patch adds a symlink to keep current interfaces stable.
Bug 1311528
Bug 1434573
Change-Id: I951000f4b25285ff96e93eb726342d5b76cc84f1
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/396926
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Currently the gpu driver assumes that the GPU is a child of host1x.
This is an invalid assumption and therefore we need to get the host1x
device from device tree based on nvidia,host1x property.
Bug 1311528
Bug 1434573
Change-Id: I097e39369aaa15ab6652cd23f353f88f7c2b9c48
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/395664
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Make the perfmon sampling configurable, by adding an 'enabled' flag.
This is set according to the CONFIG initially. Modify the perfmon event
handler to not touch clock rates. Add a counter to count the number of
perfmon events.
Also add debugfs entries for the above.
Bug 1410515
Change-Id: Ic8197eef0e46e35af1179a5b06140393541cfd43
Signed-off-by: Prashant Malani <pmalani@nvidia.com>
Reviewed-on: http://git-master/r/351564
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Currently creation of the load sysfs node is bound to devfreq
profile initialisation, however, this information is useful even
if the scaling is not enabled. This patch modifies the code to create
the sysfs node always.
Bug 1485489
Change-Id: Id20433344aa81108f89a36cd56c9a73dd9d2e1c8
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/399474
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
On channel_finish() path, we first check if last submit was
WFI and in that case we do not submit new WFI but just wait
on old syncpt fence.
But it is possible that sync resource is already freed from
another path (channel_suspend())
Hence add a NULL check there to prevent Null pointer
exception.
Also, in channel_free() path, move syncpt free API after
channel_unbind() since we logically free the syncpt after
unbinding the channel.
Bug 1305024
Change-Id: Icc2fc83f004310560fc459527e1d37730428ec2d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/400233
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
All of the channel's submit jobs are added to the list channel->jobs
In channel_update(), we iterate over this list and check if any job
has completed. If any job is complete then we remove it from the list.
If this list is empty then it means channel is idle and we can free
its syncpt.
Hence after iterating this list, check if it is empty or not.
If it is empty AND if we are aggressive to free the syncpt
(syncpt_aggressive_destroy flag is set) then free the syncpt
at this point.
Keep the syncpt free code inside submit_lock to avoid race conditions.
Also, do not free the syncpt if we have already scheduled WFI on some
other path. In that case, syncpt is still needed to check for channel
idle. Once WFI completes, we free the syncpt anyway.
Bug 1305024
Change-Id: I1654e1db3b76b7ad14644dbb900b03f195ca3b2c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/398617
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add submit mutex lock to avoid race conditions between submitting
a job, removing a job and submitting WFI
With this lock make below operations atomic :
during submit_gpfifo() -
1. getting new syncpt
2. inserting syncpt increment
3. submitting gpfifo
4. setting job completion interrupt
during submit_wfi() -
1. getting new syncpt
2. inserting syncpt increment when idle
during channel_update() -
1. checking the submit job completion
2. freeing the job if it is completed
Bug 1305024
Change-Id: I0e3c0b8906d83fd59642344626ffdf24fad2aaab
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/397670
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
CBC frontdoor access works incorrectly in the simulator if CBC
is allocated from IOVA. This patch makes CBC allocation to happen
from physical memory if are running in simulator.
Bug 1409151
Change-Id: Ide08f4eab6911adc5737001c6d751ee227fec8f9
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/401544
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
update headers from latest gen_register/ip_check info
Change-Id: Iae892ab7138e7bba4abc821b9d7893e768647daa
Signed-off-by: Ken Adams <kadams@nvidia.com>
Reviewed-on: http://git-master/r/399382
fixes one use of unitialized var
renames a register to make it match dev_* file.
Change-Id: Iafba659bbf2df509e0b494b2c5dab3819bf650ef
Signed-off-by: Ken Adams <kadams@nvidia.com>
Reviewed-on: http://git-master/r/394792
the CBC clean and invalidate is done for gk20a for bug 1409151, now
it's time to do the same fo gm20b. the text of this change is
strictly copied from gk20a, simply to make build pass.
Change-Id: Id717cb1e2ca0fa3f8483c3fd40d7629a9cc85ec9
Signed-off-by: Bo Yan <byan@nvidia.com>
Call railgate and unrailgate ops only if they are defined.
Change-Id: I0a87ac0259af3719098d4372be7e25f0a54416fc
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/396375
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Bo Yan <byan@nvidia.com>
gk20a_ltc_init_comptags and gk20a_ltc_clear_comptags are defined
in ltc_gk20a.c, gm20b has its own init/clear functions, so remove
these two from ltc_common.c
change nvhost_allocator_init to gk20a_allocator_init, this is a
left-over after rebase, just like the above 2 function definitions,
so fix it.
Change-Id: I829639dd7fee9110dd65d5df7d7f0f8fe5fca6c1
Signed-off-by: Bo Yan <byan@nvidia.com>
Two calls to gk20a_init_gpu_characteristics() is not needed.
GPU sim aperture was defined twice.
Change-Id: Iaf78611717c55b1cae456358fcae2641ad552d9f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/383855
Reviewed-by: Automatic_Commit_Validation_User
Move the set_zbc_color_entry() operation to the LTC common code
as this is part of the LTC.
Change-Id: Iba41e32e273d86fcf76094440c2313a75a928326
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/366174
(cherry picked from commit 569ce1f3370532f12face62664a07d2d17a96bef)
Reviewed-on: http://git-master/r/376505
Reviewed-by: Automatic_Commit_Validation_User
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move the comptags cache init and clear operations to the LTC
from the gr code as this is part of the LTC.
Change-Id: I2163a09bcfe68a8833d5135bfa4035f37c7157ab
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/366173
(cherry picked from commit f56d4723f996f0dd2fcf0ae4279dbc4b6483b405)
Reviewed-on: http://git-master/r/376504
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Kevin Huang (Eng-SW) <kevinh@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This patch adds an interface for the gk20a driver to have
generic ops which are implemented by a chip specific HAL
layer. The HAL layer is provided by the gpu_ops struct which
defines function pointers for chip specific oeprations. This
is necessary for supporting multiple chips with the same
code base and minimal per chip hacking.
Also, since much code is common except in the HW headers
that are needed, the LTC common code is compiled by first
including the necessary chip specific header(s) and then
including the ltc common code file.
This allows for easy updating of functions that are only
different between chips as a result of register offset and
field changes whereas the HAL provides the mechanism for
functions that have actual semantic changes.
Change-Id: I96f9a8350d34e7e101beb141d4521fab69dcfbae
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/360627
(cherry picked from commit fe90cad939cf979fc2516a96e5911bd8ab6fc457)
Reviewed-on: http://git-master/r/362228
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
This adds new IOCTL that provides information for the userspace for
GPU characterization. Specifically, the following items are provided:
GPU arch/impl/rev, number of GPCs, L2 cache size, on-board video
memory size, num of tpc:s per gpc, and bus type. The primary user of
the new IOCTL will be rmapi_tegra.
Bug 1392902
Change-Id: Ia7c25c83c8a07821ec60be3edd018c6e0894df0f
Reviewed-on: http://git-master/r/346379
(cherry picked from commit 0b9ceca5a06d07cc8d281a92b76ebef8d4da0c92)
Reviewed-on: http://git-master/r/350658
Reviewed-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
If DMA address is not defined, use the physical address.
Bug 1500983
Change-Id: Ic33b21f74c8c2760e43146b87eec7ea467fc87be
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
(cherry picked from commit 8ae9a6567349241ce1cfff383526b0d9d39c28a1)
Reviewed-on: http://git-master/r/415238
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
This far the scaling has been disabled only when we suspend the
system and therefore we unnecessarily keep gpu workers running even
if the gpu itself would be railgated. This is not proper behaviour
and it causes a race in suspend sequence.
This patch reorders scaling disable to happen always when we turn off
the GPU.
Bug 200004860
Change-Id: Ief0bfd89378d5a7ced26c3ef29094dd5c378b01a
Signed-off-by: Santosh Katvate <skatvate@nvidia.com>
Reviewed-on: http://git-master/r/410443
(cherry picked from commit bcae65bea24be2a1e0abe42522d99ba70c94cbe2)
Reviewed-on: http://git-master/r/413249
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
In some cases the gpu has still work pending while the device is
being suspended. This patch forces pm runtime to be disabled for
the device to avoid powering up the gpu unnecessarily.
Bug 1515437
Change-Id: I4b57d72eb34e794f0457d7a074d26c9d096a13b3
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/411968
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>
Add a place to edit context-switched perf settings based upon
class. Disable tex-lock as the first of such for compute.
Bug 1409041
Change-Id: I5317a2a2e5f855661a1400b42f69211d16ae0c1d
Signed-off-by: Randy Spurlock <rspurlock@nvidia.com>
Reviewed-on: http://git-master/r/405908
(cherry picked from commit 250e149be35ecb8893dcef053ec44ffea86c302a)
Reviewed-on: http://git-master/r/407094
(cherry picked from commit 54337c08cbf6c2c6b5c929c1be24e87165d9d946)
Reviewed-on: http://git-master/r/408837
Reviewed-by: Mandar Padmawar <mpadmawar@nvidia.com>
Tested-by: Mandar Padmawar <mpadmawar@nvidia.com>
Add handler gk20a_gr_handle_fecs_error() in case we have
pending fecs error interrupt
And clear this interrupt after handling.
Also, in gk20a_gr_handle_fecs_error(), for now just print
the contents of NV_PGRAPH_FECS_INTR and clear it
Bug 1495957
Change-Id: Ie7f70c84ec76ab698141646cd683584c4501e3e0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/402874
(cherry picked from commit a29f219c57d65a06f6dae8086f19fa1af94d95bd)
Reviewed-on: http://git-master/r/403587
(cherry picked from commit e65ebebd0d4d5c3dbb6fa454dd51c383ea13d715)
Reviewed-on: http://git-master/r/411160
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Even though we mask LBREQ interrupt, hardware will still indicate it
in PBDMA interrupt register. Stop treating LBREQ as fatal.
Bug 1498688
Change-Id: Iec4c199437c50951ed9289cb85faf0008646d5c0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/408763
Reviewed-by: Riham Haidar <rhaidar@nvidia.com>
Tested-by: Riham Haidar <rhaidar@nvidia.com>