We can trigger CILP only if SM debug mode is on.
So in fecs interrupt handler, we could have graphics context
running for which SM debug mode is disabled.
And in that case we skip posting of cilp completion
events to UMD.
But since CILP event was anyways triggered, we need to post
events to UMD irrespective of SM debug mode is enabled
at that point or not
Hence remove check gk20a_gr_sm_debugger_attached() for
posting events to UMD
Bug 200243092
Change-Id: I54ad205be11ec6d5034d524bfbb28f8a1fa72993
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1263591
(cherry picked from commit e6259e2d0d5a4bb5929e70e03e154f8b82ae3600)
Reviewed-on: http://git-master/r/1264780
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
gp10b does not have an fbpa unit, although the
hw header files claim it does. Hardcode all fbpa
values to 0.
Bug 200249125
Change-Id: I6ed63b3231d7af8e31ccf5047d56bdb85f05a9d9
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1256422
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
In gp10b_ce_nonstall_isr(), we trigger a semaphore wakeup.
Currently, we clear the interrupt status register after the
wakeup is complete. There is potential for an interrupt to
come in while the wake-up operation is in progress, and it
is possible that:
1) We miss processing the interrupt in that ISR iteration AND
2) We clear the interrupt status register anyways
This change clears the status register before triggering wakeup,
so the interrupt will properly re-fire.
Bug 200244458
Change-Id: Ia3338252eeea4eb60d11c0e241279989a46dac04
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1253107
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
get_litter_value API is updated to use int instead of
enum type.
JIRA GV11B-21
Change-Id: I982fdfe372f4be38aa4ed026a23e936d73190e79
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1252212
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent
such problems from occuring in future.
Change-Id: Ib7026728ef0e8c3c9e68956fc9794ec3a786a8a2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252069
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GPU frequencies can be set by powerhal when GPU is railgated,
and before this change that would cause EMC floors to remain
set until GPU is unrailgated.
After this change, EMC floors are not requested by the GPU
client when the GPU is railgated. It is ok to ignore the
requests, as the GPU client maxes the floor when powering
up.
Bug 1807560
Change-Id: I9a0d58b0288edbd03b2edf09580ecabd9b74f0c2
Signed-off-by: Juha Lainema <jlainema@nvidia.com>
Reviewed-on: http://git-master/r/1216233
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Ilan Aelion <iaelion@nvidia.com>
Reviewed-by: Cyril Raju <craju@nvidia.com>
Tested-by: Cyril Raju <craju@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
- Define fuse macros depending on kernel version as fuse
offset got changed in K4.4 and for K4.4 fuse defines are
defined in common header file (tegra-fuse.h)
- Use fuse control read/write APIs when reading control
registers for K4.4
Bug 200243956
Change-Id: I34dabd1a307d10010cb89ac6a5f1e3f5b177c0fc
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1245825
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
The lowest page table level may hold very few entries for mappings of
large pages, but a new page is allocated for each list of entries at the
lowest level, wasting memory and performance. Compact these so that the
new "allocation" of ptes is appended at the end of the previous
allocation, if there is space.
Bug 1736604
Change-Id: I4c7c4cad9019de202325750aee6034076e7e61c2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1222810
(cherry picked from commit 97303ecc946c17150496486a2f52bd481311dbf7)
Reviewed-on: http://git-master/r/1234995
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Use entry->mem for determining the target aperture bits of the memory
block represented by entry->mem in update_gmmu_pde0_locked(), instead of
pte->mem that holds the parent memory where this bit is written to.
Previously this has worked because all page tables have been in the same
aperture, but really large userspace allocations may push a part of them
suddendly to sysmem.
Bug 1809939
Change-Id: I3372487c6ae9793018ce44552ded3fb1ba4d145a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1218636
(cherry picked from commit a92596f6e8e621e51b6afae9ab7e62044d6311eb)
Reviewed-on: http://git-master/r/1220525
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For bar2 and pmu instance blocks, use gk20a_aperture_mask()
to select target address (i.e. if address is in sysmem or
vidmem) based on aperture
Also add target accessors for gr_fecs_new_ctx and
gr_fecs_arb_ctx_ptr
Jira DNVGPU-22
Change-Id: Ieaa80bd83a4191fe57b7fba6e0f9cdaeb195a077
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1216138
(cherry picked from commit 7a9f4175abc5dddf0879215de4637b7b6eb0ab7b)
Reviewed-on: http://git-master/r/1219712
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add support for cyclestats snapshots in the virtual case
Bug 1700143
JIRA EVLR-278
Change-Id: I353efac6a17704c815a99745ac04d2c3d831351b
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1216644
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Since page tables could either reside either in sysmem
or vidmem, use gk20a_mem_get_base_addr() to get the
base address for buffer
This API will take care of returning proper base address
Jira DNVGPU-20
Change-Id: I3422b51c3ffb8fb86f1dc5095263fc8f19dae44d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1206407
(cherry picked from commit 3c4b22c35b2c4eec33234c2f8dccd9de9422d093)
Reviewed-on: http://git-master/r/1210962
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Event notifications on TSGs should only be sent to the channel that caused the
event to happen in the first place, not evey channel in the tsg. Any more and
the debugger will not be able to tell what channel actually got the event.
Worse yet, if all the channels in a tsg are bound to the same debug session
(as is the case with cuda-gdb), then multiple nvgpu events for the same gpu
event will be triggered, causing events to be buffered and the client to get
out of sync.
One gpu exception, one nvgpu event per tsg.
Bug 1793988
Change-Id: Iee36c774f193554ffb9ab7c1650ee0610e476a99
Signed-off-by: Cory Perry <cperry@nvidia.com>
Reviewed-on: http://git-master/r/1194206
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Unknown engine is expected, as we do not support all dGPU engines.
Remove the error spew.
JIRA DNVGPU-26
Change-Id: I3d43253b8cab4e51b426536e4899a62156d0da16
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1206465
(cherry picked from commit a3fa13f6be4ff60e90558326474af3d1b315aa43)
Reviewed-on: http://git-master/r/1208408
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Added preemption mode (WFI, GFXP, CTA and CILP) support for gp10x
family gr class (PASCAL_B and PASCAL_COMPUTE_B).
Bug 200221149
Change-Id: Ia8b781c5baedba660db5997f190a0b363286ed7f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1193209
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Added control flag for nvgpu infra to allow kernel to create privileged
CE channels for page migration and clearing support between sysmem
and videmem.
JIRA DNVGPU-53
Change-Id: I1fc35eea60af3d1ea9a0b5582011f20d58958ccb
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1173091
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
For devices that have vidmem available, use the vidmem allocator in
gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem.
Because all of the buffers haven't been tested to work in vidmem yet,
rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at
the end to declare explicitly that vidmem is used. Enabling vidmem for
each now is a matter of removing "_sys" from the function call.
Jira DNVGPU-18
Change-Id: I4a67eae403f1d9d271118c35e3775b1129170676
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1176806
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
NvGPU is moving to use runtime PM only for its power
management
Remove pm_domain calls to register to nvhost
Jira DNVGPU-57
Change-Id: Idd01b680af0e8fd601801150fc663afa53b7ce6f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1163217
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that
buffers represented as a mem_desc and present in vidmem can be mapped to
gpu.
JIRA DNVGPU-18
JIRA DNVGPU-76
Change-Id: Icd675e83e3c28836f0ed8880425748697713bb0a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169296
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Modify page table updates to take an aperture flag (up until
gk20a_locked_gmmu_map()), don't hard-assume sysmem and propagate it to
hardware.
Jira DNVGPU-76
Change-Id: I797fdaaf5f42a84fa0446577359147fb6908a720
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169295
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
add gk20a_aperture_mask() for memory target selection now that buffers
can actually be allocated from vidmem, and use it in all cases that have
a mem_desc available.
Jira DNVGPU-76
Change-Id: Ifd1908808d928155a0cadeff8ca451a151bfc8c5
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169294
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Set NV_PGRAPH_PRI_FE_GFXP_WFI_TIMEOUT from the
default of ~20us to ~100us. Also set
NV_PGRAPH_DEBUG_2_GFXP_WFI_ALWAYS_INJECTS_WFI o
avoid going into GFXP all the time.
Bug 1593548
Jira VFND-1894
Change-Id: I6310c3605f7b83178c38de88788d87e36ee428b4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1162629
(cherry picked from commit 873ddc7288063b1773d31a5bda30d980122d6645)
Reviewed-on: http://git-master/r/1166988
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add accessors for GFXP_WFI_ALWAYS_INJECTS_WFI,
field to control FE behaviour for GFXP
Jira VFND-1900
Change-Id: Id531f795422393dc603859a0f3059d0681cf9464
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1162628
(cherry picked from commit 4175a21dd2fcbf9c25623bf5d472a3bc30476faa)
Reviewed-on: http://git-master/r/1166989
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
From this patch onwards, runlist_id is a member of
struct channel_gk20a. So removed hard coded
runlist_id mapping logic.
JIRA DNVGPU-25
Change-Id: Ib87d96a518a490d4167071708a76100a4d4c02dd
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1161776
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This is done to mask a race issue seen where power refcount
is zero during ISR or bottom half.
Bug 200198908
Change-Id: I0a8ed774cd4fda9db65429b5aad03c5e001ff666
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/1162314
Reviewed-by: Juha Tukkinen <jtukkinen@nvidia.com>
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt support for
Pascal GPU series
5) Removed hard coded engine_id logic and
made generic way
6) Code cleanup for readability
JIRA DNVGPU-26
Change-Id: Ibf46a89a5308c82f01040ffa979c5014b3206f8e
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1156022
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>