Implement support for 64kB large page size. Add an API to create an
address space via IOCTL so that we can accept flags, and assign one
flag for enabling 64kB large page size.
Also adds APIs to set per-context large page size. This is possible
only on Maxwell, so return error if caller tries to set large page
size on Kepler.
Default large page size is still 128kB.
Change-Id: I20b51c8f6d4a984acae8411ace3de9000c78e82f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Merge initialization code from gk20a_init_system_vm(),
gk20a_init_bar1_vm() and gk20a_vm_alloc_share() into gk20a_init_vm().
Remove redundant page size data, and move the page size fields to be
VM specific.
Bug 1558739
Bug 1560370
Change-Id: I4557d9e04d65ccb48fe1f2b116dd1bfa74cae98e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
-Moved bind fecs from work queue to init_gr_support.
-It makes all CPU->FECS communication to happen before
booting PMU, and after we boot PMU, only PMU talks to
FECS. So it removes possibility to race between CPU
and PMU talking to FECS.
Bug 200032923
Change-Id: I01d6d7f61f5e3c0e788d9d77fcabe5a91fe86c84
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/559733
Fixed sparse non static symbol warning by making the following symbol 'static':
- to_gk20a_sync_pt
- to_gk20a_timeline
Bug 200032218
Change-Id: Ie0310116aa1500ae8e4838b8a9ad4943a61cfc24
Signed-off-by: Amit Sharma <amisharma@nvidia.com>
Reviewed-on: http://git-master/r/552052
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
gk20a_vm_map_buffer() used to ignore silently map requests for
non-dmabuf fd:s. Fix this by returning the error code from
dma_buf_get().
Bug 1566862
Change-Id: If01b03f43b67b17d9fb997d914db871520f50c6e
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
gpu rail gating reference count is going wrong because
"can_railgate" is set to false during probe(). For rail-gating
to work no gpu re-work is needed and by default rail-gating
is enabled with INT_MAX delay.
Bug 200044987
Change-Id: I9367275cd18c34cb19a51193353585789ba44c03
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/556568
Reviewed-by: Mitch Luban <mluban@nvidia.com>
When validating buffers to be mapped, check that the buffer end does not
overflow over the virtual address node space.
Bug 1562361
Change-Id: I3c78ec7380584ae55f1e6bf576f524abee846ddd
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
-Change cde_buf to use writecombined cpu mapping.
-Since reading writecombined cpu data is still slow, avoid reads in
gk20a_replace_data by checking whether a patch overwrites a whole
word.
-Remove unused distinction between src and dst buffers in cde_convert.
-Remove cde debug dump code as it causes a perf hit.
Bug 1546619
Change-Id: Ibd45d9c3a3dd3936184c2a2a0ba29e919569b328
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/553233
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
Using KEPLER_C for doing sync point increment has side effects.
It adds a SetObject method, which changes channel state that not all
user space accounts for.
Bug 1462255
Bug 1497928
Bug 1559462
Change-Id: I5c422ad8ca94fba15cad9bd232f7a10d94aa0973
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/554478
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
GK20A PMU is not supported in GPU client for virtualization. However,
to make native case and virtualization case can share same defconfig and
kernel image, we need to enable CONFIG_GK20A_PMU and
CONFIG_GK20A_DEVFREQ in defconfig. This commit changes to detect if we
should disable GK20A PMU support in run time.
Bug 200041597
Change-Id: I292c647303ed57af6faa1c5671037ca27b48e31e
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/553653
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
When initializing fifo, we ignore several error conditions. Add
checks for them.
Change-Id: Id67f3ea51e3d4444b61a3be19553a5541b1d1e3a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/553269
If engine status is in context switch in the fifo mmu fault handler,
dump falcon stats and gr stats for each engine.
Bug 1544766
Change-Id: Idfa9772b7e67072941144ac3bdd73e791fdc2b23
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/553205
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Implement empty or -ENOSYS functions for vgpu if
CONFIG_TEGRA_GR_VIRTUALIZATION is not enabled, and remove ifdefs around
the calling code.
Change-Id: Idc75c9bc486d661786bc222bd9e0380aa7766e78
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/552898
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
At this stage, ctxsw is always in reset state, because we're powering GPU
up, or we have reset the whole GR partition. Remove the code to invoke a
second reset.
Fix waiting for FE idle. We should wait after each bundle, and break if any
iteration fails.
Change-Id: I0846f67c6d860a485dea62ff870deafe55a47365
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/552799
The return value of gk20a_busy must be checked since it may not succeed
in some cases. Add the __must_check attribute that generates a compiler
warning for code that does not read the return value and fix all uses of
the function to take error cases into account.
Bug 200040921
Change-Id: Ibc2b119985fa230324c88026fe94fc5f1894fe4f
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542552
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Expose a debugfs entry pmu_security. It allows checking if PMU was
booted in secure or non-secure mode.
Change-Id: Iea584b696440779bee0900edccabd4e5b2997805
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/552456
Move nvgpu ioctls from the many user space interface headers to a new
single nvgpu.h header under include/uapi. No new code or replaced names
are introduced; this change only moves the definitions and changes
include directives accordingly.
Bug 1434573
Change-Id: I4d02415148e437a4e3edad221e08785fac377e91
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542651
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
To help remove the nvhost dependency from nvgpu, rename ioctl defines
and structures used by nvgpu such that nvhost is replaced by nvgpu.
Duplicate some structures as needed.
Update header guards and such accordingly.
Change-Id: Ifc3a867713072bae70256502735583ab38381877
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542620
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Proper error number for invalid request number is EINVAL instead of
EFAULT, so change it in ioctl calls.
Change-Id: I8fddd34e012700550e9e30fe17ba7152b3a0417b
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/542563
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Change CDE swizzling shader kernel size to 8x8 to avoid waste with
relatively small surfaces.
Map compbit backing store and destination surface as cacheable.
Clean up kernel size calculation.
Bug 1546619
Change-Id: Ie97c019b4137d2f2230da6ba3034387b1ab1468a
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/501158
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Tested-by: Arto Merilainen <amerilainen@nvidia.com>
Added basic support for GM20b GPCPLL noise-aware(NA) mode. In this
mode PLL internal DVFS mechanism is engaged, and output frequency is
scaled with voltage automatically. The scaling coefficients in this
commit are preliminary, pending characterization.
If NA mode is enabled, any frequency change is done under PLL bypass,
with no dynamic ramp allowed.
This commit kept NA mode disabled.
Bug 1555318
Change-Id: I8d96a10006155635797331bae522fb048d3dc4a0
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/499488
GVS: Gerrit_Virtual_Submit
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
Channel gpfifo cannot be submitted if the channel has no vm, so add a
check for it and bail out if no as is bound. Clean up other similar
checks too.
Change-Id: Ibb0fe08e44e34bbaaa00ebd02dce6cc4d93ca5d9
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/538887
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
nvhost_sync_create_fence returns ERR_PTRs instead of NULLs on error;
check for its errors with IS_ERR.
Change-Id: I9752e0d8fa703b2872918b23721ae973be58bf35
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/533794
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
Retrieve channel count from gm20b specific header instead of the
gk20a header. This increases channel count from 128 to 512.
Change-Id: I96d4887432852795f7f526e33f0d3d2458f3af0e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/500623
Boards require a rework to make railgating and DVFS work realiably.
The information whether the board has been reworked or not will be
available on DTS.
This patch adds a DTS check to the GPU driver initialisation. If the
rework information is not available (or the rework has been marked as
disabled), railgating and DVFS are disabled.
Bug 1555485
Change-Id: Ie86fe35fb94377403472faffcbcaec645b6e40d9
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/500218
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Invalid method needs to be cleared in gm20b to prevent getting same
interrupt again.
Change-Id: I4d83d1a27e5c711b5d82b95552be84d5f16a13e0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/500286
Restrict reading of FE object table to the number of entries
available.
Change-Id: I11275ecd14e53f0b763d00d65042adb4b1e8ae6f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/449306
Runlist event is not sent in gm20b for updated runlist. Polling is
the preferred way also for gk20a.
Bug 1555239
Change-Id: I60de084db69f848f63451f1f3078f183ca51ba50
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/500241
Add poll interface and control ioctls for waiting for GPU job completion
via semaphores.
Poll on a gk20a channel file waits for events from pending semaphore
interrupts (stalling) of that channel. New ioctls enable and disable the
events, and clear a single interrupt event so that next poll doesn't
wake up for it again.
Bug 1528781
Change-Id: I5c6238966b5d0900c8ab263c6a7f8f2611901f33
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/497750
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Expose supported nvgpu ioctls to userspace via bits in the flags field
of nvhost_gpu_characteristics; currently define two bits for special
memory allocation support.
Bug 1539747
Change-Id: I1bc9333b12825d07a00b7a4136ae9d35816a5855
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/495942
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
L2 bypass registers have moved in gm20b. Move the code to
ltc_common.c, which gets compiled once per chip version.
Change-Id: I0ab4dd03c78b8ad8abc7a7b18c094b6002827587
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/499220
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Use TSG specific API gk20a_fifo_recover_tsg() in following cases :
- IOCTL_CHANNEL_FORCE_RESET
to force reset a channel in TSG, reset all the channels
- handle pbdma intr
while resetting in case of pbdma intr, if channel is part of
TSG, recover entire TSG
- TSG preempt failure
when TSG preempt times out, use TSG recover API
Use preempt_tsg() API to preempt if channel is part of TSG
Add below two generic APIs which will take care of preempting/
recovering either of channel or TSG as required
gk20a_fifo_preempt()
gk20a_fifo_force_reset_ch()
Bug 1470692
Change-Id: I8d46e252af79136be85a9a2accf8b51bd924ca8c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/497875
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>