Commit Graph

2355 Commits

Author SHA1 Message Date
Konsta Holtta
d98232ab21 gpu: nvgpu: use nvgpu_info in refcount tracking
Use the nvgpu_info log facility instead of Linux-specific dev_info in
gk20a_channel_dump_ref_actions. Also fix format types.

dev_info isn't defined anymore in this file and our version is preferred
anyway. Refcount tracking isn't compiled in by default, so this has went
unnoticed.

Jira NVGPU-22

Change-Id: If93b98c9a54d3b0deaf344a355594cb73712399c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1663032
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-26 05:19:10 -08:00
Deepak Nibade
0c46f8a5e1 gpu: nvgpu: support user fence updates
Add support for user fence updates i.e. increments added by user space
in pushbuffer directly

Add a submit IOCTL flag NVGPU_SUBMIT_GPFIFO_FLAGS_USER_FENCE_UPDATE to indicate
if User has added increments in pushbuffer
If yes, number_of_increment value is received in fence.value from User

If User is adding increments in the pushbuffer then we don't need to do any job
tracking in the kernel
So fail the submit if we evaluate need_job_tracking to true and
FLAGS_USER_FENCE_UPDATE is set
User is responsible for ensuring all pre-requisites for a fast submit and to
prevent kernel job tracking

Since user space adds increments in the pushbuffer, just handle the threshold
book keeping in kernel.

Bug 200326065
Jira NVGPU-179

Change-Id: Ic0f0b1aa69e3389a4c3305fb6a559c5113719e0f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1661854
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-26 03:48:14 -08:00
Deepak Nibade
8d5536271f gpu: nvgpu: add user API to get a syncpoint
Add new user API NVGPU_IOCTL_CHANNEL_GET_USER_SYNCPOINT which will expose
per-channel allocated syncpoint to user space
API will also return current value of the syncpoint
On supported platforms, this API will also return a RW semaphore address
(corresponding to syncpoint shim) to user space

Add new characteristics flag NVGPU_GPU_FLAGS_SUPPORT_USER_SYNCPOINT to indicate
support for this new API
Add new flag NVGPU_SUPPORT_USER_SYNCPOINT for use of core driver

Set this flag for GV11B and GP10B for now

Add a new API (*syncpt_address) in struct gk20a_channel_sync to get GPU_VA
address of a syncpoint

Add new API nvgpu_nvhost_syncpt_read_maxval() which will read and return MAX
value of syncpoint

Bug 200326065
Jira NVGPU-179

Change-Id: I9da6f17b85996f4fc6731c0bf94fca6f3181c3e0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1658009
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-26 03:48:11 -08:00
Thomas Fleury
180604fec0 gpu: nvgpu: gv100: fb hal to init and enable nvlink
Add the following hals:
(1) init_nvlink to configure nvlink(s) for sysmem in HSHUB
(2) enable_nvlink to switch from PCIe sysmem to nvlink sysmem,
    and setup atomics.

Change-Id: I73d2370aaf8e0530158a1091d9efef4a8cf2aac5
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1648044
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-25 21:48:28 -08:00
Thomas Fleury
0601fd25a5 gpu: nvgpu: gv100: nvlink endpoint driver
The following changes implements the initial (as per bringup) nvlink driver.

(1) SW initialization of nvlink core driver structures
(2) Nvlink interrupt handling
(3) Device initialization (IOCTRL, pll and clocks, device level intr)
(4) Falcon support for minion
(5) Minion load and bootstrapping
(6) Link initialization and DL PROD settings
(7) Device Interface init (and switching HSHUB to nvlink)
(8) HS set/get mode for both link and sublink
(9) Topology discovery and VBIOS settings.
(10) Ensures we get physical contiguous memory when Nvlink is enabled

This driver includes a hack for the current single dev/single link limitation.

JIRA: EVLR-2331
JIRA: EVLR-2330
JIRA: EVLR-2329
JIRA: EVLR-2328

Change-Id: Idca9a819179376cc655784482b24b575a52fa9e5
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1656790
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-25 21:48:24 -08:00
Alex Waterman
b5d3cf444e gpu: nvgpu: Cleanup unused variables
There are numerous places where variables are assigned to but then
never used. This patch cleans up all these unused variables and
in some cases simplifies surrounding logic.

Also delete unused header includes and add necessary header includes.

JIRA NVGPU-525

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: Ice9ec2a0e97f262d0dcfebe22f83208dbea569d9
Reviewed-on: https://git-master.nvidia.com/r/1662548
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-23 21:53:38 -08:00
Alex Waterman
bd95c2ce3f gpu: nvgpu: Cleanup warnings and Linuxisms
Remove a variable which is assigned to but never used after
that and convert a pr_warn() into an nvgpu_warn().

JIRA NVGPU-525

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: Ia3ab64063ec48d6cd8a4a13cff2ab9b9c459a462
Reviewed-on: https://git-master.nvidia.com/r/1662544
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-23 21:53:23 -08:00
Alex Waterman
c991410874 gpu: nvgpu: Abstract kernel_restart()
This function is used in gk20a.c to handle catastrophic error conditions
but is Linux specific. As such, implement an abstraction for this in
driver_common.c and expose the API in nvgpu_common.h.

JIRA NVGPU-525

Signed-off-by: Alex Waterman <alexw@nvidia.com>
Change-Id: Ie2e417d30af5ff7db76f4d2d5b97ec96c386bd04
Reviewed-on: https://git-master.nvidia.com/r/1662543
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-23 21:53:19 -08:00
Alex Waterman
ea7eaed843 gpu: nvgpu: Don't alloc comptags on GV100
We don't have compression enabled on GV100 so it does not make sense
to allocate a huge CBC for this device.

Change-Id: I0ff908571f28c2ba6f439b0989398bf68dce16f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1655279
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-23 05:04:30 -08:00
Alex Waterman
eb219e9f3f gpu: nvgpu: Cleanup map attributes debugging
Make the map attributes printed by map debug code are more easily
readable and consistent.

Change-Id: I9737131a2ea44c6a080dff0095929760888b83ae
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1654518
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-22 08:09:06 -08:00
Terje Bergstrom
ec00a6c2db gpu: nvgpu: Use preallocated VPR buffer
To prevent deadlock while allocating VPR in nvgpu, allocate all the
needed VPR memory at probe time and use an internal allocator to
hand out space for VPR buffers.

Change-Id: I584b9a0f746d5d1dec021cdfbd6f26b4b92e4412
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1655324
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-14 21:43:43 -08:00
Deepak Nibade
f0cbe19b12 gpu: nvgpu: add user API to get read-only syncpoint address map
Add User space API NVGPU_AS_IOCTL_GET_SYNC_RO_MAP to get read-only syncpoint
address map in user space

We already map whole syncpoint shim to each address space with base address
being vm->syncpt_ro_map_gpu_va

This new API exposes this base GPU_VA address of syncpoint map, and unit size
of each syncpoint to user space.
User space can then calculate address of each syncpoint as
syncpoint_address = base_gpu_va + (syncpoint_id * syncpoint_unit_size)

Note that this syncpoint address is read_only, and should be only used for
inserting semaphore acquires.
Adding semaphore release with this address would result in MMU_FAULT

Define new HAL g->ops.fifo.get_sync_ro_map and set this for all GPUs supported
on Xavier SoC

Bug 200327559

Change-Id: Ica0db48fc28fdd0ff2a5eb09574dac843dc5e4fd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1649365
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-07 15:35:47 -08:00
Seema Khowala
791ce6bd54 gpu: nvgpu: gv11b: enable more gr exceptions
-pd, scc, ds, ssync, mme and sked exceptions are
 enabled. This will be useful for debugging
-Handle enabled interrupts
-Add gr ops to handle ssync hww. For legacy
 chips, ssync hww_esr register is gpcs_ppcs_ssync_hww_esr.
 Since ssync hww is not enabled on legacy chips, added
 ssync hww exception handling for volta only.

Change-Id: I63ba2eb51fa82e74832df26ee4cf3546458e5669
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1644751
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-31 13:23:30 -08:00
Seema Khowala
9beefc4551 gpu: nvgpu: add fecs_host_int_enable hal
This will be used to enable fecs interrupts per
chip.

Change-Id: Id99412ca1a9c4caad999c3458b0e9701515db4b9
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1642554
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-31 13:23:21 -08:00
Terje Bergstrom
cb9f8bae1a gpu: nvgpu: Unify querying stream id
Stream ID for gp10b is retrieved directly from DT headers in common
code. Introduce instead a variable to store the stream ID and move the
query to platform_gp10b_tegra.c.

JIRA NVGPU-4

Change-Id: I123024e13e470283bb691883f8f963eb72c997d8
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1648013
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-31 12:37:28 -08:00
Richard Zhao
b386768d32 gpu: nvgpu: make .tsg_unbind_channel one layer lower
The message to tell RM server to unbind channel has to be sent after
client unbinds the channel and before client calls tsg release. The
channel has to belong to a tsg on RM server before client submit a
runlist to remove the channel. Or there's a bare channel problem.

By moving .tsg_unbind_channl one layer lower, gk20a_tsg_unbind_channel()
will be common functions for all chip, and it'll call tsg release after
call .tsg_unbind_channel. So vgpu won't need to worry about tsg was
released before sending msg to RM server.

Bug 200382695
Bug 200382785

Change-Id: I32acc122f3f9d5d0628049ccf673225f9e90c87a
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1645383
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-31 02:40:48 -08:00
Konsta Holtta
1a7484c901 gpu: nvgpu: ce: store fences in a separate array
Simplify the copyengine code massively by storing the job post fence
pointers in an array of fences instead of mixing them up in the command
buffer memory. The post fences are used when the ring buffer of a
context gets full and we need to wait for the oldest slot to free up.

NVGPU-43
NVGPU-52

Change-Id: I36969e19676bec0f38de9a6357767a8d5cbcd329
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1646037
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-26 10:50:37 -08:00
Konsta Holtta
91114cd6d4 gpu: nvgpu: ce: drop prefence support
Delete the gk20a_fence_in argument in gk20a_ce_execute_ops. It has never
been used and is in the way of some upcoming code cleanup.

NVGPU-43

Change-Id: Ie61e1a2f4945b1e34d64880044c265d26fa822d7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1646036
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-26 10:50:33 -08:00
David Nieto
fbdcc8a2d4 gpu: nvgpu: Initial Nvlink driver skeleton
Adds the skeleton and integration of the GV100 endpoint driver to NVGPU

(1) Adds a OS abstraction layer for the internal nvlink structure.
(2) Adds linux specific integration with Nvlink core driver.
(3) Adds function pointers for nvlink api, initialization and isr process.
(4) Adds initial support for minion.
(5) Adds new GPU enable properties to handle NVLINK presence
(6) Adds new GPU enable properties for SG_PHY bypass (required for NVLINK over
PCI)
(7) Adds parsing of nvlink vbios structures.
(8) Adds logging defines for NVGPU

JIRA: EVLR-2328

Change-Id: I0720a165a15c7187892c8c1a0662ec598354ac06
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1644708
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-25 17:39:53 -08:00
Alex Waterman
4967570033 gpu: nvgpu: add speculative load barrier (ctrl IOCTLs)
Data can be speculatively loaded from memory and stay in cache even
when bound check fails. This can lead to unintended information
disclosure via side-channel analysis.

To mitigate this problem insert a speculation barrier.

bug 2039126
CVE-2017-5753

Change-Id: Ib6c4b2f99b85af3119cce3882fe35ab47509c76f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640500
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-25 14:25:34 -08:00
Alex Waterman
25aba34bbd gpu: nvgpu: add speculative load barrier (channel IOCTLs)
Data can be speculatively loaded from memory and stay in cache even
when bound check fails. This can lead to unintended information
disclosure via side-channel analysis.

To mitigate this problem insert a speculation barrier.

bug 2039126
CVE-2017-5753

Change-Id: I6b8af794ea2156f0342ea6cc925051f49dbb1d6e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640498
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-25 14:25:21 -08:00
Terje Bergstrom
9cbb954266 gpu: nvgpu: Report mailbox id and value on ucode timeout
When we detect a timeout waiting for ctxsw ucode method to complete,
we print an error. The error does not detail the event we are waiting
which makes debugging difficult. Add the missing mailbox id and value
to the error.

Bug 2049965

Change-Id: I45204a2d6f1f39919a0133b1e0867213e1a5b671
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1643709
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-23 11:19:10 -08:00
Terje Bergstrom
f3f14cdff5 gpu: nvgpu: Fold T19x code back to main code paths
Lots of code paths were split to T19x specific code paths and structs
due to split repository. Now that repositories are merged, fold all of
them back to main code paths and structs and remove the T19x specific
Kconfig flag.

Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-22 22:20:15 -08:00
seshendra Gadagottu
193a2ed38c gpu: nvgpu: add sw method for SET_BES_CROP_DEBUG4
Added sw method support for SET_BES_CROP_DEBUG4.
In this sw method:
CLAMP_FP_BLEND_TO_MAXVAL forces overflow and
CLAMP_FP_BLEND_TO_INF blend results to clamp to FP maxval.

Added support for this sw method in gp10b/gp106/gv11b
and gv100.

Bug 2046636

Change-Id: I3a9e97587aca76718f7f504ea3b853f87409092a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1641529
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-22 15:29:54 -08:00
Konsta Holtta
f6d898656a gpu: nvgpu: note railgate_allowed in do_idle
The idling and unidling of deterministic channels in the
do_idle/do_unidle path assume that each deterministic channel holds a
power reference. This is no longer the case if railgating has been
allowed for a channel via the deterministic options ioctl which also
causes the channel to drop the power ref that it holds otherwise during
its lifetime.

All this is happening inside the deterministic_busy rwsem, which also
guards the ioctl changing those deterministic option states.

Bug 200327089

Change-Id: I9ce312bbaa459b3cf4a7541fa369186b78c3afdc
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1642310
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-22 08:29:05 -08:00
Konsta Holtta
3ccf5c85fb gpu: nvgpu: add g->sw_ready flag
Fix a race condition where we'd still be booting up the gpu and/or
initializing the driver but elsewhere assume that all is done already.

Some userspace APIs to make sure that we're ready by testing
g->gr.sw_ready, but this flag is set in the middle of bootup; there are
other things after gr initialization. Add a new flag that is enabled
after bootup is fully complete at the end of finalize_poweron, and
change the checks in user API paths to test the new flag only.

These checks are only in the ioctl paths for ctrl, dbg and tsg, and in
the ctrl device's opening path.

The gr.sw_ready flag is still left there to signify whether just gr has
had its bookkeeping initialized.

Bug 200370011

Change-Id: I2995500e06de46430d9b835de1e9d60b3f01744e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640124
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-20 02:19:02 -08:00
Alex Waterman
137006fe78 gpu: nvgpu: Update gk20a pde bit coverage function
The mm_gk20a.c function that returns number of bits that a PDE covers
is very useful for determing PDE size for all chips. Copy this into
the common VM code since this applies to all chips/platforms.

Bug 200105199

Change-Id: I437da4781be2fa7c540abe52b20f4c4321f6c649
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1639730
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-19 17:29:09 -08:00
Konsta Holtta
f50c2af8a7 gpu: nvgpu: delete unused wfi in gk20a_fence
The boolean wfi field in struct gk20a_fence is not used for anything.
Delete it and a couple of function parameters that carried the flag.

Jira NVGPU-43

Change-Id: I399c8709102a3f944cab669ff806761aedaeb6d3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1636344
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-19 08:13:13 -08:00
Deepak Goyal
e0dbf3a784 gpu: nvgpu: gv11b: Enable perfmon.
t19x PMU ucode uses RPC mechanism for
PERFMON commands.

- Declared  "pmu_init_perfmon",
  "pmu_perfmon_start_sampling",
  "pmu_perfmon_stop_sampling" and
  "pmu_perfmon_get_samples" in pmu ops
  to differenciate for chips using RPC & legacy
  cmd/msg mechanism.
- Defined and used PERFMON RPC commands for t19x
  	- INIT
	- START
	- STOP
	- QUERY
- Adds RPC handler for PERFMON RPC commands.
- For guerying GPU utilization/load, we need to send PERFMON_QUERY
  RPC command for gv11b.
- Enables perfmon for gv11b.

Bug 2039013

Change-Id: Ic32326f81d48f11bc772afb8fee2dee6e427a699
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1614114
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-18 23:40:02 -08:00
Richard Zhao
e53263ea86 gpu: nvgpu: unexport gk20a_ce_create/delete_context
No external referencing of them.

Jira VFND-4713

Change-Id: If053bbdbb37e9bd4789bfd7cccb1aef035fbf317
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1639674
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-17 15:49:31 -08:00
Terje Bergstrom
2f6698b863 gpu: nvgpu: Make graphics context property of TSG
Move graphics context ownership to TSG instead of channel. Combine
channel_ctx_gk20a and gr_ctx_desc to one structure, because the split
between them was arbitrary. Move context header to be property of
channel.

Bug 1842197

Change-Id: I410e3262f80b318d8528bcbec270b63a2d8d2ff9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1639532
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-17 12:29:09 -08:00
Terje Bergstrom
ece3d958b3 gpu: nvgpu: Combine gk20a and gp10b free_gr_ctx
gp10b version of free_gr_ctx was created to keep gp10b source code
changes out from the mainline. gp10b was merged back to mainline a
while ago, so this separation is no longer needed. Merge the two
variants.

Change-Id: I954b3b677e98e4248f95641ea22e0def4e583c66
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635127
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-12 12:42:57 -08:00
Terje Bergstrom
351f519c2e gpu: nvgpu: Add HAL for dumping ctxsw statistics
Add HAL for dumping ctxsw statistics. The statistics are dependent on
the architecture, and the function that calls this operation needs to
be moved to gk20a.

Bug 1842197

Change-Id: I285c74b8ddc8c7854c85b3fef4cbfc582098919e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1632681
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-12 12:42:31 -08:00
Seema Khowala
ffa5231d2c gpu: nvgpu: runlist info mutex not needed for runlist_state
runlist_info mutex for the runlist being enabled or
disabled in fifo_sched_disable_r is not needed to be
acquired

Bug 2043838

Change-Id: Ia9839ab7effbe7daf353c3a54f25a2b4914af5e8
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630345
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-11 17:37:55 -08:00
Thomas Fleury
6b90684cee gpu: nvgpu: vgpu: get virtual SMs mapping
On gv11b we can have multiple SMs per TPC. Add sm_per_tpc in
vgpu constants to properly dimension the virtual SM to TPC/GPC
mapping in virtualization case.
Use TEGRA_VGPU_CMD_GET_SMS_MAPPING to query current mapping.

Bug 2039676

Change-Id: I817be18f9a28cfb9bd8af207d7d6341a2ec3994b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631203
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 15:57:20 -08:00
Deepak Nibade
5fb7c7d8f9 gpu: nvgpu: protect linux include with config
In fence_gk20a.c protect <linux/file.h> and <linux/fs.h> includes with config
CONFIG_SYNC since they are only needed with this config enabled

Jira NVGPU-487

Change-Id: I6c26aa0fbb4ee284129109c625a0e324d5caf235
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635471
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 08:47:11 -08:00
seshendra Gadagottu
e9de95d7e0 gpu: nvgpu: use chip specific zbc_c/z format reg
Use chip specific gpcs_swdx_dss_zbc_c_format_reg
and gpcs_swdx_dss_zbc_z_format_reg. These registers
are different for gv11b/gv100 from gp10b/gp106.

Change-Id: I9e209c878a11edc986ba4304ff60fcccbb5087aa
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635091
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 08:47:07 -08:00
seshendra Gadagottu
0ac3ba2a99 gpu: nvgpu: gv11b: fix for gfx preemption
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.
For gp10b used smaller buffer sizes than specified
value in hw manuals as per sw requirement.

Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode

This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.

Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.

Bug 1976694

Change-Id: I2dc923340d34d0dc5fe45419200d0cf4f53cdb23
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635027
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 08:47:03 -08:00
Sourab Gupta
84d02ed7b1 gpu: nvgpu: correct function arguments to fix QNX compilation
The patch changes the function argument from 'int' to 'unsigned int'
to fix the QNX compilation failures.

Change-Id: Iaee7850d8310bea693996ac618b95252ca5d1b35
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1626397
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-09 16:52:14 -08:00
Alex Waterman
2ae16008cd Revert "gpu: nvgpu: gv11b: fix for gfx preemption"
This reverts commit caf168e33e.

Might be causing an intermittency in quill-c03 graphics submit. Super
weird since the only change that seems like it could affect it is the
header file update but that seems rather safe.

Bug 2044830

Change-Id: I14809d4945744193b9c2d7729ae8a516eb3e0b21
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1634349
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Timo Alho <talho@nvidia.com>
Tested-by: Timo Alho <talho@nvidia.com>
2018-01-09 06:32:30 -08:00
seshendra Gadagottu
caf168e33e gpu: nvgpu: gv11b: fix for gfx preemption
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.

Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode

This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.

Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.

Bug 1976694

Change-Id: I284e29e0815d205c150998b07d0757b5089d3267
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630520
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Tested-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-08 12:16:49 -08:00
Sourab Gupta
667c9fd10b gpu: nvgpu: include nvgpu types.h explicitly in fence.h
QNX needs defines for u32 data type, which is retrieved from
nvgpu/types.h. We need to explicity include this for fence.h

Change-Id: I0768042b8b10db550a1e321a0c3c1d86d981f9b0
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1626401
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-08 08:45:48 -08:00
Sourab Gupta
3f4919ef32 gpu: nvgpu: replace pr_err with nvgpu_err
Replace the linux specific pr_err with nvgpu_err function

Change-Id: I856a3030c62009b078a8cdfc0050b541a66e6eaa
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1626400
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-08 08:45:40 -08:00
Sourab Gupta
cc94cb3b2b gpu: nvgpu: remove dma-buf.h include in channel
The patch removes the dma-buf.h include from channel_gk20a.c,
now that there are no references to dma_buf present here.

Change-Id: I079c3c3763e7ac4f91e43a4bc54a23ec8d5a23fa
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1626396
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-08 08:45:05 -08:00
David Nieto
1f71f475e2 DNI: gpu: nvgpu: Increase GV100 ctxsw timeouts
During bringup and before nvlink is up GV100 on the DDPX platform operates
with a very, very slow sysmem link. In order to get sysmem test to pass
it is neccesary to significantly increase most timeouts by an order the
magnitude.

Bug 2040544

Change-Id: I26858afde4ae80c70f86b47cfff674b6b00b5bf8
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627417
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-05 13:54:37 -08:00
Terje Bergstrom
dde1913f16 gpu: nvgpu: Do not disable ELPG when committing buffers
Committing buffer addresses only writes to the memory. There's no
need to disable ELPG for the duration, so drop the ELPG protection.

Change-Id: I8d8d08506387197e4737e0311df4a20085496056
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631149
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-04 11:04:43 -08:00
Terje Bergstrom
031eb0ec83 gpu: nvgpu: Remove gk20a specific optimization
Remove compute optimization specific to gk20a. We do not support
gk20a anymore.

Change-Id: Ibd548eee8d891a667f28a451d586fcfaac7f026a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631144
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-04 11:04:39 -08:00
Deepak Nibade
e0aca109b1 gpu: nvgpu: fix erroneous gk20a_put() call
With a recent rework we moved gk20a_get() call to nvgpu_ioctl_tsg_open(),
but corresponding gk20a_put() call remained in gk20a_tsg_release()

So if a TSG is opened and released from within kernel with APIs
gk20a_tsg_open()/gk20a_tsg_release() we mistakenly drop extra refcount
through gk20a_put()

Fix this by moving gk20a_put() call to nvgpu_ioctl_tsg_release() which
balances gk20a_get() call in nvgpu_ioctl_tsg_open()

Bug 200374011

Change-Id: Id0cec0426e6231309dc530ab5c934dacaba9f8da
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630969
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-04 08:46:10 -08:00
Deepak Nibade
c7b7dbe39a gpu: nvgpu: return error code in failure cases
In gk20a_ce_create_context(), if gk20a_tsg_open() or gk20a_open_new_channel()
fails, we bail out from the function without setting the error code
This could mislead the caller and report incorrect success

Fix this by setting error code explicitly in failure cases

Bug 200374011

Change-Id: Idf6cba4a57740107bada698295745352f7b5d5ac
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631506
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-04 08:46:07 -08:00
Deepak Nibade
5ad1c28b5f gpu: nvgpu: fix TSG leak from CE code
In gk20a_ce_delete_gpu_context(), we unbind the channel from TSG and
close the channel. But we do not drop the TSG refcount leaking
the TSG reference

Fix this by explicitly dropping TSG refcount

Also, do not explicitly unbind the channel from TSG
gk20a_channel_close() will internally unbind the channel from TSG

Bug 200374011

Change-Id: Ie4aa32f1d0bff4231f41aa2b33743cdc63e967c7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1629972
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-04 08:45:48 -08:00