After moving devfreq enable to end of finalize power on,
intermittent issues related to gpu booting with devfreq
enabled are fixed.
Enabled devfreq for gv11b by enabling ""nvhost_podgov"
governor in platform data.
Reused scaling functions from gp10b/gk20a.
Removed emc floor on railgate for power saving.
Added max emc frequency as floor in rail-ungate for
faster gpu boot.
Bug 2049965
Bug 2039013
Bug 200377508
Change-Id: Ia1dec278b663b9f7ed859dd953a60f3eae7ef9a0
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1644702
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
VM and CDE code assumes that dma_buf_attachment is stored as a pointer
in the private dma_buf_drvdata, so it is not tracked. In Linux trees
without dma_buf_*_drvdata() support this is not true, so change the
code to explicitly track dma_buf_attachment.
JIRA NVGPU-4
Change-Id: I692f05a19a6469195d5444a7e5ff6e92f77ae272
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1648004
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Enabling gpu scaling driver after finalize poweron,
will make gpu booting happen at initially set
frequency(1GHz).
Also doing platform specific init scale after enabling
scaling driver.
Bug 2049965
Bug 2039013
Bug 200377508
Change-Id: I633f8f5a25d9de18cbb3a022913b8b725ccd87e5
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1644703
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The message to tell RM server to unbind channel has to be sent after
client unbinds the channel and before client calls tsg release. The
channel has to belong to a tsg on RM server before client submit a
runlist to remove the channel. Or there's a bare channel problem.
By moving .tsg_unbind_channl one layer lower, gk20a_tsg_unbind_channel()
will be common functions for all chip, and it'll call tsg release after
call .tsg_unbind_channel. So vgpu won't need to worry about tsg was
released before sending msg to RM server.
Bug 200382695
Bug 200382785
Change-Id: I32acc122f3f9d5d0628049ccf673225f9e90c87a
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1645383
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Patch 7240b3c2 enabled secure allocation for gv11b
But since we allocate secure buffers in poweron path, and secure allocation
needs GPU to be in off state, this results in deadlock in poweron path
To solve this, we already cause early VPR resize for older chips by calling
gk20a_tegra_secure_page_alloc() from late_probe
Implement same for gv11b.
Add late_probe callback and add a call to gk20a_tegra_secure_page_alloc()
Bug 2038249
Change-Id: I8c17b069962b26edbd0639a7c0d6c2fdaa352935
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1648831
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Simplify the copyengine code massively by storing the job post fence
pointers in an array of fences instead of mixing them up in the command
buffer memory. The post fences are used when the ring buffer of a
context gets full and we need to wait for the oldest slot to free up.
NVGPU-43
NVGPU-52
Change-Id: I36969e19676bec0f38de9a6357767a8d5cbcd329
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1646037
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Adds the skeleton and integration of the GV100 endpoint driver to NVGPU
(1) Adds a OS abstraction layer for the internal nvlink structure.
(2) Adds linux specific integration with Nvlink core driver.
(3) Adds function pointers for nvlink api, initialization and isr process.
(4) Adds initial support for minion.
(5) Adds new GPU enable properties to handle NVLINK presence
(6) Adds new GPU enable properties for SG_PHY bypass (required for NVLINK over
PCI)
(7) Adds parsing of nvlink vbios structures.
(8) Adds logging defines for NVGPU
JIRA: EVLR-2328
Change-Id: I0720a165a15c7187892c8c1a0662ec598354ac06
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1644708
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Data can be speculatively loaded from memory and stay in cache even
when bound check fails. This can lead to unintended information
disclosure via side-channel analysis.
To mitigate this problem insert a speculation barrier.
bug 2039126
CVE-2017-5753
Change-Id: I982225e754cc5d430c19f4cc542302e52243bd38
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640501
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In the nvgpu_big_free() function the passed in address is checked
to see what type of address it is: kmalloc or vmalloc. This change
uses the is_vmalloc_addr() instead since this is a much clearer and
easier way to determine if a virtual address should be vfree()ed.
Anything not a vmalloc address is then assumed to be a kmalloc()
address.
Bug 2049449
Change-Id: I2bd9441d3c5fc455f03ec2075d012c607280ad5f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1644802
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Arun Kannan <akannan@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Lots of code paths were split to T19x specific code paths and structs
due to split repository. Now that repositories are merged, fold all of
them back to main code paths and structs and remove the T19x specific
Kconfig flag.
Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added sw method support for SET_BES_CROP_DEBUG4.
In this sw method:
CLAMP_FP_BLEND_TO_MAXVAL forces overflow and
CLAMP_FP_BLEND_TO_INF blend results to clamp to FP maxval.
Added support for this sw method in gp10b/gp106/gv11b
and gv100.
Bug 2046636
Change-Id: I3a9e97587aca76718f7f504ea3b853f87409092a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1641529
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix a race condition where we'd still be booting up the gpu and/or
initializing the driver but elsewhere assume that all is done already.
Some userspace APIs to make sure that we're ready by testing
g->gr.sw_ready, but this flag is set in the middle of bootup; there are
other things after gr initialization. Add a new flag that is enabled
after bootup is fully complete at the end of finalize_poweron, and
change the checks in user API paths to test the new flag only.
These checks are only in the ioctl paths for ctrl, dbg and tsg, and in
the ctrl device's opening path.
The gr.sw_ready flag is still left there to signify whether just gr has
had its bookkeeping initialized.
Bug 200370011
Change-Id: I2995500e06de46430d9b835de1e9d60b3f01744e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640124
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Syncpoints can longer be disabled/enabled during run time as
NVGPU_HAS_SYNCPOINTS flag is set based on has_syncpoints
value in platform data during probe. Based on this, either
of syncpoint or semaphore pool is initialized.
Bug 2040115
Change-Id: Ib256e1a6ec8b1584799adb6f183fd567aebfaf13
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640380
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Enable devfreq for gv11b by enabling ""nvhost_podgov"
governor in platform data.
Reuse scaling functions from gp10b/gk20a.
Remove emc floor on railgate for power saving and make
max emc frequency as floor in rail-ungate for faster gpu boot.
Bug 2039013
Bug 200377508
Change-Id: I65ee7735202e3decbe3451157f7fc1f1f273c3ff
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1639752
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add cleanup to the gk20a_probe() function since it will often fail
due to probe deferal. These "failures" cause this function to be
called multiple times and potentially allocate many resources over
and over again, leaking the old allocations.
Bug 200369627
Change-Id: Ic0bba0ae6542485135d9cb7393086e4460cd271d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640628
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
On gv11b we can have multiple SMs per TPC. Add sm_per_tpc in
vgpu constants to properly dimension the virtual SM to TPC/GPC
mapping in virtualization case.
Use TEGRA_VGPU_CMD_GET_SMS_MAPPING to query current mapping.
Bug 2039676
Change-Id: I817be18f9a28cfb9bd8af207d7d6341a2ec3994b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631203
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
During bringup and before nvlink is up GV100 on the DDPX platform operates
with a very, very slow sysmem link. In order to get sysmem test to pass
it is neccesary to significantly increase most timeouts by an order the
magnitude.
Bug 2040544
Change-Id: I26858afde4ae80c70f86b47cfff674b6b00b5bf8
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627417
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
With a recent rework we moved gk20a_get() call to nvgpu_ioctl_tsg_open(),
but corresponding gk20a_put() call remained in gk20a_tsg_release()
So if a TSG is opened and released from within kernel with APIs
gk20a_tsg_open()/gk20a_tsg_release() we mistakenly drop extra refcount
through gk20a_put()
Fix this by moving gk20a_put() call to nvgpu_ioctl_tsg_release() which
balances gk20a_get() call in nvgpu_ioctl_tsg_open()
Bug 200374011
Change-Id: Id0cec0426e6231309dc530ab5c934dacaba9f8da
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630969
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_cde_load(), if TSG allocation fails we bail out the function without
setting the error code and caller of this functions assumes CDE load is
successful
Fix this by setting explicit error code if TSG allocation fails
Bug 200374011
Change-Id: I6e7bcb325fb0062605fa2f696da4abdeb34e241a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627117
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_cde_remove_ctx(), we unbind the channel from TSG and
close the channel. But we do not drop the TSG refcount leaking
the TSG reference
After allocating sufficient contexts, we see TSG creation fails as below
nvgpu: 17000000.gp10b: gk20a_cde_load:1286 [ERR] cde: could not create TSG
Fix this by explicitly dropping TSG refcount
Also, do not explicitly unbind the channel from TSG
gk20a_channel_close() will internally unbind the channel from TSG
Bug 200374011
Change-Id: If6d75b20d5e03d710c0597d7a320d1157206a2a5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627116
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>