Commit Graph

2259 Commits

Author SHA1 Message Date
Deepak Nibade
4d772455c5 gpu: nvgpu: skip channel abort for deferred reset
In case deferred_reset_pending is set in gk20a_fifo_handle_mmu_fault(), we skip
resetting the engines and skip setting the error notifier
Then we call gk20a_channel_abort()/gk20a_fifo_abort_tsg() which aborts the
channels, and resets the syncpoint values to release all the waiters

But since we don't set error notifier this could lead User to assume a
successful submission without any error

To fix this disable channel/TSG in case deferred_reset_pending is set and skip
calls to gk20a_channel_abort()/gk20a_fifo_abort_tsg()

Note that we finally abort the channel when channel is being closed

Bug 200363077

Change-Id: Ia48ca369701c14d1913d8f7b66ed466b7b840224
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1665632
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
2018-02-28 09:45:27 -08:00
Bibek Basu
a8b5d8da59 gpu: nvgpu: check for emc_params
Check for emc_params before asking for clock

Bug 200388474

Change-Id: I74a67cadf554a2cc69a4cf3bce1cc26c7259a491
Signed-off-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1654974
GVS: Gerrit_Virtual_Submit
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Tested-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
2018-02-12 12:13:33 -08:00
James Huang
55f8ac57b9 gpu: nvgpu: add speculative load barrier (ctrl IOCTLs)
Data can be speculatively loaded from memory and stay in cache even
when bound check fails. This can lead to unintended information
disclosure via side-channel analysis.

To mitigate this problem insert a speculation barrier.

bug 2039126
CVE-2017-5753

Change-Id: Ib6c4b2f99b85af3119cce3882fe35ab47509c76f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Signed-off-by: James Huang <jamehuang@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1650050
Reviewed-by: Hayden Du <haydend@nvidia.com>
(cherry picked from commit f293fa670f)
Reviewed-on: https://git-master.nvidia.com/r/1650742
GVS: Gerrit_Virtual_Submit
Reviewed-by: Prabhu Kuttiyam <pkuttiyam@nvidia.com>
Tested-by: Prabhu Kuttiyam <pkuttiyam@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
2018-02-02 13:13:34 -08:00
Terje Bergstrom
7739ae9316 gpu: nvgpu: Allow defining min_freq and stepping
Allow defining min_freq and stepping to use for generating freq
table via Kconfig.

Bug 1869602
Bug 200348636

Change-Id: I2e15366300a701e79c392beabdf2fb587e8e6891
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Rohit Vaswani <rvaswani@nvidia.com>
Reviewed-on: http://git-master/r/1297668
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit ef6f034d31d671424604c73dcee0f3a9a90746db in
rel-27)
Reviewed-on: https://git-master.nvidia.com/r/1645193
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
Tested-by: Winnie Hsu <whsu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2018-01-30 15:29:18 -08:00
Rohit Vaswani
d18c8b3dff gpu: nvgpu: Allow defining min_freq and stepping
Allow defining min_freq and stepping to use for generating freq
table via Kconfig.

Bug 1869602
Bug 200348636

Change-Id: Iaf0af19219a5ce48f424df336e5e5d27d0b7acb4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Rohit Vaswani <rvaswani@nvidia.com>
Reviewed-on: http://git-master/r/1297666
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
GVS: Gerrit_Virtual_Submit
(cherry picked from commit 811880da40 in
rel-27)
Reviewed-on: https://git-master.nvidia.com/r/1645028
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
Tested-by: Winnie Hsu <whsu@nvidia.com>
2018-01-30 15:29:08 -08:00
Alex Waterman
85494f6428 gpu: nvgpu: Smarter way to check vmalloc address
In the nvgpu_big_free() function the passed in address is checked
to see what type of address it is: kmalloc or vmalloc. This change
uses the is_vmalloc_addr() instead since this is a much clearer and
easier way to determine if a virtual address should be vfree()ed.
Anything not a vmalloc address is then assumed to be a kmalloc()
address.

rel-28: Note that this code is actually in <nvgpu/kmem.h> on rel-28
so this cherry-pick took that into account.

Bug 2049449

Change-Id: I2bd9441d3c5fc455f03ec2075d012c607280ad5f
Reviewed-on: https://git-master.nvidia.com/r/1644802
(cherry picked from commit a63e715117)
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1645512
Reviewed-by: Arun Kannan <akannan@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-25 19:22:03 -08:00
Konsta Holtta
c386221971 gpu: nvgpu: add g->sw_ready flag
Fix a race condition where we'd still be booting up the gpu and/or
initializing the driver but elsewhere assume that all is done already.

Some userspace APIs to make sure that we're ready by testing
g->gr.sw_ready, but this flag is set in the middle of bootup; there are
other things after gr initialization. Add a new flag that is enabled
after bootup is fully complete at the end of finalize_poweron, and
change the checks in user API paths to test the new flag only.

These checks are only in the ioctl paths for ctrl, dbg and tsg, and in
the ctrl device's opening path.

The gr.sw_ready flag is still left there to signify whether just gr has
had its bookkeeping initialized.

Bug 200370011

Change-Id: I2995500e06de46430d9b835de1e9d60b3f01744e
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640136
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-20 01:47:58 -08:00
Peter Boonstoppel
88bc143450 gpu: nvgpu: Remove gk20a_scale_notify_busy/idle() hooks
Remove dependency for nvgpu to invoke devfreq govenor on every
gk20a_busy/idle() call. This dependency was originally necessary to
track GPU load (busy vs idle) in software. However, since we currently
read the load GPU from HW/PMU there is no need to invoke the devfreq
governor in this path. Instead it can use timer-based polling.

Jira NVGPU-20

Change-Id: Id09f89a8a562ed49164a2e06dcbb901e4a46e7d5
Reviewed-on: https://git-master/r/1473140
(cherry picked from commit 635e9946b7)
Signed-off-by: Jon McCaffrey <jmccaffrey@nvidia.com>
Signed-off-by: Arun Kannan <akannan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1619234
GVS: Gerrit_Virtual_Submit
Reviewed-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-by: Hayden Du <haydend@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-17 17:55:29 -08:00
Peter Boonstoppel
92634f615f gpu: nvgpu: Set default devfreq polling rate
Sets default polling rate for GPU podgov governor to 25ms.

Jira NVGPU-20

Change-Id: I994f3aab772b41c238f6755e0bd22ed3d4b27cf4
Reviewed-on: https://git-master/r/1473141
(cherry picked from commit 5ad02b2d01)
Signed-off-by: Jon McCaffrey <jmccaffrey@nvidia.com>
Signed-off-by: Arun Kannan <akannan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1619233
GVS: Gerrit_Virtual_Submit
Reviewed-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-by: Hayden Du <haydend@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-17 17:55:26 -08:00
Ken Chang
793f7deb94 gpu: nvgpu: error handling for gk20a_fence alloc
In current design, the new increments are added to the threshold for
fence allocation. So if it fails to get gk20a_fence, decrease the new
increments before bailing out from __gk20a_channel_syncpt_incr()
and propagating the error code.

Bug 1867651

Change-Id: I8a21bf0afef1d9ebe660ebea59d877acad1b726a
Signed-off-by: Ken Chang <kenc@nvidia.com>
Reviewed-on: http://git-master/r/1300421
(cherry picked from commit a4a55f1851834042ab14e487f1ff0d497509ff24)
Reviewed-on: https://git-master.nvidia.com/r/1609824
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-12-11 05:31:32 -08:00
David Li
c6cd82008f gpu: nvgpu: add NVGPU_IOCTL_CHANNEL_PREEMPT_NEXT
Add NVGPU_IOCTL_CHANNEL_PREEMPT_NEXT ioctl to check host and FECS status
and preempt pending load of context not belonging to the calling channel on
GR engine during context switch. This should be called after a submit with
NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST to decrease worst case submit
to start latency for high interleave channel.
There is less than 0.002% chance that the ioctl blocks up to couple
miliseconds due to race condition of FECS status changing while being read.
Also fix bug with host reschedule for multiple runlists which needs to write
both runlist registers.

Bug 1987640
Bug 1924808
Change-Id: I0b7e2f91bd18b0b20928e5a3311b9426b1bf1848
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1549598
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-11-20 17:46:08 -08:00
Alex Waterman
f006e2daeb gpu: nvgpu: Validate buffer_offset argument
Validate the mapping_size argument in the VM mapping IOCTL before
attempting to use the argument for anything.

Bug 1954931
Bug 1965443

Change-Id: I81b22dc566c6c6f89e5e62604ce996376b33a343
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1547046
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit e68391690c in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/1601466
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-11-20 09:01:16 -08:00
seshendra Gadagottu
6e70225c51 gpu: nvgpu: gp10b: change prod value for pg slcg
SW WAR to fix graphics slcg hang issue by updating prod
value for slcg gating register.

Bug 200349133

Reviewed-on: https://git-master.nvidia.com/r/1593076
(cherry picked from commit ac5d3fcf04)

Change-Id: Ia152cc08f715dc5e0b4d6e40501437d06621d068
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1599042
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-11-16 16:16:09 -08:00
Jonathan McCaffrey
0865b9de0c gpu: gp10b: add gfxp_wfi_timeout sysfs node
Add a sysfs node to allow root user to set PRI_FE_GFXP_WFI_TIMEOUT, for gp10b
only, in units of sysclk cycles. Store the set value in a variable, and write
the set value to register after GPU is un-railgated.

NV_PGRAPH_PRI_FE_GFXP_WFI_TIMEOUT is engine_reset after Bug 1623341.

Change default value to be specified in cycles, rather than time.  This value
is almost the current value in cycles calculated each boot.

Bug 1932782

Change-Id: I0a4207e637cd1413a1be95abe2bcce3adccf76fa
Signed-off-by: Jonathan McCaffrey <jmccaffrey@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1540939
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-09-24 21:57:35 -07:00
David Li
c37fdaaa02 gpu: nvgpu: check capability for reschedule runlist submit flag
NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST is only used by realtime
priority EGL context, which checks for CAP_SYS_NICE during context
creation in userspace, so it wasn't secure against unprivileged program
spoofing submit ioctl with this flag to stall GPU progress of others.
This flag does increase duration of submit by approx 16us,
mostly due to register accesses and PMU FIFO mutex.

Bug 1989493
Bug 1854791
Bug 1968813

Change-Id: I086b1d14f286abf8bd2d2dfae5945974b7fe6d1f
Reviewed-on: https://git-master.nvidia.com/r/#/c/1558644
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1558683
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-09-19 02:21:40 -07:00
Konsta Holtta
42d90e17dd gpu: nvgpu: expose deterministic submit support
Add these bits in the gpu characteristics flags:

NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_NO_JOBTRACKING - fast
submits with no in-kernel job tracking are supported.

NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_FULL - deterministic
submits also with job tracking and num_inflight_jobs set are supported.

Either of these may get disabled if the particular channel or submit
still requires features that block these.

Make gk20a_channel_sync_needs_sync_framework() take a gk20a pointer
instead of a channel pointer so that it can be called without a channel.
It does not need any per-channel data.

Bug 20029130
Bug 200274674

Change-Id: I5f82510b6d39b53bcf6f1006dd83bdd9053963a0
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1456845
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit ee9733e587 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/1558993
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
2017-09-16 01:39:00 -07:00
Darren Sun
b79a75517a gpu: nvgpu: Disable rd_coalesce for all chips
Disable read coalescing for all chips.

Bug 200314091

Change-Id: Iaa3f58f94369ae1edae0620083eca4594be730fd
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1518308
Signed-off-by: Darren Sun <darrens@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1551739
GVS: Gerrit_Virtual_Submit
Reviewed-by: Hayden Du <haydend@nvidia.com>
2017-09-06 05:29:39 -07:00
Alex Waterman
5ecd220d32 gpu: nvgpu: Add su_rd_coalesce register field
Add the surface rd coalesce field in the register that
controls read coalescing.

Bug 200314091

Change-Id: I185ad7e6ef64ecae9369e26d22a7381611ddc693
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1518305
(cherry picked from commit 0dd02e634d)
Reviewed-on: https://git-master.nvidia.com/r/1551738
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Darren Sun <darrens@nvidia.com>
Tested-by: Darren Sun <darrens@nvidia.com>
Reviewed-by: Hayden Du <haydend@nvidia.com>
2017-09-06 05:29:39 -07:00
Debarshi Dutta
d73cc6808d gpu: nvgpu: changed log level for defer_probe
If platform probe fails as a result of DEFER_PROBE, it should be
reported as dev_info instead of dev_err.

Bug 1926777

Change-Id: Iba4392abdd6089da9678695b8ee7f2c92bea1505
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: http://git-master/r/1492711
(cherry picked from commit 896dc2b1b979107f968ba91582210216ab8800b7 in
dev-kernel)
Reviewed-on: https://git-master.nvidia.com/r/1550437
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
2017-09-04 02:00:18 -07:00
Sandeep Shinde
aa4daddda2 gpu: nvgpu: Add pd_max_batches sysfs node for gp10b
Add a new sysfs node pd_max_batches for setting max batches value in
NV_PGRAPH_PRI_PD_AB_DIST_CONFIG_1_MAX_BATCHES register which controls
max number of batches per alpha-beta transition stored in PD.

Bug 1927124

Change-Id: I2817f2d70dab348d8b0b8ba19bf1e9b9d23ca907
Signed-off-by: Sandeep Shinde <sashinde@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1544104
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
GVS: Gerrit_Virtual_Submit
2017-08-31 06:48:53 -07:00
David Li
813d081ba5 gpu: nvgpu: add NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST
NVGPU_SUBMIT_GPFIFO_FLAGS_RESCHEDULE_RUNLIST causes host to expire
current timeslice and reschedule from front of runlist.
This can be used with NVGPU_RUNLIST_INTERLEAVE_LEVEL_HIGH to make a
channel start sooner after submit rather than waiting for natural
timeslice expiration or block/finish of currently running channel.

Bug 1968813

Change-Id: I632e87c5f583a09ec8bf521dc73f595150abebb0
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1537218
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-08-28 14:31:53 -07:00
Stephen Warren
572c509152 nvgpu: Fix include paths for in-tree builds
When compiling the kernel with an O= option (which stores built files
outside the source tree) the kernel adds various extra source paths to the
system include path. However, this doesn't happen when building in-tree.
Adjust the main nvgpu Makefile to ensure all required paths are part of
the system include path so that all headers can be found.

Bug 1978395

Change-Id: I51ffc78b3863b89ebb5f051c963a8016258534a3
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1544320
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-08-24 18:45:43 -07:00
Alex Waterman
7ef4c5955c gpu: nvgpu: Increase small page aperture
Increase the small page aperture to 56GB to facilitate easier
fixed address mapping for userspace (primarily CUDA).

Bug 200320732

Change-Id: I1f0aaa4f28c8a294cc880b35f26942b562396b48
Signed-off-by: Sunny He <suhe@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1502432
(cherry picked from commit 2492142224)
Reviewed-on: https://git-master.nvidia.com/r/1535948
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-08-10 16:06:11 -07:00
Lauri Peltonen
9960fb7954 gpu: nvgu: Support SET_BES_CROP_DEBUG3 sw method
The new SET_BES_CROP_DEBUG3 sw method is used to flip two fields
in the NV_PGRAPH_PRI_BES_CROP_DEBUG3 register.  The sw method is
used by the user space driver to disable enough ROP optimizations
to maintain ZBC state of target tiles.

Bug 1942454

Change-Id: Id4e4d9d06c6c66080d06b6d4694546fe5cba8436
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1516202
(cherry picked from commit d3415f27c4)
Reviewed-on: https://git-master.nvidia.com/r/1520447
Reviewed-by: Donghan Ryu <dryu@nvidia.com>
Reviewed-by: Mathias Heyer <mheyer@nvidia.com>
Tested-by: Mathias Heyer <mheyer@nvidia.com>
2017-07-17 06:48:21 -07:00
Lauri Peltonen
116ae9cc11 gpu: nvgpu: Don't reject unusual ZBC colors
For some use cases, we need to program two ZBC slots with the
same DS color value but different FB color value.  Remove the
check that would reject such unorthodox ZBC entries.

Bug 1847208

Change-Id: Ibed2c8195516832789470f7f1a8c865568694c28
Signed-off-by: Lauri Peltonen <lpeltonen@nvidia.com>
Reviewed-on: http://git-master/r/1477611
(cherry picked from commit 7bc97ca7d5)
Reviewed-on: http://git-master/r/1483768
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-06-12 21:15:23 -07:00
Bharat Nihalani
96f221d6a0 nvgpu: gk20a: use usleep_range instead of msleep
msleep is not recommended for (1ms - 20ms). So use usleep_range
instead to have a more deterministic sleep time.

Also fix the print for target_ref_count that could either be 2 or
1 based on whether GPU rail-gating is enabled or not.

Bug 200294536

Change-Id: I26c9ed8a1badc84db5efa89347a227e6b46f603c
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/1500409
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-06-12 13:45:04 -07:00
Deepak Nibade
0057d09aa2 gpu: nvgpu: do not disable SM exceptions for BPT_INT
In gr_gk20a_handle_sm_exception(), we disable all SM exceptions
if SM debug mode is set and irrespective of exception type

But we should not disable SM exceptions if the only
exception is BPT_INT

Fix this by checking if only interrupt is BPT_INT and
do not disable SM exceptions in that case

Note that for rest of the exceptions we still need to
disable SM exceptions

Also, remove redudant checks of sm_debugger_attached since
we bail out early if this flag is not set anyways

Bug 200264850

Change-Id: I7732567273fc88f6c98f25372fd8619d92339734
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1487040
(cherry picked from commit f76febb962)
Reviewed-on: http://git-master/r/1496601
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2017-06-06 16:48:37 -07:00
Thomas Fleury
56f56b5cd9 gpu: nvgpu: hal for timestamps correlation
In order to perform timestamps correlation for FECS
traces, we need to collect GPU / GPU timestamps
samples. In virtualization case, it is possible for
a guest to get GPU timestamps by using read_ptimer.
However, if the CPU timestamp is read on guest side,
and the GPU timestamp is read on vm-server side,
then it introduces some latency that will create an
artificial offset for GPU timestamps (~2 us in
average). For better CPU / GPU timestamps correlation,
Added a command to collect all timestamps on vm-server
side.

Bug 1900475

Change-Id: Idfdc6ae4c16c501dc5e00053a5b75932c55148d6
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1472447
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-23 14:14:46 -07:00
Deepak Nibade
70f507eec7 gpu: nvgpu: call scale_notify_busy during resume
API gk20a_pm_resume() is called during system resume.
While unrailgating in this path, we request MAX EMC,
but then we never request a lower frequency after boot
completes.

To fix this, call gk20a_scale_notify_busy() after boot
completes in gk20a_pm_resume() so that we request
a lower EMC corresponding to current clock value

Also, add corresponding gk20a_scale_notify_idle()
call to gk20a_pm_suspend()

Bug 200308543

Change-Id: I3859222abdf67805673467f383981be519a1cece
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1484660
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-18 15:14:50 -07:00
Alex Frid
d7e3da9b8b gpu: nvgpu: tegra: Fix EMC frequency scaling
Before this commit call to EMC BWMGR in postscale procedure was skipped
when GPU rail is ON (= GPU is running). It should be the other way
around - call should be skipped when GPU is rail-gated.

Bug 200267304

Change-Id: Id4da84b3d0ed0606017cc53a58e2917d486fa13e
Signed-off-by: Alex Frid <afrid@nvidia.com>
Reviewed-on: http://git-master/r/1479769
(cherry picked from commit ab22d66386)
Reviewed-on: http://git-master/r/1482812
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Shreshtha Sahu <ssahu@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Tested-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
2017-05-17 21:44:59 -07:00
Deepak Nibade
47cd4860f0 gpu: nvgpu: skip taking g->busy_lock in gk20a_idle
We use g->busy_lock in gk20a_do_idle() to prevent submitting
more jobs to h/w and to wait for currently running jobs to
finish

But requesting this lock in gk20a_idle() prevents decrementing
runtime counter and hence gk20a_do_idle() can timeout with
below prints
[  148.904739] gk20a 17000000.gp10b: Timeout detected @
gk20a_do_idle+0x30/0x38
[  148.912185] gk20a 17000000.gp10b: __gk20a_do_idle: failed to idle -
refcount 4 != 1

Hence skip requesting this lock in gk20a_idle()

Bug 200294536

Change-Id: I060075fdee1b68e1b5fa11baa44a3f5ce4917d94
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1480777
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-16 06:05:58 -07:00
Peter Boonstoppel
3bc022cb38 gpu: nvgpu: Add czf_bypass sysfs node for gp10b
This change adds a new sysfs node to allow configuring CZF_BYPASS, to
enable platforms with low context-switching latency requirements.

/sys/devices/17000000.gp10b/czf_bypass

Values:
0 - always
1 - lateZ (default)
2 - single pass
3 - never

The specified value will apply only to newly allocated contexts.

Bug 1914014

Change-Id: Ibb9a8e86089acaadaa7260b00eedec5c80762d6f
Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-on: http://git-master/r/1478567
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Donghan Ryu <dryu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-11 14:30:01 -07:00
Jake Park
3edd3cbde4 gpu: nvgpu: reduce too much debug logs
When CONFIG_DEBUG_FS is not set, lots of gpu_dbg_info spew.
To reduce the debug logs, set GK20A_DEFAULT_DBG_MASK to 0.

Bug 1885240

Change-Id: I3f60ce1b205b316641228a34fa791df7fab48c9e
Signed-off-by: Jake Park <jakep@nvidia.com>
Reviewed-on: http://git-master/r/1474997
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Kamal Balagopalan <kbalagopalan@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-05-08 10:59:44 -07:00
Debarshi Dutta
dd26a9bb65 gpu: nvgpu: changed log level for failure report
dev_info() is used in place of dev_err() when secure buffer allocation
is failing.

Bug 200302186

Change-Id: Ia43925985b0a0a79e03863b91e475068ee8449c9
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: http://git-master/r/1474476
Tested-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
2017-05-04 15:15:41 -07:00
Thomas Fleury
d664ff12af gpu: nvgpu: clip target frequencies in arbiter
It is currently possible to set GPCCLK lower than the
minimum allowed frequency.
Clip target GPCCLK/MCLK according to valid min/max range
in arbiter. We could do this before submitting request to
arbiter, but then we would loose information on the
requested target frequency. Instead, we cache the clock
range in arbiter context, and check target frequency when
running arbiter.

Bug 200288036

Change-Id: I29f5176e6365a926d1041430c05a63f0c8447e2b
Reviewed-on: http://git-master/r/1460834
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
(cherry picked from commit eb626903e4fc046fe1f0eaee703c857e9a0f2b4d)
Reviewed-on: http://git-master/r/1461882
(cherry picked from commit 3615061045276c8913057e1c0e8cc443b70d2ad9)
Reviewed-on: http://git-master/r/1467627
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2017-04-24 16:29:18 -07:00
Terje Bergstrom
a81c400056 Revert "gp10b: add gr_pri_gpcs_prop_debug1 to access map"
This reverts commit 167d3524d1. The
patch was not reviewed.

Change-Id: I39024adee7e939645f1f47da46f419cc39ce48de
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1467631
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
2017-04-21 14:52:50 -07:00
Deepak Nibade
53120aa55e gpu: nvgpu: wait for engine idle in shutdown
In gk20a_pm_shutdown(), we do not check return value
of gk20a_pm_prepare_poweroff

In some cases it is possible that gk20a_pm_prepare_poweroff()
returns -EBUSY (this could happen if engines are busy)
so we don't clean up s/w state and directly
trigger GPU railgate

In case some interrupt is triggered simultaneously
we try to access a register while GPU is already railgated
This leads to a hard hang in nvgpu shutdown path

Make below changes in shutdown sequence to fix this:
- check return value of gk20a_wait_for_idle()
- disable activity on all engines with
  gk20a_fifo_disable_all_engine_activity()
- ensure engines are idle with gk20a_fifo_wait_engine_idle()
- check return value of gk20a_pm_prepare_poweroff()
- check return value of gk20a_pm_railgate()

Add a print when we bail out early in case
GPU is already railgated

Also, skip shutdown in case of VGPU since we don't need
to clean up virtual GPU, and RM server will take care of
cleaning h/w resources

Bug 200281010

Change-Id: I2856f9be6cd2de9b0d3ae12955cb1f0a2b6c29be
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1454658
(cherry picked from commit 6de456f840)
Reviewed-on: http://git-master/r/1461150
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-14 06:14:51 -07:00
Thomas Fleury
0a58c5ace0 gpu: nvgpu: fix pid mapping for dGPU FECS traces
For dGPU, instance block is in vidmem, and context_ptr
was not properly computed, leading to reporting pid=0 in
FECS traces.
Use gk20a_mm_inst_block_addr, which handles all cases
to determine instance block physical address.

Bug 1899195

Change-Id: If003d9f00aff66d808e66c06baf6ded38699981a
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1461646
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-13 02:34:15 -07:00
Jon McCaffrey
167d3524d1 gp10b: add gr_pri_gpcs_prop_debug1 to access map
This register is context_reset , and the value is context-switched with
GPU contexts

Bug 1855307
Bug 1871606

Change-Id: I4f871eaf94b4bb589ae179fc0ea972943acd0881
Signed-off-by: Jon McCaffrey <jmccaffrey@nvidia.com>
Reviewed-on: http://git-master/r/1452947
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Donghan Ryu <dryu@nvidia.com>
(cherry picked from commit f02591a23ab575d8592d3b2bebf94163c1323f17)
Reviewed-on: http://git-master/r/1459948
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
2017-04-11 16:04:58 -07:00
Jake Park
4d36b3dd41 gpu: nvgpu: fix build error when CONFIG_DEBUG_FS=n
replace dbg_info with gpu_dbg_info

Bug 1885240

Change-Id: I49a69e22ea5806afa56f3736f7acbe9550fcaa52
Signed-off-by: Jake Park <jakep@nvidia.com>
Reviewed-on: http://git-master/r/1455834
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-11 16:04:57 -07:00
Terje Bergstrom
98a7cc5831 gpu: nvgpu: Disable watchdog for in-kernel CE channels
Getting a timeout on kernel's own CE channels is unrecoverable.
Vidmem freeing also depends on CE to clear pages that have been
used so that they can be reused.

Disable watchdog on kernel's CE channels.

Bug 200287270

Change-Id: I87e0aa925d6d20485a5a19d2a6bfd050de34e968
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1454208
(cherry picked from commit 2ff3a9f374)
Reviewed-on: http://git-master/r/1457402
GVS: Gerrit_Virtual_Submit
2017-04-07 14:29:33 -07:00
Konsta Holtta
873e1b81bf gpu: nvgpu: avoid false job timeout message
The nvgpu timer API prints a message when the timer expires, but
expiration of that does not necessarily mean here that the job has
actually timed out, which is tested by comparing gp_get. Change the
expiration check to just peek instead of the default which prints to
log on expiration.

Bug 1887569
Bug 200291842
Jira NVGPU-21

Change-Id: Ifde34cff701eaed2f3ea727dba3ec8affeef26b9
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1329731
(cherry picked from commit 580c8112f0)
Reviewed-on: http://git-master/r/1452969
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-04-04 14:18:35 -07:00
David Nieto
9eef737acf gpu: nvgpu: fix vgpu shutdown code
On unbind we need to check that interrupts are complete before tearing
down the interrupt threads, but on vgpu those structures are not
initialized as they are managed by the server.

This change makes sure we do not try to free those resources on vgpu
shutdown

Bug 200293510
JIRA: EASS-1753

Change-Id: I77cb8594e1ad2c53f632e18b0dfc88f784a815e4
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
(cherry picked from commit 1a640fa6a3b41c3de7d63e14ee6770679e2c82af)
Reviewed-on: http://git-master/r/1330770
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Jinyoung Park <jinyoungp@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2017-04-04 06:03:48 -07:00
Richard Zhao
7809c16221 gpu: nvgpu: vgpu: init vars in gk20a vgpu missed
This is a quick fix. Finally, the common probe code is better be put
in common function btween vgpu and native gpu.

Bug 200293437
Jira EVLR-1152

Change-Id: I55f0d179d7adba556e0cb404766e14405b3e27e5
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1330229
(cherry picked from commit 7691902fec8abdd621ee17561607efeef615499f)
Reviewed-on: http://git-master/r/1331606
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2017-04-03 23:01:05 -07:00
Cory Perry
070c7799d6 gpu: nvgpu: fix suspending all SMs
In gk20a_suspend_all_sms(), we currently loop
over all GPCs and then loop over all TPCs in inner
loop
But this is incorrect and leads to SM with
invalid GPC,TPC ids

Fix this by looping over number of TPCs in each
GPC in inner loop

Also, fix gk20a_gr_wait_for_sm_lock_down() as
per below
- we right now wait infinitely for SM to lock down
- restrict this wait with a timeout on silicon
  platforms
- return ETIMEDOUT instead of EAGAIN
- add more debug prints with additional data
  for SM lock down failures

Bug 200258704

Change-Id: Id6fe32e579647fd8ac287a4b2ec80cbf98791e0d
Signed-off-by: Cory Perry <cperry@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1316471
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-on: http://git-master/r/1322373
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-03-29 19:34:36 -07:00
David Nieto
abee92ab92 gpu: nvgpu: refactor teardown to support unbind
This change refactors the teardown in remove to ensure that it is
possible to unload the driver while leaving fds open. This is achieved
by making sure that the SW state is kept alive till all fds are closed
and by checking that subsequent calls to ioctls after the teardown fail.

Normally, this would be achieved ny calls into gk20a_busy(), but in
kickoff we dont call into that to reduce latency, so we need to check
the driver status directly, and also in some of the functions
as we need to make sure the ioctl does not dereference the device or
platform struct

bug 200277762
JIRA: EVLR-1023

Change-Id: I163e47a08c29d4d5b3ab79f0eb531ef234f40bde
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320219
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Shreshtha Sahu <ssahu@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
(cherry picked from commit e0f2afe5eb)
Reviewed-on: http://git-master/r/1327755
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
2017-03-27 12:06:06 -07:00
Terje Bergstrom
ccccde66e7 gpu: nvgpu: Enable CE always
All GPUs have a copy engine. So delete the flag has_ce, because
it's always true.

JIRA NVGPU-16

Change-Id: I89db74c7cf66b24db84301b79832862ef28100b9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1325355
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit 81660ab58c)
Reviewed-on: http://git-master/r/1328222
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
Tested-by: David Martinez Nieto <dmartineznie@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
2017-03-27 12:06:06 -07:00
David Nieto
83721dfab9 gpu: nvgpu: pass gk20a struct to gk20a_busy
After driver remove, the device structure passed in gk20a_busy can be
invalid. To solve this the prototype of the function is modified to pass
the gk20a struct instead of the device pointer.

bug 200277762
JIRA: EVLR-1023

Change-Id: I08eb74bd3578834d45115098ed9936ebbb436fdf
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320194
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
(cherry picked from commit 2a502bdd5f)
Reviewed-on: http://git-master/r/1327754
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
2017-03-27 12:06:06 -07:00
David Nieto
ecd71ed447 gpu: nvgpu: remove duplicated busy in debugger
On the past, we had separate calls for platform and channel busy, but
those got removed. The result is that in the debugger code we have
essentially a double busy call int the powergating enable/disable.

This change removes it

bug 200277762
JIRA: EVLR-1023

Change-Id: Iba70b81700f27b847e1d0222fb69ed1a7a883342
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1323220
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-on: http://git-master/r/1327753
Reviewed-by: Sumeet Gupta <sumeetg@nvidia.com>
2017-03-26 07:19:23 -07:00
David Nieto
280a628a67 gpu: nvgpu: fix running condition on fifo isr
The fifo interrupt path was reading the PBDMA interrupt status
after clearing interrupts and this could lead to a situation in
which the host may have advanced to another channel, leading to
the recovery code resetting the wrong channel.

Bug 200278729
JIRA: EVLR-1036

Change-Id: I392423d1eaa8d23acf88454bf113c015e649e13d
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1326461
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
(cherry picked from commit ab401c7068)
Reviewed-on: http://git-master/r/1327292
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2017-03-24 19:36:09 -07:00