Instead of checking if a job is complete, only check that channel is
making progress by checking its gp_get is advancing.
This will make the watchdog conservative. Previously a whole job had
x seconds to complete. Now channel has x seconds to get host to
consume each push buffer segment.
Bug 1861838
Bug 200273419
Bug 200263100
Change-Id: I70adc1f50301bce8db7dac675771c251c0f11b70
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1294850
Reviewed-by: Automatic_Commit_Validation_User
To test semaphore-related bugs with igpus, add a debugfs node called
"disable_syncpoints" to override the "has_syncpoints" platform flag.
This makes job synchronization use semaphores, for example.
NVGPU_GPU_FLAGS_HAS_SYNCPOINTS is still reported in gpu characteristics
if the platform supports that, because it is filled in during boot.
Jira NVGPU-18
Change-Id: I58c815f896a6054df472f571012c239f1478bf07
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1293972
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove including gk20a.h from pmu_gk20a.h. This causes a fallout
as some #includes were missing.
gr_gp10b.h uses mem_desc, but did not include mm_gk20a.h. Add the
include.
Including mm_gk20a.h in gr_gp10b.h causes recursive include, as
mm_gk20a.h has some gr defines. Move the defines to gr_gk20a.h to
remove the dependency.
gr_ctx_gk20a.h used struct gk20a pointers, but did not forward
declare it. Add a forward declaration.
gr_gk20a.h uses dbg_session_gk20a, but was missing forward
declaration.
gr_gk20a.h did not include nvgpu.h but it uses preemption types from
that header. Add include.
Change-Id: I2168e2303b55e0d187b816bcb26f37c8af1649ba
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1283717
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
API gk20a_wait_for_idle() right now always waits for
0 usage count
But in case railgating is disabled through sysfs,
usage count will never get to 0
Hence in this case we should wait for usage count
of 1
If platform->user_railgate_disabled is set,
keep target usage count of 1, otherwise keep
target usage count as 0
Bug 200260926
Change-Id: I1a80621ca61babbd6566989dc09a7b20670c649c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1291421
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make the GPU bind and rebind operations work when the driver
is idle. This required two changes.
1. Reset the GPU before doing SW init for PCI GPUs. This clears
the SW state which may be stale in the case of a rebind attempt.
2. Cleanup the interrupt enable/disables. Firstly there was one
place where nvgpu would accidentally disable the stalling
interrupt twice when the stalling interrupt and non-stalling
interrupt are the same. Secondly make sure when exiting nvgpu
that the interrupt enable/disables are balanced. Leaving the
interrupt in the -1 disable state means that next time the
driver runs interrupts never quite get enabled.
Bug 1816516
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1287643
Reviewed-on: http://git-master/r/1287649
(cherry picked from commit aa15af0aae5d0a95a8e765469be4354ab7ddd9f8)
Change-Id: I945e21c1fbb3f096834acf850616b71b2aab9ee3
Reviewed-on: http://git-master/r/1292700
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add a full GPU reset function to the XVE block. This allows
the driver to reset the GPU (except the XVE and XP interfaces)
to clear the GPU's state.
This is necessary for the GPU rebind to work. The state of the
GPU needs to be cleared before the new driver instance can work.
Bug 1816516
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1287642
Reviewed-on: http://git-master/r/1287648
(cherry picked from commit 7e751c0eb2c0f7d9d0b2020600c33fc8b4381878)
Change-Id: Ie2b721bf1b40acbab34de2436dea4e70d33b5611
Reviewed-on: http://git-master/r/1292698
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Prints out Timeslice value, Interleave level, Graphics preemption
mode and compute preempt mode along with chid, tsgid, pid.
Enable it with setting dbg_mask with 8192
Bug 1855710
Change-Id: I60efef9810587f8fedd4e2ba62ba67d06d84faea
Signed-off-by: Mihir Thakkar <mthakkar@nvidia.com>
Reviewed-on: http://git-master/r/1287141
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Copy and restore golden context correctly with
context header. Removed parallel fecs bind method,
which can cause issues for context in execution.
Also added function pointer to freeing context
header during channel context free.
Bug 1834201
Change-Id: I7962d68338d5144f624375ab81436e86cb31051e
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1275201
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We right now compare requested value to the last
freq value. Last freq value is always a rounded
value, whereas requested value need not be a
rounded value
Hence it is incorrect to compare requested value
to last freq value
Fix this by comparing rounded value to last_freq
Change-Id: I7c6ea7c4e57105598c9af75efe70016b7fa8038b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1287360
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The soc tegra headers are unified and moved all the content of
linux/tegra-soc.h to the soc/tegra/chip-id.h to have the
single soc header for Tegra.
Change-Id: I281e19dd3eb1538b8dfbea4eb0779fb64d1fcffa
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1288365
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
bug 200269171
Updating PMU firmware to fix voltage raise when switching mclk to 810mhz
with CLFC and MSCG enabled. The fix is to make sure that clock domain is
not evaluated in CLFC if MSCG has engaged anytime after the previous
evaluation
Change-Id: I2b6979ed3361f47273f2643c27c005deac49dc8b
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/1286437
(cherry picked from commit dbfccb42614ec9361628b3c3427a65d3fe908597)
Reviewed-on: http://git-master/r/1287461
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The recently added refcount tracking support had a slight mishap in
refactoring some printks. Fix a typo to make the support compile again
when enabled.
Bug 1826754
Change-Id: Ifd76d644932fa219751db82a0beb3c8482ea68c3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1285922
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Use the timers API in the gk20a code instead of Linux specific
API calls.
This also changes the behavior of several functions to wait for
the full timeout for each operation that can timeout. Previously
the timeout was shared across each operation.
Bug 1799159
Change-Id: I2bbed54630667b2b879b56a63a853266afc1e5d8
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1273826
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Hold debug_s->ioctl_lock for all debug session
IOCTLs to prevent multi-threaded user space
IOCTL calls
debug session IOCTL calls are not thread-safe
and hence this serialization is required
Bug 1832267
Change-Id: I847ac951601d4f0093546b592bdb8c8f00185317
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1286436
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
In gk20a_pm_shutdown(), we currently do not wait
for IOCTLs or threads in progress and directly
proceed with shutdown sequence
This can cause random hangs during system shutdown
Fix this by calling gk20a_wait_for_idle() after
we disable runtime PM in gk20a_pm_shutdown()
Bug 200260926
Change-Id: I0f06ba9232263fcb09c6e9d246be89deec053d44
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1286522
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Added HAL support to get current
pstate from clk_arb
Note - This function is inherently unsafe to call while
arbiter is running arbiter must be blocked
before calling this function
JIRA DNVGPU-165
Change-Id: I4e9f5eba7739280bddd9ee661fd314288c129516
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1286378
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
- with help of WRITE_ONCE() & ACCESS_ONCE()
make sure variable pmu->mscg_stat read/write goes through
without optimization
- Added WRITE_ONCE() define for kernel-3.18 version & below
to support backward compatibility
issue: inconsistencies on getting MSCG to trigger consistently in P5
due to a lack of memory barrier around and volatile accesses to the
variable pmu->mscg_stat
JIRA DNVGPU-71
Change-Id: I04d30493d42c52710304dbdfb9cb4a1e9a76f2c0
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1252524
(cherry picked from commit 8af7fc68e7ab06a856ba4ef4e44de7336682361b)
Reviewed-on: http://git-master/r/1271614
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Print process name if we fail submit due to gk20a_busy()
failure
This is helpful in debugging and to know the process name
submitting jobs to nvgpu after system shutdown was
already triggered
Bug 200262275
Change-Id: I34d8c07fc96fd5556afa982bfd56f7f3964449d0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1284113
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
When gpu host is executing a context, there should not be any calls
to fecs that can change the current context in execution. For some
reason legacy fmodels are calling fecs method to golden
context restore while loading golden context for new channel.
This call is not required and should not be called. Only first
time during golden context creation, fecs methods like bind can be
called and it is pretty safe to do.
Bug 1834201
Change-Id: Ia6178e875e3ac37fb1cf10e27976c26b9a02c56f
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1284512
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Setting timeslice for virtualized case was not effective,
because both ioctls NVGPU_TSG_IOCTL_SET_TIMESLICE and
NVGPU_SCHED_IOCTL_TSG_SET_TIMESLICE were calling the
native function to set TSG timeslice.
- Fixed wrapper function to call HAL
- Defined HAL function for "native" set TSG timeslice
- Also, properly update timeout_us in TSG context, in
virtualized case.
This change also moves the min/max bounds checking for
tsg timeslice into the native function implementation.
There is no sysfs node for these parameters for vgpu,
as RM server is ultimately responsible for this check.
Bug 200263575
Change-Id: Ibceab9427561ad58ec28abfff0c96ca8f592bdb9
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1283180
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
The timeslice values that can be selected for a particular channel/tsg
are bounded by a static min/max. This change introduces two sysfs
nodes that allow these bounds to be configured from userspace.
min_timeslice_us
max_timeslice_us
Bug 200251974
Bug 1854791
Change-Id: I5d5a14225eee4090e418c7e43629324114f60768
Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-on: http://git-master/r/1280372
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
When kernel adds patches to a context, kernel needs to update
the patch count in order for FECS to pick up the new patches.
Previously patching was done only at the context creation
time. Now patching is used also when changing preemption mode,
but the patches did not take effect due to not updating count.
Update patch count every time we end patching of a context.
Bug 1852094
Change-Id: Ic2150741609d1d1956769e439ce1c5f2edcacb84
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1280424
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reorganize the HW headers of gk20a. The headers are moved to a
new directory:
include/nvgpu/hw/gk20a
And from the code are included like so:
#include <nvgpu/hw/gk20a/hw_pwr_gk20a.h>
This is the first step in reorganizing all of the HW headers for
gm20b, gm206, etc. This is part of a larger effort to re-structure
and make the driver more readable and scalable.
Bug 1799159
Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1244790
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
If enabled, track actions (gets and puts) on channel reference counters.
Dump the most recent actions to syslog when
gk20a_wait_until_counter_is_N gets stuck when closing a channel.
GK20A_CHANNEL_REFCOUNT_TRACKING specifies the size of the action
history. Default is to disable completely, as this has some runtime
overhead.
Bug 1826754
Change-Id: I880b0efe8881044d02ae224c243a51cb6c2db8c1
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1262424
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Added struct pmu_pg_stats_data to extract
data from multiple version of pmu pg statistics
- Added pmu_pg_stats_v2 interface to fetch
PG statistics data from PMU
- Added MSCG debugfs node to read mscg
statistics from PMU.
- Added pmu_elpg_statistics HAL support for
gp106 PG statistics read.
- Made changes to gp104/gp106
pmu_elpg_statistics HAL to support
for struct pmu_pg_stats_data
JIRA DNVGPU-165
Change-Id: I2b9e89c0fae90deb45006c4478170b9a97b56603
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1252798
(cherry picked from commit 3c073b15fd991db8d65b3171b02c161294be40cd)
Reviewed-on: http://git-master/r/1271615
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Kickoff latencies when GPFIFO is in vidmem grow significantly as
function of number of GPFIFO entries. Move GPFIFO to sysmem to
improve kickoff latency.
Bug 1848369
Change-Id: Ie95d10df26b4f1370f7250a8fbf0f7ef0211df32
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
(cherry-picked from commit 897cb579a759bbe8455ce368413979e91eb0d475)
Reviewed-on: http://git-master/r/1281564
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move the GPU allocators to common/mm/ since the allocators are common
code across all GPUs. Also rename the allocator code to move away from
gk20a_ prefixed structs and functions.
This caused one issue with the nvgpu_alloc() and nvgpu_free() functions.
There was a function for allocating either with kmalloc() or vmalloc()
depending on the size of the allocation. Those have now been renamed to
nvgpu_kalloc() and nvgpu_kfree().
Bug 1799159
Change-Id: Iddda92c013612bcb209847084ec85b8953002fa5
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1274400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_busy() we check if dev is NULL and return -ENODEV if so. But
before that we've already dereferenced dev by passing it to
get_gk20a(). Defer call to get_gk20a() until after the NULL check.
Bug 200192125
Change-Id: I943a9e96d13ff8cb4333fe20a941c8e95d159a66
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1280349
GVS: Gerrit_Virtual_Submit