fifo_sched_disable_true_v() returns 1 and this
value is being right shifted by runlist_id.
This will work only if runlist_id is 0. For runlist_id
other than 0, 1 right shifted by runlist_id will return 0 and
engine will remain disabled. fifo_sched_disable_true_v()
should be left shifted by runlist_id to fix the bug.
Change-Id: If747035b9f6c80a21a67c63e27fb214223a55d4d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1257344
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
In case one job completes just around timeout boundary,
it is possible that we launch both clean up worker and
timeout worker for same job
Then in clean up worker we try to cancel timeout
worker, and in timeout worker we try to wait for clean
up to finish which leads to deadlock with below stacks
stack 1:
[<ffffffc0000bb484>] cancel_delayed_work_sync+0x10/0x18
[<ffffffc0004f820c>] gk20a_channel_cancel_job_clean_up+0x20/0x44
[<ffffffc0004fc794>] gk20a_channel_abort_clean_up+0x34/0x31c
[<ffffffc0004fcb30>] gk20a_channel_abort+0xb4/0xc0
[<ffffffc0004f3d18>] gk20a_fifo_recover_ch+0x9c/0xec
[<ffffffc0004f3f04>] gk20a_fifo_force_reset_ch+0xdc/0xf8
[<ffffffc0004fa8c4>] gk20a_channel_timeout_handler+0xf8/0x128
stack 2:
[<ffffffc0000bb484>] cancel_delayed_work_sync+0x10/0x18
[<ffffffc0004f82c4>] gk20a_channel_timeout_stop+0x40/0x60
[<ffffffc0004fc488>] gk20a_channel_clean_up_jobs+0x7c/0x238
To fix this, cancel the timeout worker in
gk20a_channel_update() itself instead of cancelling in
gk20a_channel_clean_up_jobs()
Bug 200246829
Change-Id: Idef9de3cae29668f4e25beb564422cf2e3736182
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1259963
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add IOCTL API NVGPU_DBG_GPU_IOCTL_ACCESS_FB_MEMORY
to read/write fb/vidmem memory
Interface will accept dmabuf_fd of the buffer in vidmem,
offset into the buffer to access, temporary buffer
to copy data across API, size of read/write and
command indicating either read or write operation
API will first parse all the inputs, and then call
gk20a_vidbuf_access_memory() to complete fb access
gk20a_vidbuf_access_memory() will then just use
gk20a_mem_rd_n() or gk20a_mem_wr_n() depending
on the command issued
Bug 1804714
Jira DNVGPU-192
Change-Id: Iba3c42410abe12c2884d3b603fa33d27782e4c56
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1255556
(cherry picked from commit 2c49a8a79d93fc526adbf6f808484fa9a3fa2498)
Reviewed-on: http://git-master/r/1260471
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
gm20b_init_gr does not inherit the ops set by gk20a_init_gr_ops, and the
gr.setup_rop_mapping HAL was not set there, so it was not set for chips
that inherit from gm20b_init_gr and do not override it explicitly.
Set the pointer in gm20b_init_gr, which other chips inherit, and delete
the surrounding if condition from the call, making sure that future
users always call it, because there is an implementation since the
earliest supported chip.
Bug 1833382
Change-Id: I7893c9aac7c5c49ce9a55031ea6baa9382a1b7ca
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1258960
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
gk20a/gm20b do not have an fbpa unit, although the
hw header files claim they do. Hardcode all fbpa
values to 0.
Bug 200249125
Change-Id: I4afb29795199552979247de7c76b6b55ea4f368f
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1256420
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This change fixes error handling logic in
gk20a_alloc_channel_gpfifo(). In cases, where we don't
allocate a channel_sync at gpfifo allocation time,
we shouldn't attempt to destroy it while handling
an error.
Bug 200253447
Change-Id: I57a78c74bbce84fa17fb0360c59b8f413a9124a7
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1255858
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Instead of using enum type for litter values, use
define macros. This will fix:
1. Resolve ambiguity associated with enum type size.
2. Litter values can be extended easily in future chips.
JIRA GV11B-21
Change-Id: Idca5144ea3754820c67831a716bb0aaf2e375eb2
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1254854
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.
Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, in gk20a_scale_target, we set clock frequency
even if it is equivalent to the rate previously requested by
the governor. This change adds a check to bypass this in
case new_frequency == prev_frequency.
These clocking operations result in multiple BPMP calls, and add
significant overhead to submit time. So, we avoid these
operations when possible.
Bug 1795076
Change-Id: I0f180564e54581f0f4add4626c647e0b9a1bbe43
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1247913
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Aaron Huang <aaronh@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
We right now obtain pm_qos frequency requirments in
qos notifier callback gk20a_scale_qos_notify()
But now we want to limit GPU frequencies based on
frequency limited from devfreq nodes
And devfreq requirement should precede over
qos requirements
Hence, move all frequency estimation and clipping
to function gk20a_scale_target() which sets the
frequency at the end
Bug 200245796
Change-Id: I0572c676dce0acc0917924a11e4c0fb4a9db4e6e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1243427
(cherry picked from commit 81c757a3232463d126aecba64ca0c55d8e4423d2)
Reviewed-on: http://git-master/r/1239936
Reviewed-by: Aaron Huang <aaronh@nvidia.com>
Tested-by: Aaron Huang <aaronh@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This allows us to use these functions with both Tegra and Common Clock
Frameworks
Bug 200233943
Change-Id: I5a394d7bacfecabeabc64d32dab214d2e7cf89d7
Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com>
Reviewed-on: http://git-master/r/1242481
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This reverts commit 5f1c2bc27f.
Added back now that matching RM server has been updated:
In hypervisor mode, all GPU VA allocations must be done by client;
fix this for the allocation of the hwpm ctxt buffer
Bug 200231611
Change-Id: Ie5ce2c2562401b1f00821231d37608e3fc30d4a4
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1252138
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Issue:
warning: symbol 'pmu_allocation_get_fb_size_v3' was not declared.
Should it be static?
Fix:
Declare the 'pmu_allocation_get_fb_size_v3' as static
Bug 200067946
Change-Id: If93e074ecc041e33f91cb46913f6632bf32f48f0
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1250905
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Integration error has resulted into kfree() being called twice for
PM FBPA region of ctxsw registers.
Change-Id: Ia959e024ba6f8d2c7fc43b0c7e082f34b50962a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249966
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
In calls to gk20a_fifo_recover() we pass a bitfield of engines to
recover. We generate the bitfield by acquiring engine id from FIFO,
and using BIT(). If GR engine is now known, the resulting engine ID is
u32 with all bits set, which cannot be passed to BIT().
gk20a_fifo_recover() can already deal with all bits set, so pass that
verbatim instead.
Change-Id: Ib79d8e7e156deef0d483642cfb1ce7bf55f3c572
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249964
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
When buffer's IOVA is zero, treat that as error condition instead of
ignoring and continuing.
Change-Id: I2ede9921945645f526b0600f61f7e5ed19af6d73
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249963
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
In CDE GPU CONFIGURATION the result is computed using 32-bit
arithmetic and returned as 64-bit unsigned integer. Cast intermediate
result to u64 to prevent unintentional overflow.
Change-Id: Iebe53e2b17c1aaa498245a52962c3dbad7ce893e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249962
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
railgate_enable_store() has two places where err is checked and
returned. Because we have only one place where err can be set,
the second check and return are superfluous.
Change-Id: Id45923fc829f061fee34fa1abca0359b443e6f0d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249960
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Move debug write so that we access length and base of allocation
before the alloc structure gets freed.
Change-Id: I02e418f423beaa2b52a32d1abcff327b68dd5fa6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249959
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
We multiply integer entry size and number of runlist entries and
store the result in u64. The result is used as size of memory, so
it should be size_t instead.
Change-Id: I0f5baa66ede259c9b42ede64c08f821c3e74a20b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1249957
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Fix FBP and ROP_L2 enable masks for Maxwell+. Deprecate rop_l2_en_mask
in GPU characteristics by adding _DEPRECATED postfix. The array is
too small to hold ROP_L2 enable masks for desktop GPUs.
Add NVGPU_GPU_IOCTL_GET_FBP_L2_MASKS to expose the ROP_L2 masks for
userspace.
Bug 200136909
Bug 200241845
Change-Id: I5ad5a5c09f3962ebb631b8d6e7a2f9df02f75ac7
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/1245294
(cherry picked from commit 0823b33e59defec341ea7919dae4e5f73a36d256)
Reviewed-on: http://git-master/r/1249883
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Define fuse macros depending on kernel version as fuse
offset got changed in K4.4 and for K4.4 fuse defines are
defined in common header file (tegra-fuse.h)
- Use fuse control read/write APIs when reading control
registers for K4.4.
Bug 200243956
Change-Id: I5a86ef58d9de17a273aea8d3ce8ad5772444dac2
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1245824
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
In hypervisor mode, all GPU VA allocations must be done by client;
fix this for the allocation of the hwpm ctxt buffer
Bug 200231611
Change-Id: I0270b1298308383a969a47d0a859ed53c20594ef
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1240913
(cherry picked from commit 49314d42b13e27dc2f8c1e569a8c3e750173148d)
Reviewed-on: http://git-master/r/1245867
(cherry picked from commit d0b10e84d90d0fd61eca8be0f9e879d9cec71d3e)
Reviewed-on: http://git-master/r/1246700
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following implementation,
1) Power Sensor Table parsing.
2) Power Topology Table parsing.
3) Add debugfs interface to get the current power(mW), current(mA) and
voltage(uV) information from PMU.
4) Power Policy Table Parsing
5) Implement PMU boardobj interface for pmgr module.
6) Over current protection.
JIRA DNVGPU-47
Change-Id: I620f4470aa704f1cc920e03947831440fbb0eb05
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1217176
(cherry picked from commit ed56743c2ac8dc325c75f85a82271d2d5ed8d96a)
Reviewed-on: http://git-master/r/1241952
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, when we receive a semaphore wakeup interrupt,
we call the channel_update callback, which schedules
deferred job clean-up.
For deterministic channels, we don't allow semaphore-backed
syncs anyways. That means for these channels, if we get a
semaphore wakeup interrupt, it must be for a userspace-managed
semaphore. In this case, there is no need to call into the
channel_update callback. So for deterministic channels, we
skip this.
Bug 1795076
Change-Id: I4cdfecd53144078c5cd4be8a41c5c3b7d74c338e
Signed-off-by: Sachit Kadle <skadle@nvidia.com>
Reviewed-on: http://git-master/r/1225620
(cherry picked from commit 64a6db0080c3b198ddc2029544f52eb590dc08ff)
Reviewed-on: http://git-master/r/1225615
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove a global debugfs variable and instead save the allocator
debugfs root node in the gk20a struct.
Bug 1799159
Change-Id: If4eed34fa24775e962001e34840b334658f2321c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225611
(cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a)
Reviewed-on: http://git-master/r/1242390
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move the CE cleanup to before the FIFO cleanup. Since the CE closes
a channel during its cleanup the FIFO needs to be initialized since
the FIFO code maintains the vmalloc()'ed channels.
Bug 1816516
Change-Id: Ia7a97059a12a0c2b52368ffe411e597f803e8e6e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1225613
(cherry picked from commit 707bd2a6d4672c6a7b7a8b2e581ea3a606ed971d)
Reviewed-on: http://git-master/r/1240106
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>