Use the new kmem API functions in the channel and channel
related code.
Also delete the usage of kasprintf() since that must be paired
with a kfree(). Since the kasprintf() doesn't use the nvgpu kmem
machinery (and is Linux specific) instead use a small buffer
statically allocated on the stack.
Bug 1799159
Bug 1823380
Change-Id: Ied0183f57372632264e55608f56539861cc0f24f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1318312
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Really just delete the usage of kasprintf() since that must be
paired with a kfree(). Since the kasprintf() doesn't use the
nvgpu kmem machinery (and is Linux specific) instead use a small
buffer statically allocated on the stack.
Bug 1799159
Bug 1823380
Change-Id: Ic26f39da401df926024c67e8ff5d979f683a6d3e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1318311
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Use the correct type for the size argument in __nvgpu_kmalloc() and
make sure the size_t type is available for usage.
Bug 1799159
Bug 1823380
Change-Id: I7d6aea964e065e576c8bc3383a9b2326639c018f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1318308
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
VBIOS might change size of the power policy table and add new fields.
So driver should validate a minimum size, not absolute size.
Also driver should use header size from data read from vbios
to parse the table further.
bug 200287822
Change-Id: Ica60de444c3f81cbf61af8684ce4e32ae288188d
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/1322242
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
On kernel 4.9, the DMA API has changed, so use the NVIDIA compatibility
wrappers from dma-attrs.h to allow the code to build for both 4.4 and
4.9.
Bug 1853519
Change-Id: I0196936e81c7f72b41b38a67f42af0dc0b5518df
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1321102
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Don't use enum dma_attr in the gk20a_gmmu_alloc_attr* functions, but
define nvgpu-internal flags for no kernel mapping, force contiguous, and
read only modes. Store the flags in the allocated struct mem_desc and
only use gk20a_gmmu_free, remove gk20a_gmmu_free_attr. This helps in OS
abstraction. Rename the notion of attr to flags.
Add implicit NVGPU_DMA_NO_KERNEL_MAPPING to all vidmem buffers
allocated via gk20a_gmmu_alloc_vid for consistency.
Fix a bug in gk20a_gmmu_alloc_map_attr that dropped the attr
parameter accidentally.
Bug 1853519
Change-Id: I1ff67dff9fc425457ae445ce4976a780eb4dcc9f
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1321101
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
When we get a PBDMA MMU fault, we won't be able to map the MMU id
into an engine id for reset. We still pass FIFO_INVAL_ENGINE_ID to
gk20a_fifo_should_defer_engine_reset() which causes an unnecessary
debug spew.
Check for FIFO_INVAL_ENGINE before calling
gk20a_fifo_should_defer_engine_reset().
Change-Id: I6f4a49be194cbc6070c1a1c667059de2ea79790f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1321492
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Simulator support is intermixed with the rest of code in gk20a.c.
Move that code away from gk20a.c to an own file.
Change-Id: Idd3c8795cec5eadc6e49811b5b8ff0592c49a7d2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1323230
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
The main driver structure is not refcounted properly,
so when the driver unload, file desciptors associated to the
driver are kept open with dangling references to the main object.
This change adds referencing to the gk20a structure.
bug 200277762
JIRA: EVLR-1023
Change-Id: Id892e9e1677a344789e99bf649088c076f0bf8de
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1317420
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The driver is not properly tearing down the arbiter on the PCI driver
unload. This change makes sure that the workqueues are drained before
tearing down the driver
bug 200277762
JIRA: EVLR-1023
Change-Id: If98fd00e27949ba1569dd26e2af02b75897231a7
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1320147
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
In gk20a_channel_clean_up_jobs, move removal of job from channel's job list
to before fences are cleaned up; this will prevent gk20a_channel_abort from
asynchronously trying to dereference an already freed job.
Bug 1844305
JIRA EVLR-849
Change-Id: I1ba05237aa74be1350007630bfa5eba9988f859a
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
(cherry picked from commit 2a9ce58b1b318b95ecfcdf78462f918d090eab99)
Reviewed-on: http://git-master/r/1319026
(cherry picked from commit 990f070b0a363159ce1b21f936b7512f469018ca)
Reviewed-on: http://git-master/r/1321624
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move all programming of FB to fb_*.c files, and remove the inclusion
of FB hardware headers from other files.
TLB invalidate function took previously a pointer to VM, but the new
API takes only a PDB mem_desc, because FB does not need to know about
higher level VM.
GPC MMU is programmed from the same function as FB MMU, so added
dependency to GR hardware header to FB.
GP106 ACR was also triggering a VPR fetch, but that's not applicable
to dGPU, so removed that call.
Change-Id: I4eb69377ac3745da205907626cf60948b7c5392a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1321516
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move clock APIs from gk20a_platform to gpu_ops. At the same time
allow use of internal get_rate/set_rate for querying both GPCCLK
and PWRCLK on iGPU.
At the same time we can replace calls to clk framework with the
new HAL and drop direct dependency to clk framework.
gp10b ops were replaced as a whole at HAL initialization. That
replaces anything set in platform probe stage, so reduce that to
touch only clock gating regs.
Change-Id: Iaf219b1f000d362dbf397d45832f52d25463b31c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1300113
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Platform files are used for adding code to probe for Tegra Linux
platform. Move the files to Tegra Linux directory to make this
clear.
Change-Id: Ida66af835688325f095260c618dad90395851267
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1300112
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
get_rate is already used for call-back that queries the last set
clock rate. This instance of get_rate actually measures the frequency
so renaming it to measure_freq.
At the same time modify to use hertz instead of megahertz. We use
fractional megahertz already in GPU.
Change-Id: I387473d6a6cbf3bb9b9e5a909677a1a725403c32
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1300111
Reviewed-by: Alex Waterman <alexw@nvidia.com>
In the timers code a macro was using __builtin_return_address(0)
when it should have been using _THIS_IP_.
__builtin_return_address(0) will cause the timers code to print
the return address of the function that calls the timers code. This
isn't actually useful, of course. A user actually cares about where
the timers code call comes from which is easily obtained with
_THIS_IP_.
Bug 1799159
Change-Id: Iac16bc79e89e4cd18133db3d20f5a50d4d5e8f31
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1320839
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
The call site (gk20a_submit_prepare_syncs) owns the incr_cmd buffer
passed to __gk20a_channel_semaphore_incr. Delete the free in the error
path of the latter case to avoid freeing the same buffer twice.
Bug 1853519
Change-Id: I9b90ce7ebb17ac63992938c7f9fe90bbd139f85f
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1321117
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Read fuse values from GPU's own fuse registers instead of Tegra fuse
registers whenever possible. This reduces the number of dependencies
to Linux fuse code.
Some fuses do not have a corresponding register in GPU, so they're
left as is.
Change-Id: Id9f2f4da897f3e20b20c300a67f705e3fa5ba35a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1318278
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Query the RM server to retrieve gpu preemption policy
of guest based on pct configuration. If guest is not
allowed to request wfi preemption mode then set context
with either gfxp or cta preemption mode only.
Jira VFND-3079
Jira VFND-3081
Change-Id: I60cbf121d6f0e2373568cf40b3dfdb4df76fe02d
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: http://git-master/r/1280903
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Sachit Kadle <skadle@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
dGPU's SEC2 is passed frequency, which is queried with
clk_get_rate(). dGPU clocks are not represented in CCF, so the query
always returned an error. The value is ignored, so this went
unnoticed.
Replace the call to clk_get_rate() by just hard coding 0 as the clock
rate.
Change-Id: I86fec3726d2b6683cdadd86cab1672f3b199378f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1319068
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Navneet Kumar <navneetk@nvidia.com>
In gk20a_suspend_all_sms(), we currently loop
over all GPCs and then loop over all TPCs in inner
loop
But this is incorrect and leads to SM with
invalid GPC,TPC ids
Fix this by looping over number of TPCs in each
GPC in inner loop
Also, fix gk20a_gr_wait_for_sm_lock_down() as
per below
- we right now wait infinitely for SM to lock down
- restrict this wait with a timeout on silicon
platforms
- return ETIMEDOUT instead of EAGAIN
- add more debug prints with additional data
for SM lock down failures
Bug 200258704
Change-Id: Id6fe32e579647fd8ac287a4b2ec80cbf98791e0d
Signed-off-by: Cory Perry <cperry@nvidia.com>
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1316471
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
JIRA: EVLR-1004
(*) Refactor the non-stalling interrupt path to execute clear on the
top half, so on dGPU case processing of stalling interrupts does not
block non-stalling one.
(*) Use a worker thread to do semaphore wakeups and allow batching of
the non-stalling operations.
(*) Fix a bug where some gpus will not properly track the completion
of interrupts, preventing safe driver unloads
Change-Id: Icc90a3acba544c97ec6a9285ab235d337ab9eefa
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1312796
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Navneet Kumar <navneetk@nvidia.com>