Allow platforms to choose whether or not to have unified GPU
VA spaces. This is useful for the dGPU where having a unified
address space has no problems. On iGPUs testing issues is
getting in the way of enabling this feature.
Bug 1396644
Bug 1729947
Change-Id: I65985f1f9a818f4b06219715cc09619911e4824b
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1265303
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove the special VMA that could be used for allocating fixed
addresses. This feature was never used and is not worth maintaining.
Bug 1396644
Bug 1729947
Change-Id: I06f92caa01623535516935acc03ce38dbdb0e318
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1265302
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Cleanup and simplify the gk20a_init_vm() function to ease the
implementation of a platform dependent address space unification
decision.
Bug 1396644
Bug 1729947
Change-Id: Id8487d0e3d3c65e3357e3528063fb17c8a85f7da
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1265301
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
The basic structure of this patch is to make the small page allocator
and the large page allocator into pointers (where they used to be just
structs). Then assign each of those pointers to the same actual
allocator since the buddy allocator has supported mixed page sizes
since its inception.
For the rest of the driver some changes had to be made in order to
actually support mixed pages in a single address space.
1. Unifying the allocation page size determination
Since the allocation and map operations happen at distinct
times both mapping and allocation of GVA space must agree
on page size. This is because the allocation has to separate
allocations into separate PDEs to avoid the necessity of
supporting mixed PDEs.
To this end a function __get_pte_size() was introduced which
is used both by the balloc code and the core GPU MM code. It
determines page size based only on the length of the mapping/
allocation.
2. Fixed address allocation + page size
Similar to regular mappings/GVA allocations fixed address
mapping page size determination had to be modified. In the
past the address of the mapping determined page size since
the address space split was by address (low addresses were
small pages, high addresses large pages). Since that is no
longer the case the page size field in the reserve memory
ioctl is now honored by the mapping code. When, for instance,
CUDA makes a memory reservation it specifies small or large
pages. When CUDA requests mappings to be made within that
address range the page size is then looked up in the reserved
memory struct.
Fixed address reservations were also modified to now always
allocate at a PDE granularity (64M or 128M depending on
large page size. This prevents non-fixed allocations from
ending up in the same PDE and causing kernel panics or GMMU
faults.
3. The rest...
The rest of the changes are just by products of the above.
Lots of places required minor updates to use a pointer to
the GVA allocator struct instead of the struct itself.
Lastly, this change is not truly complete. More work remains to be
done in order to fully remove the notion that there was such a thing
as separate address spaces for different page sizes. Basically after
this patch what remains is cleanup and proper documentation.
Bug 1396644
Bug 1729947
Change-Id: If51ab396a37ba16c69e434adb47edeef083dce57
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1265300
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Make sure that map_offset is set to the fixed map address or 0)
before determining PTE size. Then use map_offset instead of
offset_align for computing the PTE size since offset_align
could be either an alignment ora fixed mapping offset.
Also is the minimum of the buffer size and the buffer alignment
for computing page size. This is necessary is the GMMU is doing
page gathering (i.e the buffer does not appear as a continguous
IOMMU range to the GPU). Is such cases a large page sized buffer
may be made up of a bunch of discontiguous 4k pages.
Bug 1396644
Bug 1729947
Change-Id: I6464ee6a4ccab2495ccb31cd1ddf1db467d2b215
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1271359
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Use hardware headers instead of hardcoded register numbers in priv
ring. This required updating the priv ring headers to add all the
registers and fields needed.
Incidentally this also gets rid of a lot of GPC priv ring registers
as they're not used in our code.
Also delete duplicate prints for the same information. We were
dumping GPC error also in gk20a_pbus_isr(), and we dumped master
information twice.
Dump status of each GPC separately instead of supporting only GPC0.
Change-Id: Ic50817ecc50892618fa27947fa83b05148b2cd6a
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1295481
GVS: Gerrit_Virtual_Submit
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Prune from clock gating list the entries that target units that
do not exist on gp106.
Change-Id: I192219a24d8e67de7c1fc25276dfcccbe041a05f
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1294819
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
We did not follow the proper sequence to reset priv ring on error.
Instead we just re-enabled priv ring, which does not reset anything.
Rename the gk20a_reset_priv_ring() to gk20a_enable_priv_ring() to
indicate its proper use. Add another gk20a_reset_priv_ring() which
actually resets priv ring properly.
Change-Id: Ied74465b1215daa447a565b7e9cafef7fbe67d1b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1294681
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
During testing it was detected that a failure in loading the firmware
for the driver would not propagate, allowing some function pointers to
be left unitialized. This would cause a kernel-crash later on.
Bug 1866370
Change-Id: I66056a1d99229d10635293d4c1685f596f197255
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1295376
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Devfreq and gpcclk require GPU v/f tables for registering correctly.
Fix this by deferring the nvgpu_probe if GPU-DVFS is not completely
initialized. Change applicable to kernels with Common Clock Framework
enabled.
Bug 200233943
Change-Id: I82dadc1b0970d47e839d6bec935330966402e93b
Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Reviewed-on: http://git-master/r/1280832
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Platform probe can return a EDEFER_PROBE, perform user init only if
platform probe is successful so that all the device objects are
created only once.
Bug 200233943
Change-Id: If6f41af13c29d070743896f26e6650228153027b
Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Reviewed-on: http://git-master/r/1280831
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
DVFS constraints for GPU are applied on gbus not on gpcclk. Make T210
K4.4 use gm20b.gbus to change the GPU clk rates and use its parent
clock gbus while querrying DVFS constraints for the GPU.
Bug 200233943
Change-Id: I2bad3266d6b8f8f3806a0d4249d9b40308c2ee6a
Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Reviewed-on: http://git-master/r/1275926
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move the sw initialization of gpcclk to probe time so that gpcclk is
ready to use before first rail ungate. Change is applicable only for
platforms with CCF enabled.
Bug 200233943
Change-Id: I7b322215041c0b88e9e2a37567af408fbbc31dc1
Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com>
Reviewed-on: http://git-master/r/1280830
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Instead of checking if a job is complete, only check that channel is
making progress by checking its gp_get is advancing.
This will make the watchdog conservative. Previously a whole job had
x seconds to complete. Now channel has x seconds to get host to
consume each push buffer segment.
Bug 1861838
Bug 200273419
Bug 200263100
Change-Id: I70adc1f50301bce8db7dac675771c251c0f11b70
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1294850
Reviewed-by: Automatic_Commit_Validation_User
To test semaphore-related bugs with igpus, add a debugfs node called
"disable_syncpoints" to override the "has_syncpoints" platform flag.
This makes job synchronization use semaphores, for example.
NVGPU_GPU_FLAGS_HAS_SYNCPOINTS is still reported in gpu characteristics
if the platform supports that, because it is filled in during boot.
Jira NVGPU-18
Change-Id: I58c815f896a6054df472f571012c239f1478bf07
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1293972
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Remove including gk20a.h from pmu_gk20a.h. This causes a fallout
as some #includes were missing.
gr_gp10b.h uses mem_desc, but did not include mm_gk20a.h. Add the
include.
Including mm_gk20a.h in gr_gp10b.h causes recursive include, as
mm_gk20a.h has some gr defines. Move the defines to gr_gk20a.h to
remove the dependency.
gr_ctx_gk20a.h used struct gk20a pointers, but did not forward
declare it. Add a forward declaration.
gr_gk20a.h uses dbg_session_gk20a, but was missing forward
declaration.
gr_gk20a.h did not include nvgpu.h but it uses preemption types from
that header. Add include.
Change-Id: I2168e2303b55e0d187b816bcb26f37c8af1649ba
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1283717
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Hardware headers have been outdated. Regenerate with newest tool.
At the same time correct the incorrect usage of fuse fields.
JIRA DNVGPU-172
Change-Id: If190bf0cf2e41d525e6ea374a30efd1f63963e5e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1294267
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
vfe_var_construct_single_sensed_fuse() first constructs boardobj and
then does further validity checks. If the checks fail, it calls exit
label. The exit label checks if boardobj is NULL and calls destructor
if it is.
As there is no path to get to exit label with boardobj NULL, skip
the check.
Coverity ID 2011368
Change-Id: Ifea931113a7b862830b4b3f9852d9c16310a1549
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1291685
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
clk_prog_construct_1x_master_table() first constructs boardobj and
then allocates further structures. If the further allocation fails,
it calls exit label. The exit label checks if boardobj is NULL and
calls destructor if it is.
As there is no path to get to exit label with boardobj NULL, skip
the check.
Coverity ID 2011367
Change-Id: Ic157397ca42d26b7640f7b28f6a9fb929d517412
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1291684
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
If nvgpu_clk_arb_install_fd() gets an error from
nvgpu_clk_notification_queue_alloc(), it fails to free the
nvgpu_clk_dev that it allocated earlier.
Direct the error case to call an appropriate fail label.
Coverity ID 1862040
Change-Id: I1d804d4f5261ec64831938f997f9efc3f2700b60
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1291683
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
If construct_clk_prog() gets an error reported in status, it returns
NULL instead of the constructed board_obj_ptr.
Call a destructor to prevent leaking any possibly constructed
board_obj_ptr.
Coverity ID 490171
Change-Id: Icf359da6511b108a03dd86d4556c5cbb288e90de
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1291682
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
API gk20a_wait_for_idle() right now always waits for
0 usage count
But in case railgating is disabled through sysfs,
usage count will never get to 0
Hence in this case we should wait for usage count
of 1
If platform->user_railgate_disabled is set,
keep target usage count of 1, otherwise keep
target usage count as 0
Bug 200260926
Change-Id: I1a80621ca61babbd6566989dc09a7b20670c649c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1291421
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make the GPU bind and rebind operations work when the driver
is idle. This required two changes.
1. Reset the GPU before doing SW init for PCI GPUs. This clears
the SW state which may be stale in the case of a rebind attempt.
2. Cleanup the interrupt enable/disables. Firstly there was one
place where nvgpu would accidentally disable the stalling
interrupt twice when the stalling interrupt and non-stalling
interrupt are the same. Secondly make sure when exiting nvgpu
that the interrupt enable/disables are balanced. Leaving the
interrupt in the -1 disable state means that next time the
driver runs interrupts never quite get enabled.
Bug 1816516
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1287643
Reviewed-on: http://git-master/r/1287649
(cherry picked from commit aa15af0aae5d0a95a8e765469be4354ab7ddd9f8)
Change-Id: I945e21c1fbb3f096834acf850616b71b2aab9ee3
Reviewed-on: http://git-master/r/1292700
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Resets the GPU without resetting the XVE/XP interfaces. This allows
the GPU to stay attached to the PCI bus but still resets all the rest
of the GPU's internal state.
Bug 1816516
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1287644
Reviewed-on: http://git-master/r/1287650
(cherry picked from commit c14efaee5d03a053d5bf229425a7594e1c6bfad0)
Change-Id: If7aba3cc8109e30bd6b6aa145836e812d50b35c5
Reviewed-on: http://git-master/r/1292699
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add a full GPU reset function to the XVE block. This allows
the driver to reset the GPU (except the XVE and XP interfaces)
to clear the GPU's state.
This is necessary for the GPU rebind to work. The state of the
GPU needs to be cleared before the new driver instance can work.
Bug 1816516
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1287642
Reviewed-on: http://git-master/r/1287648
(cherry picked from commit 7e751c0eb2c0f7d9d0b2020600c33fc8b4381878)
Change-Id: Ie2b721bf1b40acbab34de2436dea4e70d33b5611
Reviewed-on: http://git-master/r/1292698
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Prints out Timeslice value, Interleave level, Graphics preemption
mode and compute preempt mode along with chid, tsgid, pid.
Enable it with setting dbg_mask with 8192
Bug 1855710
Change-Id: I60efef9810587f8fedd4e2ba62ba67d06d84faea
Signed-off-by: Mihir Thakkar <mthakkar@nvidia.com>
Reviewed-on: http://git-master/r/1287141
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Since pwr_sensors, pwr_topology_ and pwr_policy_* tables in bios.h
are not defined as packed, nvgpu driver is not able to find hw
threshold pwr_policy table in VBIOS and ends up hard coding the HW
thershold policy.
Changed definitions to packed, and explicitly unpack structures
when parsing the power policy table. Removed the function that
did the hard coding.
Jira DNVGPU-206
Change-Id: Idc2b5b5c86ddfe735631190dda10218cc462be3b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1290303
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Copy and restore golden context correctly with
context header. Removed parallel fecs bind method,
which can cause issues for context in execution.
Also added function pointer to freeing context
header during channel context free.
Bug 1834201
Change-Id: I7962d68338d5144f624375ab81436e86cb31051e
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1275201
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We right now compare requested value to the last
freq value. Last freq value is always a rounded
value, whereas requested value need not be a
rounded value
Hence it is incorrect to compare requested value
to last freq value
Fix this by comparing rounded value to last_freq
Change-Id: I7c6ea7c4e57105598c9af75efe70016b7fa8038b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1287360
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The soc tegra headers are unified and moved all the content of
linux/tegra-soc.h to the soc/tegra/chip-id.h to have the
single soc header for Tegra.
Change-Id: I281e19dd3eb1538b8dfbea4eb0779fb64d1fcffa
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1288365
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
The fuse headers are unified and moved all the content of
linux/tegra-fuse.h to the soc/tegra/fuse.h to have the
single fuse header for Tegra.
Use unified fuse header soc/tegra/fuse.h.
bug 200260692
Change-Id: Icab3ba5c3dbcd3fa831455c2f336942d356ff5ac
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: http://git-master/r/1287498
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
bug 200269171
Updating PMU firmware to fix voltage raise when switching mclk to 810mhz
with CLFC and MSCG enabled. The fix is to make sure that clock domain is
not evaluated in CLFC if MSCG has engaged anytime after the previous
evaluation
Change-Id: I2b6979ed3361f47273f2643c27c005deac49dc8b
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/1286437
(cherry picked from commit dbfccb42614ec9361628b3c3427a65d3fe908597)
Reviewed-on: http://git-master/r/1287461
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The variable indicating the size of the buffer for GPC vf points
was not reset before the query, thus sporadic failures could
happen if the number of available VF points changed on an update
Maximum number of points increased to 256. This is the maximum
that can fit in the boardobj table
bug 200269804
Change-Id: Icb4ae386135a9bb40d4345eb73c5584fecd79147
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: http://git-master/r/1286028
Reviewed-on: http://git-master/r/1287589
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
The recently added refcount tracking support had a slight mishap in
refactoring some printks. Fix a typo to make the support compile again
when enabled.
Bug 1826754
Change-Id: Ifd76d644932fa219751db82a0beb3c8482ea68c3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1285922
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>