Commit Graph

8968 Commits

Author SHA1 Message Date
mpoojary
3de579438e gpu: nvgpu: Add lsf encrypt check for ACR blob prep
Add lsf encrypt flag check for ACR blob preparation
NVGPU will pass the same previous blob during recovery
sequence instead of 0 blob size, if any of ucode image
is encrypted.

BUG 4617207
Bug 4786365

Change-Id: I7a76cc426dda90930e8b7eded9656af9f2eb952a
Signed-off-by: mpoojary <mpoojary@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3191453
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: buildbot_gerritrpt <buildbot_gerritrpt@nvidia.com>
2024-08-14 10:04:10 -07:00
Pruthav Sanwatsarkar
632d5fbc32 nvgpu: Remove newline from warning statement
The ap_devtools_judy kernel warning test saw
consistent failrures because of empty newline
warns being reported as kernel failures. These
new line warnings were found to be reported
only from the changed kernel warning.

Since all other nvgpu warns dont have new line
endings, removing the new line ending for the
error causing warn in this case.

Change-Id: Iaaf415085708eb970ae74f01c18be989ca068776
Signed-off-by: Pruthav Sanwatsarkar <pruthavs@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3014285
Reviewed-by: Martin Radev <mradev@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3100656
2024-03-20 23:34:34 -07:00
Ramalingam C
4a95d2b7ea gpu: nvgpu: allow powergating during bind_context ioctl
Do not check for powergate disablement for bind and unbind of the context
ioctls. As bind and unbind of context is just a sw statemachine related
ops we can ignore even if the power gating is still on. No need to print
the error message.

Bug 4397568

Change-Id: Idc71e12a4188cbc1c94c38032a2f1ed435d278ce
Signed-off-by: Ramalingam C <ramalingamc@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3034385
(cherry picked from commit 5f1dfe20366637f59b04b8968a527dbc3fac0d9f)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/3035549
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Tested-by: Tushar Kashalikar <tkashalikar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-12-26 22:49:40 -08:00
shaochunk
c655a5e058 gpu: nvgpu: specify devfreq timer through dt
Originally,
nvgpu uses deferrable timer for devfreq polling by default,
this leads to issues below because of unstable polling interval.
 - out of time frequency scaling
 - unstable GPU frequency scaling

This change lets users be able to specify devfreq timer through dt.
If the dt node 'devfreq-timer' equals to 'delayed', then gpu will uses
delayed timer for devfreq polling.

Bug 3823798

Change-Id: Idc0849b4a6b8af52fda8e88f5c831f183b7a27de
Signed-off-by: shaochunk <shaochunk@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2897026
Reviewed-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-05-19 03:36:48 -07:00
Sagar Kamble
c066401be7 gpu: nvgpu: disable nvgpu rpm if genpd support is not available
GPU is set to always ON state on safety L4T for SMCU to not fault.
However, nvgpu railgating was always enabled. This will lead to
improper GPU railgate/unrailgate sequence as bpmp will not
powergate/ungate the gpu on suspend and resume requests.

Keeping rpm enabled can lead to ACR failure on resume as it expects
the GPU to be reset on every resume.

Disable nvgpu runtime PM when the power domain node for the gpu is
not defined.

Bug 4111746

Change-Id: I9215ea87dbfbf53360003cac5f8a51d39982ace9
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2904335
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-05-16 02:06:47 -07:00
Divya
c62bdb94ba gpu: nvgpu: add sysfs node for golden img status
- Add a sysfs node "golden_img_status" to show
  if golden_image size and ptr are already initialized
  or not.
- This node helps to know golden image status before
  attempting to modify gpc/tpc/fbp masks.

Bug 3960290

Change-Id: I3c3de69b369bcaf2f0127e897d06e21cb8e2d68e
Signed-off-by: Divya <dsinghatwari@nvidia.com>
(cherry picked from commit c728f09c18)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2864095
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-05-05 06:06:47 -07:00
Jinesh Parakh
0ed7416297 gpu: nvgpu: Fix Explicit null dereference
Fix the following Coverity Defect:
pwrpolicy.c : Explicit null dereference

CID 10059138

Bug 3460991

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: Ie572e0608d0b07d5023e7cca878d16087cfc284f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2717978
(cherry picked from commit 658f83ca48)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2897902
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
2023-05-04 01:51:46 -07:00
Sagar Kamble
cd7044b401 gpu: nvgpu: fix pmu_board_obj init in construct_pwr_policy
Fix below CERT violation:
In construct_pwr_policy: Do not dereference null pointers.

This was introduced in the below commit:

    commit 700bd83b41 ("gpu: nvgpu: Rename/clean boardobj unit")

CID 203372
Bug 3512546

Change-Id: I30a2ce13f9df343a1dc74fdd7427ccf65b228a3e
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2710234
(cherry picked from commit da884615d3)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2897901
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
2023-05-04 01:51:41 -07:00
atanand
a1b0d921b4 gpu: nvgpu: Get GA10B EMC floorsweeping status
The memory bandwidth reported by the nvgpu driver is a resultant of FBP and EMC floorsweeping status. The FBP floorsweep status was already getting reported in the GPU characterstics so the status of EMC was fetched and reported in this change.

Jira NVGPU-9609
Bug 3661074

Change-Id: Ia2fe6cb029d086765da15d9e964ea77256e06604
Signed-off-by: atanand <atanand@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2859237
(cherry picked from commit 9dd2a8fc73)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2892943
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Kirill Artamonov <kartamonov@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Tested-by: Kirill Artamonov <kartamonov@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-27 11:36:51 -07:00
Jinesh Parakh
1cb90f30c9 gpu: nvgpu: Fixed out-of-bounds Coverity Defects
Fix following Coverity Defects:
clk_mon_tu104.c : Out-of-bounds read and Out-of-bounds access

CID 10061400
CID 10061401

Bug 3460991

Changed the datatype of domain_mask from u32 to unsigned long
to solve the out-of-bounds defect.

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I1c43bd90053264ee4104ca8c3a33d9ea07f04045
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2708765
(cherry picked from commit bb73cf9597)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2890021
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
2023-04-19 04:07:30 -07:00
Sagar Kamble
76792585b5 gpu: nvgpu: add hal to get the bar2 vm size
On ga10b+ platforms, more VM space is needed to map various buffers
to bar2 vm. Engine method buffer is mapped for each pbdma and for
maximum supported TSGs this requires more than 32MB of space.
Also we need to consider fault buffer space and vab buffer
space requirement.

Bug 3958581

Change-Id: I9ee87119f762352ee12859b71c08a5f75b3554e0
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2872811
(cherry picked from commit 53dc53a8b4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2881179
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-04-03 07:51:45 -07:00
Sagar Kamble
dd8f7114ba gpu: nvgpu: update dmabuf locking
All drivers that use dma-bufs have been moved to the updated locking
specification wherein dma-buf reservation is to be locked while
accessing the dmabuf internal data. Lock is removed. So lock
the resv object onwards while updating dmabuf private data
used for compression and buffer metadata.

With this, we can enable compression for all kernel versions that
was disabled earlier for v6.2+ kernels.

Bug 3974855
Bug 3995618

Change-Id: Iece3ab57912d0420d4bc5c07d2c0d2e03ff19292
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2877633
(cherry picked from commit 410d3603ff)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2880975
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
2023-04-03 01:06:38 -07:00
Kishan
ddbd2da4a9 gpu: nvgpu: Fix gv11b LUT for safe jetpack product
Current ga10b LUT used in gv11b is tailormade for auto safety
wherein non-ecc errors are treated as fatal and accordingly
quiesce is triggered. Recovery is also not supported.
Jetson industrial expects recovery in scenarios where it can
be supported.
Replaced ga10b automotive safety based LUT with gv11b
safe jetpack specific LUT. With this LUT, error criticality
is consistent across rel-32 and rel-35 .
The supported behaviour is:
1.Corrected ECC error, we report it as non-fatal
error and only convey the error to L1SS.
2.Uncorrected ECC error, we report it as fatal error
and hence trigger quiesce.
3.Non-ECC error, we report it as non-fatal and let
nvgpu perform recovery if it exists.

Bug 3920935

Change-Id: Iaa64aa91d6dd84b21c4d0c4684ead498e398698a
Signed-off-by: Kishan <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2866975
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-03-15 10:36:32 -07:00
Sagar Kamble
90e7747074 gpu: nvgpu: Enable compression for k6.1
dmabuf internals that nvgpu relies upon for storing meta-data for
compressible buffers changed in k6.2. Enable it for k6.1.

Bug 3844023

Change-Id: Ief661b3739e987dc8f2fe13bb0efb02fa78dbacd
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2861335
(cherry picked from commit a4eca46b4b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2866992
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-03-10 03:51:49 -08:00
Alex Waterman
ac6e0c3766 gpu: nvgpu: Disable compression for k6.1+
dmabuf internals that nvgpu relies upon for storing meta-data for
compressible buffers changed in k6.1. For now, disable compression
on all k6.1+ kernels.

Additionally, fix numerous compilation issues due to the bit rotted
compression config. All normal Tegra products support compression
and thus have this config enabled. Over the last several years
compression dependent code crept in that wasn't protected under the
compression config.

Bug 3844023

Change-Id: Ie5b9b5a2bcf1a763806c087af99203d62d0cb6e0
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2820846
(cherry picked from commit 03533066aa)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2860925
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-03-10 03:51:44 -08:00
Jon Hunter
e914561b6e gpu: nvgpu: Fix crash on reboot
A kernel panic has been observed sometimes on reboot. The crash occurs
in the nvgpu_kernel_shutdown_notification() function when calling
nvgpu_cond_signal(). Fix this by checking that the 'gr' pointer is valid
before calling nvgpu_cond_signal().

Bug 3943885

Change-Id: I81e5e1b1128f22832daf01b880fac2a5e38f2a7a
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2846761
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-01-20 08:51:28 -08:00
Jon Hunter
82b95758d1 nvgpu: Fix devnode function pointers for Linux v6.2
Upstream Linux kernel commit ff62b8e6588f ("driver core: make struct
class.devnode() take a const *") updated the 'devnode' function pointer
under the class structure to take a const device struct. This breaks
building the NVGPU driver with Linux v6.2. Make the necessary changes to
the NVGPU driver to fix the build breakage.

Bug 3936429
Bug 3844023

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Change-Id: Ia39d7fded8df0e4eb30ebd58b2261e48e1963549
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2841032
(cherry picked from commit 315813beac)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2842397
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-01-15 11:21:33 -08:00
Sagar Kamble
eacaf8cec2 gpu: nvgpu: register pm_qos min & max frequency notifiers
nvpmodel updates the devfreq frequency limits as per power requirements
for specific chip. Clock arbiter ignored these limits and set clock
to maximum supported frequency which may lead to leaking power and
over heating.

Add support to get the devfreq limits by registering PM_QOS notifiers.
Note that with this patch we enable CONFIG_GK20A_PM_QOS when PM_DEVFREQ
is enabled. So it will be enabled for all supported kernels (4.9, 4.14
kernels continue to support this. For 5.10+ kernels notifiers added in
this patch will be used. Thermal framework related notifiers for kernels
after 4.14 will not be registered as those use downstream interfaces
that are not available.)

We maintain devfreq min/max limits in the scale profile and update those
in the notifier calls. We use these limits to clamp the frequency in the
clock arbiter.

Bug 3852824

Change-Id: I734a9fb080fee1a91e9b5da071b662dbd9a18682
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2822686
Tested-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-09 01:21:15 -08:00
Mikko Perttunen
a432d2adf3 gpu: nvgpu: linux/host1x: Execute fence callback in non-atomic context
Due to changes in the host1x driver, dma_fence callbacks will be
executed in interrupt context instead of workqueue context as
previously. To allow for that, this patch effectively moves the
workqueue step into nvgpu so that the in-nvgpu fence callback gets
executed in workqueue context.

Bug 3730564

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I7bfa294aa3b4bea9888921b79175a8fc218d8e3f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2785968
(cherry picked from commit 5c8e511e48)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2823241
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-08 12:52:09 -08:00
Debarshi Dutta
6965343d13 gpu: nvgpu: don't skip setting same clk in arbiter
In the current setting, clock arbiter skips setting
the clock if its already set previously. The value
set by the arbiter is stored in
"struct nvgpu_clk_arb->actual" whenever the clock is
updated via the arbiter. However, DVFS might also
update the clock and the updates are not synchronized
with the arbiter. Hence, ensure that any clock
requests are always updated i.e. the requested rate is
set even if the previous rate remains the same.

In the devfreq scale() part, scale emc when clk_arb
is active and skip setting of clocks.

Note that this is cherry-pick from dev-main. Previously
merged cherry-pick is not complete.

Bug 3666615
Bug 3852824

Change-Id: I480a816434dcd59d18a287954a536fd7061c707c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2822685
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-07 03:37:18 -08:00
Divya
27f3fc61a3 gpu: nvgpu: wake up gr wait wq in rmmod path
- The pmu_pg_task thread remains alive in the background
  during railgate and rail-ungate.
- During rail-ungate, the PG task thread starts again and
  executes PG-related tasks.
- It comes in pmu_pg_init_powergating() and waits for GR
  initialization. Here it waits for gr to be initialized.
- In parallel, the main GPU thread works on rmmod (from
  gpu_module_reload test).
- By this time, the main gpu thread has started rmmod and
  gr->initialized can be set to false, thus causing an uninterruptible
  wait for pmu_pg_task thread.
- To solve this, wake gr wait wq in rmmod path when
  NVGPU_DRIVER_IS_DYING and NVGPU_KERNEL_IS_DYING flgas are set.

Bug 3806514
Bug 3756912

Change-Id: Id78d92f30b75aba1aee22398cc86a3acebd50ef6
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2798003
(cherry picked from commit d9345065bcb6d9ff497c127fa4cd52077f4ecfa4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2819084
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-11-30 21:07:01 -08:00
Divya
26423cad95 gpu: nvgpu: add nvgpu_start_gpu_idle in nvgpu_remove path
- With ELPG + RG enabled, gpu_module_reload test fails.
- This happens because the test tries to unload nvgpu.ko
  module and then reload it. This all happens with RG enabled.
- During rmmod of nvgpu.ko module the code path taken is:
  nvgpu_remove() ->  nvgpu_quiesce() -> gk20a_pm_prepare_poweroff
  -> nvgpu_prepare_poweroff -> pmu_destroy
- In this code path, NVGPU_DRIVER_IS_DYING flag is not set.
- Thus, in pmu_pg_task thread (which keeps on running in parallel),
  commands are sent to the PMU and the driver keeps waiting for the
  ACK in nvgpu_pmu_wait_fw_ack_status().
- Add nvgpu_start_gpu_idle() in nvgpu_remove() path, before calling
  nvgpu_quiesce().
- This will set NVGPU_DRIVER_IS_DYING flag to true.
- nvgpu_can_busy() will return 0 when the driver is shutting down or
  getting removed.

Bug 3676200
Bug 3756912

Change-Id: Ic24f58c210e4b477e5d560b053b70c16308e16f1
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2762310
(cherry picked from commit 8f1792565e71b822a6e9cc50af4b43c1b48518e0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2819082
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-30 21:06:50 -08:00
Sagar Kamble
406e5392b7 gpu: nvgpu: set MIT license for nvsched sources
Change NV license for nvsched sources and Makefile.doxygen to MIT
license as those can be distributed with other linux sources but
they are also used in qnx.

Bug 3871403

Change-Id: Iefc957b4afdf4c3c2ff19df144caac9790490114
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2814847
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-24 02:22:46 -08:00
Jon Hunter
235ec32291 gpu: nvgpu: Update include paths for OOT module
When building NVGPU as an OOT module for upstream Linux kernels, the
NVGPU driver source is now copied into a common location with all the
other OOT modules. Therefore, we can now use the 'srctree.nvidia' path
for finding the necessary header files for Host1x and NVMAP. Update the
include search paths to use 'srctree.nvidia' when building NVGPU as an
OOT module.

Bug 3817518

Change-Id: I63066e4331c66a0f47ada83fde3e63402faaf38a
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2785910
(cherry picked from commit 6bef424e1e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2809488
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-15 11:58:15 -08:00
Debarshi Dutta
00f4dbf9aa gpu: nvgpu: add missing error reporting for GV11B Hals
Error reportings were removed from the following functions in GV11B

1. gp10b_priv_ring_decode_error_code
2. gv11b_gr_intr_report_gpcmmu_ecc_err
3. gv11b_gr_intr_report_icache_uncorrected_err -> Duplicate
4. gv11b_gr_intr_report_l1_tag_corrected_err -> Duplicate
5. gv11b_gr_intr_report_l1_tag_uncorrected_err -> Duplicate
6. gv11b_ltc_intr_handle_dstg_ecc_interrupts
7. gv11b_ltc_intr_handle_ecc_sec_ded_interrupts
8. gv11b_ltc_intr_handle_tstg_ecc_interrupts
9. gv11b_pbdma_handle_intr_1

The ones marked "Duplicate" are the only ones which are used for both
gv11b and ga10b. Others are invoked only for gv11b and not ga10b.

a) For gv11b_gr_intr_report_l1_tag_corrected_err and
gv11b_gr_intr_report_l1_tag_corrected_err, the errors are handled by moving
them into gv11b_gr_intr_set_l1_tag_corrected_err and
gv11b_gr_intr_set_l1_tag_uncorrected_err functions respectively. These
functions are invoked only from GV11B.

b) For gv11b_gr_intr_report_icache_uncorrected_err, the errors are
handled by adding them in gv11b_set_icache_ecc_status_uncorrected_errors
which is specific to gv11b.

Bug 200588528

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I581bdfec8f996643d6af63b2b80a135e7d715b89
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2770836
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-10-12 02:06:25 -07:00
Divya
1274f25dda gpu: nvgpu: Update the error code for tpc_pg_mask
- nvpmodel service used to expect a return value of -ENODEV from the
  underlying tpc_pg_mask_store() when the golden image size was
  initialized.
- With the current implementation, the return value is -EINVAL due to
  which write for new tpc_pg_mask was not successful.
- Update the return value to -EBUSY for the case where golden image
  is already initialized.

Bug 3765637

Change-Id: I5a1a38cce035ea245db5d72c9f5db210d3bb95f1
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2778855
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-09-21 09:21:56 -07:00
Divya
4ed38d5b2a gpu: nvgpu: update tpc-pg support
- Add tpc count variable in the platform struct
  to store the number of tpcs present in the  chip.
  This count is needed before GPU boots to provide
  support for static TPC-PG feature.
- Remove valid_tpc_pg_mask and valid_gpc_fbp_pg_mask
  variable from gk20a struct as it is already taken care
  in platform struct.

Bug 3765637
JIRA NVGPU-8210

Change-Id: Ic04af4b7c24f5e790c52708c117e45a3bb0d1810
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725960
(cherry picked from commit 001e9a2695)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2775710
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-09-21 09:21:51 -07:00
Jon Hunter
cd85e527ec gpu: nvgpu: Add host1x support for Tegra234
Add support for the upstream host1x driver in NVGPU for Tegra234.

Bug 3724727
Bug 3752030

Change-Id: I529b731ea3feb3c8c435e7433772af82004ea208
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2759207
(cherry picked from commit 34f478fca6)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2768288
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-08-30 05:37:18 -07:00
Jon Hunter
7d9e5f9780 gpu: nvgpu: Fix crash if tegra_bpmp_get() fails
The function tegra_bpmp_get() returns an error pointer on failure and
so if the call to tegra_bpmp_get() fails, because the device-tree
property is missing, then this is not detected and leads to a crash when
trying to dereference the pointer to the bpmp handle. Fix this by
correctly checking the return value from tegra_bpmp_get().

Bug 3752030

Change-Id: I944063ab7e116fc81769c9dbbfefe6b6dc4bf0f4
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2759251
(cherry picked from commit 59f7a9e318)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2768287
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-08-30 05:37:13 -07:00
Kishan
cc1081e223 gpu: nvgpu: Fix leaf and top interrupt disabling logic
There are 2 issues here:
1. top_en register is being masked for each leaf level
interrupt disable operation. top_en bit should be disabled
as part of top level stall operation only.
2. Wrong mask is being calculated to disable the leaf_en bits
for a unit which inturn affects the entire subtree.
Subtree_mask_restore for a subtree stores the last state
of interrupts that are enabled. As part of disable operation,
we only need to update subtree_mask_restore and not reupdate
subtree_mask for that subtree. Same logic applies to enable
operation.

Renamed the apis to better reflect their operation. The
interrupt disabling is done at unit level and not subtree level.

Bug 3712884

Change-Id: Id840c77f612021a303cfe0e8dca69386bc570273
Signed-off-by: Kishan <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2752541
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2758138
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-08-10 01:07:15 -07:00
Martin Radev
eb5d644536 gpu: nvgpu: Consume L3 map flag even if L3 not supported
This patch fixes a bug where GPU mappings with the L3 hint
would fail. The failure happens because the L3 map flag does
not get consumed if L3 allocations are disabled. The fix
is to consume the flag.

Bug 3717951
Bug 3486025

Change-Id: Ib10ee58cc060318c810f86013de7311f73c25729
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2750419
(cherry picked from commit cb768ff133)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2750903
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-28 10:06:31 -07:00
Debarshi Dutta
e4b3499850 gpu: nvgpu: add a soft dependency on podgov module
The present implementation of podgov driver doesn't
export any symbols and as a result, the dependency
between NVGPU driver and podgov is not established
by depmod. Fix that by adding a soft dependency.

MODULE_SOFTDEP("pre: governor_pod_scaling");

This allows loading the podgov governor before
nvgpu driver.

Bug 3674235

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Id1959639399042f488cdaa30372feb65d8f21aaa
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2740446
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-14 07:37:53 -07:00
Laxman Dewangan
ab32853a65 nvgpu: Get rid of NV_BUILD_KERNEL_OPTIONS for identify stable kernel
There are some configs which are set for the stable kernel and
it is identified from the NV_BUILD_KERNEL_OPTIONS.

The stable kernel build nvgpu as out-of-tree module and
pass the environment config CONFIG_TEGRA_OOT_MODULE during
build.

Hence, it is not required to use the NV_BUILD_KERNEL_OPTIONS to
identify the kstable build. It uses CONFIG_TEGRA_OOT_MODULE for
setting the configs for build as module.

Bug 3652905

Change-Id: I6570760e91ca98a4c83d7691fad517b2c772e629
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2720729
(cherry picked from commit 646a48ea5a)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2728409
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-14 07:37:42 -07:00
Sagar Kamble
f246facd01 gpu: nvgpu: fix syncpt increment logic in host1x syncpt_set_minval
On host copy engine PBDMA interrupt, channel is aborted as part of the
recovery and its syncpt value is set to the max threshold.

Syncpoint may then get incremented by PBDMA (incr cmd gets processed)
after this interrupt is handled leading to syncpoint value becoming
greater than the max threshold.

Again while unbinding the channel, syncpoint value is incremented until
it reaches max threshold. Since syncpoint value is already greater than
max threshold, host1x version of nvgpu_nvhost_syncpt_set_minval will
loop for entire u32 range until it reaches max threshold and this
will hang the channel unbind.

nvgpu_nvhost_syncpt_set_minval can ensure the syncpoint value is greater
than or equal to max threshold. Hence update the check for syncpoint
value from not equal to less than.

Bug 3681100

Change-Id: I96e7a1f53d4037e9ed858a2e90dd5a8d17ed6bb0
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2742604
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-12 16:54:00 -07:00
mkumbar
da632aa173 gpu: nvgpu: ga10b: LSPMU interrupt update
Enable/disable LSPMU interrupt in MC, as required LSPMU
interrupts are configured as part of LSPMU ucode init and
don't need any additional PMU IRQ register to set/clear as
part of GPU power-on/off sequence.

Bug 3681561

Change-Id: Ifb47bc9cc83e16e46649b0eef5f257acb02f302c
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2740623
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-07 19:59:48 -07:00
Debarshi Dutta
e89553fe62 gpu: nvgpu: add error reporting support for L4T
Add error reporting support for T194's L1SS safety
services for linux.

Used GA10B's LUT for GV11B. The error ids for T194 are
different compared to GA10B. This is handled by creating
a separate table mapping existing error ids to match GV11B.

Ids that are not used by GV11B are set to U32_MAX to indicate
the driver to not send them to the l1ss driver.

Bug 200588528

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I10a267942df77458c3deee0aad1179955490aa74
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2736772
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-06 04:37:50 -07:00
Sagar Kamble
28ddb0996f gpu: nvgpu: acquire platforms clocks on floorsweeping gpc
bpmp will floorsweep GPCs as per parameters to tpc_pg_mask sysfs.
While doing that corresponding GPC clocks are also disabled.
nvgpu should re-initialize the clocks every time the
GPC/TPC pg_masks are passed to bpmp mrq.

Also print error when clk_prepare_enable fails.

Introduce platform->clks_lock to protect access to platform->clks
and platform->num_clks done from unrailgate/railgate and bpmp
mrq set calls from sysfs.

Acquire static_pg_lock in railgate path to synchronize railgate
with sysfs.

Bug 3688506

Change-Id: I3203d78b87289e7a847d78b3117e2d3119be3425
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2738920
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-05 14:37:23 -07:00
Sagar Kamble
96292d688b gpu: nvgpu: update Makefile for NVMAP_NEXT and TEGRA_NVLINK configs
To build nvgpu as external module NV_BUILD_KERNEL_OPTIONS dependency is
present to set the config CONFIG_NVGPU_NVMAP_NEXT.

Remove that dependency as rel-35 does not support kernels prior
to v5.10.

And nvlink symbols are not found while building with public sources.
nvlink is not supported on rel-35, hence disable that config.

Bug 3700823
Bug 3684625

Change-Id: I8787ecb9746dd010a025e6d53679d2f23578ad56
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2738479
Tested-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-01 08:17:46 -07:00
Scott Long
868b723b16 gpu: nvgpu: fix remap page size flag handling
When destroying a virtual memory pool the associated page size must
be set in the nvgpu_vm_remap_op structure.

This patch adds a new nvgpu_vm_remap_page_size_flag() routine that
converts the page size derived from the vm/vm_area structs to the
corresponding NVGPU_VM_REMAP_OP_FLAGS_PAGESIZE bit.

Bug 3669908

Change-Id: Idca77cc36d353777b399c872f68a1f5231ddb8dd
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2734822
Tested-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-30 12:13:06 -07:00
Debarshi Dutta
8cdf3a087f gpu: nvgpu: enable DEVFREQ for Sidecar
Enable DEVFREQ for OOT module unconditionally as the podgov governor
module.

linux/pm_qos is only used for downstream supported modifications
which is currently determined by CONFIG_GK20A_PM_QOS.

struct devfreq_dev_status doesn't have any field 'busy' in the upstream
driver hence enable it only for when downstream driver is in use
activated by CONFIG_GK20A_PM_QOS.

governor.h is only needed for android platforms which depend on 4.9
version of the kernel in downstream builds. Hence, added an compile
time flag to remove it for kernels versions greater than 4.9.

Jira LS-418

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Id242bd28e66ed187208f0d7975ee0bc508730a88
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2705766
(cherry picked from commit e81d0e8ff8)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2734112
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-23 21:38:45 -07:00
Debarshi Dutta
0cecc5c5ab gpu: nvgpu: don't skip setting same clk in arbiter
In the current setting, clock arbiter skips setting
the clock if its already set previously. The value
set by the arbiter is stored in
"struct nvgpu_clk_arb->actual" whenever the clock is
updated via the arbiter. However, DVFS might also
update the clock and the updates are not synchronized
with the arbiter. Hence, ensure that any clock
requests are always updated i.e. the requested rate is
set even if the previous rate remains the same.

Bug 3666615

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I32bf4dbf81b19fdd6fa0bdec3a6c9a9312b78eca
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2732364
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-23 20:47:44 -07:00
Laxman Dewangan
2f3c1adad4 gpu: nvgpu: Add OOT kernel build support
Add OOT kernel support same as kstable for building nvgpu
as module.

Bug 3642168

Change-Id: I7353275a6c5e487773b716e23610b22e2dc5780d
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2710918
(cherry picked from commit 152b4a0379)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725979
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
2022-06-14 00:37:57 -07:00
Shashank Singh
09da6eb397 gpu: nvgpu: move gv11b code under config flag
Move gv11b specific code under CONFIG_NVGPU_GV11B_SUPPORT so that gv11b
support can be removed for qnx later as it is no longer POR for qnx on
dev-main.

Jira NVGPU-8189
Bug 3642168

Change-Id: Idc17cfa22199f2b69a1bab0849cd2bd2e0fb6288
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693828
(cherry picked from commit ba22f6263b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725975
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-14 00:37:52 -07:00
Tejal Kudav
b37181569b gpu: nvgpu: Make missing DT prop print conditional
Below print is misleading and seems like an error.
 [INFO]  Missing support-gpu-tools property, ret =-22

'support-gpu-tools' property was added to allow disabling debugger
features on prod boards. The debugger/profiler support will be
enabled by default, even if the property is missing.

Make the INFO print conditional, more informational and less
dramatic.

Bug 3539518

Change-Id: I5fc50df30be23e1fd1ecc06282a0d50f3ca7ac64
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2668464
(cherry picked from commit 69bb38f606)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725971
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-14 00:37:47 -07:00
mkumbar
684bc1c8cb gpu: nvgpu: falcon debug unit update
- Don't print error if debug display buffer is empty.

Bug 3623500
Bug 3418561
Bug 3659996

Change-Id: I066999fb0f7d41d491c3b01df2b976fcfa833ebf
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2704967
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
(cherry picked from commit 162d7ec32d)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2722384
2022-06-13 11:53:24 -07:00
Debarshi Dutta
028b6dd811 gpu: nvgpu: update dma_mask based on H/W compatibility
To be able to access the full physical memory range, gpu's dma_mask
needs to be set to the max value of H/W compatible range.

For example. In order to support from 2GB to 66 GB, GV11B's dma_mask
needs to be atleast 37 bits. Set GV11B's dma_mask to 38 bit
and T23X's dma_mask to 39 bit. These values are supported by H/W.

Bug 3656729

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Icfff3c36a8c9cf074a254fa773c42e18020ae5de
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2723640
(cherry picked from commit 1bf9309f17)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2724566
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Brad Griffis <bgriffis@nvidia.com>
2022-06-07 23:23:04 -07:00
Sagar Kamble
45c6aed68d gpu: nvgpu: fix CERT violations in nvgpu_dbg_gpu_access_gpu_va
Update nvgpu_dbg_gpu_access_gpu_va to:
1. Ensure that integer conversions do not result in lost or
   misinterpreted data.
2. Do not dereference null pointers.

CID 436748
CID 473585
CID 254272
CID 490303
Bug 3512546

Change-Id: I551484b671aa48175a8cea119885eac478c2731c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707019
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 23:24:44 -07:00
Sagar Kamble
9d6269ce7f gpu: nvgpu: assert gr dev is non-NULL
nvgpu_device_get can return NULL if supplied invalid ID or instance
ID. We expect GR device struct to be non-NULL there hence just
assert that it is indeed non-NULL in gr_reset_engine and
ga10b_grmgr_init_gr_manager.

CID 224133
CID 250232
Bug 3512546

Change-Id: Id09a1c436a8e49b921111b940d3d013bd66bff7a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707018
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-05-07 23:24:39 -07:00
Sagar Kamble
c1202d7283 gpu: nvgpu: assert that priv is non-NULL in gk20a_alloc_comptags
priv data is available when gk20a_alloc_comptags is called hence add
assert for it.

CID 274852
Bug 3512546

Change-Id: I9d907153c359900071f0f89b84d2ee15141dd874
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707492
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 15:18:56 -07:00
Sagar Kamble
75c9a2eb94 gpu: nvgpu: fix nvgpu_dma_alloc_flags_sys cleanup
aligned_size was decremented from g->dma_memory_used in case
of failure post dma alloc. However, aligned_size is not
initialized at that point. Use size instead.

CID 446040
Bug 3512546

Change-Id: Id1e117703a3c24dcb9c0b6f3b808c7e30bf90f0b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707486
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 15:18:45 -07:00