Commit Graph

8953 Commits

Author SHA1 Message Date
Jon Hunter
e914561b6e gpu: nvgpu: Fix crash on reboot
A kernel panic has been observed sometimes on reboot. The crash occurs
in the nvgpu_kernel_shutdown_notification() function when calling
nvgpu_cond_signal(). Fix this by checking that the 'gr' pointer is valid
before calling nvgpu_cond_signal().

Bug 3943885

Change-Id: I81e5e1b1128f22832daf01b880fac2a5e38f2a7a
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2846761
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-01-20 08:51:28 -08:00
Jon Hunter
82b95758d1 nvgpu: Fix devnode function pointers for Linux v6.2
Upstream Linux kernel commit ff62b8e6588f ("driver core: make struct
class.devnode() take a const *") updated the 'devnode' function pointer
under the class structure to take a const device struct. This breaks
building the NVGPU driver with Linux v6.2. Make the necessary changes to
the NVGPU driver to fix the build breakage.

Bug 3936429
Bug 3844023

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Change-Id: Ia39d7fded8df0e4eb30ebd58b2261e48e1963549
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2841032
(cherry picked from commit 315813beac)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2842397
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2023-01-15 11:21:33 -08:00
Sagar Kamble
eacaf8cec2 gpu: nvgpu: register pm_qos min & max frequency notifiers
nvpmodel updates the devfreq frequency limits as per power requirements
for specific chip. Clock arbiter ignored these limits and set clock
to maximum supported frequency which may lead to leaking power and
over heating.

Add support to get the devfreq limits by registering PM_QOS notifiers.
Note that with this patch we enable CONFIG_GK20A_PM_QOS when PM_DEVFREQ
is enabled. So it will be enabled for all supported kernels (4.9, 4.14
kernels continue to support this. For 5.10+ kernels notifiers added in
this patch will be used. Thermal framework related notifiers for kernels
after 4.14 will not be registered as those use downstream interfaces
that are not available.)

We maintain devfreq min/max limits in the scale profile and update those
in the notifier calls. We use these limits to clamp the frequency in the
clock arbiter.

Bug 3852824

Change-Id: I734a9fb080fee1a91e9b5da071b662dbd9a18682
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2822686
Tested-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-09 01:21:15 -08:00
Mikko Perttunen
a432d2adf3 gpu: nvgpu: linux/host1x: Execute fence callback in non-atomic context
Due to changes in the host1x driver, dma_fence callbacks will be
executed in interrupt context instead of workqueue context as
previously. To allow for that, this patch effectively moves the
workqueue step into nvgpu so that the in-nvgpu fence callback gets
executed in workqueue context.

Bug 3730564

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Change-Id: I7bfa294aa3b4bea9888921b79175a8fc218d8e3f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2785968
(cherry picked from commit 5c8e511e48)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2823241
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-08 12:52:09 -08:00
Debarshi Dutta
6965343d13 gpu: nvgpu: don't skip setting same clk in arbiter
In the current setting, clock arbiter skips setting
the clock if its already set previously. The value
set by the arbiter is stored in
"struct nvgpu_clk_arb->actual" whenever the clock is
updated via the arbiter. However, DVFS might also
update the clock and the updates are not synchronized
with the arbiter. Hence, ensure that any clock
requests are always updated i.e. the requested rate is
set even if the previous rate remains the same.

In the devfreq scale() part, scale emc when clk_arb
is active and skip setting of clocks.

Note that this is cherry-pick from dev-main. Previously
merged cherry-pick is not complete.

Bug 3666615
Bug 3852824

Change-Id: I480a816434dcd59d18a287954a536fd7061c707c
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2822685
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-12-07 03:37:18 -08:00
Divya
27f3fc61a3 gpu: nvgpu: wake up gr wait wq in rmmod path
- The pmu_pg_task thread remains alive in the background
  during railgate and rail-ungate.
- During rail-ungate, the PG task thread starts again and
  executes PG-related tasks.
- It comes in pmu_pg_init_powergating() and waits for GR
  initialization. Here it waits for gr to be initialized.
- In parallel, the main GPU thread works on rmmod (from
  gpu_module_reload test).
- By this time, the main gpu thread has started rmmod and
  gr->initialized can be set to false, thus causing an uninterruptible
  wait for pmu_pg_task thread.
- To solve this, wake gr wait wq in rmmod path when
  NVGPU_DRIVER_IS_DYING and NVGPU_KERNEL_IS_DYING flgas are set.

Bug 3806514
Bug 3756912

Change-Id: Id78d92f30b75aba1aee22398cc86a3acebd50ef6
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2798003
(cherry picked from commit d9345065bcb6d9ff497c127fa4cd52077f4ecfa4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2819084
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-11-30 21:07:01 -08:00
Divya
26423cad95 gpu: nvgpu: add nvgpu_start_gpu_idle in nvgpu_remove path
- With ELPG + RG enabled, gpu_module_reload test fails.
- This happens because the test tries to unload nvgpu.ko
  module and then reload it. This all happens with RG enabled.
- During rmmod of nvgpu.ko module the code path taken is:
  nvgpu_remove() ->  nvgpu_quiesce() -> gk20a_pm_prepare_poweroff
  -> nvgpu_prepare_poweroff -> pmu_destroy
- In this code path, NVGPU_DRIVER_IS_DYING flag is not set.
- Thus, in pmu_pg_task thread (which keeps on running in parallel),
  commands are sent to the PMU and the driver keeps waiting for the
  ACK in nvgpu_pmu_wait_fw_ack_status().
- Add nvgpu_start_gpu_idle() in nvgpu_remove() path, before calling
  nvgpu_quiesce().
- This will set NVGPU_DRIVER_IS_DYING flag to true.
- nvgpu_can_busy() will return 0 when the driver is shutting down or
  getting removed.

Bug 3676200
Bug 3756912

Change-Id: Ic24f58c210e4b477e5d560b053b70c16308e16f1
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2762310
(cherry picked from commit 8f1792565e71b822a6e9cc50af4b43c1b48518e0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2819082
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-30 21:06:50 -08:00
Sagar Kamble
406e5392b7 gpu: nvgpu: set MIT license for nvsched sources
Change NV license for nvsched sources and Makefile.doxygen to MIT
license as those can be distributed with other linux sources but
they are also used in qnx.

Bug 3871403

Change-Id: Iefc957b4afdf4c3c2ff19df144caac9790490114
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2814847
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-24 02:22:46 -08:00
Jon Hunter
235ec32291 gpu: nvgpu: Update include paths for OOT module
When building NVGPU as an OOT module for upstream Linux kernels, the
NVGPU driver source is now copied into a common location with all the
other OOT modules. Therefore, we can now use the 'srctree.nvidia' path
for finding the necessary header files for Host1x and NVMAP. Update the
include search paths to use 'srctree.nvidia' when building NVGPU as an
OOT module.

Bug 3817518

Change-Id: I63066e4331c66a0f47ada83fde3e63402faaf38a
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2785910
(cherry picked from commit 6bef424e1e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2809488
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-15 11:58:15 -08:00
Debarshi Dutta
00f4dbf9aa gpu: nvgpu: add missing error reporting for GV11B Hals
Error reportings were removed from the following functions in GV11B

1. gp10b_priv_ring_decode_error_code
2. gv11b_gr_intr_report_gpcmmu_ecc_err
3. gv11b_gr_intr_report_icache_uncorrected_err -> Duplicate
4. gv11b_gr_intr_report_l1_tag_corrected_err -> Duplicate
5. gv11b_gr_intr_report_l1_tag_uncorrected_err -> Duplicate
6. gv11b_ltc_intr_handle_dstg_ecc_interrupts
7. gv11b_ltc_intr_handle_ecc_sec_ded_interrupts
8. gv11b_ltc_intr_handle_tstg_ecc_interrupts
9. gv11b_pbdma_handle_intr_1

The ones marked "Duplicate" are the only ones which are used for both
gv11b and ga10b. Others are invoked only for gv11b and not ga10b.

a) For gv11b_gr_intr_report_l1_tag_corrected_err and
gv11b_gr_intr_report_l1_tag_corrected_err, the errors are handled by moving
them into gv11b_gr_intr_set_l1_tag_corrected_err and
gv11b_gr_intr_set_l1_tag_uncorrected_err functions respectively. These
functions are invoked only from GV11B.

b) For gv11b_gr_intr_report_icache_uncorrected_err, the errors are
handled by adding them in gv11b_set_icache_ecc_status_uncorrected_errors
which is specific to gv11b.

Bug 200588528

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I581bdfec8f996643d6af63b2b80a135e7d715b89
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2770836
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-10-12 02:06:25 -07:00
Divya
1274f25dda gpu: nvgpu: Update the error code for tpc_pg_mask
- nvpmodel service used to expect a return value of -ENODEV from the
  underlying tpc_pg_mask_store() when the golden image size was
  initialized.
- With the current implementation, the return value is -EINVAL due to
  which write for new tpc_pg_mask was not successful.
- Update the return value to -EBUSY for the case where golden image
  is already initialized.

Bug 3765637

Change-Id: I5a1a38cce035ea245db5d72c9f5db210d3bb95f1
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2778855
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-09-21 09:21:56 -07:00
Divya
4ed38d5b2a gpu: nvgpu: update tpc-pg support
- Add tpc count variable in the platform struct
  to store the number of tpcs present in the  chip.
  This count is needed before GPU boots to provide
  support for static TPC-PG feature.
- Remove valid_tpc_pg_mask and valid_gpc_fbp_pg_mask
  variable from gk20a struct as it is already taken care
  in platform struct.

Bug 3765637
JIRA NVGPU-8210

Change-Id: Ic04af4b7c24f5e790c52708c117e45a3bb0d1810
Signed-off-by: Divya <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725960
(cherry picked from commit 001e9a2695)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2775710
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-09-21 09:21:51 -07:00
Jon Hunter
cd85e527ec gpu: nvgpu: Add host1x support for Tegra234
Add support for the upstream host1x driver in NVGPU for Tegra234.

Bug 3724727
Bug 3752030

Change-Id: I529b731ea3feb3c8c435e7433772af82004ea208
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2759207
(cherry picked from commit 34f478fca6)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2768288
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-08-30 05:37:18 -07:00
Jon Hunter
7d9e5f9780 gpu: nvgpu: Fix crash if tegra_bpmp_get() fails
The function tegra_bpmp_get() returns an error pointer on failure and
so if the call to tegra_bpmp_get() fails, because the device-tree
property is missing, then this is not detected and leads to a crash when
trying to dereference the pointer to the bpmp handle. Fix this by
correctly checking the return value from tegra_bpmp_get().

Bug 3752030

Change-Id: I944063ab7e116fc81769c9dbbfefe6b6dc4bf0f4
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2759251
(cherry picked from commit 59f7a9e318)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2768287
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-08-30 05:37:13 -07:00
Kishan
cc1081e223 gpu: nvgpu: Fix leaf and top interrupt disabling logic
There are 2 issues here:
1. top_en register is being masked for each leaf level
interrupt disable operation. top_en bit should be disabled
as part of top level stall operation only.
2. Wrong mask is being calculated to disable the leaf_en bits
for a unit which inturn affects the entire subtree.
Subtree_mask_restore for a subtree stores the last state
of interrupts that are enabled. As part of disable operation,
we only need to update subtree_mask_restore and not reupdate
subtree_mask for that subtree. Same logic applies to enable
operation.

Renamed the apis to better reflect their operation. The
interrupt disabling is done at unit level and not subtree level.

Bug 3712884

Change-Id: Id840c77f612021a303cfe0e8dca69386bc570273
Signed-off-by: Kishan <kpalankar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2752541
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2758138
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-08-10 01:07:15 -07:00
Martin Radev
eb5d644536 gpu: nvgpu: Consume L3 map flag even if L3 not supported
This patch fixes a bug where GPU mappings with the L3 hint
would fail. The failure happens because the L3 map flag does
not get consumed if L3 allocations are disabled. The fix
is to consume the flag.

Bug 3717951
Bug 3486025

Change-Id: Ib10ee58cc060318c810f86013de7311f73c25729
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2750419
(cherry picked from commit cb768ff133)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2750903
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Winnie Hsu <whsu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-28 10:06:31 -07:00
Debarshi Dutta
e4b3499850 gpu: nvgpu: add a soft dependency on podgov module
The present implementation of podgov driver doesn't
export any symbols and as a result, the dependency
between NVGPU driver and podgov is not established
by depmod. Fix that by adding a soft dependency.

MODULE_SOFTDEP("pre: governor_pod_scaling");

This allows loading the podgov governor before
nvgpu driver.

Bug 3674235

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Id1959639399042f488cdaa30372feb65d8f21aaa
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2740446
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-14 07:37:53 -07:00
Laxman Dewangan
ab32853a65 nvgpu: Get rid of NV_BUILD_KERNEL_OPTIONS for identify stable kernel
There are some configs which are set for the stable kernel and
it is identified from the NV_BUILD_KERNEL_OPTIONS.

The stable kernel build nvgpu as out-of-tree module and
pass the environment config CONFIG_TEGRA_OOT_MODULE during
build.

Hence, it is not required to use the NV_BUILD_KERNEL_OPTIONS to
identify the kstable build. It uses CONFIG_TEGRA_OOT_MODULE for
setting the configs for build as module.

Bug 3652905

Change-Id: I6570760e91ca98a4c83d7691fad517b2c772e629
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2720729
(cherry picked from commit 646a48ea5a)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2728409
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-14 07:37:42 -07:00
Sagar Kamble
f246facd01 gpu: nvgpu: fix syncpt increment logic in host1x syncpt_set_minval
On host copy engine PBDMA interrupt, channel is aborted as part of the
recovery and its syncpt value is set to the max threshold.

Syncpoint may then get incremented by PBDMA (incr cmd gets processed)
after this interrupt is handled leading to syncpoint value becoming
greater than the max threshold.

Again while unbinding the channel, syncpoint value is incremented until
it reaches max threshold. Since syncpoint value is already greater than
max threshold, host1x version of nvgpu_nvhost_syncpt_set_minval will
loop for entire u32 range until it reaches max threshold and this
will hang the channel unbind.

nvgpu_nvhost_syncpt_set_minval can ensure the syncpoint value is greater
than or equal to max threshold. Hence update the check for syncpoint
value from not equal to less than.

Bug 3681100

Change-Id: I96e7a1f53d4037e9ed858a2e90dd5a8d17ed6bb0
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2742604
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-12 16:54:00 -07:00
mkumbar
da632aa173 gpu: nvgpu: ga10b: LSPMU interrupt update
Enable/disable LSPMU interrupt in MC, as required LSPMU
interrupts are configured as part of LSPMU ucode init and
don't need any additional PMU IRQ register to set/clear as
part of GPU power-on/off sequence.

Bug 3681561

Change-Id: Ifb47bc9cc83e16e46649b0eef5f257acb02f302c
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2740623
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-07 19:59:48 -07:00
Debarshi Dutta
e89553fe62 gpu: nvgpu: add error reporting support for L4T
Add error reporting support for T194's L1SS safety
services for linux.

Used GA10B's LUT for GV11B. The error ids for T194 are
different compared to GA10B. This is handled by creating
a separate table mapping existing error ids to match GV11B.

Ids that are not used by GV11B are set to U32_MAX to indicate
the driver to not send them to the l1ss driver.

Bug 200588528

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I10a267942df77458c3deee0aad1179955490aa74
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2736772
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-07-06 04:37:50 -07:00
Sagar Kamble
28ddb0996f gpu: nvgpu: acquire platforms clocks on floorsweeping gpc
bpmp will floorsweep GPCs as per parameters to tpc_pg_mask sysfs.
While doing that corresponding GPC clocks are also disabled.
nvgpu should re-initialize the clocks every time the
GPC/TPC pg_masks are passed to bpmp mrq.

Also print error when clk_prepare_enable fails.

Introduce platform->clks_lock to protect access to platform->clks
and platform->num_clks done from unrailgate/railgate and bpmp
mrq set calls from sysfs.

Acquire static_pg_lock in railgate path to synchronize railgate
with sysfs.

Bug 3688506

Change-Id: I3203d78b87289e7a847d78b3117e2d3119be3425
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2738920
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-05 14:37:23 -07:00
Sagar Kamble
96292d688b gpu: nvgpu: update Makefile for NVMAP_NEXT and TEGRA_NVLINK configs
To build nvgpu as external module NV_BUILD_KERNEL_OPTIONS dependency is
present to set the config CONFIG_NVGPU_NVMAP_NEXT.

Remove that dependency as rel-35 does not support kernels prior
to v5.10.

And nvlink symbols are not found while building with public sources.
nvlink is not supported on rel-35, hence disable that config.

Bug 3700823
Bug 3684625

Change-Id: I8787ecb9746dd010a025e6d53679d2f23578ad56
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2738479
Tested-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-07-01 08:17:46 -07:00
Scott Long
868b723b16 gpu: nvgpu: fix remap page size flag handling
When destroying a virtual memory pool the associated page size must
be set in the nvgpu_vm_remap_op structure.

This patch adds a new nvgpu_vm_remap_page_size_flag() routine that
converts the page size derived from the vm/vm_area structs to the
corresponding NVGPU_VM_REMAP_OP_FLAGS_PAGESIZE bit.

Bug 3669908

Change-Id: Idca77cc36d353777b399c872f68a1f5231ddb8dd
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2734822
Tested-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-30 12:13:06 -07:00
Debarshi Dutta
8cdf3a087f gpu: nvgpu: enable DEVFREQ for Sidecar
Enable DEVFREQ for OOT module unconditionally as the podgov governor
module.

linux/pm_qos is only used for downstream supported modifications
which is currently determined by CONFIG_GK20A_PM_QOS.

struct devfreq_dev_status doesn't have any field 'busy' in the upstream
driver hence enable it only for when downstream driver is in use
activated by CONFIG_GK20A_PM_QOS.

governor.h is only needed for android platforms which depend on 4.9
version of the kernel in downstream builds. Hence, added an compile
time flag to remove it for kernels versions greater than 4.9.

Jira LS-418

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Id242bd28e66ed187208f0d7975ee0bc508730a88
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2705766
(cherry picked from commit e81d0e8ff8)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2734112
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-23 21:38:45 -07:00
Debarshi Dutta
0cecc5c5ab gpu: nvgpu: don't skip setting same clk in arbiter
In the current setting, clock arbiter skips setting
the clock if its already set previously. The value
set by the arbiter is stored in
"struct nvgpu_clk_arb->actual" whenever the clock is
updated via the arbiter. However, DVFS might also
update the clock and the updates are not synchronized
with the arbiter. Hence, ensure that any clock
requests are always updated i.e. the requested rate is
set even if the previous rate remains the same.

Bug 3666615

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: I32bf4dbf81b19fdd6fa0bdec3a6c9a9312b78eca
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2732364
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-23 20:47:44 -07:00
Laxman Dewangan
2f3c1adad4 gpu: nvgpu: Add OOT kernel build support
Add OOT kernel support same as kstable for building nvgpu
as module.

Bug 3642168

Change-Id: I7353275a6c5e487773b716e23610b22e2dc5780d
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2710918
(cherry picked from commit 152b4a0379)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725979
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
2022-06-14 00:37:57 -07:00
Shashank Singh
09da6eb397 gpu: nvgpu: move gv11b code under config flag
Move gv11b specific code under CONFIG_NVGPU_GV11B_SUPPORT so that gv11b
support can be removed for qnx later as it is no longer POR for qnx on
dev-main.

Jira NVGPU-8189
Bug 3642168

Change-Id: Idc17cfa22199f2b69a1bab0849cd2bd2e0fb6288
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693828
(cherry picked from commit ba22f6263b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725975
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-14 00:37:52 -07:00
Tejal Kudav
b37181569b gpu: nvgpu: Make missing DT prop print conditional
Below print is misleading and seems like an error.
 [INFO]  Missing support-gpu-tools property, ret =-22

'support-gpu-tools' property was added to allow disabling debugger
features on prod boards. The debugger/profiler support will be
enabled by default, even if the property is missing.

Make the INFO print conditional, more informational and less
dramatic.

Bug 3539518

Change-Id: I5fc50df30be23e1fd1ecc06282a0d50f3ca7ac64
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2668464
(cherry picked from commit 69bb38f606)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2725971
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Jonathan Hunter <jonathanh@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Amulya Yarlagadda <ayarlagadda@nvidia.com>
Tested-by: Jonathan Hunter <jonathanh@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-06-14 00:37:47 -07:00
mkumbar
684bc1c8cb gpu: nvgpu: falcon debug unit update
- Don't print error if debug display buffer is empty.

Bug 3623500
Bug 3418561
Bug 3659996

Change-Id: I066999fb0f7d41d491c3b01df2b976fcfa833ebf
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2704967
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
(cherry picked from commit 162d7ec32d)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2722384
2022-06-13 11:53:24 -07:00
Debarshi Dutta
028b6dd811 gpu: nvgpu: update dma_mask based on H/W compatibility
To be able to access the full physical memory range, gpu's dma_mask
needs to be set to the max value of H/W compatible range.

For example. In order to support from 2GB to 66 GB, GV11B's dma_mask
needs to be atleast 37 bits. Set GV11B's dma_mask to 38 bit
and T23X's dma_mask to 39 bit. These values are supported by H/W.

Bug 3656729

Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Change-Id: Icfff3c36a8c9cf074a254fa773c42e18020ae5de
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2723640
(cherry picked from commit 1bf9309f17)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2724566
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Brad Griffis <bgriffis@nvidia.com>
2022-06-07 23:23:04 -07:00
Sagar Kamble
45c6aed68d gpu: nvgpu: fix CERT violations in nvgpu_dbg_gpu_access_gpu_va
Update nvgpu_dbg_gpu_access_gpu_va to:
1. Ensure that integer conversions do not result in lost or
   misinterpreted data.
2. Do not dereference null pointers.

CID 436748
CID 473585
CID 254272
CID 490303
Bug 3512546

Change-Id: I551484b671aa48175a8cea119885eac478c2731c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707019
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 23:24:44 -07:00
Sagar Kamble
9d6269ce7f gpu: nvgpu: assert gr dev is non-NULL
nvgpu_device_get can return NULL if supplied invalid ID or instance
ID. We expect GR device struct to be non-NULL there hence just
assert that it is indeed non-NULL in gr_reset_engine and
ga10b_grmgr_init_gr_manager.

CID 224133
CID 250232
Bug 3512546

Change-Id: Id09a1c436a8e49b921111b940d3d013bd66bff7a
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707018
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-05-07 23:24:39 -07:00
Sagar Kamble
c1202d7283 gpu: nvgpu: assert that priv is non-NULL in gk20a_alloc_comptags
priv data is available when gk20a_alloc_comptags is called hence add
assert for it.

CID 274852
Bug 3512546

Change-Id: I9d907153c359900071f0f89b84d2ee15141dd874
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707492
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 15:18:56 -07:00
Sagar Kamble
75c9a2eb94 gpu: nvgpu: fix nvgpu_dma_alloc_flags_sys cleanup
aligned_size was decremented from g->dma_memory_used in case
of failure post dma alloc. However, aligned_size is not
initialized at that point. Use size instead.

CID 446040
Bug 3512546

Change-Id: Id1e117703a3c24dcb9c0b6f3b808c7e30bf90f0b
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707486
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 15:18:45 -07:00
Sagar Kamble
c32c4025a4 gpu: nvgpu: fix the ce app ctx cleanup
tsg and ch members in ce_ctx may remain uninitialized when the cleanup
function nvgpu_ce_delete_gpu_context_locked is called. Guard the
references to those.

CID 438091
Bug 3512546

Change-Id: I0ce96f9bad1e4f7fd331171b3f134c48c893839f
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707470
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 15:18:39 -07:00
mkumbar
2506dd2b86 gpu: nvgpu: set ACR FW load flag as per platform
-Add ACR FW load flag which will be set based on
 platform and load the requested FW accordingly.

Bug 3572869

Change-Id: I6643f183fed104fef059dd691036a2c509073a50
Signed-off-by: mkumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689022
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Andy Chiang <achiang@nvidia.com>
2022-05-07 15:13:03 -07:00
Richard Zhao
1ce899ce46 gpu: nvgpu: fix compile error of new compile flags
Preparing to push hvrtos gpu server changes which requires bellow CFLAGS:
        -Werror -Wall -Wextra \
        -Wmissing-braces -Wpointer-arith -Wundef \
        -Wconversion -Wsign-conversion \
        -Wformat-security \
        -Wmissing-declarations -Wredundant-decls -Wimplicit-fallthrough

Jira GVSCI-11640

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I25167f17f231ed741f19af87ca0aa72991563a0f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2653746
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-07 15:11:49 -07:00
Johnny Liu
69ec2dcff7 gpu: nvgpu: ga10b+: emc scaling using ICC
For EMC frequency scaling, prior to ga10b, nvgpu driver
was using BWMGR. On ga10b+, BWMGR support is deprecated
and moved to Linux ICC framework.

Jira NVGPU-7312
Bug 3514055
Bug 200766984

Change-Id: Ib1f87afe021414dfc563e007823f93098937fe59
Signed-off-by: Johnny Liu <johnliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2706374
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajkumar Kasirajan <rkasirajan@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-06 05:21:55 -07:00
Rajesh Devaraj
fac998940c gpu: nvgpu: enable polling support for error reporting in AV+L
As per Safety_Services, a client must perform polling to ensure that the
previously reported errors are cleared at FSI, in case of back-to-back
error reporting. However, to minimize the polling overhead, NvGPU driver
performs polling only when the error to be reported is corrected error
to ensure that it is not overwriting the previously reported
uncorrected/corrected error. In case of uncorrected errors, it will be
reported without doing polling. This situation leads to a failure in
error reporting, when uncorrected errors are reported back-to-back. This
is acceptable for safety builds where SW quiesce will be triggered
immediately after the reporting of first uncorrected error. In case of
other build configurations, MCU/SEH takes the decision on encountering
uncorrected errors. To handle such build configurations, polling is
enabled for all types of errors, in all build configurations.

This patch also removes an unused macro "ERR_TYPE_MASK".

Bug 3622420

Change-Id: I750b0406faec9b229d8d0c74e986807234362cb9
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2707105
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-06 05:21:43 -07:00
Martin Radev
657daaee9e gpu: nvgpu: Mark fds with O_CLOEXEC
There shouldn't be an usecase that an fd, installed by nvgpu,
must be shared on exec with the new process. This doesn't only
lead to excessive number of fds in the exec process, but also
can lead to potential security issues.

This patch marks the fds with O_CLOEXEC, so that they get
closed on exec.

Bug 3583628

Change-Id: I3499b1429ac512b2c172e9e628d0a7a1417d72e3
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2704350
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-03 20:33:58 -07:00
Richard Zhao
c30afdce02 gpu: nvgpu: add periodic timer API
move fecs_trace polling from kthread to timer API.

Jira GVSCI-10883

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I224754b7205f1d0eefdc19a73a98f42e4d3e9d0e
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2700601
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Aparna Das <aparnad@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-05-02 23:16:44 -07:00
Antony Clince Alex
61ae0b7642 gpu: nvgpu: fix emulate mode enable
The emulate mode support is determined after chip detect and is flagged
by using NVGPU_SUPPORT_EMULATE_MODE flag. The present logic prevents
user from configuring the emulate mode sysfs knobs if this flag is not
set, however the emulate mode usecase requires the user to configure the
syfs knob prior to power-on, hence defer emulate mode check to a later
stage after chip detect.

Bug 3621460

Change-Id: If522527542fa8d7e95ccbcff43b74adbb9e976e6
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2703953
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mayur Poojary <mpoojary@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: David Li <davli@nvidia.com>
2022-04-29 06:17:59 -07:00
Jinesh Parakh
e3ed309d35 gpu: nvgpu: Fix Unintentional integer overflow bug
Fix following Coverity defect:
cde.c : Unintentional integer overflow

The Coverity issue suggests that (u64) (nvgpu_ltc_get_ltc_count(g) * nvgpu_ltc_get_slices_per_ltc(g) * nvgpu_ltc_get_cacheline_size(g)) can cause overflow because typecasting is done after multiplication.
This patch solves that issue by typecasting it before multiplication.

CID 10112360

Bug 3460991

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I314ca7a9adc95fcb09f15eb603b56ad03ce34b99
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2697027
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-29 06:11:31 -07:00
Jinesh Parakh
131933d528 gpu: nvgpu: Fix Division by zero defect
Fix following Coverity Defect:
profile.c : Division or modulo by zero

CID 10061399

Bug 3460991

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I03979af4ab105f659cf0fe3eac8d21946dfca950
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2695362
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-29 06:10:48 -07:00
Jinesh Parakh
5b38f48227 gpu: nvgpu: Fix Bad bit shift Coverity issues
Fixed following Coverity Defects:
sec2_tu104.c : Bad bit shift operation
grmgr_ga10b.c : Bad bit shift operation
clk_gm20b.c : Bad bit shift operation
gsp_ga10b.c : Bad bit shift operation

CID 9869505
CID 9869506
CID 10062538
CID 10112176
CID 10127981

Bug 3460991

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I9cbb2403ccd41d6cbe99f73f27f915969094fa5b
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2689262
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-29 06:08:50 -07:00
Jinesh Parakh
622fe70dab gpu: nvgpu: Fix Bad bit shift Coverity issues
Fixed following Coverity Defects:
ioctl_as.c : Bad bit shift operation
mc_tu104.c : Bad bit shift operation
vm.c : Bad bit shift operation
vm_remap.c : Bad bit shift operation

A new linux header file for ilog2 is created.
The files which used the old ilog2 function
have been changed to use the new nvgpu_ilog2
function.

CID 9847922
CID 9869507
CID 9859508
CID 10112314
CID 10127813
CID 10127899
CID 10128004

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: Ia201eea7cc426c3d6581e1e5ae3b882dbab3b490
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2700994
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-28 04:08:45 -07:00
Jinesh Parakh
167e7b0256 gpu: nvgpu: Fix Unused value Defect
Fix following Coverity Defect:
nvgpu_init.c : Unused value

The ret variable was being reassigned the error code from
nvgpu_cic_mon_deinit(g) without taking into account the previous ret value.
We need to propagate whether there is an error
(the last known error is returned) or not using ret, the temp_ret
variable helps in verifying this.
Similar coding style followed in the entire function.

CID 10127863

Bug 3460991

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: I732ba5269ebbbe68f113e53229df40ae49ccc13c
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2697104
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-27 20:19:29 -07:00
Martin Radev
60481ea5e4 gpu: nvgpu: Free regops allowlist after failure-prone operations
The function nvgpu_profiler_unbind_pm_resources is responsible for
destroying the regops allowlist object, but unfortunately does it
prior to any of the failure-prone operations. Because this function
can be called multiple times, in rare cases it can happen that
object is deallocated twice.

This patch fixes the issue by moving the free operations after the
failure-prone operations.

Bug 3591603

Change-Id: I3415712da561ccf162c9fb7f3ebb942faa9d9420
Signed-off-by: Martin Radev <mradev@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693803
(cherry picked from commit I3415712da561ccf162c9fb7f3ebb942faa9d9420)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2693799
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-04-26 17:48:45 -07:00
mpoojary
769ec3f88b gpu: nvgpu: pmu: Add support to set nvgpu_next pmu init
Select nvgpu_next_pmu_init when config_next flag is set.
This will let pmu load nvgpu_next binaries.

Bug 3579665

Change-Id: Ifc15ba1ff5eacfba22de9676d5fe93beda608153
Signed-off-by: mpoojary <mpoojary@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2702292
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
2022-04-26 04:09:02 -07:00