Commit Graph

874 Commits

Author SHA1 Message Date
Sagar Kamble
577bcd8d9d gpu: nvgpu: set MIT license for unit test Makefiles
Change NV license for unit test Makefiles to MIT license as those
can be distributed like unit test sources.

Change-Id: I2a835ea39eb24a2e4fcb3aaff100690a54cbaf22
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2813958
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
2022-11-23 09:28:10 -08:00
srajum
e212271d56 nvgpu: disable failing unit tests
- These unit tests are failing and we are going to take care of
  unit tests as part of Safety work on dev-main.
  It should not be an issue for rel-35.

Bug 3681100

Change-Id: I6f0f4d1697151b189c4f26e5206d25537e65a7bd
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2735815
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
2022-06-30 00:27:17 -07:00
Jinesh Parakh
622fe70dab gpu: nvgpu: Fix Bad bit shift Coverity issues
Fixed following Coverity Defects:
ioctl_as.c : Bad bit shift operation
mc_tu104.c : Bad bit shift operation
vm.c : Bad bit shift operation
vm_remap.c : Bad bit shift operation

A new linux header file for ilog2 is created.
The files which used the old ilog2 function
have been changed to use the new nvgpu_ilog2
function.

CID 9847922
CID 9869507
CID 9859508
CID 10112314
CID 10127813
CID 10127899
CID 10128004

Signed-off-by: Jinesh Parakh <jparakh@nvidia.com>
Change-Id: Ia201eea7cc426c3d6581e1e5ae3b882dbab3b490
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2700994
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-04-28 04:08:45 -07:00
Tejal Kudav
b80b2bdab8 gpu: nvgpu: Add CE interrupt handling
a. LAUNCH_ERR
    - Userspace error.
    - Triggered due to faulty launch.
    - Handle using recovery to reset CE engine and teardown the
      faulty channel.

b. An INVALID_CONFIG -
    - Triggered when LCE is mapped to floorswept PCE.
    - On iGPU, we use the default PCE 2 LCE  HW mapping.
      The default mapping can be read from NV_CE_PCE2LCE_CONFIG
      INIT value in CE refmanual.
    - NvGPU driver configures the mapping on dGPUs (currently only on
      Turing).
    - So, this interrupt can only be triggered if there is
      kernel or HW error
    - Recovery ( which is killing the context + engine reset) will
      not help resolve this error.
    - Trigger Quiesce as part of handling.

c. A MTHD_BUFFER_FAULT -
    - NvGPU driver allocates fault buffers for all TSGs or contexts,
      maps them in BAR2 VA space and writes the VA into channel
      instance block.
    - Can be triggered only due to kernel bug
    - Recovery will not help, need quiesce

d. FBUF_CRC_FAIL
    - Triggered when the CRC entry read from the method fault buffer
      does not match the computed CRC from the methods contained in
      the buffer.
    - This indicates memory corruption and is a fatal interrupt which
      at least requires the LCE to be reset before operations can
      start again, if not the entire GPU.
    - Better to quiesce on memory corruption
      CE Engine reset (via recovery) will not help.

e. FBUF_MAGIC_CHK_FAIL
    - Triggered when the MAGIC_NUM entry read from the method fault
      buf does not match NV_CE_MTHD_BUFFER_GLOBAL_HDR_MAGIC_NUM_VAL
    - This indicates memory corruption and is a fatal interrupt
    - Better to quiesce on memory corruption

f. STALLING_DEBUG
    - Only triggered with SW write for debug purposes
    - Debug interrupt, currently ignored

Move launch error handling from GP10b to GV11b HAL as -
1. LAUNCHERR_REPORT errcode METHOD_BUFFER_ACCESS_FAULT is not
   defined on Pascal
2. We do not support GP10b on dev-main ToT

JIRA NVGPU-8102

Change-Id: Idc84119bc23b5e85f3479fe62cc8720e98b627a5
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678893
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-14 17:12:14 -07:00
Tejal Kudav
3bfab5df3f gpu: nvgpu: Disable fault mthd buf intrs on safety
Below CE interrupts are disabled on safety build as fault and
switch mechanism is not supported on safety:
NV_CE_LCE_INTR_STATUS_MTHD_BUFFER_FAULT
NV_CE_LCE_INTR_STATUS_FBUF_CRC_FAIL
NV_CE_LCE_INTR_STATUS_FBUF_MAGIC_CHK_FAIL

Bug 3548082

Change-Id: I400cd02a8c9888b7ef0d71bbc1f7d792b48e8227
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2679052
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-10 16:04:37 -08:00
srajum
2316f39f77 userspace: fixing warnings in NVGPU-RM SWVS
- Below are warnings encountered when we use same function names in
  multiple units

  doxygenfunction: Unable to resolve multiple matches for  function
  “test_setup_env” with arguments () in doxygen xml output.

  doxygenfunction: Unable to resolve multiple matches for  function
  “test_free_env” with arguments () in doxygen xml output.

- Fixing warnings by updating functions with unique names in multiple
  units

JIRA NVGPU-7115

Change-Id: Iaa861040208e101c114f5c556096deb09d08b7fe
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2601798
(cherry picked from commit f57e408ba2fae4ff9b7c54a441e5cc3e75b0c87c)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678347
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-03-10 16:03:22 -08:00
srajum
8381647662 gpu: nvgpu: fixing MISRA violations
- MISRA Directive 4.7
  Calling function "nvgpu_tsg_unbind_channel(tsg, ch, true)" which returns
  error information without testing the error information.

- MISRA Rule 10.3
  Implicit conversion from essential type "unsigned 64-bit int" to different
  or narrower essential type "unsigned 32-bit int"

- MISRA Rule 5.7
  A tag name shall be a unique identifier

JIRA NVGPU-5955

Change-Id: I109e0c01848c76a0947848e91cc6bb17d4cf7d24
Signed-off-by: srajum <srajum@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2572776
(cherry picked from commit 073daafe8a11e86806be966711271be51d99c18e)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678681
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-10 16:01:18 -08:00
Shashank Singh
8169bc8c83 Revert "gpu: nvgpu: disable golden context image verification"
This reverts commit a372ec9a38.
Earlier golden context image verification was failing on orin safety due
to mismatch. But on tot there is no mismatch obeserved (possibly due to
update of NET image from A to D). So, now golden context image
verification can be re-enabled for orin safety.

Bug 3482988

Change-Id: I2bda9be921987e6b6a3b933b3ff45b26cf3025ca
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2678153
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-09 21:13:45 -08:00
Tejal Kudav
3fe70bf86e gpu: nvgpu: Update CE Intr code as per Orin HSIs
Below CE interrupts do not have any users(usecases) on safety build;
disable them only on safety build.
   1. BLOCKPIPE stall intr: Not used by GFX(VKSC) and CUDA on safety.
   2. NONBLOCK_PIPE nonstall intr: Non-stall intrs are not supported
          on safety build. Also, this one is not used by GFX(VKSC)
          and CUDA.
   3. STALLING_DEBUG intr: Added in Orin tree. It is only needed for
          debugging. Disable on safety build as there is no current
          usage in driver.
   4. POISON_ERROR intr: Poison is a fault containment and not
	  supported on GA10b.
   5. INVALID_CONFIG intr: Floor sweeping not supported on functional
          safety SKU.

Bug 3548082

Change-Id: I8d97ccb38f138b2c04a780e1c255a64d28723405
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671927
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-03-08 11:41:26 -08:00
shashank singh
29019dff6e gpu: nvgpu: remove round_up usage in safety build
- In function gv11b_tsg_init_eng_method_buffers() PAGE_ALIGN can be used
  instead of round_up macro.
- In function nvgpu_posix_find_next_bit() rounding up of start does not
  seem to serve any purpose.

JIRA NVGPU-7057

Change-Id: I4a3a21e95a0f3aa38f7007de1f6959f1d878e511
Signed-off-by: shashank singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614326
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2672107
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-23 11:08:31 -08:00
shashank singh
fb0ebef0a7 gpu: nvgpu: compile out ununsed code on safety build for common.nvgpu
Jira NVGPU-7052

Change-Id: Idab4f9d56c0748f54fd08fc5fd01d96a66f94700
Signed-off-by: shashank singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581247
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2670885
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-23 11:08:05 -08:00
Shashank Singh
5ec241a1d8 gpu: nvgpu: remove non stall intr from top handler for safety
On safety nonstall interrupt is not used and should be compiled out to
rule out any chance of interference with safety code. Remove top handler
support of nonstall interrupt for safety which is currently not
applicable to linux.

Jira NVGPU-7066
Jira NVGPU-4078

Change-Id: I278efc8da6ddd0f22c128af6630cfd1b20ba4784
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2589006
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2671586
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-21 02:31:38 -08:00
Shashank Singh
19a3b86f06 gpu: nvgpu: remove unused code from common.nvgpu on safety build
- remove unused code from common.nvgpu unit on safety build. Also,
remove the code which uses them in other places.
- document use of compiler intrinsics as mandated in code inspection
  checklist.

Jira NVGPU-6876

Change-Id: Ifd16dd197d297f56a517ca155da4ed145015204c
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561584
(cherry picked from commit 900391071e9a7d0448cbc1bb6ed57677459712a4)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2561583
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-02-17 04:58:32 -08:00
Rajesh Devaraj
0699220b85 gpu: nvgpu: compile-out unused apis from safety build
This patch does the following changes:
- Compiles-out unused error reporting APIs and the related
  data structures from safety build. For this purpose, it
  introduces the new flag: CONFIG_NVGPU_INTR_DEBUG
- Updates nvgpu_report_err_to_sdl() API with one more argument,
  hw_unit_id. This aids in finding whether an error to be reported
  is corrected or uncorrected from LUT.
- Triggers SW quiesce, if an uncorrected error is reported to
  Safety_Services, in safety build.
- Renames files in cic folder by replacing gv11b with ga10b,
  since error reporting for gv11b is not supported in dev-main.

JIRA NVGPU-8002

Change-Id: Ic01e73b0208252abba1f615a2c98d770cdf41ca4
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2668466
Reviewed-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-02-14 22:00:33 -08:00
Rajesh Devaraj
878235e914 gpu: nvgpu: remove report error callback
In DRIVE 6.0, NvGPU needs to support error reporting in QNX-Safety,
QNX-Standard, and Linux. To support error reporting in all these
platform variants, SDL unit will be moved from QNX to common code.
As part of this refactoring activity, this patch removes ops assignment
for report error. Also, it removes API calls that are used to take
time-stamp for stall interrupt thread. This time-stamp APIs will be
brought back later, if required to support periodic diagnostics.

JIRA NVGPU-7353

Change-Id: I38536019dc7165e6a97674863b37d009854af948
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2655958
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2022-01-24 02:06:24 -08:00
Seshendra Gadagottu
a7c1052024 gpu: nvgpu: program ltc cg prod values after acr boot
Separate nvgpu_cg_blcg/slcg_fb_ltc_load_enable function
into nvgpu_cg_blcg/slcg_fb_load_enable and
nvgpu_cg_blcg/slcg_ltc_load_enable.

Program fb slcg/blcg prod values during fb init and
program ltc slcg/blcg prod values after acr boot to
have correct privilege for ltc cg programming.

Update unit tests to have sperate blcg/slcg hal for
fb and ltc programming.

Bug 3423549

Change-Id: Icdb45528abe1a3ab68a47f689310dee9a4fe9366
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2646039
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-15 06:08:21 -08:00
Shashank Singh
a372ec9a38 gpu: nvgpu: disable golden context image verification
- Disable golden context image verification until ctxsw fw for orin
safety is ready for this feature.
- Make NULL check for hal set_default_compute_regs else it causes crash
for orin safety.

Bug 3456240

Change-Id: I1f6ca9d78f22cc6776bb0b3a9e05f22171095c7f
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2645666
(cherry picked from commit 3907d1b315e1247243632fefdcbce69d58090681)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2644533
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2022-01-06 11:40:46 -08:00
Sagar Kamble
c463810bcd gpu: nvgpu: fix ltc isr, unit tests
LTC isr doesn't handle ECC errors correctly. INTR3 reports only
parity ECC errors and INTR reports SEC/DED ECC errors. nvgpu
managed both these errors with same counters. Fix it as per
Volta ECC HW Functional Description.

JIRA NVGPU-6982

Change-Id: I6ddaab55f7e1354ad9b832672a9006b7e58df9f7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2605012
(cherry picked from commit 5f92651e921b17cb61bbbb8954128c787cd89238)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2632548
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-17 14:36:45 -08:00
Sagar Kamble
449a4823d4 gpu: nvgpu: compile out non fusa LTC functionality
nvgpu_ltc_sync_enabled functionality is used only in the kernel mode
submit path and for debugging. en_illegal_compstat functionality is
used for debugging .

Compile them out under CONFIG_NVGPU_NON_FUSA.

JIRA NVGPU-6982

Change-Id: I404d4b74b2e60ba4c2173ba0bfb643b1ecb6ba7c
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2605011
(cherry picked from commit f4bcafe73c8f7184b5e125e3ff6e55ceccaf87eb)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2632547
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-17 14:36:40 -08:00
Konsta Hölttä
ae166bba8a gpu: nvgpu: posix: print WARN*() location
WARN() and WARN_ON() are most useful when the log explains where they
happened. The posix implementation of these prints neither that nor the
warning message (if any). Extend the macros to include function name and
line number, and print those plus the format string.

Actually formatting the format string is problematic wrt. MISRA rules,
so the arguments are not formatted.

The implementation of BUG() already prints the function name and line
number.

Change-Id: Ie246a915f5e8420e1c606bb1555a7f9b498725fd
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2634105
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-12-11 14:05:16 -08:00
Konsta Hölttä
632644b44a gpu: nvgpu: couple runlist domains and nvs
Now that the main nvsched code exists in the nvgpu build, make it
control the runlist domains. As a new nvs domain is created, create the
relevant runlist data too. To support the default domain, create a
default nvs domain at boot.

The scheduling domain code owns the responsibility of domain lifetime,
and runlist domains exist to serve that logic although the RL domains
are directly used by channel and TSG logic. Add refcounting to the
scheduler uapi level to make sure that busy domains (that still have TSG
participants) do not get removed too early.

Adjust error injection sensitive unit tests to match the updated logic.

Jira NVGPU-6425
Jira NVGPU-6427

Change-Id: I1beec97c54c60ad334165b1c0acb5e827c24f2ac
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2632287
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-07 07:07:12 -08:00
Deepak Nibade
9f55801a15 gpu: nvgpu: move local golden context memory allocation to poweorn
- Separate out local golden context memory allocation from
  nvgpu_gr_global_ctx_init_local_golden_image() into a new function
  nvgpu_gr_global_ctx_alloc_local_golden_image().
- Add a new member local_golden_image_copy to struct
  nvgpu_gr_obj_ctx_golden_image to store copy used for context
  verification.
- Allocate local golden context memory from nvgpu_gr_obj_ctx_init()
  which is called during poweron path.
- Remove memory allocation from nvgpu_gr_obj_ctx_save_golden_ctx().
- Disable test test_gr_obj_ctx_error_injection since it needs rework
  to accomodate the new changes.
- Fix below tests to allocate local golden context memory :
  test_gr_global_ctx_local_ctx_error_injection
  test_gr_setup_alloc_obj_ctx

Bug 3307637

Change-Id: I2f760d524881fd328346838ea9ce0234358f8e51
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2633713
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-01 08:44:30 -08:00
Tejal Kudav
6a1fd53b54 gpu: nvgpu: Mark read_ptimer() HAL as NON_FUSA
Remove read_ptimer() API from safety build as GPU_GET_TIME DEVCTL got
removed. This functionality is entirely implemented inside nvrm_gpu.
Remove related unit-tests.

JIRA NVGPU-4922

Change-Id: I3c1d2e16ddf170d4f08d6bf4826ee683ea0d9e19
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2608654
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-12-01 08:39:27 -08:00
Sagar Kamble
48d17e9c53 gpu: nvgpu: fix the unit test traceability
gk20a_tsg_unbind_channel_check_hw_next was not added to Targets
in unit test specification. Add it.  __attribute__ in debug.h
is captured by Doxygen as function with no tests. However it
is not really a function and applies to non-fusa function so
skip it in Doxygen.

JIRA NVGPU-7211

Change-Id: I2adadaebbf4e43768eb408dd10aaa20b1e13eccc
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2615256
(cherry picked from commit e829afb55a17dc0dacf17c71633f5689324171d7)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623629
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-14 04:24:17 -08:00
Sagar Kamble
f64b5e20b0 gpu: nvgpu: add unit test for gk20a_tsg_unbind_channel_check_hw_next
false branch when NEXT bit is not set is not covered. Add unit test
for same.

JIRA NVGPU-7211

Change-Id: I57725e35971605bf8144e7eaac618f44a38e5b31
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614209
(cherry picked from commit 2064209f92700dc859d7398e061b3d7dc2725521)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623628
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-14 04:24:11 -08:00
Sagar Kamble
cdfbd4313b gpu: nvgpu: add BVEC tests for common.mc unit
Add BVEC tests for following common.mc unit APIs:
1. nvgpu_mc_intr_stall_unit_config
2. nvgpu_mc_intr_nonstall_unit_config
3. mc.reset_mask

Changed the WARN to nvgpu_err in mc.reset_mask. Invalid inputs are
handled properly.

Updated the MC unit test logic w.r.t mc_intr_en_r, mc_intr_en_set_r
and mc_intr_en_clear_r semantics.

JIRA NVGPU-6399

Change-Id: I6a3ae42ac37cd6b586f6c71de338595e6cb04a37
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542591
(cherry picked from commit b9908c979e8964a216141cc6ed475c7de2f2cc0b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623631
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-14 01:30:25 -08:00
Sagar Kamble
0394aef90d gpu: nvgpu: cast bvec test
Add BVEC tests for following functions:
nvgpu_safe_cast_u64_to_u32, nvgpu_safe_cast_u64_to_u16,
nvgpu_safe_cast_u64_to_u8, nvgpu_safe_cast_u64_to_s64,
nvgpu_safe_cast_u64_to_s32, nvgpu_safe_cast_s64_to_u64,
nvgpu_safe_cast_s64_to_u32, nvgpu_safe_cast_s64_to_s32,
nvgpu_safe_cast_u32_to_u16, nvgpu_safe_cast_u32_to_u8,
nvgpu_safe_cast_u32_to_s32, nvgpu_safe_cast_u32_to_s8,
nvgpu_safe_cast_s32_to_u64, nvgpu_safe_cast_s32_to_u32,
nvgpu_safe_cast_s8_to_u8, nvgpu_safe_cast_bool_to_u32

JIRA NVGPU-6412

Change-Id: Ic97e45051570a7133045de6cb4345c5f935cf9f6
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555469
(cherry picked from commit be2ba5f1a7ead4c062eab74c7587c32797a651df)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623637
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-12 21:37:01 -08:00
Sagar Kamble
a1e75fe9bc gpu: nvgpu: add unit test for nvgpu_wrapping_add_u32
Add BVEC unit test for the function nvgpu_wrapping_add_u32.

JIRA NVGPU-7211

Change-Id: I5c4c870c75b3e7643a771110b2c0d248c1f8cb56
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614166
(cherry picked from commit f6a2fae67c3dd0d3f11deba2cb943a8c6420fda5)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623633
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-12 21:36:55 -08:00
Sagar Kamble
d944313a54 gpu: nvgpu: arithmetic bvec tests
Add BVEC tests for following functions:
nvgpu_safe_sub_u8, nvgpu_safe_add_u32, nvgpu_safe_add_s32,
nvgpu_safe_sub_u32, nvgpu_safe_sub_s32, nvgpu_safe_mult_u32,
nvgpu_safe_add_u64, nvgpu_safe_add_s64, nvgpu_safe_sub_u64,
nvgpu_safe_sub_s64, nvgpu_safe_mult_u64, nvgpu_safe_mult_s64

JIRA NVGPU-6412

Change-Id: Ie4f1138318314c3f53b1f188e1ca45f681ca895e
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2553170
(cherry picked from commit 74c32f975c181107372957a28aad0cb5278f42b2)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2623630
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-12 21:36:49 -08:00
Konsta Hölttä
6cff904dc3 gpu: nvgpu: use runlist obj for wait_pending
Change the gops_runlist::wait_pending API to take a runlist pointer
instead of a runlist ID to better match with the rest of that interface.

Jira NVGPU-6425

Change-Id: I96c4f49df8e2613498e0a09cc75a950824828bed
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621214
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-11-11 20:39:47 -08:00
Konsta Hölttä
3cf796b787 gpu: nvgpu: move active bitmaps to domain
Move the active_channels and active_tsgs bitmaps from struct
nvgpu_runlist to struct nvgpu_runlist_domain. A TSG and its channels are
currently active as part of a runlist; in the future, a runlist may be
switched from multiple domains that each are a collection of TSGs.

The changes are still internal to the runlist code. Users of runlists
need no modifications.

Jira NVGPU-6425

Change-Id: I2d0e98e97f04b9716bc3f4890cf881735d0ab664
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2618387
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-11-03 20:55:08 -07:00
Konsta Hölttä
1d23b8f13a gpu: nvgpu: introduce internal runlist domain
The current runlist code assumes a single runlist buffer to hold all TSG
and channel entries. Create separate RL domain and domain memory types
to hold data that is related to only a scheduling domain and not
directly to the runlist hardware; in the future, more than one domains
may exist and one of them is enabled at a time.

The domain is used only internally by the runlist code at this point and
is functionally equivalent to the current runlist memory that houses the
round robin entries.

The double buffering is still kept, although more domains might benefit
from some cleverness. Although any number of created domains may be
edited in runtime, nly one runlist memory is accessed by the hardware at
a time. To spare some contiguous memory, this should be considered an
opportunity for optimization in the future.

Jira NVGPU-6425

Change-Id: Id99c55f058ad56daa48b732240f05b3195debfb1
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2618386
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-11-03 20:54:48 -07:00
Konsta Hölttä
93f7636268 gpu: nvgpu: unit test mapping cache maint errors
Target the recently extended error handling paths in gmmu mapping paths
in both passing and failing PTE entry update conditions. Verify the
number of calls to cache ops and that failed mappings leave the PTEs
cleared.

Bug 200778663

Change-Id: I1a69b514a6815e83fe0efaf1dcf1613d3fcb76aa
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2616042
(cherry picked from commit a132b0322b36b4014d90370ce0b415295f125faf)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2617911
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-28 21:06:02 -07:00
Konsta Hölttä
f4ec400d5f gpu: nvgpu: simplify nvgpu_timeout_init
nvgpu_timeout_init() returns an error code only when the flags parameter
is invalid. There are very few possible values for flags, so extract the
two most common cases - cpu clock based and a retry based timeout - to
functions that cannot fail and thus return nothing. Adjust all callers
to use those, simplfying error handling quite a bit.

Change-Id: I985fe7fa988ebbae25601d15cf57fd48eda0c677
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2613833
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-26 13:47:32 -07:00
Konsta Hölttä
9b3f3ea4be gpu: nvgpu: remove timeout fault injection tests
The timeout init API is changing to return void in most cases. Adapt the
unit tests to the reduced branching.

Change-Id: I4d05484529fe4ef46b518f41d10b71a4a9f9c6fb
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2614286
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-26 13:47:20 -07:00
Konsta Hölttä
32a148867f gpu: nvgpu: unit test leaky failed mappings
Ensure that when a mapping attempt fails in the middle of updating GMMU
PTEs, the PTEs are left unmapped. Add test_map_buffer_security() to the
VM tests to trigger a PD allocation failure and verify the first PTE.

Bug 200778663

Change-Id: I766c1a68b6f734a218c5c4a4f6a6655a7ad8ca27
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2599538
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-13 13:52:00 -07:00
Konsta Hölttä
4c62b1aad4 gpu: nvgpu: unit: avoid use-after-free in unmap test
The error path in map_buffer() attempts to unmap a buffer twice to check
that such action does not cause errors. The call site uses a field of a
freed structure in the second call; store that in a local variable to
avoid reading freed memory.

Change-Id: I20fe66cf255dce25b1c4012bda2a6f864daf419a
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2605495
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-09 15:06:09 -07:00
Deepak Nibade
d1f3f81553 gpu: nvgpu: remove SW methods from safety build
Improved SDL heartbeat mechanism detects the interrupts triggered by
SW method and treats them as errors. Hence remove the SW method support
completely from safety build. Registers set by SW methods are now set
by default for all the contexts.

Implement new HAL gops.gr.init.set_default_compute_regs() to set the
registers in patch context. Call this HAL while creating each context.

Update gv11b_gr_intr_handle_sw_method() to treat all compute SW methods
as invalid.

Update unit test test_gr_intr_sw_exceptions() so that it now expects
failure for any method/data.

Bug 200748548

Change-Id: I614f6411bbe7000c22f1891bbaf06982e8bd7f0b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2527249
(cherry picked from commit bb6e0f9aa1404f79bcfbdd308b8c174a4fc83250)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2602638
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-04 18:03:55 -07:00
Konsta Hölttä
1b1d183b9c gpu: nvgpu: simplify gmmu map calls
Introduce nvgpu_gmmu_map_partial() to map a specific size of a buffer
represented by nvgpu_mem, or what nvgpu_gmmu_map() used to do. Delete
the size parameter from nvgpu_gmmu_map() such that it now maps the
entire buffer. The separate size parameter is a historical artifact from
when nvgpu_mem did not exist yet; the typical use is to map the entire
buffer.

Mapping at a certain address with nvgpu_gmmu_map_fixed() still takes the
size parameter.

The returned address still has to be stored somewhere, typically to
mem.gpu_va by the caller so that the matching unmap variant finds the
right address.

Change-Id: I7d67a0b15d741c6bcee1aecff1678e3216cc28d2
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2601788
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-10-01 21:38:43 -07:00
Konsta Hölttä
44422db851 gpu: nvgpu: simplify gmmu unmap calls
Introduce nvgpu_gmmu_unmap_addr() to unmap a nvgpu_mem that was mapped
at some other address than mem.gpu_va, which can be the case for buffers
that are shared across different address spaces. Delete the address
parameter from nvgpu_gmmu_unmap(), as the common case is to store the
address to mem.gpu_va when mapping the buffer.

Modify some instances of consecutive unmap + free calls to call just
nvgpu_dma_unmap_free().

Change-Id: Iecd7c9aa41d04e9f48e055f6bc0c9227cd759c69
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2601787
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-30 16:29:41 -07:00
Sagar Kamble
72c3bce602 gpu: nvgpu: compile out non-safe ctxsw_prog hals
Following two hals are non-safe. Compile them under
CONFIG_NVGPU_HAL_NON_FUSA:
1. init_ctxsw_hdr_data
2. disable_verif_features

JIRA NVGPU-5358

Change-Id: I751c4655dc628f7ab66ed3a779268a6a88f9a1e3
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581361
(cherry picked from commit abf16c6a01109d174879609c10354f06739fb6dc)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581842
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 03:17:12 -07:00
Sagar Kamble
62b04331de gpu: nvgpu: compile out priv_access_map config/addr hals
These hals are non-safe. Compile them out with
CONFIG_NVGPU_SET_FALCON_ACCESS_MAP.

JIRA NVGPU-5358

Change-Id: I75b46e201fa132e09fee15679a402d24bbf9b2ab
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581360
(cherry picked from commit d048333ef391019b2618abf7d09c8fe2042f8ee0)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2581841
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-21 03:17:00 -07:00
Tejal Kudav
5a94007725 gpu: nvgpu: Remove redundant HAL from common.fbp
common.fbp has two interfaces to initialize FBP:
1. Public API nvgpu_fbp_init_support
2. HAL fbp.fbp_init_support

nvgpu_fbp_init_support() is only used to initialize HAL
fbp.fbp_init_support. Remove the HAL and use the API directly.

JIRA NVGPU-6644

Change-Id: I2c455e09dbcf5e4fb1dc370b284e4f0d5c678b40
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2592047
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-16 05:59:00 -07:00
Debarshi Dutta
791dc18666 gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields
Add Setter and Getter methods for accessing tsg->sm_error_states.
Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state.
This renders it unnecessary to add BVEC for above fields for the struct
in multiple locations. The current design ensures that only a constant
pointer is obtained from the owner unit i.e. FIFO.

The following new methods are added. Both unit tests and BVEC tests
are added for them as well.

nvgpu_tsg_store_sm_error_state
nvgpu_tsg_get_sm_error_state

Jira NVGPU-6947

Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
(cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-13 20:57:09 -07:00
Sagar Kadamati
dd9b4364aa gpu: nvgpu: add nvgpu-next infrastructure
* As of now, working on multiple chip bringup in nvgpu-next repo has
   an issue because we end with losing control on source code (hard to
   find which part of the code belongs to which chip) and it's valuable
   history this affects chip migration on release.

 * To support multiple chip bringup simultaneously, we need new
   guidelines to avoid losing control on source code and make migration
   easier. This change adds links to nvgpu-next repo.

 * Updated return code to ENODEV for consistency
 * Updated ACR unittest to work with ENODEV return code

NOTE:
     These are the initial set of infrastructure changes, guidelines
     will evolve, and source code will get updated accordingly.

     Based on future chip features, Which part of the source code falls
     under nvgpu-next repo is decided.

JIRA NVGPU-6574

Change-Id: I81827e35d189c55554df00e255b527a4473e0338
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556793
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-09-08 06:50:38 -07:00
ajesh
7155ae865c gpu: nvgpu: update queue unit tests
Update queue unit tests for code coverage.

JIRA NVGPU-6904

Change-Id: I49ed6980f2d610cf8359c375a1236e8866ea6795
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555333
(cherry picked from commit f2311f2710cab83b82ed7f5d51c54fa897051686)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560216
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 05:57:54 -07:00
ajesh
3c70d56ddb gpu: nvgpu: update posix thread unit tests
Update the unit tests for posix thread unit to increase
coverage.

JIRA NVGPU-6904

Change-Id: Ib103de1ee37fb4986aa36900772b78b990ccb02a
Signed-off-by: ajesh <akv@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555772
(cherry picked from commit cd45d1cd2d095c77d738fdf7746fd258bc58353b)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2560213
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-09-01 05:57:49 -07:00
Sagar Kamble
40064ef1ec gpu: nvgpu: fix ecc counter free
ECC counter structures are freed without removing the node from the
stats_list. This can lead to invalid access due to dangling pointers.

Update the ecc counter free logic to set them to NULL upon free, to
remove them from stats_list and free them by validation.

Also updated some of the ecc init paths where error was not propa-
gated to callers and full ecc counters deallocation was not done.

Now, calling unit ecc_free from any context (with counters alloc-
ated or not) is harmless as requisite checks are in place.

bug 3326612
bug 3345977

Change-Id: I05eb6ed226cff9197ad37776912da9dcb7e0716d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2565264
Tested-by: Ashish Mhetre <amhetre@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2021-08-11 01:55:08 -07:00
Tejal Kudav
b33079d47e gpu: nvgpu: Move intr data members from MC to CIC
Move interrupt specific data-members from common.mc to common.cic
Some of these data members like sw_irq_stall_last_handled_cond need
To be initialized much earlier during the OS specific init/probe stage.
Also, some more members from struct nvgpu_interrupts(like stall_size,
stall_lines[]), which will soon be moved to CIC will also need to be
initialized early during the OS specific probe stage.
However, the chip specific LUT can only be initialized after the
hal_init stage where the HALs are all initialized.
Split the CIC init to accommodate the above initialization requirements.

JIRA NVGPU-6899

Change-Id: I9333db4cde59bb0aa8f6eb9f8472f00369817a5d
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2552535
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-19 18:06:28 -07:00
Divya Singhatwaria
9f30609550 gpu: nvgpu: Rename TPC powergating mutex
Rename tpc_pg_lock to static_pg_lock and
have_tpc_pg_lock to have_static_pg_lock as it
is used for tpc/gpc/fbp power gating.

JIRA NVGPU-6433

Change-Id: I4c56b9710e303ad9e872bad4b5ed9a167acb9dd6
Signed-off-by: Divya Singhatwaria <dsinghatwari@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2537489
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2021-07-18 02:46:25 -07:00