Rename __nvgpu_set_enabled() to nvgpu_set_enabled(). The original
double underscore was present to indicate that this function is a
function with potentially unintended side effects (enabling a feature
has wide ranging impact).
To not lose this documentation a comment was added to convey that this
function must be used with care.
JIRA NVGPU-1029
Change-Id: I8bfc6fa4c17743f9f8056cb6a7a0f66229ca2583
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1989434
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new unit common/gr/ctx.c to manage GR context
This unit provides interfaces to allocate/free/map/unmap GR context,
patch context, pm context, ctxsw {preempt/spill/betacb/pagepool/rtvcb}
buffers.
It also provides APIs to set size of above buffers
Add new header file include/nvgpu/gr/ctx.h to declare all the interfaces.
Move nvgpu_gr_ctx, patch_desc, pm_ctx_desc, zcull_ctx_desc structures
to this unit
Add new structure nvgpu_gr_ctx_desc to hold context description
parameters. For now we add sizes of all the buffers here.
Add this structure to gr_gk20a for global reference
Remove gr_gp10b_alloc_buffer() since it is no longer used
Rename g->ops.gr.alloc_gfxp_rtv_cb() to g->ops.gr.init_gfxp_rtv_cb()
since this HAL now only sets the size of rtvcb ctxsw buffer
Remove gr->ctx_vars.buffer_size and gr->ctx_vars.buffer_total_size
since they were redundant. We already have gr->ctx_vars.golden_image_size
to denote golden image size
Jira NVGPU-1527
Change-Id: I8847b347f80235209dd5e28d979e79984ab85408
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1987702
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove a BUG() call in nvgpu_vm_find_mapping(). This bug was
present to ensure that we evaluate whether this function is
necessary or useful before deleting it.
nvgpu_vm_find_mapping() facilitates map caching in the VM
layer. This feature is not necessary for unit testing at the
moment. As such delete the BUG() and allow this code to run.
JIRA NVGPU-1352
Change-Id: If7cdf3c0dfe0e94e32061d7775d2eef7318cf63f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1987807
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Local golden image is copy of global GR context buffer hence move its
ownership to global context unit
Add new structure nvgpu_gr_global_ctx_local_golden_image to hold all meta
data for local golden image and move it to struct gr_gk20a
Expose and use new APIs to initialize/deinitialize and load local golden image
Jira NVGPU-1625
Change-Id: Ieb68e52c205ca0ecd27f8bf4bb31922a01e7ae54
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1984952
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Turing board (PG189) variants (A00, A01, A02, A03) need
different dealys for valid power good (PG) signal. To support
all board variants change the delay to minimum required value
of 250ms.
Bug 200452556
JIRA NVGPU-1100
Change-Id: Iba2a6b17dec7552197cb0b7873132d83e9e09aea
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1987659
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add clk arbiter support for tu104
setup clk_arb for supporting functions in hal_tu04
TU104 supports GPCCLK and not GPC2CLK
Remove multiplication and division by 2 to convert gpcclk to gpc2clk
Provide support for following features
*Domains: Currently GPCCLK is supported
*clk Range: From P0 min to P0 max
*Freq Points: Gives the VF curve from PMU
*Default: Default value(P0 Max)
*Current Pstate: P0 is supported
All request for change is freq is validated against P0 value
Out of bound values are trimmed to match the Pstate limits
Multiple requests are supported and max of that will be set
Requests are sent to PMU via change sequencer
Bug 200454682
JIRA NVGPU-1653
Change-Id: I36735fa50c7963830ebc569a2ea2a2d7aafcf2ab
Signed-off-by: Abdul Salam <absalam@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1982078
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Extract non-chip-specific code that manages the runlists (init, update,
reschedule etc.) to a new file in the common directory. Move the
declarations to a new matching runlist.h header.
Jira NVGPU-1309
Change-Id: I3c7e0032899516487037f47ddc9a7e7aa4b0b33a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1978058
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new unit common/gr/global_ctx.c to manage GR global context buffers
This unit provides interfaces to allocate/free/map/unmap all the global
context buffers. It also provides APIs to get/set size of the buffers,
and to get memory handle of the buffers
Use interfaces exposed by this unit instead of directly accessing global
context buffers in common code
Add new header file include/nvgpu/gr/global_ctx.h to declare all the
interfaces.
Rename "struct gr_ctx_buffer_desc" to "struct nvgpu_gr_global_ctx_buffer_desc"
which holds all data for each global context
Remove void *priv since it is no longer used
Add size to the desc structure to store the requested size
Remove global_ctx_buffer_size from struct nvgpu_gr_ctx since it is no longer
used for any real purpose
Jira NVGPU-1625
Change-Id: I3feaf47bc2fdf192f36b136f2ef80a49d1782c5d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1977884
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For K4.14, the pci driver is enabled with CONFIG_PCIE_TEGRA=y. The
check of dummy APIs doesn't capture this config. Fix this to use
tegra_pcie_detach/attach_controller() APIs from the pci driver.
Bug 200480179
JIRA NVGPU-1100
Change-Id: I3a2b4f243dce6ead1174b12bc8ce2ffb6700c86b
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1982549
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This reverts commit 98dca979d6.
The original commit was reverted because there was an issue
where commit "include: uapi: move nvhost user-interface headers" in
linux-nvidia repo caused DLA UMD driver to be exposed of including
kernel headers directly.
Since then, DLA UMD driver has been fixed to use headers from
user-space code. And hence restore this change and commit
"include: uapi: move nvhost user-interface headers" in linux-nvidia
repo.
Bug 200471393
Change-Id: Ic8627aca37422aad9b2549c9b7e6de1474d80af9
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1980596
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The GC-OFF feature shall be available only for selective
dGPUs like Volta, etc. To enable this, add a platform flag
to control GC-OFF feature for a given dGPU.
If GC-OFF is not enabled for a dGPU, EPERM error will be
returned by kernel interfaces.
JIRA NVGPU-1100
Change-Id: Ic9e4492b2bb8916d520e78ecb6a500ccd349b70c
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1923249
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This change adds RDMA supports for tegra iGPU.
1. Cuda Process allocates the memory and passes
the VA and size to the custom kernel driver.
2. The custom kernel driver maps the user allocated
buf and does the DMA to/from it.
3. Only supports iGPU + cudaHostAlloc sysmem
4. Works only for a given process.
5. Address should be sysmem page aligned and size should
be multiple of sysmem page size.
6. The custom kernel driver must register a free_callback when get_page()
function is called.
Bug 200438879
Signed-off-by: Preetham Chandru R <pchandru@nvidia.com>
Change-Id: I43ec45734eb46d30341d0701550206c16e051106
Reviewed-on: https://git-master.nvidia.com/r/1953780
(cherry picked from commit d6278955f6)
Reviewed-on: https://git-master.nvidia.com/r/1821407
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The PMU pstate deinit was invoked part of gpu power off. This frees and clears
the pmgr_pmu struct which causes the pmu remove support to crash when it
tries to access the pmgr_pmu object for freeing up the pmu board objects.
Deferred pstate deinit to nvgpu driver removal as there is no reason for it be
invoked part of prepare poweroff sequence.
JIRA NVGPU-1618
Change-Id: I2eb52000f0732d0abed54946e0843367b119d443
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1971225
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Commit ca611e4d0e (gpu: nvgpu: verify usermode mapping is at least
PAGE_SIZE) was not quite the right thing to do; do_mmap() rounds the
length up to a page boundary anyway, but the length must not be longer
than the size of the usermode region which is 64 KB to avoid leaking
access to other registers.
Bug 2441531
Change-Id: Ib1c88a6725db62c8276b6e8b880631227a4fc8cd
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1971339
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Allen Martin <amartin@nvidia.com>
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
PMU counters #0 and #4 are used to count total cycles and busy cycles.
These counts are used by podgov to estimate GPU load.
PMU idle intr status register is used to monitor overflow. Overflow
rarely occurs because frequency governor reads and resets the counters
at a high cadence. When overflow occurs, 100% work load is reported to
frequency governor.
Bug 1963732
Change-Id: I046480ebde162e6eda24577932b96cfd91b77c69
Signed-off-by: Peng Liu <pengliu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1939547
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
NVGPU_CTXSW_IOCTL_RING_SETUP can be used to setup a custom ring buffer
and it accepts size via arguments. nvgpu driver will return an error
if size requested is greater than 128 * 4096 but this value is hardcoded
and not exposed anywhere.
Add characteristics field in nvgpu.h to expose this size so that corresponding
nvrm_gpu API can use it.
Bug 2169674
Change-Id: Icf9465d4eec6ba3a307ea9490bd5da563944e4f6
Signed-off-by: Anup Mahindre <amahindre@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1967596
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We had to force allocation of physically contiguous memory for
USERD in nvlink case, as a channel's USERD address is computed as
an offset from fifo->userd address, and nvlink bypasses SMMU.
With 4096 channels, it can become difficult to allocate 2MB of
physically contiguous sysmem for USERD on a busy system.
PBDMA does not require any sort of packing or contiguous USERD
allocation, as each channel has a direct pointer to that channel's
512B USERD region. When BAR1 is supported we only need the GPU VAs
to be contiguous, to setup the BAR1 inst block.
- Add slab allocator for USERD.
- Slabs are allocated in SYSMEM, using PAGE_SIZE for slab size.
- Contiguous channels share the same page (16 channels per slab).
- ch->userd_mem points to related nvgpu_mem descriptor
- ch->userd_offset is the offset from the beginning of the slab
- Pre-allocate GPU VAs for the whole BAR1
- Add g->ops.mm.bar1_map() method
- gk20a_mm_bar1_map() uses fixed mapping in BAR1 region
- vgpu_mm_bar1_map() passes the offset in TEGRA_VGPU_CMD_MAP_BAR1
- TEGRA_VGPU_CMD_MAP_BAR1 is called for each slab.
Bug 2422486
Bug 200474793
Change-Id: I202699fe55a454c1fc6d969e7b6196a46256d704
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This adds APIs to the POSIX sgt module to allow unit tests to create
sgt's and sgl's more easily. The new APIs allow the caller to pass in a
table of sgl's and creates the new sgt/sgl as needed.
Since this there is a new API to create an sgl, this also includes a new
nvgpu_sgl_free() API.
JIRA NVGPU-1563
Change-Id: I27ab7654289cbd2212a75625b6dd24d2b742e3bf
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1966151
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add separate new unit gr/ctxsw_prog that provides interface to access
h/w header files hw_ctxsw_prog_*.h
Add below chip specific files that access above h/w unit and provide
interface through g->ops.gr.ctxsw_prog.*() HAL for rest of the units
common/gr/ctxsw_prog/ctxsw_prog_gm20b.c
common/gr/ctxsw_prog/ctxsw_prog_gp10b.c
common/gr/ctxsw_prog/ctxsw_prog_gv11b.c
Remove all the h/w header includes from rest of the units and code.
Remove direct calls to h/w headers ctxsw_prog_*() and use HALs
g->ops.gr.ctxsw_prog.*() instead
In gr_gk20a_find_priv_offset_in_ext_buffer(), h/w header
ctxsw_prog_extended_num_smpc_quadrants_v() is only defined on gk20a
And since we don't support gk20a remove corresponding code
Add missing h/w header ctxsw_prog_main_image_pm_mode_ctxsw_f() for
some chips
Add new h/w header ctxsw_prog_gpccs_header_stride_v()
Jira NVGPU-1526
Change-Id: I170f5c0da26ada833f94f5479ff299c0db56a732
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1966111
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This is part of a move to 64KiB for usermode mapping to fix failures
when the system page size is 64KiB. When remapping or zapping the
vma, use the existing size, not hardcoded size. Also change the
verification of the size when creating the mapping to verify it is at
least as big as PAGE_SIZE. This allows 4KiB mappings to continue to
work until nvrm_gpu is changed to use 64KiB mappings.
Bug 2441531
Change-Id: I447ef8e9f84e6d70bbe96b527e267ec41c5630b8
Signed-off-by: Allen Martin <amartin@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1964687
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>