Add gk20a_channel_from_id() to retrieve a channel, given a raw channel
ID, with a reference taken (or NULL if the channel was dead). This makes
it harder to mistakenly use a channel that's dead and thus uncovers bugs
sooner. Convert code to use the new lookup when applicable; work remains
to convert complex uses where a ref should have been taken but hasn't.
The channel ID is also validated against FIFO_INVAL_CHANNEL_ID; NULL is
returned for such IDs. This is often useful and does not hurt when
unnecessary.
However, this does not prevent the case where a channel would be closed
and reopened again when someone would hold a stale channel number. In
all such conditions the caller should hold a reference already.
The only conditions where a channel can be safely looked up by an id and
used without taking a ref are when initializing or deinitializing the
list of channels.
Jira NVGPU-1460
Change-Id: I0a30968d17c1e0784d315a676bbe69c03a73481c
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1955400
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The return type of the function pointer *calc_global_ctx_buffer_size()
is changed from int to u32 and all its implementations.
The arg type of size in *set_big_page_size() is changed from int to
u32 and all it implementations. These changes are necessary because
size should be an unsigned value.
JIRA NVGPU-992
Change-Id: I3e4cd1d83749777aa8588a44a48772e26f190c4d
Signed-off-by: Sai Nikhil <snikhil@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1950503
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
vgpu gv11b advertises IO coherence with NVGPU_SUPPORT_IO_COHERENCE. Turn
on NVGPU_USE_COHERENT_SYSMEM so that nvgpu internals choose the correct
flag as well; we already set both for native environments together.
Most likely the availability of IO coherence should be read somewhere
instead of hardcoding these flags though.
Bug 200145225
Bug 200467197
Change-Id: Ia1f7b75fdcc230b92aedd50ba1aa0416786a9ed3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1951462
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently has_timedout variable is protected by wmb at places
where it is being set and there is no correspoding rmb whenever
has_timedout variable is read. This is prone to errors for
concurrent execution. This change is supposed to fix this issue.
Rename has_timedout variable of channel struct to ch_timedout.
Also to avoid rmb every time ch_timedout is read,
ch_timedout_spinlock is added to protect ch_timedout
variable for taking care of concurrent execution.
Bug 2404865
Bug 2092051
Change-Id: I0bee9f50af0a48720aa8b54cbc3af97ef9f6df00
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1930935
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Rule 21.15 prohibits use of memcpy() with incompatible ptrs
to qualified/unqualified types.
To circumvent this issue we've introduced a new MISRA-compliant
nvgpu_memcpy() function.
This change switches non-offending memcpy usage in vgpu/* code
over to to use nvgpu_memcpy() with appropriate casts applied
to maintain consistency within nvgpu.
JIRA NVGPU-849
Change-Id: Ib62d77929d60bbda64c51d3d81adf66ed155a8a9
Signed-off-by: Scott Long <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1946262
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Rule-15.6 requires that all if-else blocks and loop blocks
be enclosed in braces, including single statement blocks. Fix errors
due to single statement if-else and loop blocks without braces
by introducing the braces.
JIRA NVGPU-775
Change-Id: Ib70621d39735abae3fd2eb7ccf77f36125e2d7b7
Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1928745
GVS: Gerrit_Virtual_Submit
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
MISRA Rule-17.7 requires the return value of all functions to be used.
Fix is either to use the return value or change the function to return
void. This patch contains fix for all 17.7 violations instandard C functions
in GPU specific files.
JIRA NVGPU-1036
Change-Id: Iefadc38bdbea4f02de3c24b6ad1c71d6eb0af4bd
Signed-off-by: Nicolas Benech <nbenech@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1929903
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
pmuif structures refer to the ctrl structures, so that means that ctrl
structures are part of the pmuif. Move the headers to the right place
and update all include statements to include from the right place.
JIRA NVGPU-596
Change-Id: I7be1a727be654d58eccd0e12d599979687dd0733
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1934022
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
All the netlist parsing code is currently under GR unit, but netlist
ucode parsing does not really have any logical dependency to GR
Hence separate out a new unit common/netlist/ that parses the netlist
image and stores/exposes its content through netlist_vars structure
Structure nvgpu_netlist_vars is added to structure gk20a
Move netlist parsing code to common/netlist/netlist.c and chip
specific files to common/netlist/netlist_<chip>.c
Move simulation netlist parsing to common/netlist/netlist_sim.c
Rename g.ops.gr_ctx HAL to g.ops.netlist
Rename all the exported structures to be in the form of nvgpu_*
Rename all exported functions to be in the form of nvgpu_netlist_*()
Add netlist initialization to GPU boot path, and add deinitialization
to GPU remove path
Jira NVGPU-1317
Change-Id: I9af86e3b3230a89db5260cc8ed96ff5f72938c9a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1936454
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- In vGPU code path, function vgpu_channel_abort_cleanup() does not
obtain channel reference before using the channel structure
(channel_gk20a)
- vgpu_channel_abort_cleanup() is called by vgpu_intr_thread() which
runs commands obtained from interrupt queue from RM server.
- If there is a scenario where gk20a_channel_release() function runs
before guest receives notification from RM server to abort channel
cleanup, channel gets freed before vgpu_channel_abort_cleanup() runs.
- However, because vgpu_channel_abort_cleanup() does not take explicit
reference to the channel, it ends up accessing structures
(such as ch->g) which are set to NULL and thus we end up in a crash.
- This patch explicitly takes reference of channel before
vgpu_channel_abort_cleanup() is called.
- If gk20a_channel_release() runs before vgpu_channel_abort_cleanup()
and ends up freeing channel, we dont get reference to freed
channel in vgpu_channel_abort_cleanup() and thus we return from
function rather than continuing with freed channel as was the case
previously.
Bug 200453473
JIRA EVLR-3411
Change-Id: I311043b2231336616b28246531cf8a0dc151b0cd
Signed-off-by: Deepak Bhosale <dbhosale@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1932028
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: Nirav Patel <nipatel@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reduce the size of memory allocations in the channel debug dump by
capturing only the necessary values from the instance block. This also
simplifies the allocation path slightly with the downside of having to
add a capture_channel_ram_dump HAL for reading the interesting parts
explicitly beforehand to the now smaller staging buffer.
Also rename struct ch_state to struct nvgpu_channel_dump_info.
Jira NVGPU-886
Change-Id: I5d7518d9d474b0b728b183383bc83d89ecf91b98
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1928207
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename gk20a/dbg_gpu_gk20a.c to common/debugger.c and make it a
separate common unit
Also rename gk20a/dbg_gpu_gk20a.h to include/nvgpu/debugger.h
We had two different HALs for debugger - gops.debugger and
gops.dbg_session_ops
Combine them into one single HAL gops.debugger and remove
gops.dbg_session_ops
Rename all exported APIs from debugger.h to be in the form of
nvgpu_*()
Jira NVGPU-1013
Change-Id: I136dc7786e3b2065921eb03b99f16049212f3cd2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1920075
Reviewed-by: Sachin Jadhav <sachinj@nvidia.com>
Tested-by: Sachin Jadhav <sachinj@nvidia.com>
Add new separate unit common/perf/cyclestats_snapshot.c and add
corresponding header file include/nvgpu/cyclestats_snapshot.h
This unit is h/w independent and simply calls gops.perf.* HALs
exposed by perf unit to do the h/w configurations
Also remove gv11b/css_gr_gv11b.* files as h/w specific sequence
implemented in them is already moved to perf unit
Rename all cyclestats_snapshot HALs in the form nvgpu_css_*()
Jira NVGPU-1103
Change-Id: I303f6becb313ac918e06c495a5fe299947a1f0b1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1916652
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add separate unit for perfbuf in common/perf/perfbuf.c which does not need to
include any h/w file. This unit will utilize HALs exported by
perf_*.c units for h/w accesses.
Add corresponding header file at include/nvgpu/perfbuf.h
Add new HAL gops.perfbuf with below operations :
gops.perfbuf.perfbuf_enable()
gops.perfbuf.perfbuf_disable()
Remove below debug session specific HALs
gops.dbg_session_ops.perfbuffer_enable()
gops.dbg_session_ops.perfbuffer_disable()
Delete file gv11b/dbg_gpu_gv11b.c since it is no longer needed now as it was
only including perfbuf sequence
Also remove perfbuf sequences from gk20a/dbg_gpu_gk20a.c
Jira NVGPU-1102
Change-Id: I57b87c9f0dcd85784f8002bc92728b6d78a68d98
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1819303
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gops.fb.dump_vpr_wpr_info() accesses both VPR and WPR registers.
Split this into two different HALs gops.fb.dump_vpr_info() and
gops.fb.dump_wpr_info()
Also unset HALs accessing VPR registers on dGPUs
We don't support VPR on dGPUs
Remove fb_mmu_vpr_info_r() register and all its accessors from
dGPU headers
Bug 2173122
Change-Id: I5b2712f8c5389e422a84c375a7e836add48bfd1c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1850947
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add support for the measure_freq clock op for igpu:
- add nvgpu_clk_measure_freq(), which in turn calls
the get_rate() clock op.
- Initialize the measure_freq clock op to nvgpu_clk_measure_freq()
for native linux and vgpu.
JIRA ESRM-398
Change-Id: I8a3b2ee79e29e3491a16f55281494f05cd841b07
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1850585
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Nirav Patel <nipatel@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
1. Implement the following vgpu functions to support clk-arb:
- vgpu_clk_get_range() to return min and max freqs from
supported frequencies
- implement vgpu_clk_get_round_rate() which sets rounded
rate to input rate. Rounding is handled in RM Server
- modify vgpu_clk_get_freqs() to retrieve freq table in IVM
memory instead of copying the value in array as part of cmd
message.
2. Add support for clk-arb related HALs for vgpu.
3. support_clk_freq_controller is assigned true for vgpu
provided guest VM has the privilege to set clock frequency.
Bug 200422845
Bug 2363882
Jira EVLR-3254
Change-Id: I91fc392db381c5db1d52b19d45ec0481fdc27554
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1812379
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- This patch enables HWPM Mode-E context switch for gv11b.
- Write new pm mode to context buffer header. Ucode use
this mode to enable mode-e context switch. This is Mode-B
context switch of PMs with Mode-E streamout on one context.
If this mode is set, Ucode makes sure that Mode-E pipe
(perfmons, routers, pma) is idle before it context switches PMs.
- This allows us to collect counters in a secure way
(i.e. on context basis) with stream out.
- For Mode-E ctxsw it is required that engine_sel
is set to 0xFFFFFFFF.
- Default 0 is a valid signal and causes problems.
Bug 2106999
Change-Id: Idc6380116a71ffd7ae348ceec68cb395b2eca5f6
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1818070
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Created struct nvgpu_acr to hold acr module related member
within single struct which are currently spread across multiple structs
like nvgpu_pmu, pmu_ops & gk20a.
-Created struct hs_flcn_bl struct to hold ACR HS bootloader specific members
-Created struct hs_acr to hold ACR ucode specific members like bootloader data
using struct hs_flcn_bl, acr type & falcon info on which ACR ucode need to run.
-Created acr ops under struct nvgpu_acr to perform ACR specific operation,
currently ACR ops were part PMU which caused to have always dependence
on PMU even though ACR was not executing on PMU.
-Added acr_remove_support ops which will be called as part of
gk20a_remove_support() method, earlier acr cleanup was part of
pmu remove_support method.
-Created define for ACR types,
-Ops acr_sw_init() function helps to set ACR properties
statically for chip currently in execution & assign ops to point to
needed functions as per chip.
-Ops acr_sw_init execute at early as nvgpu_init_mm_support calls acr
function to alloc blob space.
-Created ops to fill bootloader descriptor & to patch WPR info to ACR uocde
based on interfaces used to bootstrap ACR ucode.
-Created function gm20b_bootstrap_hs_acr() function which is now common
HAL for all chips to bootstrap ACR, earlier had 3 different function for
gm20b/gp10b, gv11b & for all dgpu based on interface needed.
-Removed duplicate code for falcon engine wherever common falcon code can be used.
-Removed ACR code dependent on PMU & made changes to use from nvgpu_acr.
JIRA NVGPU-1148
Change-Id: I39951d2fc9a0bb7ee6057e0fa06da78045d47590
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1813231
GVS: Gerrit_Virtual_Submit
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Removed ACR support code from PMU module
- Deleted ACR related ops from pmu ops
- Deleted assigning of ACR related ops
using pmu ops during HAL init
-Removed code related to ACR bootstrap &
dependent code for all chips.
JIRA NVGPU-1147
Change-Id: I47a851a6b67a9aacde863685537c34566f97dc8d
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1817990
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move OS agnostic parts of vgpu clk code out of os/linux specific
path. This includes implementation sending rpc commands to
RM Server. Move Linux specific vgpu clk code to platform vgpu files
keeping it consistent with native implementation.
Bug 2363882
Jira EVLR-3254
Change-Id: I0aae014ef16415bb356c81e9bfd76bc65206d9fd
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1820674
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Move implementation of MC HAL to common/mc. Also bump gk20a
implementation to gm20b.
gk20a_mc_boot_0 was used via a HAL, but we have only one possible
implementation. It also has to be anyway called directly to detect
which HALs to assign, so make it a true common function.
mc_gk20a_handle_intr_nonstall was also used only in os/linux/intr.c
so move it there.
JIRA NVGPU-954
Change-Id: I79aedc9158f90d578db0edc17b714617b52690ac
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1813519
Reviewed-by: svc-misra-checker <svc-misra-checker@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
NVGPU_DBG_GPU_IOCTL_REG_OPS currently doesn't return if the ctx was
resident in engine or not.
Regops are broken down into batches of 128 and each batch is executed
together. Since there only 32 bits were available in IOCTL args, returning
is ctx was resident isn't possible for all batches.
Hence return if the ctx was resident for the first batch.
Bug 200445575
Change-Id: Iff950be25893de0afadd523d4ea04842a8ddf2af
Signed-off-by: Anup Mahindre <amahindre@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1812975
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below regops HALs are not being called from anywhere, so remove them
gops.regops.get_runcontrol_whitelist_ranges()
gops.regops.get_runcontrol_whitelist_ranges_count()
gops.regops.get_qctl_whitelist_ranges()
gops.regops.get_qctl_whitelist_ranges_count()
HAL gops.regops.apply_smpc_war() is unimplemented for all the chips, and it
was originally only needed for gk20a which is not unsupported
So remove this HAL and its call too
Jira NVGPU-620
Change-Id: Ia2c74883cd647a2e94ee740ffd040a40c442b939
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1813106
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>