Currently the vGPU engine management rewrites a lot of the common
device agnostic engine management code.
With the new top HAL parsing one device at a time, it is now more
easily possible to tie the vGPU into the new common device framework
by implementing the top HAL but with the vGPU engine list backend.
This lets the vGPU inherit all the common engine and device
management code. By doing so the vGPU HAL need only implement a
trivial and simple HAL.
This also gets us a step closer to merging all of the CE init
code: logically it just iterates through all CE engines whatever
they may be. The only reason this differs between chips is because
of the swap from CE0-2 to LCEs in the Pascal generation. This could
be abstracted by the unit code easily enough.
Also, the pbdma_id for each engine has to be added to the device
struct. Eventually this was going to happen anyway, since the
device struct will soon replace the nvgpu_engine_info struct.
It's a little bit of an abuse but might be worth it long term. If
not, it should not be difficult to replace uses of dev->pbdma_id
with a proper lookup of PBDMA ID based on the device info.
JIRA NVGPU-5421
Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists.
This array allows other software to query what PBDMA(s) serves a given
runlist. The PBDMA map is read verbatim from an array of host registers.
These registers are stored in a kmalloc()'ed array.
This causes a problem for the device management code. The device
management initialization executes well before the rest of the FIFO
PBDMA initialization occurs. Thus, if the device management code
queries the PBDMA mapping for a given device/runlist, the mapping has
yet to be populated.
In the next patches in this series the engine management code is subsumed
into the device management code. In other words the device struct is
reused by the engine management and all host SW does is pull pointers to
the host managed devices from the device manager. This means that all
engine initialization that used to be done on top of the device
management needs to move to the device code.
So, long story short, the PBDMA map needs to be read from the registers
directly, instead of an array that gets allocated long after the device
code has run.
This patch removes the pbdma map array, deletes two HALs that managed
that, and instead provides a new HAL to query this map directly from
the registers so that the device code can use it.
JIRA NVGPU-5421
Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
This adds a new device management unit in the common code responsible
for facilitating the parsing of the GPU top device list and providing
that info to other units in nvgpu.
The basic idea is to read this list once from HW and store it in a
set of lists corresponding to each device type (graphics, LCE, etc).
Many of the HALs in top can be deleted and instead implemented using
common code parsing the SW representation.
Every time the driver queries the device list it does so using a
device type and instance ID. This is common code. The HAL is responsible
for populating the device list in such a way that the driver can
query it in a chip agnostic manner.
Also delete some of the unit tests for functions that no longer
exist. This code will require new unit tests in time; those should be
quite simple to write once unit testing is needed.
JIRA NVGPU-5421
Change-Id: Ie41cd255404b90ae0376098a2d6e9f9abdd3f5ea
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319649
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_engine_is_valid_runlist_id already iterates the list of
active engines, therefore the engine_id is already known to
be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.
Also make sure to reset num_engines in nvgpu_cleanup_sw, to avoid
accessing uninitialized active_engines_list in unit test corner
cases (targetting init/remove support).
Jira NVGPU-4511
Change-Id: Ia6b904a7f3ca46e5097f06770b4caad317ec967b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2263618
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_engine_get_gr_runlist_id gets the first instance of
active GR engine using nvgpu_engine_get_ids. Therefore the
engine_id is already known to be valid.
Remove call to nvgpu_engine_get_active_eng_info (which iterates
all engines), and fetch f->engine_info[engine_id] instead.
Also remove non-NULL test for engine_info, which could not
be true.
Jira NVGPU-4511
Change-Id: Ifcc0851e3d14d862e2ed7b21ea57f17a66eca9dd
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2263617
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Fix bug where the CE mask includes other engine types besides just CEs
in nvgpu_ce_engine_interrupt_mask().
The intent of this API is to return mask of CE interrupts. However, the
if clause in the for loop is only excluding engine interrupts if the CE
stall or non-stall ISR is NULL. So, it does not distinquish between CE
or GR engine interrupts if the CE ISR is non-null.
Since the expectation is to not return CE interrupts if the ISRs are
NULL, just return a 0 mask if either ISR is NULL without having to
bother with the loop.
If the ISRs are set in the CE HAL, within the loop, only add interrupts
to the mask returned if the engine type is actually a CE engine (i.e. do
not include GR engine interrupts).
JIRA NVGPU-2224
Change-Id: Ic0048b00f16590fec50bb0858bd3f4498a00650d
Signed-off-by: Philip Elcan <pelcan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2256269
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Previously, unit interrupt enabling/disabling and corresponding MC level
interrupt enabling/disabling was not done at the same time.
With this change, stall and nonstall interrupt for units are programmed
at MC level along with individual unit interrupts. Kept access to MC
interrupt registers through mc.intr_lock spinlock.
For doing this separated CE and GR interrupt mask functions.
mc.intr_enable is only used when there is global interrupt
control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable
is now removed. Removed following functions - mc_gv100_intr_enable,
mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config
as we can use the generic unit interrupt control function.
JIRA NVGPU-4336
Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2196178
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gk20a.h will include gops_mc.h to contain the mc ops definitions. Add
doxygen comments for the HAL functions that are called directly.
Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file.
JIRA NVGPU-2524
Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2226017
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add macros for whitelisting coverity violations. These macros use pragma
directives. The pragma directives and whitelisting macros are only
enabled when a coverity scan is being run.
The whitelisting macros have been added to a new header called
static_analysis.h. The contents of safe_ops.h (CERT C safe ops) have
been moved into static_analysis.h because this will be the new header
for static analysis related macros/defines/etc.
JIRA NVGPU-3820
Change-Id: I9c63f20f670880b420415535738034619314b7c3
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2180600
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Changes added to support "rmmod nvgpu" in dgpu simulation after gpu
poweron.
nvgpu_engine-wait_for_idle got stuck in busy mode for nvdec and nvec
engines in simulation as simulation doesnt support timeout.
These engines are not valid engines in nvgpu engine list.
Add nvgpu_engine_check_valid_id before checking engine status.
Simulation crash on accessing 0xb81604 top interrupt register.
Add func_priv_cpu_intr_top__size_1_v() function to get the supported
size than using default MAX_INTR_TOP_REGS.
nvlink is not supprted in dgpu simulation. Avoid warning for
-ENODEV return.
Avoid register read following gpu power off completion.
Bug 2498574
Change-Id: I9f9f1cf1ac4620242bda1d2cc0f29f51f81a6711
Signed-off-by: vinodg <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2179930
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove below hals since the corresponding functions are same on all
platforms and they are h/w independent
g->ops.gr.enable_ctxsw()
g->ops.gr.disable_ctxsw()
g->ops.gr.reset()
Call the functions directly at all places
Remove CONFIG_NVGPU_DEBUGGER from places where these functions are
called since they are not debugger dependent
This also helps to disable CONFIG_NVGPU_DEBUGGER and to keep recovery
sequence intact
Jira NVGPU-3506
Change-Id: Id2b208ca23dc4667e78edcd8ad242a8558e0ff64
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2137255
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-compile out nvgpu_pmu members which are not required for
safety buid & modified source as required to support same.
-compile out PMU headers include which are not required for
safety code
-Removed unnecessary PMU header includes from some files
JIRA NVGPU-3418
Change-Id: I5364b1b16c46637d229e82745dd2846cb6335a72
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2128228
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Hal API g->ops.gr.halt_pipe() is defined in unsafe unit hal.gr.gr
It is called from safe unit, and it calls into API
g->ops.gr.falcon.ctrl_ctxsw() which is also safe
Hence get rid of unsafe API g->ops.gr.halt_pipe().
Caller now directly calls hal.gr.falcon API to halt pipe
Jira NVGPU-3506
Change-Id: I5439cb79431795fc7c22384832cf632d6db03316
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127755
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
renamed NVGPU_LS_PMU to NVGPU_FEATURE_LS_PMU to follow
nvgpu naming standard
Compile out LS PMU files when PMU RTOS support is
disabled for safety build by setting NVGPU_LS_PMU
build flag to 0
JIRA NVGPU-3418
Change-Id: Ib09924ac25657e932723c10be573f2f701cb7bea
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2127794
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Basic units like fifo, rc are having dependency on
gr_falcon. Avoided outside gr units dependency on gr_falcon
by moving following functions to gr:
int nvgpu_gr_falcon_disable_ctxsw(struct gk20a *g,
struct nvgpu_gr_falcon *falcon); ->
int nvgpu_gr_disable_ctxsw(struct gk20a *g);
int nvgpu_gr_falcon_enable_ctxsw(struct gk20a *g,
struct nvgpu_gr_falcon *falcon); ->
int nvgpu_gr_enable_ctxsw(struct gk20a *g);
int nvgpu_gr_falcon_halt_pipe(struct gk20a *g); ->
int nvgpu_gr_halt_pipe(struct gk20a *g);
HALs also moved accordingly and updated code to reflect this.
Also moved following data back to gr from gr_falcon:
struct nvgpu_mutex ctxsw_disable_mutex;
int ctxsw_disable_count;
JIRA NVGPU-3168
Change-Id: I2bdd4a646b6f87df4c835638fc83c061acf4051e
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/2100009
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>