Legacy profiler does not reserve PMA stream resource with PM
reservation system. Also, HWPM system reset is separately implemented
in membuf disable path. And it does not even restore perf unit SLCG
prod values.
Allcoate a dummy profiler object for debug session in perfbuf map
path. Free it in perfbuf unmap path.
This has advantage of synchronizing PMA stream reservation with new
profiler stack. And this also leverages HWPM system reset and SLCG
handling code during resource reservation.
Remove explicit HWPM reset from gops.perf.membuf_reset_streaming()
HALs
Bug 2510974
Jira NVGPU-5360
Change-Id: I54c5202b6251dea3d80a4dfc011e8a296339e07f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399595
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
When a pbdma fault needs a channel teardown, do the recovery/teardown
process before acking the pbdma interrupt status back. Acking it causes
the hardware to proceed which could release fences too early before the
involved channel(s) have been found to be broken.
With these host copyengine interrupts, the teardown sequence is light
and proceeds even with the pbdma intr flag still set; there are no
engines to reset when these pbdma launch check interrupts happen. The
bad tsg is just disabled and the channels in it aborted.
A few unit tests are so heavily affected by this refactor that they
would need to be rewritten. They're not strictly needed at the moment,
so do only half of the rewrite: just delete them.
Bug 200611198
Change-Id: Id126fb158b6d05e46ba124cd426389046eedc053
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392669
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Delete the struct nvgpu_engine_info as it's essentially identical to
struct nvgpu_device. Duplicating data structures is not ideal as it's
terribly confusing what does what.
Update all uses of nvgpu_engine_info to use struct nvgpu_device. This
is often a fairly straight forward replacement. Couple of places though
where things got interesting:
- The enum_type that engine_info uses is defined in engines.h and
has a bit of SW abstraction - in particular the GRCE type. The only
place this seemed to be actually relevant (the IOCTL providing device
info to userspace) the GRCE engines can be worked out by comparing
runlist ID.
- Addition of masks based on intr_id and reset_id; those can be
computed easily enough using BIT32() but this is an area that
could be improved on.
This reaches into a lot of extraneous code that traverses the fifo
active engines list and dramtically simplifies this. Now, instead of
having to go through a table of engine IDs that point to the list of
all host engines, the active engine list is just a list of pointers to
valid engines. It's now trivial to do a for-all-active-engines type
loop. This could even be turned into a generic macro or otherwise
abstracted in the future.
JIRA NVGPU-5421
Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Implement new API nvgpu_prof_ioctl_exec_reg_ops() to support regops on
new profiler objects.
Add two new staging buffers to hold regops copied from userspace, and
to convert and execute regops in common code.
Buffers are allocated and released along with the profiler object.
New API will implements this :
- copy regops data in chunks of 4K from userspace
- store them in staging buffer
- convert the new regop struct into common regop struct and also
copy the content into second staging buffer
- trigger gops.regops.exec_regops() with second staging buffer as
operation pointer
- convert common regop struct back into new regop struct and copy
back to userspace
Export bunch of helper functions from ioctl_dbg.h. e.g.
nvgpu_get_regops_op_values_common()
Update regop execution code to skip regop execution if regop status
is not valid. This is only possible when userspace requests for
CONTINUE_ON_ERROR mode.
Add more documentation to some of the fields in UAPI header.
Note that maximum atomic operations reported by new API are same
as legacy API and are incorrect. This will be fixed up in upcoming
patches.
Bug 2510974
Jira NVGPU-5360
Change-Id: I9f82052b22143aec33f6e778c0784386744b699e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394208
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rework regops execution API to accomodate below updates for new
profiler design
- gops.regops.exec_regops() should accept TSG pointer instead of
channel pointer.
- Remove individual boolean parameters and add one flag field.
Below new flags are added to this API :
NVGPU_REG_OP_FLAG_MODE_ALL_OR_NONE
NVGPU_REG_OP_FLAG_MODE_CONTINUE_ON_ERROR
NVGPU_REG_OP_FLAG_ALL_PASSED
NVGPU_REG_OP_FLAG_DIRECT_OPS
Update other APIs, e.g. gr_gk20a_exec_ctx_ops() and validate_reg_ops()
as per new API changes.
Add new API gk20a_is_tsg_ctx_resident() to check context residency
from TSG pointer.
Convert gr_gk20a_ctx_patch_smpc() to a HAL gops.gr.ctx_patch_smpc().
Set this HAL only for gm20b since it is not required for later chips.
Also, remove subcontext code from this function since gm20b does not
support subcontext.
Remove stale comment about missing vGPU support in exec_regops_gk20a()
Bug 2510974
Jira NVGPU-5360
Change-Id: I3c25c34277b5ca88484da1e20d459118f15da102
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389733
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In nvgpu_dbg_gpu_ioctl_smpc_ctxsw_mode(), check if SMPC global mode
capability is supported instead of checking for the function pointer.
Enable the capability only for Turing since pre-Turing GPUs don't
support it.
Bug 2510974
Jira NVGPU-5360
Change-Id: I352fb2a91b836cd8ef727966a53a28255d8ea834
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389653
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
- NvRmGpuDeviceSetSmDebugMode uses regops interface.
- NvRmGpuDeviceTriggerSuspend, NvRmGpuDeviceWaitForPause,
and NvRmGpuDeviceResumeFromPause should return error on Pascal+. Use
regops interface to suspend/resume.
- On non-cilp devices(Maxwell), NvRmGpuDeviceTriggerSuspend,
NvRmGpuDeviceWaitForPause, NvRmGpuDeviceResumeFromPause and
NvRmGpuDeviceSetSmDebugMode are used when debugger(including coredump,
memcheck) is attached or when CUDA application uses a syscall that
requires traphandler(assert, cnp).
Bug 2558022
Bug 2559631
Bug 2706068
JIRA NVGPU-5502
Change-Id: I9eb2ab0c8c75c50f53523d8bf39c75f98b34f3f0
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2376159
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Create new dev nodes for device and context profilers. Example of dev
nodes on iGPU
/dev/nvhost-prof-dev-gpu - device scope profiler
/dev/nvhost-prof-ctx-gpu - context scope profiler
Add below APIs to open/close above dev nodes :
nvgpu_prof_dev_fops_open()
nvgpu_prof_ctx_fops_open()
nvgpu_prof_fops_release()
Add common API nvgpu_prof_fops_ioctl() to handle IOCTL call on these
dev nodes. Add IOCTL NVGPU_PROFILER_IOCTL_BIND_CONTEXT to bind the TSG
to profiler objects.
Add nvgpu_tsg_get_from_file() to retrieve TSG struct pointer from
file descriptor. Also store profiler object pointer into TSG struct.
Enable NVGPU_SUPPORT_PROFILER_V2_DEVICE capability on gv11b and tu104.
Note that this is not yet enabled for vGPU.
Keep NVGPU_SUPPORT_PROFILER_V2_CONTEXT capabiity disabled since this
will take longer to support.
Add new IOCTL NVGPU_PROFILER_IOCTL_UNBIND_CONTEXT so that userspace can
explicitly unbind the context and release the resources before closing
the profiler descriptor.
Add context_init flag to profiler object for book keeping.
Bug 2510974
Jira NVGPU-5360
Change-Id: Ie07e0cfd5a9da9d80008f79c955c7ef93b4bc60f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2384354
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
If recovery sequence profiling is enabled skip the debug dump that
happens during an MMU fault. This prevents the debug dump from
dominating the time spent by the recovery sequence. The debug dump
is severly limited in speed by the (lack of) UART bandwidth.
JIRA NVGPU-5606
Change-Id: Ifc7c326d33d9115d58b13c0fa42ec4bb7acb3075
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2382591
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
During recovery, we set ch->unserviceable at the end after we preempt
the TSG and reset the engines. It might be too late and user-space
might submit more work to the broken channel which is not desirable.
Move setting this unserviceable flag right at the start
of recovery sequence.
Another thread doing a submit can still read the unserviceable flag
just before it is set here, leaving that submit stuck if recovery
completes before the submit thread advances enough to set up a post
fence visible for other threads. This could be fixed with a big lock
or with a double check at the end of the submit code after the job
data has been made visible.
We still release the fences, semaphore and error notifier wait queues
at the end; so user-space would not trigger channel unbind while
channel is being recovered.
Also, change the handle_mmu_fault APIs to return void as the
debug_dump return value is not used in any of the caller APIs.
JIRA NVGPU-5843
Change-Id: Ib42c2816dd1dca542e4f630805411cab75fad90e
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2385256
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
During recovery, we preempt the faulty TSG from PBDMA and engines.
If the TSG preempt on PBDMA times out(timeout = 100ms), the PBDMA
might be hung state. We do not reset the HOST during recovery, so
stuck PBDMAs are unrecoverable.
Abort the recovery and trigger GPU to quiesce as there is no way
back.
Triggering Quiesce from recovery sequence should be fine as the only
redundant operation will be write to FIFO_RUNLIST_PREEMPT register.
The error notifiers will eventually be set by Quiesce thread.
Bug 2768005
JIRA NVGPU-4631
Change-Id: I914b9379aa8e48014e6ddace9abe47180a072863
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368187
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, NVGPU_SUPPORT_FECS_CTXSW_TRACE enabled flag is set to true
when fecs_trace s/w setup is executed successfully. Sometimes,
fecs_trace is required to be disabled for debugging. This change will
help disable/enable fecs_trace feature by modifying one of the enabled
flags.
Enable NVGPU_SUPPORT_FECS_CTXSW_TRACE during chip specific hal init.
Control fec_trace init and ctxsw dev open depending on
NVGPU_SUPPORT_FECS_CTXSW_TRACE flag status.
JIRA NVGPU-5616
Change-Id: Id0754a5af7cd95a67a1f0ae5de36115d44e1111b
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2357501
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
gk20a_perfbuf_map() allocates perfbuf VM, maps the user buffer into new
VM, and then triggers gops.perfbuf.perfbuf_enable(). This HAL then does
following :
- Allocate perfbuf instance block
- Initialize perfbuf instance block
- Reset stream buffer
- Program instance block address in PMA registers
- Program user buffer address into PMA registers
New profiler interface will have it's own API to setup PMA strem, and
it requires above setup to be done in two phases of perfbuf
initialization and then user buffer setup.
Split above functionalities into below functions
- nvgpu_perfbuf_init_vm()
- Allocate perfbuf VM
- Call gops.perfbuf.init_inst_block() to initialize perfbuf instance
block
- gops.perfbuf.init_inst_block()
- Allocate perfbuf instance block
- Initialize perfbuf instance block
- Program instance block address in PMA registers using
gops.perf.init_inst_block()
- In case of vGPU, trigger TEGRA_VGPU_CMD_PERFBUF_INST_BLOCK_MGT
command to gpu server
- gops.perf.init_inst_block()
- Reset stream buffer
- Program user buffer address into PMA registers
Also add corresponding cleanup functions as below :
gops.perf.deinit_inst_block()
gops.perfbuf.deinit_inst_block()
nvgpu_perfbuf_deinit_vm()
Bug 2510974
Jira NVGPU-5360
Change-Id: I486370f21012cbb7fea84fe46fb16db95bc16790
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372984
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The driver configures the sm hww global, warp ESR report masks during poweron
as part of gops_gr.gr_init_support. However, during golden context init, these
are overwritten with default entries from sw_ctx_load list; this leaves the
report masks in a state inconsistent with the driver expectation.
The driver should configure the sm hww warp, global ESR report masks during
golden context init and not before it; Hence, move set_hww_esr_report_mask from
power-on path to golden context init.
In addition, update set_hww_esr_report_mask to do RMW, so as to retain the
values loaded from sw_ctx_load list.
Update global ESR report mask to enable all exceptions.
Bug 3029888
Bug 2997718
Change-Id: Id7ad4cff5409982143f49695c95c5e1d1c9fdec9
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2367466
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
While booting LS falcons, gr.falcon.bind_instblk gops is
used to bind WPR VA to gr falcon. Only FECS_METHOD must be
used to bind instblks. But at this point FECS falcon is not loaded
and running. Hence FECS_METHOD cannot be used to bind this instblk.
Besides that, this code is not required
for successful falcon boot and functioning of chips other
than gm20b.
JIRA NVGPU-5323
Change-Id: I148ccc77d65d5f01adbba6261369e7a292dccfc3
Signed-off-by: smadhavan <smadhavan@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369736
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently the vGPU engine management rewrites a lot of the common
device agnostic engine management code.
With the new top HAL parsing one device at a time, it is now more
easily possible to tie the vGPU into the new common device framework
by implementing the top HAL but with the vGPU engine list backend.
This lets the vGPU inherit all the common engine and device
management code. By doing so the vGPU HAL need only implement a
trivial and simple HAL.
This also gets us a step closer to merging all of the CE init
code: logically it just iterates through all CE engines whatever
they may be. The only reason this differs between chips is because
of the swap from CE0-2 to LCEs in the Pascal generation. This could
be abstracted by the unit code easily enough.
Also, the pbdma_id for each engine has to be added to the device
struct. Eventually this was going to happen anyway, since the
device struct will soon replace the nvgpu_engine_info struct.
It's a little bit of an abuse but might be worth it long term. If
not, it should not be difficult to replace uses of dev->pbdma_id
with a proper lookup of PBDMA ID based on the device info.
JIRA NVGPU-5421
Change-Id: Ie8dcd3b0150184d58ca0f78940c2e7ca72994e64
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2351877
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add tu104 specific HAL tu104_gr_falcon_ctrl_ctxsw() that processes below
CTXSW methods to start/stop SMPC global mode :
NVGPU_GR_FALCON_METHOD_START_SMPC_GLOBAL_MODE
NVGPU_GR_FALCON_METHOD_STOP_SMPC_GLOBAL_MODE
Add new tu104 specific HAL tu104_gr_update_smpc_global_mode() to trigger
SMPC global mode start/stop using gops.gr.falcon.ctrl_ctxsw().
Update nvgpu_dbg_gpu_ioctl_smpc_ctxsw_mode() to enable/disable SMPC
global mode if channel is not bound to debug session.
Bug 2510974
Bug 2257799
Jira NVGPU-5360
Change-Id: I1f9d8f2a2d30a4738f291db3fc72c400d24f4048
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2368696
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Current PM resource reservation system is limited to HWPM resources
only. And reservation tracking is done using boolean variables.
New upcoming profiler support requires reservation for all the PM
resources like SMPC and PMA stream. Using boolean variables is
not scalable and confusing. Plus the variables have to be replicated
on gpu server in case of virtualization.
Remove flag tracking mechanism and use list based approach to track
all PM reservations. Also, current HALs are defined on debugger object.
Implement new HALs in new pm_reservation object since it is really an
independent functionality.
Add new source file common/profiler/pm_reservation.c which implements
functions to reserve/release resources and to check if any resource
is reserved or not.
Add common/vgpu/pm_reservation_vgpu.c for vGPU which simply forwards
the request to gpu server.
Define new HAL object gops.pm_reservation and assign above functions
to below respective HALs :
g->ops.pm_reservation.acquire()
g->ops.pm_reservation.release()
g->ops.pm_reservation.release_all_per_vmid()
Last HAL above is only used for gpu server cleanup of guest OS.
Add below new common profiler functions that act as APIs to reserve/
release resources for rest of the units in nvgpu.
nvgpu_profiler_pm_resource_reserve()
nvgpu_profiler_pm_resource_release()
Initialize the meta data required for reservtion system in
nvgpu_pm_reservation_init() and call it during nvgpu_finalize_poweron.
Clean up the meta data before releasing struct gk20a.
Delete below HALs :
g->ops.debugger.check_and_set_global_reservation()
g->ops.debugger.check_and_set_context_reservation()
g->ops.debugger.release_profiler_reservation()
Bug 2510974
Jira NVGPU-5360
Change-Id: I4d9f89c58c791b3b2e63099a8a603462e5319222
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2367224
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove an ancient piece of code that morphed from a hard coded fault ID
lookup in the original gk20a driver. In the old days there was no top
parsing code, so converting between engine ID and fault ID was done
by a simle function.
This code derived from that function for some reason. However, given
that the HW top table is not actually broken, this code never executes.
The only way this code would execute is if the HW top table reported
that the fault ID for a GRCE engine is 0. But this never happens.
JIRA NVGPU-5420
Change-Id: If8483faa9878f752c29ef6eadc1a56ce1de81942
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2362865
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Do the internal job synchronization semaphore and syncpt shim
comparisons with ACQ_CIRC_GEQ instead of ACQ_STRICT_GEQ. The semas and
syncpts naturally wrap, and this matches the ACQ_GEQ of pre-gv11b chips
that the driver assumes.
With the strict comparison that does not use wraparound, waits for
semas/syncpts that have just wrapped back to zero would hang because a
small value is not strictly greater than equal to a value near U32_MAX.
Change-Id: I427bb205a960d5ba3426f228364d96c30278dcaf
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366823
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For gv11b, priv ring interrupts were not triggered for PLM L2
registers. So, read back was implemented to confirm data written
to L2 registers.
Verified that LTC registers are not PLM protected, and read back
implemented after write is not necessary.
Replace nvgpu_writel_check with nvgpu_writel.
Bug 200625897
Bug 2989973
Jira NVGPU-5490
Change-Id: Ie0815899589c506e13ab2f9342657cb4d68b8fc8
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2363384
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Below APIs to update hwpm/smpc ctxsw mode take a channel pointer as a
parameter. APIs then extract corresponding TSG from channel and perform
various operations on context stored in TSG.
g->ops.gr.update_smpc_ctxsw_mode()
g->ops.gr.update_hwpm_ctxsw_mode()
Update both above APIs to accept TSG pointer instead of a channel.
This is a refactor work to support new profiler design where a profiler
object is bound to TSG and keeps track of TSG only.
Bug 2510974
Jira NVGPU-5360
Change-Id: Ia4cefda503d8420f2bd32d07c57534924f0f557a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366122
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
quad type reg_ops were only needed on Kepler, and not for any other chip
beginning Maxweel.
HAL g->ops.gr.access_smpc_reg() was incorrectly set for Volta and Turing
whereas it was only applicable to Kepler. Delete it.
There is no register in the quad type whitelist since the type itself is
not supported anymore. Remove the empty whitelists for all chips and
also delete below HALs:
g->ops.regops.get_qctl_whitelist()
g->ops.regops.get_qctl_whitelist_count()
hal/regops/regops_gv100.* files are not used anymore. Delete the files
instead of just deleting quad HALs in these files.
Bug 200628391
Change-Id: I4dcc04bef5c24eb4d63d913f492a8c00543163a2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2366035
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The FIFO pbdma map is an array of bit maps that link PBDMAs to runlists.
This array allows other software to query what PBDMA(s) serves a given
runlist. The PBDMA map is read verbatim from an array of host registers.
These registers are stored in a kmalloc()'ed array.
This causes a problem for the device management code. The device
management initialization executes well before the rest of the FIFO
PBDMA initialization occurs. Thus, if the device management code
queries the PBDMA mapping for a given device/runlist, the mapping has
yet to be populated.
In the next patches in this series the engine management code is subsumed
into the device management code. In other words the device struct is
reused by the engine management and all host SW does is pull pointers to
the host managed devices from the device manager. This means that all
engine initialization that used to be done on top of the device
management needs to move to the device code.
So, long story short, the PBDMA map needs to be read from the registers
directly, instead of an array that gets allocated long after the device
code has run.
This patch removes the pbdma map array, deletes two HALs that managed
that, and instead provides a new HAL to query this map directly from
the registers so that the device code can use it.
JIRA NVGPU-5421
Change-Id: I5966d440903faee640e3b41494d2caf4cd177b6d
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361134
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
This patch removes the reporting of _ECC_CORRECTED errors which are
not applicable to GV11B. Specifically, this patch removes the code
related to the reporting of the following service IDs:
NVGUARD_SERVICE_IGPU_SM_SWERR_LRF_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_CBU_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_PMU_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_GPCCS_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_FECS_SWERR_FALCON_DMEM_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_GCC_SWERR_L15_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_MMU_SWERR_L1TLB_FA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_MMU_SWERR_L1TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_L2TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_TLB_SA_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_PTE_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_HUBMMU_SWERR_PDE0_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L0_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L0_PREDECODE_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L1_DATA_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_ICACHE_L1_PREDECODE_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_TAG_MISS_FIFO_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_SM_SWERR_L1_TAG_S2R_PIXPRF_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_TSTG_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_RSTG_ECC_CORRECTED
NVGUARD_SERVICE_IGPU_LTC_SWERR_CACHE_DSTG_BE_ECC_CORRECTED
Bug 200616002
Change-Id: I199c396f9f6a6be007bd6d3c556199b5a73c3c91
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2349587
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>