Add ptimer register offsets to regops allowlist for testing. New
allowlist restricts regops only to reserved resources, this makes it
difficult to test the interface since only HWPM registers can be
accessed and that could have side effects on system.
Having ptimer registers as test offsets has advantage that the offsets
do not change across chips, registers are read-only, and values are
always incrementing so a test can verify read regops and test various
flags of interface.
Add gops.ptimer.get_timer_reg_offsets() HAL to return timer offsets.
Add static function add_test_range_to_map() that adds timer offsets to
allowlist always.
In nvgpu_profiler_validate_regops_allowlist() return success if timer
offsets are hit in range search.
Bug 2510974
Jira NVGPU-5360
Change-Id: I8b51bb92e43e8b1bbe903c874a429341659ef603
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2460002
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Add gv11b and tu104 HALs to get allowed HWPM resource register ranges,
offsets, and stride meta data.
Add new enum nvgpu_pm_resource_hwpm_register_type for HWPM register
type. Add new struct nvgpu_pm_resource_register_range_map to store all
the register ranges for HWPM resources. Add pointer of map in struct
nvgpu_profiler_object along with map entry count.
Add new API nvgpu_profiler_build_regops_allowlist() to build the regops
allowlist dynamically while binding the resources. Map entry count is
received with get_pm_resource_register_range_map_entry_count() and only
those resource ranges are added for which resource is reserved by
profiler object.
Add nvgpu_profiler_destroy_regops_allowlist() to destroy the allowlist
while unbinding the resources.
Add static functions allowlist_range_search() to search a register
offset in HWPM resource ranges. Add another static function
allowlist_offset_search() to search the offset in per-resource offset
list.
Add nvgpu_profiler_validate_regops_allowlist() that accepts an offset
value, checks if it is in allowed ranges using allowlist_range_search()
and then checks if offset is in allowlist using allowlist_offset_search().
Update gops.regops.exec_regops() to receive profiler object pointer as
a parameter.
Invoke nvgpu_profiler_validate_regops_allowlist() from
validate_reg_ops() if prof pointer is not-null. This will be true only
for new profiler stack and not legacy profilers.
In gr_exec_ctx_ops(), skip regops execution if offset is invalid.
Bug 2510974
Jira NVGPU-5360
Change-Id: I40acb91cc37508629c83106ea15b062250bba473
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2460001
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Implement below functions:
- nvgpu_profiler_quiesce_hwpm_streamout_resident
Teardown sequence when context is resident or in case profiling
session is a device level session.
- nvgpu_profiler_quiesce_hwpm_streamout_non_resident
Teardown sequence when context is non resident
- nvgpu_profiler_quiesce_hwpm_streamout
Generic sequence to call either of above API based on whether
context is resident or not.
Trigger HWPM streamout teardown sequence while unbinding resources
in nvgpu_profiler_unbind_hwpm_streamout()
Add a new HAL gops.gr.is_tsg_ctx_resident to call
gk20a_is_tsg_ctx_resident() from common code.
Implement below supporting HALs for resident teardown sequence:
- gops.perf.pma_stream_enable()
- gops.perf.disable_all_perfmons()
- gops.perf.wait_for_idle_pmm_routers()
- gops.perf.wait_for_idle_pma()
- gops.gr.disable_cau()
- gops.gr.disable_smpc()
Jira NVGPU-5360
Change-Id: I304ea25d296fae0146937b15228ea21edc091e16
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2461333
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Update the below two regops ctxsw address types to fix misnomers:
- CTXSW_ADDR_TYPE_ROP:
This address type is used to access the PMM config registers and does not
belong to the ROP unit. Hence, rename it to CTXSW_ADDR_TYPE_PMM_FBPGS_ROP.
- CTXSW_ADDR_TYPE_BE:
This address type is used to access registers exclusively in ROP unit and not
the entire BE unit. Hence, its more appropriate to rename it to
CTXSW_ADDR_TYPE_ROP.
In addition, rename the following functions:
- pri_is_be_addr_shared => pri_is_rop_addr_shared
- pri_be_shared_addr => pri_rop_shared_addr
- pri_is_be_addr => pri_is_rop_addr
- pri_get_be_num => pri_get_rop_num
Bug 3146324
Change-Id: I8613f0972936699b2ef8f7dbe3de78582af2a35f
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2429885
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
With Generic Power Domains (genpd), bpmp driver will manage the GPU
powergating. With the nvgpu idle/unidle flows updated for VPR with
genpd/RPM, the usage of the below tegra bpmp calls can be removed
from nvgpu from railgate APIs for t186 and t194. Note that genpd
is available in k4.14 onwards, so this will work on current
downstream kernel.
tegra_bpmp_running
tegra_powergate_is_powered
tegra_powergate_partition
tegra_unpowergate_partition
Runtime suspended state indicates that the device is railgated.
Update the t186 and t194 is_railgated handlers with this. t210
railgate/unrailgate will be still managed by nvgpu as bpmp
support is not present.
Bug 200602747
JIRA NVGPU-5356
Change-Id: Iadfd794cb51bc41ca927b84fc212ac766d60094d
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2376642
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Extend the runtime suspend/resume based idle/unidle logic in the
probe case to handling done in gk20a_do_idle/unidle for nvgpu
after the probe completion.
If the railgating is disabled, setting autosuspend_delay to 0 will
enable the suspend. If railgating is enabled, autosuspend delay
will be > 0. Setting it to 0 will enable the immediate suspend.
With this approach based on RPM, forced_reset logic is removed.
force_reset_in_do_idle is also removed as railgating is
supported.
Bug 200602747
JIRA NVGPU-5356
Change-Id: Iaf6d5ab651b8200f0547b45d90f812110cf63c0e
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2375941
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
With genpd based runtime PM, the device railgating is managed by the PM
core and the nvgpu manages the clocks. To suspend/resume the device for
idling/unidling while initializing secure alloc, runtime PM is to be
enabled during probe.
nvgpu platform railgate handlers will be only managing the clocks.
During probe, the nvgpu driver poweroff/poweron are not to be
invoked as part of driver runtime suspend/resume hence probe
state is added.
After platform probe initializes the clock, explicit runtime resume of
the device is required to sanely suspend it during gk20a_do_idle.
Runtime PM configuration differs based on the NVGPU_CAN_RAILGATE
capability, hence the runtime PM is enabled ("truly") only for
the duration of nvgpu_probe and then the state is reverted at
the beginning of gk20a_pm_late_init.
Bug 200602747
JIRA NVGPU-5356
Change-Id: I1fbd03d3f49da07ccbee9714387e00ffc688864e
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2375939
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
With genpd based runtime PM, the device railgating is managed by the PM
core and the nvgpu manages the clocks. To suspend/resume the device for
idling/unidling while initializing secure alloc, runtime PM will be
enabled before init_secure_alloc.
nvgpu platform railgate handlers will be only managing the clocks. The
clocks and secure alloc initialization was done in platform probe
(applicable to tegra).
To suspend (railgate and clks disable) and resume cleanly during secure
alloc init, the platform probe should happen first that initializes the
clocks. Post that device runtime PM will handle the device idle/unidle
properly.
Hence, move gk20a_tegra_init_secure_alloc to platform late_probe.
Runtime PM changes are introduced in the later patches.
Bug 200602747
JIRA NVGPU-5356
Change-Id: I5130ff43f7b75ddc51cb7096ba6532b3f5397258
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2375938
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Starting with nvgpu-next, the ctxsw ucode computes the checksum for each
ctxsw'ed register list, this checksum is saved at the end of the same list;
This entry will be given a special placeholder address 0x00ffffff, which can
be used to distinguish it from other entries in the register list.
There is only one checksum per list, even if it has multiple subunits. Hence,
update "add_ctxsw_buffer_map_entries_subunits" to avoid adding checksum
entires for each subunit within a list.
Bug 2916121
Change-Id: Ia7abedc7467ae8158ce3e791a67765fb52889915
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2457579
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
The function "nvgpu_engine_mmu_fault_id_to_eng_id_and_veid" updates only
the veid field and leaves the engine_id as invalid. This can cause the
recovery to be skipped in certain instances of MMUFAULT; For example,
the MMUFAULT when a unbind is done on a channel which is currently active
on the engine. In this case, the ch_id associated with the fault is -1 and
the function "gv11b_mm_mmu_fault_handle_non_replayable" will not set the
rc_type correctly causing recovery to be skipped and leaving the engine in
a bad state.
Bug 3163660
Change-Id: Ic99c47771a4002c153ac77ab0473b11d01cfd54a
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2457259
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
gr_gv100_reset_hwpm_pmm_registers() writes a bunch of registers in
sys/gpc/fbp chiplets to reset perfmons. To ensure all the writes have
completed it is necessary to readback each chiplet's PRI fence register.
Add and use new HAL g->ops.priv_ring.read_pri_fence() to achieve this.
Implement the HAL for gv11b in new source code file
hal/priv_ring/priv_ring_gv11b.c.
Bug 2510974
Jira NVGPU-5360
Change-Id: If4dd61cb4265422e8c2d16884790eb0fe7f2c103
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2453631
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Update pmu ucode version for next pmu to 29323513.
This version is taken from P4 CL#29323216.
Changes:
- Enabled ACR task support
- Disabled few features/code for commands to work
- ELPG fifo preemption hals fixed
- Halt functions in ELPG save and restore functions
are commented as bloaded flag is not getting set. This
is not significant as this change will not have any impact
in elpg functionality.
P4 ToT CL on which above change was made: P4 CL#29322732
P4 CL link: https://p4sw-swarm.nvidia.com/changes/29323216
Bug 200666202
Signed-off-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com>
Change-Id: I34581cc15889463fa363cffb369485171c603247
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2447234
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>