Now that GR buffers always have a kernel mapping, remove the unnecessary
calls to nvgpu_mem_begin() and nvgpu_mem_end() on these buffers:
- global ctx buffer mem in gr
- gr ctx mem in a tsg
- patch ctx mem in a gr ctx
- pm ctx mem in a gr ctx
- ctx_header mem in a channel (subctx header)
Change-Id: Id2a8ad108aef8db8b16dce5bae8003bbcd3b23e4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1760599
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
To finish OS unification of the submit path, move the
gk20a_submit_channel_gpfifo* functions to a file that's accessible also
outside Linux code.
Also change the prefix of the submit functions from gk20a_ to nvgpu_.
Jira NVGPU-705
Change-Id: I8ca355d1eb69771fb016c7a21fc7f102ca7967d7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1760421
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-Add support for aborting runlist/s. Aborting runlist/s,
will abort all active tsgs and associated active channels
within these active tsgs
-Bare channels are no longer supported. Remove recovery
support for bare channels. In case there are bare
channels, recovery will trigger runlist abort
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: I6bec8a0004508cf65ea128bf641a26bf4c2f236d
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640567
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-is_preempt_pending hal does not need timeout_rc_type input param as
for volta, reset_eng_bitmask is saved if preempt times out. For
legacy chips, recovery triggers mmu fault and mmu fault handler
takes care of resetting engines.
-For volta, no special input param needed to differentiate between
preempt polling during normal scenario and preempt polling during
recovery. Recovery path uses preempt_ch_tsg hal to issue preempt.
This hal does not issue recovery if preempt times out.
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: Ie76a18ae0be880cfbeee615859a08179fb974fa8
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1709799
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
-During polling eng preempt done, reset eng only if eng stall
intr is pending. Also stop polling for eng preempt done
if eng intr is pending.
-Add max retries for pre-si platforms for poll pbdma and eng
preempt done polling loops.
Bug 2125776
Bug 2108544
Bug 2105322
Bug 2092051
Bug 2048824
Bug 2043838
Bug 2039587
Bug 2028993
Bug 2029245
Bug 2065990
Bug 1945121
Bug 200401707
Bug 200393631
Bug 200327596
Change-Id: I66b07be9647f141bd03801f83e3cda797e88272f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1694137
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The biggest remaining Linuxism in the submit path is the
copy_from_user() calls for reading the gpfifo entries to the HW-visible
buffer. Abstract away the copy of one such segment starting at some
offset and keep the wraparound logic and vidmem proxy in the core submit
path.
Jira NVGPU-705
Change-Id: I0c6438045c695e5e3f5da4fbc0c92d2c6e7f32cb
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1730480
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Moved PG refcount checking to a wrapper function, this
function manages the refcount and decides whether to call
dbg_set_powergate function.
Instead of checking the dbg_s->is_pg_disabled variable,
code is checking g->dbg_powergating_disabled_refcount
variable to know if powergate is disabled or not.
Updating hwpm ctxsw mode without disabling powergate
will result in priv errors.
Bug 200410871
Bug 2109765
Change-Id: I33c9022cb04cd39249c78e72584dfe6afb7212d0
Signed-off-by: Vinod G <vinodg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1753550
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In case of mmu nack error interrupt is received twice through SM
reported mmu nack interrupt and mmu fault in undertermined order.
Recover on the first received interrupt to avoid semaphore release
and skip doing a second recovery.
Also fix NULL pointer dereference in function
gv11b_fifo_reset_pbdma_and_eng_faulted when channel reference is
invalid in teardown path.
Bug 200382235
Change-Id: I361a5725d7b6355ebf02b2870727f647fbd7a37e
Signed-off-by: Aparna Das <aparnad@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1739804
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Make ecc sysfs hash table per GPU by adding it as
part of nvgpu_os_linux. Using a single hash table
might give incorrect results as GPUs have same filenames
and a filename is used as a key for a lookup.
Add device_attribute as part of struct gk20a_ecc_stat. Using
a single array of pointers of device attribute for an
ecc_stat results in memory leak and incorrect stats if
multiple GPUs are present on the system. This array of pointers
will always hold info for GPU which created sysfs nodes last.
Fix this by making device attribute array per ecc stat per GPU.
Fix ecc stat removal to consider zero sub-units for a given
number of hwunits. The multiplication with zero results
in not removing any sysfs node at all.
Bug 1987855
Change-Id: Ifcacc5623cede8decfe228c02d72786337cd0876
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735989
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add remove_gr_sys() op to gpu_ops to reverse steps
done in create_gr_sysfs().
Make gv11b_tegra_remove() specific to gv11b instead
to properly remove sysfs nodes. This also helps in
having gv11b specific remove steps.
Also, update platform remove function of dGPU i.e.
nvgpu_pci_tegra_remove() to remove sysfs nodes. This
adds parity with iGPU platform remove.
Bug 1987855
Change-Id: Ibbaffac5c24346709347f86444a951461894354d
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735987
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- defined platform agnostic wrapper for mempool
mapping and unmapping.
- used platform agnositc wrapper for device
tree parsing.
- modified css_gr_gk20a to include special
handling incase of rm-server
JIRA: VQRM:3699
Change-Id: I08fd26052edfa1edf45a67be57f7d27c38ad106a
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1733576
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- removed inclusion of linux includes.
- replaced with nvgpu/*.h's
- reformated the function signature of
"css_hw_get_pending_snapshot" and
"css_hw_get_overflow_status" be global instead of
static.
- added get_pending_snapshot and get_overflow_status
to ops->css.
JIRA: VQRM-3699
Change-Id: I177904c263e143b414924c2c28ad6fd3cfd00132
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1732783
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below HALs to setup mmu_fault configuration registers and to read
information registers and set them on Volta
gops.fb.write_mmu_fault_buffer_lo_hi()
gops.fb.write_mmu_fault_buffer_get()
gops.fb.write_mmu_fault_buffer_size()
gops.fb.write_mmu_fault_status()
gops.fb.read_mmu_fault_buffer_get()
gops.fb.read_mmu_fault_buffer_put()
gops.fb.read_mmu_fault_buffer_size()
gops.fb.read_mmu_fault_addr_lo_hi()
gops.fb.read_mmu_fault_inst_lo_hi()
gops.fb.read_mmu_fault_info()
gops.fb.read_mmu_fault_status()
Jira NVGPUT-13
Change-Id: Ia99568ff905ada3c035efb4565613576012f5bef
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1744063
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- On t186, ucode expects physical address to be
programmed for FECS trace buffer.
- On t194, ucode expects GPU VA to be programmed
for FECS trace buffer. This patch adds extra
support to handle this change for linux native.
- Increase the size of FECS trace buffer (as few
entries were getting dropped due to overflow of
FECS trace buffer.)
- This moves FECS trace buffer handling in global
context buffer.
- This adds extra check for updation of mailbox1
register. (Bug 200417403)
EVLR-2077
Change-Id: I7c3324ce9341976a1375e0afe6c53c424a053723
Signed-off-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1536028
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Nirav Patel <nipatel@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Starting with Volta, one TPC could have more than 1 SMs. So
.record_sm_error_state needs to have sm number as parameter.
Logic tpc id should be read from gr_gpc0_gpm_pd_sm_id_r.
Let the function return logical sm_id. RM server will need it to nofify
client.
Jira EVLR-2643
Bug 200405202
Change-Id: Iffaff05b89b1c5058616b8a6bf50dd73bd4e52f6
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1742165
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add below new HALs to allocate/map/commit global context buffers
gops.gr.alloc_global_ctx_buffers()
gops.gr.map_global_ctx_buffers()
gops.gr.commit_global_ctx_buffers()
Set these HALs for all the supported GPUs
We right now re-use below APIs to set these HALs
gr_gk20a_alloc_global_ctx_buffers()
gr_gk20a_map_global_ctx_buffers()
gr_gk20a_commit_global_ctx_buffers()
Jira NVGPUT-27
Change-Id: I975a54e8d1716af057f982d543787748d35a256e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1743362
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
VBIOS link_disable_mask should be sufficient to find the connected
links. As VBIOS is not updated with correct mask, we parse the DT
node where we hardcode the link_id. DT method is not scalable as same
DT node is used for different dGPUs connected over PCIE. Remove the
DT parsing of link id and use HAL to get link_mask based on the GPU.
JIRA NVLINK-162
Change-Id: Idb7b639962928ce48711a0d7fc277c4c324bee91
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1738967
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The sequence of INIT* minion dlcmd varies between nvlink 2.0 and 2.2.
The order is strict for 2.2. Also there are new dlcmds added to the
nvlink bringup sequence. Add HAL to allow sequence update for nvlink 2.2.
Old sequence:
INITLANEENABLE-> INITDLPL
New Sequence:
INITDLPL->INITDLPL_TO_CHIPA->INITTL->INITLANEENABLE
JIRA NVLINK-176
Change-Id: I49e0a726f56e7d6122ac4cddf0f0e021d16f1926
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1738329
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Earlier implementation of railgate disable config is disabling
runtime pm during pm_init. This is causing multiple issues:
1. gpu rail will be on as soon as nvgpu driver probe is called.
Actual gpu hw init may happen at much later point of time.
2. This is breaking railgate_enable sysfs node functionality.
railgate_enable is not working if runtime pm is disabled.
To avoid all these issues for railgate disable, enable runtime pm
during pm_init and set auto-suspend delay to negative (-1), which
will disable runtime pm suspend calls.
Also fixed following issues along with this:
1. Updated railgate_enable debugfs implementation to use auto-suspend delay.
To disable railgating:
Set auto-suspend delay with negative value(-1) which will disable runtime
pm suspend.
To enable railgating:
Set auto-suspend delay with railgate_delay value.
Also removed redundant user_railgate_disabled gk20a device data and
replaced with can_railgate, where ever it is applicable.
2. Initialized default railgate_delay to 500msec to avoid railgate
on/off transitions with railigate enable from disabled state.
3. Created railgate_residency debug fs node irrespective of can_railgate
initial state. This is helping with the case, where initial state of
railgate state off and then railgate enable is done through sysfs node.
Bug 2073029
Change-Id: I531da6d93ba8907e806f65a1de2a447c1ec2665c
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1694944
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Before nvlink 2.2, driver was responsible for setting the NVLink clocks
during NVLink initialization. For the purpose of security, NVLink PLL
handling is moved to Minion in nvlink 2.2 and driver should stop writing
to these registers.
JIRA NVLINK-167
Change-Id: I18392a29c322da55053037bfde62c8f74ee75288
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1730597
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
RXDET is supported only on nvlink 2.2 devices and forward.
Add HAL to run RXDET selectively based on chip. RXDET needs to be
done after the links are out of reset but before any other link
level initialization.
minion_send_cmd is also made non-static to support RXDET
functionality.
JIRA NVLINK-160
Change-Id: Ic65b8dbc7281743f62072089ff3c805521ac9b38
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1729525
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We receive bundle with address and 64 bit values from ucode on some platforms
This patch adds the support to handle 64 bit values
Add struct av64_gk20a to store an address and corresponding 64 bit value
Add struct av64_list_gk20a to store count and list of av64_gk20a
Add API alloc_av64_list_gk20a() to allocate the list that supports 64bit
values
In gr_gk20a_init_ctx_vars_fw(), if we see NETLIST_REGIONID_SW_BUNDLE64_INIT,
load the bundle64 state into above local structures
Add new HAL gops.gr.init_sw_bundle64() and call it from gk20a_init_sw_bundle()
if defined
Also load the bundle for simulation cases in gr_gk20a_init_ctx_vars_sim()
Jira NVGPUT-96
Change-Id: I1ab7fb37ff91c5fbd968c93d714725b01fd4f59b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1736450
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_gr_isr(), we handle various errors including GPC/TPC errors.
And then if BPT errors are pending we call gk20a_gr_post_bpt_events() at the
end and pass channel pointer to it
gk20a_gr_post_bpt_events() extracts TSG pointer based on ch->tsgid
But in some race conditions it is possible that we clear the error and trigger
recovery and as a result channel is unbounded from TSG and closed by user space
before calling gk20a_gr_post_bpt_events()
And in that case the code above results in getting incorrect TSG pointer and
hence crashes as below
Unable to handle kernel paging request at virtual address ffffff8012000c08
...
[<ffffff8008081f84>] el1_da+0x24/0xb4
[<ffffff80086e72e0>] gk20a_tsg_get_event_data_from_id+0x30/0xb0
[<ffffff80086e7560>] gk20a_tsg_event_id_post_event+0x50/0xc8
[<ffffff800872922c>] gk20a_gr_isr+0x27c/0x12e0
To fix this extract the TSG pointer before handling all the errors and pass
this pointer to gk20a_gr_post_bpt_events() will post the events if they are
enabled and if TSG is still open
Bug 200404720
Change-Id: I4861c72e338a2cec96f31cb9488af665c5f2be39
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735415
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add g->fifo_eng_timeout_us to define engine timeout in microseconds.
It is initialized with GRFIFO_TIMEOUT_CHECK_PERIOD_US. In RM server
case, it can be overriden with value defined in device tree.
Jira EVLR-2674
Change-Id: I69ac2ce779fe575566c8ba48e8cd2d0e6b2d93cf
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1728391
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>