- F/W version update for gv11b PMU ucode of
CL https://git-master.nvidia.com/r/#/c/1628288/
Current CL has PMU F/W version for ucode bin of
P4 CL# 23378914
P4 CL# & its changes.
- 23378914
- Don't post "PMU_PG_EVENT_IDLE_SNAP" event in
method pgConvertPgInterrupts_GP10X()
- 23355380
- Remove debug code included by mistake in P4
change list #23354716
- 23354716
- Made change to point CONVERT_PG_INTERRUPTS of
gv11b to _GP10x - pgConvertPgInterrupts_GP10X()
- Removed PMU halt upon FIFO preempt timeout in
_fifoPreemptRunlist_GP10X()
Bug 2039371
Bug 200377983
Change-Id: I8ce7cb926203b329308944235a06933768ed2a5f
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1628380
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.
Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode
This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.
Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.
Bug 1976694
Change-Id: I284e29e0815d205c150998b07d0757b5089d3267
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630520
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Tested-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
During bringup and before nvlink is up GV100 on the DDPX platform operates
with a very, very slow sysmem link. In order to get sysmem test to pass
it is neccesary to significantly increase most timeouts by an order the
magnitude.
Bug 2040544
Change-Id: I26858afde4ae80c70f86b47cfff674b6b00b5bf8
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627417
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
With a recent rework we moved gk20a_get() call to nvgpu_ioctl_tsg_open(),
but corresponding gk20a_put() call remained in gk20a_tsg_release()
So if a TSG is opened and released from within kernel with APIs
gk20a_tsg_open()/gk20a_tsg_release() we mistakenly drop extra refcount
through gk20a_put()
Fix this by moving gk20a_put() call to nvgpu_ioctl_tsg_release() which
balances gk20a_get() call in nvgpu_ioctl_tsg_open()
Bug 200374011
Change-Id: Id0cec0426e6231309dc530ab5c934dacaba9f8da
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630969
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_ce_create_context(), if gk20a_tsg_open() or gk20a_open_new_channel()
fails, we bail out from the function without setting the error code
This could mislead the caller and report incorrect success
Fix this by setting error code explicitly in failure cases
Bug 200374011
Change-Id: Idf6cba4a57740107bada698295745352f7b5d5ac
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631506
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_ce_delete_gpu_context(), we unbind the channel from TSG and
close the channel. But we do not drop the TSG refcount leaking
the TSG reference
Fix this by explicitly dropping TSG refcount
Also, do not explicitly unbind the channel from TSG
gk20a_channel_close() will internally unbind the channel from TSG
Bug 200374011
Change-Id: Ie4aa32f1d0bff4231f41aa2b33743cdc63e967c7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1629972
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_cde_load(), if TSG allocation fails we bail out the function without
setting the error code and caller of this functions assumes CDE load is
successful
Fix this by setting explicit error code if TSG allocation fails
Bug 200374011
Change-Id: I6e7bcb325fb0062605fa2f696da4abdeb34e241a
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627117
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gk20a_cde_remove_ctx(), we unbind the channel from TSG and
close the channel. But we do not drop the TSG refcount leaking
the TSG reference
After allocating sufficient contexts, we see TSG creation fails as below
nvgpu: 17000000.gp10b: gk20a_cde_load:1286 [ERR] cde: could not create TSG
Fix this by explicitly dropping TSG refcount
Also, do not explicitly unbind the channel from TSG
gk20a_channel_close() will internally unbind the channel from TSG
Bug 200374011
Change-Id: If6d75b20d5e03d710c0597d7a320d1157206a2a5
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1627116
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a field in vm_gk20a to identify guest managed VM, with the
corresponding checks to ensure that there's no kernel section for
guest managed VMs.
Also make the __nvgpu_vm_init function available globally, so that
the vm can be allocated elsewhere, requisite fields set, and passed
to the function to initialize the vm.
Change-Id: Iad841d1b8ff9c894fe9d350dc43d74247e9c5512
Signed-off-by: Sourab Gupta <sourabg@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1617171
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove nvgpu internal flag indicating that TSGs are required. We now
require TSGs always. This also fixes a regression where CE channels
were back to using bare channels on gp106.
Bug 1842197
Change-Id: Id359e5a455fb324278636bb8994b583936490ffd
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1628481
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We send a broadcast request to invoke scrubbing on all TPCs, but we
check only TPC0 for scrubbing to finish. This likely produces correct
results, because each TPC should take exactly the same number of cycles
for scrubbing, but it's not certain.
Change the polling loop to check all TPCs to make sure there are no
timing glitches.
Change-Id: Id3add77069743890379099a44aec8994f59d9a5e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1625349
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This CL handles
- erroneous use of boot_0 function pointer
before being assigned in __nvgpu_check_gpu_state
- And proper handling of error returned from gk20a_readl
in gk20a_mc_boot_0
With these fixes crash is not seen in case mc_boot_0 read
returns 0 in gk20a_mc_boot_0
- And also this handles the recursion caused by mc.boot_0()
calling nvgpu_readl and nvgpu_readl in turn
calling mc.boot_0 in case of read failure
Bug 2010966
Change-Id: Ia087811c67d88948b7fc5fff35e0fabc6ea91989
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1616274
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
During finalize power on, resume channels only after
complete hw initialization is done. Otherwise it will
cause issues with unexpected usage of hw. During first
boot will not see these issues because there will no channels.
But after rail gate/ungate or suspend/resume these issues
can be seen if channels are present before rail-gate/suspend.
Bug 2039195
Change-Id: Ie96e2f2b91902ba18b37e9a167344eeae07ba8c2
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1625506
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gv11b_set_gpc_tpc_mask(), we calculate tpc_count_mask based on
number of TPCs
But since we could change number of TPCs runtime, we would end up
calulating incorrect tpc_count_mask
Hence instead of calculating tpc_count_mask, just hard code it to
width of fuse register i.e. hard code tpc_count_mask to 4-bit value
Bug 2031635
Change-Id: Ia6f74d39d066775a5d133897305554df1e54157e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1617917
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Pavan Kunapuli <pkunapuli@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In __nvgpu_gmmu_do_update_page_table(), and in case of non-IOMMU mappings,
we call nvgpu_sgt_get_phys() to get physical address
But this API ignores mapping attributes including l3_alloc attribute
specified by user space, and this breaks L3 cache allocations
Fix this by using g->ops.mm.gpu_phys_addr() which also considers the
mapping attributes and returns appropriate physical address
Jira GPUT19X-10
Bug 200279508
Change-Id: Ibc0d29f7cb576a9d6893a97b1912d9ff4bc78e02
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1621245
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Convert tpc number from pes-aware to non-pes-aware
number. tpc id is converted to one that is numbered
in order starting from the active tpcs within PES0
followed by the active tpcs in subsequent PESs.
Bug 1842197
Change-Id: I18d4b20ee4998e5a2ca5439793fe2479b4326c1a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1615419
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gr_gv11b_init_fs_state() calls gr_gm20b_init_fs_state() which
disables vdc_4to2. This should no longer be done on gv11b, so
instead of calling gr_gm20b_init_fs_state() copy the relevant
lines to gr_gv11b_init_fs_state() and drop vdc_4to2 disable.
gv11b_ltc_init_fs_state() also disables it to match the state.
Remove that disable, too.
Change-Id: I3a3fd87a3e8836e495cb818570c971b3d29a6dd1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1619966
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Wei Sun <wsun@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
For gv11b, update thermal settings as per hw POR:
1.Created gv11b specific HAL for init_therm_setup_hw
2.Update steps for gradual slowdown to 1x,1.5x,2x,4x,8x,16x,32x.
3.Modified gradual step duration cycles to 4.
Bug 200365110
Change-Id: I93c28a3394857aacdf3d304103c9e7c25d4ad344
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1616600
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
All channels should be wrapped in TSGs so that bare channel support
can be dropped. Bind all CE channels to TSGs.
Bug 1842197
Change-Id: Ia55748d5b53750d860f7764b532ef9eeb6f214b8
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1616693
Check the availability of ecc units by checking
relevant ecc fuse and fuse overrides.
During gpu boot, initialize ecc units by scrubbing
individual ecc units available. ECC initialization
should be done before gr initialization.
Following ecc units are scrubbed:
SM LRF
SM L1 DATA
SM L1 TAG
SM CBU
SM ICACHE
Bug 200339497
Change-Id: I54bf8cc1fce639a9993bf80984dafc28dca0dba3
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1612734
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_get_nvhost_dev will not return error if host1x field within
gv11b device tree is not present. It will just set has_syncpoints
in gk20a struct to false. syncpt_unit_interface* should be called
only if g->has_syncpoints is set to true.
Change-Id: Id1eb94aba4cff1942ad519f528ebdb8291963971
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1615973
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>