The gv11b and gm20b gr status reg dumps can get printed so early that
this array is null, so don't access it in that case.
Commit 946f1e6359 ("gpu: nvgpu: don't read
missing gpc_tpc_count in dump") fixed this for gp10b only.
Bug 2049965
Change-Id: I9739fd63b5a153f43000d719a5c509e3be5135cf
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1643692
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Lots of code paths were split to T19x specific code paths and structs
due to split repository. Now that repositories are merged, fold all of
them back to main code paths and structs and remove the T19x specific
Kconfig flag.
Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Added sw method support for SET_BES_CROP_DEBUG4.
In this sw method:
CLAMP_FP_BLEND_TO_MAXVAL forces overflow and
CLAMP_FP_BLEND_TO_INF blend results to clamp to FP maxval.
Added support for this sw method in gp10b/gp106/gv11b
and gv100.
Bug 2046636
Change-Id: I3a9e97587aca76718f7f504ea3b853f87409092a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1641529
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Upon receiving MMU_FAULT error, MMU will forward MMU_NACK to SM
If MMU_NACK is masked out, SM will simply release the semaphores
And if semaphores are released before MMU fault is handled, user space
could see that operation as successful incorrectly
Fix this by handling SM reported MMU_NACK exception
Enable MMU_NACK reporting in gv11b_gr_set_hww_esr_report_mask
In MMU_NACK handling path, we just set the error notifier and clear
the interrupt so that the User Space sees the error as soon as
semaphores are released by SM
And MMU_FAULT handling path will take care of triggering RC recovery
anyways
Also add necessary h/w accessors for mmu_nack
Bug 2040594
Jira NVGPU-473
Change-Id: Ic925c2d3f3069016c57d177713066c29ab39dc3d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631708
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.
For gp10b used smaller buffer sizes than specified
value in hw manuals as per sw requirement.
Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode
This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.
Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.
Bug 1976694
Change-Id: I2dc923340d34d0dc5fe45419200d0cf4f53cdb23
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1635027
GVS: Gerrit_Virtual_Submit
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This reverts commit caf168e33e.
Might be causing an intermittency in quill-c03 graphics submit. Super
weird since the only change that seems like it could affect it is the
header file update but that seems rather safe.
Bug 2044830
Change-Id: I14809d4945744193b9c2d7729ae8a516eb3e0b21
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1634349
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Timo Alho <talho@nvidia.com>
Tested-by: Timo Alho <talho@nvidia.com>
Used chip specific attrib_cb_gfxp_default_size and
attrib_cb_gfxp_size buffer sizes during committing
global callback buffer when gfx preemption is requested.
These sizes are different for gv11b from gp10b.
Also used gv11b specific preemption related functions:
gr_gv11b_set_ctxsw_preemption_mode
gr_gv11b_update_ctxsw_preemption_mode
This is required because preemption related buffer
sizes are different for gv11b from gp10b. More optimization
will be done as part of NVGPU-484.
Another issue fixed is: gpu va for preemption buffers
still needs to be 8 bit aligned, even though 49 bits
available now. This done because of legacy implementation
of fecs ucode.
Bug 1976694
Change-Id: I284e29e0815d205c150998b07d0757b5089d3267
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1630520
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Tested-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
We send a broadcast request to invoke scrubbing on all TPCs, but we
check only TPC0 for scrubbing to finish. This likely produces correct
results, because each TPC should take exactly the same number of cycles
for scrubbing, but it's not certain.
Change the polling loop to check all TPCs to make sure there are no
timing glitches.
Change-Id: Id3add77069743890379099a44aec8994f59d9a5e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1625349
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gv11b_set_gpc_tpc_mask(), we calculate tpc_count_mask based on
number of TPCs
But since we could change number of TPCs runtime, we would end up
calulating incorrect tpc_count_mask
Hence instead of calculating tpc_count_mask, just hard code it to
width of fuse register i.e. hard code tpc_count_mask to 4-bit value
Bug 2031635
Change-Id: Ia6f74d39d066775a5d133897305554df1e54157e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1617917
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Pavan Kunapuli <pkunapuli@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Convert tpc number from pes-aware to non-pes-aware
number. tpc id is converted to one that is numbered
in order starting from the active tpcs within PES0
followed by the active tpcs in subsequent PESs.
Bug 1842197
Change-Id: I18d4b20ee4998e5a2ca5439793fe2479b4326c1a
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1615419
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gr_gv11b_init_fs_state() calls gr_gm20b_init_fs_state() which
disables vdc_4to2. This should no longer be done on gv11b, so
instead of calling gr_gm20b_init_fs_state() copy the relevant
lines to gr_gv11b_init_fs_state() and drop vdc_4to2 disable.
gv11b_ltc_init_fs_state() also disables it to match the state.
Remove that disable, too.
Change-Id: I3a3fd87a3e8836e495cb818570c971b3d29a6dd1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1619966
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: Wei Sun <wsun@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Check the availability of ecc units by checking
relevant ecc fuse and fuse overrides.
During gpu boot, initialize ecc units by scrubbing
individual ecc units available. ECC initialization
should be done before gr initialization.
Following ecc units are scrubbed:
SM LRF
SM L1 DATA
SM L1 TAG
SM CBU
SM ICACHE
Bug 200339497
Change-Id: I54bf8cc1fce639a9993bf80984dafc28dca0dba3
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1612734
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Pre-gv11b we only had 2 TPCs in a GPC. But on gv11b we have 4 TPCs in a GPC.
Hence update gr_gv11b_set_gpc_tpc_mask() as per new configuration and allow
setting bits based on number of TPCs
Bug 2031635
Change-Id: I44f5f6ce5f3e2501c229c9fcda36fb330ebf8bd0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1614044
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For gv11b, configured gfx preemption wfi timeout in usec.
Set timeout unit as usec in gr_gv11b_init_preemption_state.
Used default timeout as 1msec and this timeout value can
be modified through sysfs node:
/sys/devices/gpu.0/gfxp_wfi_timeout_count
For gp10b:
gfxp_wfi_timeout_count is in syclk cycles
For gv11b:
gfxp_wfi_timeout_count is in usec
Bug 2003668
Change-Id: I68d52ce996a83df90b8b3a8164debb07e5cb370f
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1599658
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove all linux and soc specific includes from common source file gr_gv11b.c
Use common nvgpu_usleep_range() instead of linux specific usleep_range()
Remove redundant kernel version checks pertaining to unsupported kernel versions
Use nvgpu_tegra_fuse_*() APIs instead of soc specific APIs
Jira NVGPU-405
Change-Id: I6f1602c6ab9f61046d68d3c465eb23873910960d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1606980
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
With recent rework in nvgpu most of the <uapi/linux/nvgpu.h> includes
are not needed so remove them
Remove use of NVGPU_DBG_GPU_REG_OP_* in gk20a/gr_gk20a.c and use common
definition instead
Remove use of NVGPU_ALLOC_GPFIFO_FLAGS_REPLAYABLE_FAULTS_ENABLE in
gp10b/fifo_gp10b.c by defining new common flag
NVGPU_GPFIFO_FLAGS_REPLAYABLE_FAULTS_ENABLE and then parsing it in API
nvgpu_gpfifo_user_flags_to_common_flags()
Jira NVGPU-363
Change-Id: I8e653275ea3f443f24be7284d54f2115636aba3f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1606108
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This patch removes the dependency on the header file "uapi/linux/nvgpu.h"
for regops_gk20a.c. The original structure and definitions in the
uapi/linux/nvgpu.h is maintained for userspace libnvrm_gpu.h. The
following changes are made in this patch.
1) Defined common versions of the NVGPU_DBG_GPU_REG_OP* definitions inside
regops_gk20a.h.
2) Defined common version of struct nvgpu_dbg_gpu_reg_op inside
regops_gk20a.h naming it struct nvgpu_dbg_reg_op.
3) Constructed APIs to convert the NVGPU_DBG_GPU_REG_OP* definitions from
linux versions to common and vice versa.
4) Constructed APIs to convert from struct nvgpu_dbg_gpu_reg_op to
struct nvgpu_dbg_reg_op and vice versa.
5) The ioctl handler nvgpu_ioctl_channel_reg_ops first copies from
userspace into a local storage based on struct nvgpu_dbg_gpu_reg_op which
is copied into the struct nvgpu_dbg_reg_op using the APIs above and
after executing the regops handler passes the data back into userspace
by copying back data from struct nvgpu_dbg_reg_op to struct
nvgpu_dbg_gpu_reg_opi.
JIRA NVGPU-417
Change-Id: I23bad48d2967a629a6308c7484f3741a89db6537
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1596972
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a translation layer to convert from the NVGPU_AS_* flags to
to new set of NVGPU_VM_MAP_* and NVGPU_VM_AREA_ALLOC_* flags.
This allows the common MM code to not depend on the UAPI header
defined for Linux.
In addition to this change a couple of other small changes were
made:
1. Deprecate, print a warning, and ignore usage of the
NVGPU_AS_MAP_BUFFER_FLAGS_MAPPABLE_COMPBITS flag.
2. Move the t19x IO coherence flag from the t19x UAPI header
to the regular UAPI header.
JIRA NVGPU-293
Change-Id: I146402b0e8617294374e63e78f8826c57cd3b291
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1599802
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- in the native case, replace calls for init_cyclestats with
the gm20b version, as each chip had identical versions of the code.
- in the virtual case, use the vgpu version of the function in order
to get the new max_css_buffer_size characteristic set to the mempool
size.
JIRA ESRM-54
Bug 200296210
Change-Id: I475876cb392978fb1350ede58e37d0962ae095c3
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1578934
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For SCG to work, smid numbering needs to be done
based on scg performance of tpcs. For gv11b and
gv11b vgpu, reuse gv100 function "gr_gv100_init_sm_id_table"
to do this.
Used local variable "index" to avoid multiple computations in
the function: gr_gv100_init_sm_id_table
index = sm_id + sm
Add deug info for printing initialized gpc/tpc/sm/global_tpc
indexs.
Bug 1842197
Change-Id: Ibf10f47f10a8ca58b86c307a22e159b2cc0d0f43
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1583916
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
For pre-silicon platforms, clock gating
should be skipped as it is not supported.
Added new flags "can_"x"lcg" to check platform
capability before programming SLCG,BLCG and ELCG.
Bug 200314250
Change-Id: Iec7564b00b988cdd50a02f3130662727839c5047
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1566251
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reorganize HAL initialization to remove inheritance and construct
the gpu_ops struct at compile time. This patch only covers the
gr sub-module of the gpu_ops struct.
Perform HAL function assignments in hal_gxxxx.c through the
population of a chip-specific copy of gpu_ops.
Jira NVGPU-74
Change-Id: I8feaa95a9830969221f7ac70a5ef61cdf25094c3
Signed-off-by: Sunny He <suhe@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1542988
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Number of sm is being reported incorrectly. This is because
we are not taking into account that each TPC have 2 sm.
Bug 1951026
Change-Id: I7c666aa2a0470a14aad29ab1a80ae9d23958a743
Signed-off-by: Sandarbh Jain <sanjain@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1527771
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alexander Lewkowicz <alewkowicz@nvidia.com>
Tested-by: Alexander Lewkowicz <alewkowicz@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Replace privsecurity boolean flag in gpu_ops with entry in
common flag system.
The new common flag is NVGPU_SEC_PRIVSECURITY
Jira NVGPU-74
Change-Id: I4c11e3a89a76abe137cf61b69ad0fbcd665554b7
Signed-off-by: Sunny He <suhe@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1525714
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
- Created HAL to configure gpc mmu unit for gv11b.
- Earlier chips needs writes to NV_PGRAPH_PRI_GPCS_MMU_NUM_ACTIVE_LTCS
register to know supported number of LTCS by reading NUM_ACTIVE_LTCS
but gv11b support auto update from fuse upon reset, so skipped
LTCS update for GPCS & skipping helps to fix compression failure
issue.
Bug 1950234
Change-Id: I628af7d1399e4fe3126895e3a703a19147f7a12f
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1517733
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Tested-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>