Commit Graph

10 Commits

Author SHA1 Message Date
Ashish Srivastava
10c3d4447d gpu: nvgpu: gv11b: enable RMW for gpu atomics
Separate HAL added in gv11b and gv100 for
init_gpc_mmu function.
In gv11b HAL, RMW is enabled for gpu atomics
as default.
In gv100 HAL, GPC atomic capability mode will
get set based on the FB MMU capability.
If GPU is connected through NVLINK then mmu
will be set to RMW mode, else it will be in
L2 mode.

Bug 200390336

Change-Id: I224934f83d1762ec864ef8da7265dd01d86893a0
Signed-off-by: Ashish Srivastava <assrivastava@nvidia.com>
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1735137
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-06-26 11:17:17 -07:00
Deepak Nibade
a0dfb2b911 gpu: nvgpu: gv100: consider floorswept FBPA for getting unicast list
In gr_gv11b/gk20a_create_priv_addr_table() we do not consider floorswept FBPAs
and just calculate the unicast list assuming all FBPAs are present
This generates incorrect list of unicast addresses

Fix this introducing new HAL ops.gr.split_fbpa_broadcast_addr
Set gr_gv100_get_active_fpba_mask() for GV100
Set gr_gk20a_split_fbpa_broadcast_addr() for rest of the chips

gr_gv100_get_active_fpba_mask() will first get active FPBA mask and generate
unicast list only for active FBPAs

Bug 200398811
Jira NVGPU-556

Change-Id: Idd11d6e7ad7b6836525fe41509aeccf52038321f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1694444
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-04-15 22:53:29 -07:00
Deepak Nibade
77b806fe7e gpu: nvgpu: gv100: fix PMA list alignment in ctxsw buffer
GV100 ucode is changed so that it expects LIST_nv_perf_pma_ctx_reg list in
ctxsw buffer to be 256 byte aligned but same change is not applied to other
chip ucodes

ADD new HAL (*add_ctxsw_reg_perf_pma) to configure PMA register list and
define a common HAL gr_gk20a_add_ctxsw_reg_perf_pma() for all other
chips except GV100

Define a separate HAL for GV100 gr_gv100_add_ctxsw_reg_perf_pma() and fix
the required alignment in this function

Bug 1998067

Change-Id: Ie172fe90e2cdbac2509f2ece953cd8552e66fc56
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676655
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-21 06:04:38 -07:00
Deepak Nibade
66751bc05d gpu: nvgpu: gv100: fix num_fbpas while adding ctxsw buffer entries
For LIST_nv_pm_fbpa_ctx_regs, we right now call
add_ctxsw_buffer_map_entries_subunits() to add registers corresponding
to all the FBPAs

But while configuring total number of registers, we do not consider
floorswept FBPAs and that causes misalignment in subsequent lists for GV100

Fix this by reading disabled/floorswept FBPAs from fuse and consider only those
FBPAs which are active for GV100

Add new HAL (*add_ctxsw_reg_pm_fbpa) to support this setting and define a
common HAL gr_gk20a_add_ctxsw_reg_pm_fbpa() for all chips except GV100

Define GV100 specific gr_gv100_add_ctxsw_reg_pm_fbpa() with above mentioned
implementation to consider floorsweeping

Bug 1998067

Change-Id: Id560551bb0b8142791c117b6d27864566c90b489
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1676654
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-03-21 06:04:35 -07:00
Richard Zhao
86cf2d0857 gpu: nvgpu: gv100: remove gr_gv100_load_tpc_mask()
No one uses it anymore.

Jira VQRM-2982

Change-Id: Iac885221b670507241663a851c4a597157233756
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1664653
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-02-27 14:31:21 -08:00
Terje Bergstrom
f3f14cdff5 gpu: nvgpu: Fold T19x code back to main code paths
Lots of code paths were split to T19x specific code paths and structs
due to split repository. Now that repositories are merged, fold all of
them back to main code paths and structs and remove the T19x specific
Kconfig flag.

Change-Id: Id0d17a5f0610fc0b49f51ab6664e716dc8b222b6
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1640606
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-22 22:20:15 -08:00
Thomas Fleury
6b90684cee gpu: nvgpu: vgpu: get virtual SMs mapping
On gv11b we can have multiple SMs per TPC. Add sm_per_tpc in
vgpu constants to properly dimension the virtual SM to TPC/GPC
mapping in virtualization case.
Use TEGRA_VGPU_CMD_GET_SMS_MAPPING to query current mapping.

Bug 2039676

Change-Id: I817be18f9a28cfb9bd8af207d7d6341a2ec3994b
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1631203
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2018-01-10 15:57:20 -08:00
David Nieto
2029426446 gpu: nvgpu: gv1xx: resize patch buffer
Follow the sizing consideration in bug 1753763 to support dynamic TPC modes
and subcontexts.

bug 200350539

Change-Id: Ibbdbf02f9c2ea3f082c1b2810ae7176b0775d461
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1584034
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-10-26 17:56:15 -07:00
seshendra Gadagottu
c6ccb5f2a1 gpu: nvgpu: gv11b: use scg perf for smid numbering
For SCG to work, smid numbering needs to be done
based on scg performance of tpcs. For gv11b and
gv11b vgpu, reuse gv100 function "gr_gv100_init_sm_id_table"
to do this.

Used local variable "index" to avoid multiple computations in
the function: gr_gv100_init_sm_id_table
index = sm_id + sm

Add deug info for printing initialized gpc/tpc/sm/global_tpc
indexs.

Bug 1842197

Change-Id: Ibf10f47f10a8ca58b86c307a22e159b2cc0d0f43
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1583916
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-10-25 11:23:24 -07:00
David Nieto
ed8ac6e005 gpu: nvgpu: fix smid generation of perf tables
SMID tables were generated according with the local tpc and the pagepool and cb
buffers from a different chip and did not take performance in consideration,
which made compute kernels hang with CTAs on the fly.

This change ensures we are using the right sizes and adds proper enumeration
of smids.

JIRA: NVGPUGV100-36
bug 2004378

Change-Id: Ic8f50c325d6d6720cca41d9740ae4f5f51e1100a
Signed-off-by: David Nieto <dmartineznie@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1581664
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-10-20 11:55:43 -07:00