Commit Graph

697 Commits

Author SHA1 Message Date
Deepak Nibade
771f742703 gpu: nvgpu: add supported preemptions to gpu characteristics
Add below flag fields to gpu characteristics to indicate
supported and default preemption modes on platform for
graphics and compute

__u32 graphics_preemption_mode_flags;
__u32 compute_preemption_mode_flags;
__u32 default_graphics_preempt_mode;
__u32 default_compute_preempt_mode;

Add struct nvgpu_preemption_modes_rec to struct gr_gk20a
to store these values locally

Use platform specific get_preemption_mode_flags() to
get the flags and define gk20a/gm20b specific
get_preemption_mode_flags() API

Bug 1646259

Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1133595
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:53 -07:00
Deepak Nibade
d868b65441 gpu: nvgpu: separate IOCTL to set preemption mode
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD

Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h

Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously

Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)

API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute

Make necessary changes in code to support two preemption
modes

Bug 1646259

Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:29 -07:00
Haley Teng
4c4d0e6eb2 nvgpu: vgpu: create fifo.force_reset_ch in gpu_ops
gk20a_fifo_force_reset_ch() does not support vgpu now, so we need to
create a function pointer in gpu_ops and assign it differently for
vgpu and non-vgpu.

Bug 200184349

Change-Id: I5f8f4f731b4b970c4ff8de65531f25568e7691b6
Signed-off-by: Haley Teng <hteng@nvidia.com>
Reviewed-on: http://git-master/r/1130420
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 09:52:04 -07:00
Terje Bergstrom
2db5e4794e gpu: nvgpu: Fix floorsweeping for multi-GPC GPU
There were multiple bugs in dealing with a GPU with more than one
GPC.

* Beta CB size was set to wrong PPC
* TPC mask did not shift fields correctly
* PD skip table used || instead of | operator

Change-Id: I849e2331a943586df16996fe573da2a0ac4cce19
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1132109
2016-04-27 08:12:23 -07:00
Terje Bergstrom
ec62c649b5 gpu: nvgpu: Idle GR before calling PMU ZBC save
On gk20a when PMU is updating ZBC colors it is reading them from L2.
But L2 has one port, and ZBC reads can race with other transactions.
Idle graphics before sending PMU the ZBC_UPDATE request.

Also makes pmu_save_zbc a HAL, because PMU ucode has changes to bypass
this problem on some chips.

Bug 1746047

Change-Id: Id8fcd6850af7ef1d8f0a6aafa0fe6b4f88b5f2d9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129017
2016-04-26 09:46:04 -07:00
Deepak Nibade
5765694f2a gpu: nvgpu: do not include hw_proj_*.h
hw_proj_gk20a.h and hw_proj_gm20b.h should not be
included, hence remove the includes and APIs used
from the header

Use nvgpu_get_litter_value() API to replace use
of header

Bug 200156699

Change-Id: I5e88f71657682dd94ac7f0a45f940b70cf8222e7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1129611
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-23 10:34:30 -07:00
Deepak Nibade
2b2f84219c gpu: nvgpu: add accessors for global_esr values and sm_dbgr_control
Add gk20a/gm20b accessors for various global_esr values
and for sm_dbgr_control modes

Bug 200156699

Change-Id: If7fd8cd7567f8bcd1f645facf9553bdc0a153526
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120333
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:08:07 -07:00
Deepak Nibade
b63c4bced5 gpu: nvgpu: IOCTL to suspend/resume context
Add below IOCTL to suspend/resume a context
NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS:

Suspend sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, suspend all SMs
- otherwise, disable channel/TSG
- enable ctxsw

Resume sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, resume all SMs
- otherwise, enable channel/TSG
- enable ctxsw

Bug 200156699

Change-Id: Iacf1bf7877b67ddf87cc6891c37c758a4644b014
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120332
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:45 -07:00
Deepak Nibade
c651adbeaa gpu; nvgpu: IOCTL to write/clear SM error states
Add below IOCTLs to write/clear SM error states

NVGPU_DBG_GPU_IOCTL_CLEAR_SINGLE_SM_ERROR_STATE
NVGPU_DBG_GPU_IOCTL_WRITE_SINGLE_SM_ERROR_STATE

Bug 200156699

Change-Id: I89e3ec51c33b8e131a67d28807d5acf57b3a48fd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120330
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:22 -07:00
Deepak Nibade
04e45bc943 gpu: nvgpu: support storing/reading single SM error state
Add support to store error state of single SM before
preprocessing SM exception

Error state is stored as :
struct nvgpu_dbg_gpu_sm_error_state_record {
u32 hww_global_esr;
u32 hww_warp_esr;
u64 hww_warp_esr_pc;
u32 hww_global_esr_report_mask;
u32 hww_warp_esr_report_mask;
}

Note that we can safely append new fields to above
structure in the future if required

Also, add IOCTL NVGPU_DBG_GPU_IOCTL_READ_SINGLE_SM_ERROR_STATE
to support reading SM's error state by user space

Bug 200156699

Change-Id: I9a62cb01e8a35c720b52d5d202986347706c7308
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120329
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:03 -07:00
Terje Bergstrom
0e423669a4 gpu: nvgpu: Wait for BAR1 bind
Wait for BAR1 bind to complete before continuing. The register to
wait exists Maxwell onwards.

Change-Id: Ie3736033fdb748c5da8d7a6085ad6d63acaf41f5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1123941
2016-04-15 12:38:33 -07:00
Terje Bergstrom
7d8e219389 gpu: nvgpu: Use sysmem aperture for SoC memory
In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it
has to be accessed as sysmem.

Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120468
2016-04-15 08:50:34 -07:00
Terje Bergstrom
6839341bf8 gpu: nvgpu: Add litter values HAL
Move per-chip constants to be returned by a chip specific function.
Implement get_litter_value() for each chip.

Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1121383
2016-04-15 08:48:20 -07:00
Richard Zhao
e7664f9345 gpu: nvgpu: make tpc_fs_mask work on production board
On production fused boards, it uses gr_fe_tpc_fs_r() to mask TPCs,
rather than fues.

Bug 1734150

Change-Id: I7b4eb428f1ad0cf841a57214e0c8c1e8f17b2c5a
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1111630
(cherry picked from commit 869ea54967812e03d9f1e69775ca56fd6459216c)
Reviewed-on: http://git-master/r/1122121
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-14 10:10:24 -07:00
Terje Bergstrom
9b5427da37 gpu: nvgpu: Support GPUs with no physical mode
Support GPUs which cannot choose between SMMU and physical
addressing.

Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120469
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2016-04-13 13:12:41 -07:00
Peter Daifuku
6eeabfbdd0 gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch
Add support for SMPC and HWPM context switching when virtualized

Bug 1648200
JIRASW EVLR-219
JIRASW EVLR-253

Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1122034
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-08 12:34:50 -07:00
Terje Bergstrom
e8bac374c0 gpu: nvgpu: Use device instead of platform_device
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.

Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
2016-04-08 09:42:41 -07:00
Thomas Fleury
327644a1a0 gpu: nvgpu: hw definitions for FECS trace on gm20b
Bug 1648908

Change-Id: I3b9a1ebaf062f15db397acd81b8312dc8daa9193
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1117262
Reviewed-on: http://git-master/r/1121233
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 13:54:35 -07:00
Peter Daifuku
37155b65f1 gpu: nvgpu: support for hwpm context switching
Add support for hwpm context switching

Bug 1648200

Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1120450
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 11:05:49 -07:00
Terje Bergstrom
6675c03603 gpu: nvgpu: Sync with register generator
Use re-generated register definitions. This synchronizes
kernel with the register generator.

Change-Id: I85a00f8f5c7bdfbc56cf4df909e5ae892d86f062
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120812
2016-04-07 09:39:28 -07:00
Supriya
bf82cd220a gpu: nvgpu: Add Fuse prints on PMU Halt
-Print fuse values in case of PMU halt error
-and mailbox reads 0xDEADDEAD

Bug 1737044

Change-Id: I59f5fcf4a69bdd2a2eea81a69dd99bb9c4c21e1d
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/1113464
(cherry picked from commit d0320eed72c5070c4fcc7564c02fa38599984751)
Reviewed-on: http://git-master/r/1120429
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-06 19:38:19 -07:00
Sami Kiminki
135d6db448 gpu: nvgpu: Add HAL for GPU characteristics
Add function pointer for chip specific GPU characteristics init.

Bug 1637486

Change-Id: I6ce5eea124d8057393dec6e86e72412cc87e1cfa
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: http://git-master/r/780535
(cherry picked from commit f5c240d6ed19b5b9eedff05767c885ad5812c71e)
Reviewed-on: http://git-master/r/1120428
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-06 19:37:58 -07:00
Sumit Singh
466dc96afe Merge branch 'PM_RUNTIME-Removal' into 'dev-kernel-3.18'
This change performs merge of 'PM_RUNTIME_Removal' dev-branch with
'dev-kernel-3.18' branch. It replaces CONFIG_PM_RUNTIME with CONFIG_PM.

JIRA TPM-704

Change-Id: I306e254716f275c283f727fc232d7244939542b6
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2016-03-30 11:17:41 +05:30
Yogish Kulkarni
6e946ad3a3 gpu: nvgpu: use gk20a_free_sgtable to free sgtable
Use gk20a_free_sgtable to free sgtable

Bug 200130473

Change-Id: I6ddffb848a289ce81804502b7628feb5a4a8d000
Signed-off-by: Yogish Kulkarni <yogishk@nvidia.com>
Reviewed-on: http://git-master/r/785884
(cherry picked from commit a4f3b53f2ed3971d9b8945f5bc9c1b2822156a89)
Reviewed-on: http://git-master/r/833646
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-23 07:51:59 -07:00
Aingara Paramakuru
82da6ed595 gpu: nvgpu: add support to set channel timeslice
As part of improving GPU scheduling, userspace can now set a
channel's timeslice, within reasonable limits imposed by the
kernel driver.

JIRA VFND-1312
Bug 1729664

Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1020917
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-22 10:42:45 -07:00
Arul Sekar
032efd066e gpu: nvgpu: Provide cpu gpu time correlation via ioctl
bug 1648908

Provides pairs of CPU and GPU timestamps that
can be used for correlatiing the two timebases

- IOCTL made available /dev/nvhost-ctrl-gpu

Change-Id: I1458b9d33d794b1b02ec9fd29ed9426756b94bcd
Signed-off-by: Arul Sekar <aruls@nvidia.com>
Reviewed-on: http://git-master/r/1029732
Reviewed-by: Arun Gona <agona@nvidia.com>
Tested-by: Arun Gona <agona@nvidia.com>
Reviewed-on: http://git-master/r/1111715
GVS: Gerrit_Virtual_Submit
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-22 10:39:45 -07:00
Terje Bergstrom
a13a4124c7 gpu: nvgpu: Disable illegal comptag interrupt
Illegal comptag interrupt is triggered when a page is mapped with
two different kinds with incompatible compression status. This can
be intentional, so disable the interrupt.

Change-Id: I84a212beac147991d09d2d381a9e770b1364f4d8
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1029663
(cherry picked from commit 819607a768f9fccdd0b233d58bcf88b9eee4ee19)
Reviewed-on: http://git-master/r/1031010
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2016-03-22 10:02:29 -07:00
Aingara Paramakuru
2a58d3c27b gpu: nvgpu: improve channel interleave support
Previously, only "high" priority bare channels were interleaved
between all other bare channels and TSGs. This patch decouples
priority from interleaving and introduces 3 levels for interleaving
a bare channel or TSG: high, medium, and low. The levels define
the number of times a channel or TSG will appear on a runlist (see
nvgpu.h for details).

By default, all bare channels and TSGs are set to interleave level
low. Userspace can then request the interleave level to be increased
via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will
be added later).

As timeslice settings will soon be coming from userspace, the default
timeslice for "high" priority channels has been restored.

JIRA VFND-1302
Bug 1729664

Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1014962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-15 16:23:44 -07:00
Sumit Singh
383c769995 gpu: nvgpu: Replace CONFIG_PM_RUNTIME with CONFIG_PM
After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME may now be changed to depend on
CONFIG_PM.

Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under
drivers/gpu/nvgpu/.

JIRA TPM-704

Change-Id: I23965838ff6ec77829076cd834e87641fb68e268
Signed-off-by: Sumit Singh <sumsingh@nvidia.com>
2016-03-15 16:17:42 +05:30
Seshendra Gadagottu
fab090f351 gpu: nvgpu: tegra: fix sparse errors
Fixed following sparse errors:
- therm_gm20b.c:68:6: warning: symbol 'gm20b_init_therm_ops'
  was not declared. Should it be static?
- platform_gk20a_tegra.c:825:5: warning: symbol 'gk20a_set_clk_rate'
   was not declared. Should it be static?

Bug 200067946

Change-Id: I485d5e76302fb294865854f314db2d27f71520f7
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1026685
GVS: Gerrit_Virtual_Submit
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2016-03-13 22:19:00 -07:00
Supriya
640cb6642f gpu: nvgpu: LRF, TEX, LTC, DRAM override
- Adding support for FECS mem overrides

Bug 1699676

Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/921253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-26 12:29:55 -08:00
mheyer
de66bf2869 gpu: nvgpu: enable use_full_comp_tag_line in gpc mmu
Also GPC MMU needs to have its PRI_MMU_CTRL_USE_FULL_COMP_TAG_LINE
control bit set.

Bug 1730611

Signed-off-by: Mathias Heyer <mheyer@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Change-Id: I01e11de066ea5487bf1d9c8c8eddbf159e4882da
Reviewed-on: http://git-master/r/1014881
(cherry picked from commit d1651bbebe1b3e46d2173dec1651b3d2f4307b40)
Reviewed-on: http://git-master/r/1017459
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-24 08:00:38 -08:00
Deepak Nibade
7b42acda56 gpu: nvgpu: enable ctxsw_intr1 interrupt
Enable NV_PGRAPH_PRI_FECS_HOST_INT_ENABLE_CTXSW_INTR1

Bug 200156699

Change-Id: I170dd6998381897a4b4ca832774eb0f11f92fd86
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935772
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-05 12:45:20 -08:00
Adeel Raza
f0a9ce0469 gpu: nvgpu: SM/TEX exception handling support
Add TEX exception handling support. Also make SM exception handler into
a function pointer, which should allow different chips to implement
their own SM exception handling routine.

Bug 1635727
Bug 1637486

Change-Id: I429905726c1840c11e83780843d82729495dc6a5
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: http://git-master/r/935329
2016-01-29 14:40:11 -08:00
Deepak Nibade
04092f74bb gpu: nvgpu: set set_sm_debug_mode() for gm20b
Set function pointer gops->gr.set_sm_debug_mode()
for gm20b

Bug 200168107

Change-Id: I40eebbc55b0f82f793fcea90245ae6dad0f5779c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935773
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-27 09:59:11 -08:00
Seshendra Gadagottu
c43053761b gpu: nvgpu: add support for therm gate ctrl
During gpu init, therm gate control is required to
add delay cycles before clock gating.

Bug 1717152

Change-Id: Ifabc428cf7b49e49964dc994eba2c38af4aa1a91
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/936443
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-27 09:53:31 -08:00
Richard Zhao
8fb33d92b0 gpu: nvgpu: vgpu: add channel_set_priority support
- add gops.fifo.channel_set_priority and move current code
  as native callback.
- implement the callback for vgpu

Bug 1701079

Change-Id: If1cd13ea4478d11d578da2f682598e0c4522bcaf
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/932829
Reviewed-by: Aingara Paramakuru <aparamakuru@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-25 15:22:22 -08:00
Seshendra Gadagottu
dad5a6a3c1 gpu: nvgpu: gm20b: correct thermal slowdown factor
With extended mode enable, correct thermal slowdown factors
to have divideby2, divideby4 and divideby8 slowdown.

Bug 1719974

Change-Id: I1e3a3f869657ce7c6409851df0ccd1523a06544b
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/933282
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-19 17:48:27 -08:00
Konsta Holtta
db7095ce51 gpu: nvgpu: bitmap allocator for comptags
Restore comptags to be bitmap-allocated, like they were before we had
the buddy allocator.

The new buddy allocator introduced by
e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally
6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but
unsuitable for the small compbit store.

This commit reverts partially the combination of the above commit and
also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed
a bug which is not present when using a bitmap. With a bitmap allocator,
pruning the extra allocation necessary for user-mapped mode is possible,
so that is also restored.

The original generic bitmap allocator is not restored; instead, a
comptag-only allocator is introduced.

Bug 200145635

Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/837180
(cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2)
Reviewed-on: http://git-master/r/839742
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-19 17:44:27 -08:00
Deepak Nibade
997f92566a gpu: nvgpu: support masking hww_warp_esr
Add below API pointer to support masking of hww_warp_esr
after hardware read of register and before using it
further
u32 (*mask_hww_warp_esr)(u32 hww_warp_esr)

If needed, this API will mask value of hww_warp_esr
appropriately and return it

Bug 200156699

Change-Id: I1afb1347e650fab607009c1ee55691484653a4c1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927133
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-12 23:00:59 -08:00
Deepak Nibade
b51ffba58f gpu: nvgpu: API to extract context id
Add new API gr_gk20a_get_ctx_id() to get/extract
context id from GR context

Bug 200156699

Change-Id: If0e8887a9a6b139cd795bf03f5def64fd664d12b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927130
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-12 23:00:08 -08:00
Deepak Nibade
ca76b336b3 gpu: nvgpu: support preprocessing of SM exceptions
Support preprocessing of SM exceptions if API
pointer pre_process_sm_exception() is defined

Also, expose some common APIs

Bug 200156699

Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925883
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-12 22:58:32 -08:00
Peter Pipkorn
2b064ce65e gpu: nvgpu: add high priority channel interleave
Interleave all high priority channels between all other channels.
This reduces the latency for high priority work when there
are a lot of lower priority work present, imposing an upper
bound on the latency. Change the default high priority timeslice
from 5.2ms to 3.0 in the process, to prevent long running high priority
apps from hogging the GPU too much.

Introduce a new debugfs node to enable/disable high priority
channel interleaving. It is currently enabled by default.

Adds new runlist length max register, used for allocating
suitable sized runlist.

Limit the number of interleaved channels to 32.

This change reduces the maximum time a lower priority job
is running (one timeslice) before we check that high priority
jobs are running.

Tested with gles2_context_priority (still passes)
Basic sanity testing is done with graphics_submit
(one app is high priority)

Also more functional testing using lots of parallel runs with:
NVRM_GPU_CHANNEL_PRIORITY=3 ./gles2_expensive_draw
 –drawsperframe 20000 –triangles 50 –runtime 30 –finish
plus multiple:
NVRM_GPU_CHANNEL_PRIORITY=2 ./gles2_expensive_draw
–drawsperframe 20000 –triangles 50 –runtime 30 -finish

Previous to this change, the relative performance between
high priority work and normal priority work comes down
to timeslice value. This means that when there are many
low priority channels, the high priority work will still
drop quite a lot. But with this change, the high priority
work will roughly get about half the entire GPU time, meaning
that after the initial lower performance, it is less likely
to get lower in performance due to more apps running on the system.

This change makes a large step towards real priority levels.
It is not perfect and there are no guarantees on anything,
but it is a step forwards without any additional CPU overhead
or other complications. It will also serve as a baseline to
judge other algorithms against.

Support for priorities with TSG is future work.
Support for interleave mid + high priority channels,
instead of just high, is also future work.

Bug 1419900

Change-Id: I0f7d0ce83b6598fe86000577d72e14d312fdad98
Signed-off-by: Peter Pipkorn <ppipkorn@nvidia.com>
Reviewed-on: http://git-master/r/805961
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-11 09:04:01 -08:00
Richard Zhao
a9c6f59539 gpu: nvgpu: enable semaphore acquire timeout
It'll detect dead semaphore acquire. The worst case is when
ACQUIRE_SWITCH is disabled, semaphore acquire will poll and
consume full gpu timeslicees.

The timeout value is set to half of channel WDT.

Bug 1636800

Change-Id: Ida6ccc534006a191513edf47e7b82d4b5b758684
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/928827
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-01-10 20:07:53 -08:00
Terje Bergstrom
9812bd5eea gpu: nvgpu: Control comptagline assignment from kernel
On Maxwell comptaglines are assigned per 128k, but preferred big page
size for graphics is 64k. Bit 16 of GPU VA is used for determining
which half of comptagline is used.

This creates problems if user space wants to map a page multiple times
and to arbitrary GPU VA. In one mapping the page might be mapped to
lower half of 128k comptagline, and in another mapping the page might
be mapped to upper half.

Turn on mode where MSB of comptagline in PTE is used instead of bit 16
for determining the comptagline lower/upper half selection.

Bug 1704834

Change-Id: If87e8f6ac0fc9c5624e80fa1ba2ceeb02781355b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/924322
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2016-01-05 07:50:02 -08:00
Vijayakumar
851188a5ac gpu: nvgpu: gm20b: use jiffies for wait on PMU
bug 200157852

Change-Id: Ib5ab6ed5f3d8356efd527ce5ff6e4134ac60da7d
Signed-off-by: Vijayakumar <vsubbu@nvidia.com>
Reviewed-on: http://git-master/r/921711
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Shridhar Rasal <srasal@nvidia.com>
2015-12-30 02:44:36 -08:00
Ashutosh Jain
f6eb64fcb5 gpu: nvgpu: Add 3 functions to regops interface.
This change adds the following IOCTLS:
 - NVGPU_GPU_IOCTL_RESUME_FROM_PAUSE
 - NVGPU_GPU_IOCTL_TRIGGER_SUSPEND
 - NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS

Bug 1619430

Change-Id: Iac37d515a753d8b799e631224eae2fa168b43e2c
Signed-off-by: ashutosh jain <ashutoshj@nvidia.com>
Reviewed-on: http://git-master/r/921378
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2015-12-14 08:30:36 -08:00
Amit Sharma
dcf9f05ec6 gpu: nvgpu: make local function static
Fixed the following sparse warning by restricting the scope of
API's within file.
- gr_gk20a.c:7273: warning: symbol 'gr_gk20a_bpt_reg_info' was not declared.
                   Should it be static?
- gr_gm20b.c:1053: warning: symbol 'gr_gm20b_bpt_reg_info' was not declared.
                   Should it be static?

Bug 200088648

Change-Id: I63bba55b1432e4284c9074d2729a176f1767a83a
Signed-off-by: Amit Sharma <amisharma@nvidia.com>
Reviewed-on: http://git-master/r/842260
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-12-14 02:19:44 -08:00
Terje Bergstrom
7afb57e608 gpu: nvgpu: Add grad slowdown HW register masks
Change-Id: I8d64c36d569e79cad3648bad248624290319ac2d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/839367
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
2015-12-08 09:41:10 -08:00
Terje Bergstrom
8cdd72996c gpu: nvgpu: gm20b: Add tile caching registers
Add tile caching related registers to access map.

Bug 1692373

Change-Id: I4516812dd571bed3be2dfa2b210abe3177e794fe
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812354
2015-12-07 09:52:47 -08:00