Commit Graph

200 Commits

Author SHA1 Message Date
Deepak Nibade
8ee3aa4b31 gpu: nvgpu: use common nvgpu mutex/spinlock APIs
Instead of using Linux APIs for mutex and spinlocks
directly, use new APIs defined in <nvgpu/lock.h>

Replace Linux specific mutex/spinlock declaration,
init, lock, unlock APIs with new APIs
e.g
struct mutex is replaced by struct nvgpu_mutex and
mutex_lock() is replaced by nvgpu_mutex_acquire()

And also include <nvgpu/lock.h> instead of including
<linux/mutex.h> and <linux/spinlock.h>

Add explicit nvgpu/lock.h includes to below
files to fix complilation failures.
gk20a/platform_gk20a.h
include/nvgpu/allocator.h

Jira NVGPU-13

Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1293187
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-22 04:15:02 -08:00
Terje Bergstrom
c218fefe84 gpu: nvgpu: Fix unicast register accesses for SM
In two places we used broadcast register as base, but added the
unicast offset to it. This causes the write to go well beyond
valid register range.

Change the broadcast base to use unicast base instead in sequence
to resume a single SM and to record error state of SM.

Bug 200256272

Change-Id: I4ca9af2bb5877dba20ab96575f5094d42949c9e2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
(cherry-picked from commit 04177b3414535ce5092c8baeae29883bada9d36c)
Reviewed-on: http://git-master/r/1306331
Reviewed-by: Automatic_Commit_Validation_User
2017-02-17 15:30:58 -08:00
seshendra Gadagottu
521253acb7 gpu: nvgpu: implement chip specific init_elcg_mode
Added function pointer to implement chip specific
init_elcg mode and updated this pointer for legacy chips.

JIRA GV11B-58

Change-Id: I3fff4f771eaa5dad98a3d8166c9127ecd6b745e4
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1300120
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-02-07 15:16:59 -08:00
seshendra Gadagottu
88ce7a98c8 gpu: nvgpu: update zcull and pm context pointers
Update zcull and perfmon buffer pointers in context
header through function pointers.

JIRA GV11B-48

Change-Id: Iaa6dd065128cb0c39e308cecf17b9d68a826d865
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1291850
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-27 12:23:01 -08:00
Laxman Dewangan
9e5208f634 drivers: gpu: nvgpu: Use soc/tegra/fuse.h for fuse header
The fuse headers are unified and moved all the content of
linux/tegra-fuse.h to the soc/tegra/fuse.h to have the
single fuse header for Tegra.

Use unified fuse header soc/tegra/fuse.h.

bug 200260692

Change-Id: Icab3ba5c3dbcd3fa831455c2f336942d356ff5ac
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-on: http://git-master/r/1287498
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2017-01-19 00:04:59 -08:00
Alex Waterman
7989012df2 gpu: nvgpu: Move gm20b HW headers
Move the gm20b HW headers to a new directory specially for them:

  include/nvgpu/hw/gm20b

And change the code to include like so:

  #include <nvgpu/hw/gm20b/hw_fb_gm20b.h>

This is part of the process to restructure the nvgpu driver.

Bug 1799159

Change-Id: I0765e2f6bcd5aa1e803efd250056de3cf9bfa7ed
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1244791
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2017-01-11 12:44:14 -08:00
Terje Bergstrom
ea5a214722 gpu: nvgpu: Implement SET_RD_COALESCE
Implement SW method SET_RD_COALESCE to implement correct handling
of texture read coalescing.

Bug 200223870

Change-Id: Icd6f987b72d78e5add4076fc550e2070eba70628
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1271303
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2017-01-05 09:13:30 -08:00
Deepak Nibade
ec102080ab gpu: nvgpu: define common API to write fuses
We use tegra_fuse_control_write() on k4.4 and
tegra_fuse_writel() on previous versions

But gr_gm20b_set_gpc_tpc_mask() currently broken
since we use tegra_fuse_writel() always to
update fuses

Hence define tegra_fuse_control_write() on
previous kernel versions as well and use
it everywhere

Bug 200262155

Change-Id: I116ed77d24018dae21884344373c9eaa1750c2bd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1270168
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-21 08:45:59 -08:00
Deepak Nibade
145225b896 gpu: nvgpu: remove clk writel from TPC FS
To floorsweep any TPC on gm20b, we first have to
set BIT(28) in CLK_RST_CONTROLLER_MISC_CLK_ENB_0
from nvgpu driver

But now this bit is set by default from clock
driver, hence remove clk_writel() from nvgpu
driver

Bug 200262155

Change-Id: I65bc60cb017109bdb882d83637f2a06d27586f18
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1265752
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-12-21 08:45:59 -08:00
seshendra Gadagottu
4a8802eab4 gpu: nvgpu: chip specific channel commit_inst
Add function pointer to add chip specific commit_inst.
Update this function pointer for gk20a and gm20b.

JIRA GV11B-21

Change-Id: Iae7231fae70c7b4f56647fe242776670675de3fd
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1258275
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-30 09:19:23 -08:00
Konsta Holtta
5f6a2aa02b gpu: nvgpu: fix setup_rop_mapping for gm20b+
gm20b_init_gr does not inherit the ops set by gk20a_init_gr_ops, and the
gr.setup_rop_mapping HAL was not set there, so it was not set for chips
that inherit from gm20b_init_gr and do not override it explicitly.

Set the pointer in gm20b_init_gr, which other chips inherit, and delete
the surrounding if condition from the call, making sure that future
users always call it, because there is an implementation since the
earliest supported chip.

Bug 1833382

Change-Id: I7893c9aac7c5c49ce9a55031ea6baa9382a1b7ca
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1258960
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2016-11-29 09:50:21 -08:00
Terje Bergstrom
d29afd2c9e gpu: nvgpu: Fix signed comparison bugs
Fix small problems related to signed versus unsigned comparisons
throughout the driver. Bump up the warning level to prevent such
problems from occuring in future.

Change-Id: I8ff5efb419f664e8a2aedadd6515ae4d18502ae0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1252068
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-16 21:35:36 -08:00
Terje Bergstrom
8fa5e7c58a gpu: nvgpu: Remove IOCTL FREE_OBJ_CTX
We have never used the IOCTL FREE_OBJ_CTX. Using it leads to context
being only partially available, and can lead to use-after-free.

Bug 1834225

Change-Id: I9d2b632ab79760f8186d02e0f35861b3a6aae649
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1250004
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-11 11:47:42 -08:00
Sami Kiminki
f329e674f4 gpu: nvgpu: gk20a: Fix FBP/L2 masks, add GET_FBP_L2_MASKS
Fix FBP and ROP_L2 enable masks for Maxwell+. Deprecate rop_l2_en_mask
in GPU characteristics by adding _DEPRECATED postfix. The array is
too small to hold ROP_L2 enable masks for desktop GPUs.

Add NVGPU_GPU_IOCTL_GET_FBP_L2_MASKS to expose the ROP_L2 masks for
userspace.

Bug 200136909
Bug 200241845

Change-Id: I5ad5a5c09f3962ebb631b8d6e7a2f9df02f75ac7
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/1245294
(cherry picked from commit 0823b33e59defec341ea7919dae4e5f73a36d256)
Reviewed-on: http://git-master/r/1249883
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-11 02:21:04 -08:00
Shardar Shariff Md
cc4208a278 gpu: nvgpu: define fuse macro depend on kernel version
- Define fuse macros depending on kernel version as fuse
offset got changed in K4.4 and for K4.4 fuse defines are
defined in common header file (tegra-fuse.h)
- Use fuse control read/write APIs when reading control
registers for K4.4.

Bug 200243956

Change-Id: I5a86ef58d9de17a273aea8d3ce8ad5772444dac2
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1245824
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
2016-11-11 02:18:40 -08:00
seshendra Gadagottu
d37a573c45 gpu: nvgpu: smid programming
Populate chip specific sm id table.

JIRA GV11B-21

Change-Id: I58869b2c3e55449a7d999ddf73d6eb7b359b2a07
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1227095
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-11-03 09:14:56 -07:00
seshendra Gadagottu
fabe964c76 gpu: nvgpu: chip specific commit global timeslice
Implement chip specific commit_global_timeslice function.

JIRA GV11B-21

Change-Id: I937dda77870f164d034686d6d41482c875940320
Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1243944
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-11-01 11:37:34 -07:00
Seema Khowala
8b051f34fc gpu: nvgpu: add func ptr for gpc exceptions
Add function ptr for enabling gpc exceptions

JIRA GV11B-28
JIRA GV11B-27

Change-Id: I4c7e4300825bf096c22f229ae7196f324ce40037
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1236902
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2016-10-17 14:46:44 -07:00
Seema Khowala
94efd53ed1 gpu: nvgpu: fix zcull programming
There are eight tiles per map tile register and
depending on how many tpcs are present, there is
a chance that s/w will be accessing un-allocated
memory for reading tile values from temp buffers.

Bug 1735760

Change-Id: I5c0e09ec75099aaf6ad03dde964b9e93c2dc2408
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1221580
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-10-14 08:11:20 -07:00
Seema Khowala
d64e201514 gpu: nvgpu: add check for is_fmodel
is_fmodel flag will be set in gk20a_probe().
Updated code for is_fmodel check, instead of
check for supported simulated platforms.

Bug 1735760

Change-Id: I7cbac2196130fe5ce4c1a910504879e6948c13da
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1177869
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
2016-07-27 14:32:54 -07:00
Richard Zhao
4c79e2b236 gpu: nvgpu: make func gm20b_gr_tpc_disable_override global
Bug 200220632

Change-Id: I75f628c54d68bbd06d5d8aeb32b8ee145411b8da
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1185067
(cherry picked from commit 0523a71eaac9dc7751f6e0e7d280b01f3a9e4ea3)
Reviewed-on: http://git-master/r/1189784
Reviewed-by: Thomas Fleury <tfleury@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
2016-07-26 22:55:50 -07:00
Lakshmanan M
6299b00beb gpu: nvgpu: Add multiple engine and runlist support
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
   for gm206 GPU family
5) Added generic mechanism to identify the
   CE engine pri_base address for gm206
   (CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
   made generic way
7) Code cleanup for readability

JIRA DNVGPU-26

Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-07 12:31:34 -07:00
Mahantesh Kumbar
e9d5e7dfca gpu: nvgpu: secure boot HAL update
Updated/added secure boot HAL with methods
required to support multiple GPU chips.

JIRA DNVGPU-10

Change-Id: I343b289f2236fd6a6b0ecf9115367ce19990e7d5
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: http://git-master/r/1151784
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-26 16:04:25 -07:00
Terje Bergstrom
9e908d2a8d gpu: nvgpu: gp10b: Fix SM number when more than 4 TPCs
Use multiplication instead of division to come up with an SM id.

Change-Id: I01b76bf1ba5f64e34b6a283306fcd7687c1302ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1150600
2016-05-20 13:59:34 -07:00
Peter Daifuku
ce0fe5082e gpu: nvgpu: hwpm broadcast register support
Add support for hwpm broadcast registers (ltc and lts)

In gr_gk20a_find_priv_offset_in_buffer, replace "Unknown address type" error
with informational message: gr_gk20a_exec_ctx_ops calls
gk20a_get_ctx_buffer_offsets and if that fails,
calls gr_gk20a_get_pm_ctx_buffer_offsets; HWPM registers will fail the first
call, so an error or warning is overkill.

Bug 1648200

Change-Id: I197b82579e9894652add4ff254418f818981415a
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1131365
(cherry picked from commit 9f30a92c5d87f6dadd34cc37396a6b10e3a72751)
Reviewed-on: http://git-master/r/1133628
(cherry picked from commit 7eb7cfd998852ba7f7c4c40d3db286f66e83ab3a)
Reviewed-on: http://git-master/r/1127749
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-19 15:58:24 -07:00
Terje Bergstrom
211edaefb7 gpu: nvgpu: Fix CWD floorsweep programming
Program CWD TPC and SM registers correctly. The old code did not work
when there are more than 4 TPCs.

Refactor init_fs_mask to reduce code duplication.

Change-Id: Id93c1f8df24f1b7ee60314c3204e288b91951a88
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1143697
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
2016-05-16 10:57:48 -07:00
Konsta Holtta
6eebc87d99 gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.

gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.

vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.

Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.

JIRA DNVGPU-23

Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
2016-05-13 07:11:33 -07:00
Terje Bergstrom
fb9dc629af gpu: nvgpu: Use num LTCs instead of num FBs in GR
When programming GR ZROP and CROP settings for number of active LTCs
we use FB count, which is always 1. Use LTC count instead.

Change-Id: I4d6f9b2b59d44b82ba098b243d6390cf321dbc36
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1144698
2016-05-11 08:44:37 -07:00
Richard Zhao
bc72480f8d gpu: nvgpu: add fuse overrides for tpc disabling
- add fuse_override in gops. Implement it starting from gm20b.
- set cwd fs register, so cuda won't use disabled TPCs

Bug 1757262
Bug 200169697

Change-Id: If7bac58bd3a6bcf2925197ea5b7c2d10a77e0933
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
(cherry picked from commit 66cde7724815e9e5e85ab9b07fc985a78530222f)
Reviewed-on: http://git-master/r/1132177
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Tested-by: Adeel Raza <araza@nvidia.com>
2016-05-10 12:29:49 -07:00
Deepak Nibade
771f742703 gpu: nvgpu: add supported preemptions to gpu characteristics
Add below flag fields to gpu characteristics to indicate
supported and default preemption modes on platform for
graphics and compute

__u32 graphics_preemption_mode_flags;
__u32 compute_preemption_mode_flags;
__u32 default_graphics_preempt_mode;
__u32 default_compute_preempt_mode;

Add struct nvgpu_preemption_modes_rec to struct gr_gk20a
to store these values locally

Use platform specific get_preemption_mode_flags() to
get the flags and define gk20a/gm20b specific
get_preemption_mode_flags() API

Bug 1646259

Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1133595
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:53 -07:00
Deepak Nibade
d868b65441 gpu: nvgpu: separate IOCTL to set preemption mode
Add separate IOCTL NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE
to allow setting preemption modes from UMD

Define preemption modes in nvgpu.h and use them everywhere
Remove mode definitions from mm_gk20a.h

Also, we support setting only one preemption mode in a channel
But it is possible to have multiple preemption modes (one from
graphics and one from compute) set simultaneously

Hence, update struct gr_ctx_desc to include two separate
preemption modes (graphics_preempt_mode and compute_preempt_mode)

API NVGPU_IOCTL_CHANNEL_SET_PREEMPTION_MODE also supports
setting two separate preemption modes i.e. one for graphics
and one for compute

Make necessary changes in code to support two preemption
modes

Bug 1646259

Change-Id: Ia1dea19e609ba8cc0de2f39ab6c0c4cd6b0a752c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131805
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:29 -07:00
Terje Bergstrom
2db5e4794e gpu: nvgpu: Fix floorsweeping for multi-GPC GPU
There were multiple bugs in dealing with a GPU with more than one
GPC.

* Beta CB size was set to wrong PPC
* TPC mask did not shift fields correctly
* PD skip table used || instead of | operator

Change-Id: I849e2331a943586df16996fe573da2a0ac4cce19
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1132109
2016-04-27 08:12:23 -07:00
Terje Bergstrom
ec62c649b5 gpu: nvgpu: Idle GR before calling PMU ZBC save
On gk20a when PMU is updating ZBC colors it is reading them from L2.
But L2 has one port, and ZBC reads can race with other transactions.
Idle graphics before sending PMU the ZBC_UPDATE request.

Also makes pmu_save_zbc a HAL, because PMU ucode has changes to bypass
this problem on some chips.

Bug 1746047

Change-Id: Id8fcd6850af7ef1d8f0a6aafa0fe6b4f88b5f2d9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129017
2016-04-26 09:46:04 -07:00
Deepak Nibade
5765694f2a gpu: nvgpu: do not include hw_proj_*.h
hw_proj_gk20a.h and hw_proj_gm20b.h should not be
included, hence remove the includes and APIs used
from the header

Use nvgpu_get_litter_value() API to replace use
of header

Bug 200156699

Change-Id: I5e88f71657682dd94ac7f0a45f940b70cf8222e7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1129611
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-23 10:34:30 -07:00
Deepak Nibade
b63c4bced5 gpu: nvgpu: IOCTL to suspend/resume context
Add below IOCTL to suspend/resume a context
NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS:

Suspend sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, suspend all SMs
- otherwise, disable channel/TSG
- enable ctxsw

Resume sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, resume all SMs
- otherwise, enable channel/TSG
- enable ctxsw

Bug 200156699

Change-Id: Iacf1bf7877b67ddf87cc6891c37c758a4644b014
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120332
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:45 -07:00
Deepak Nibade
c651adbeaa gpu; nvgpu: IOCTL to write/clear SM error states
Add below IOCTLs to write/clear SM error states

NVGPU_DBG_GPU_IOCTL_CLEAR_SINGLE_SM_ERROR_STATE
NVGPU_DBG_GPU_IOCTL_WRITE_SINGLE_SM_ERROR_STATE

Bug 200156699

Change-Id: I89e3ec51c33b8e131a67d28807d5acf57b3a48fd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120330
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:22 -07:00
Deepak Nibade
04e45bc943 gpu: nvgpu: support storing/reading single SM error state
Add support to store error state of single SM before
preprocessing SM exception

Error state is stored as :
struct nvgpu_dbg_gpu_sm_error_state_record {
u32 hww_global_esr;
u32 hww_warp_esr;
u64 hww_warp_esr_pc;
u32 hww_global_esr_report_mask;
u32 hww_warp_esr_report_mask;
}

Note that we can safely append new fields to above
structure in the future if required

Also, add IOCTL NVGPU_DBG_GPU_IOCTL_READ_SINGLE_SM_ERROR_STATE
to support reading SM's error state by user space

Bug 200156699

Change-Id: I9a62cb01e8a35c720b52d5d202986347706c7308
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120329
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:03 -07:00
Terje Bergstrom
6839341bf8 gpu: nvgpu: Add litter values HAL
Move per-chip constants to be returned by a chip specific function.
Implement get_litter_value() for each chip.

Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1121383
2016-04-15 08:48:20 -07:00
Richard Zhao
e7664f9345 gpu: nvgpu: make tpc_fs_mask work on production board
On production fused boards, it uses gr_fe_tpc_fs_r() to mask TPCs,
rather than fues.

Bug 1734150

Change-Id: I7b4eb428f1ad0cf841a57214e0c8c1e8f17b2c5a
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-on: http://git-master/r/1111630
(cherry picked from commit 869ea54967812e03d9f1e69775ca56fd6459216c)
Reviewed-on: http://git-master/r/1122121
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-14 10:10:24 -07:00
Peter Daifuku
6eeabfbdd0 gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch
Add support for SMPC and HWPM context switching when virtualized

Bug 1648200
JIRASW EVLR-219
JIRASW EVLR-253

Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1122034
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-08 12:34:50 -07:00
Supriya
640cb6642f gpu: nvgpu: LRF, TEX, LTC, DRAM override
- Adding support for FECS mem overrides

Bug 1699676

Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/921253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-26 12:29:55 -08:00
mheyer
de66bf2869 gpu: nvgpu: enable use_full_comp_tag_line in gpc mmu
Also GPC MMU needs to have its PRI_MMU_CTRL_USE_FULL_COMP_TAG_LINE
control bit set.

Bug 1730611

Signed-off-by: Mathias Heyer <mheyer@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Change-Id: I01e11de066ea5487bf1d9c8c8eddbf159e4882da
Reviewed-on: http://git-master/r/1014881
(cherry picked from commit d1651bbebe1b3e46d2173dec1651b3d2f4307b40)
Reviewed-on: http://git-master/r/1017459
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-24 08:00:38 -08:00
Adeel Raza
f0a9ce0469 gpu: nvgpu: SM/TEX exception handling support
Add TEX exception handling support. Also make SM exception handler into
a function pointer, which should allow different chips to implement
their own SM exception handling routine.

Bug 1635727
Bug 1637486

Change-Id: I429905726c1840c11e83780843d82729495dc6a5
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: http://git-master/r/935329
2016-01-29 14:40:11 -08:00
Deepak Nibade
04092f74bb gpu: nvgpu: set set_sm_debug_mode() for gm20b
Set function pointer gops->gr.set_sm_debug_mode()
for gm20b

Bug 200168107

Change-Id: I40eebbc55b0f82f793fcea90245ae6dad0f5779c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/935773
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-27 09:59:11 -08:00
Deepak Nibade
997f92566a gpu: nvgpu: support masking hww_warp_esr
Add below API pointer to support masking of hww_warp_esr
after hardware read of register and before using it
further
u32 (*mask_hww_warp_esr)(u32 hww_warp_esr)

If needed, this API will mask value of hww_warp_esr
appropriately and return it

Bug 200156699

Change-Id: I1afb1347e650fab607009c1ee55691484653a4c1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/927133
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-12 23:00:59 -08:00
Deepak Nibade
ca76b336b3 gpu: nvgpu: support preprocessing of SM exceptions
Support preprocessing of SM exceptions if API
pointer pre_process_sm_exception() is defined

Also, expose some common APIs

Bug 200156699

Change-Id: I1303642c1c4403c520b62efb6fd83e95eaeb519b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925883
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-01-12 22:58:32 -08:00
Amit Sharma
dcf9f05ec6 gpu: nvgpu: make local function static
Fixed the following sparse warning by restricting the scope of
API's within file.
- gr_gk20a.c:7273: warning: symbol 'gr_gk20a_bpt_reg_info' was not declared.
                   Should it be static?
- gr_gm20b.c:1053: warning: symbol 'gr_gm20b_bpt_reg_info' was not declared.
                   Should it be static?

Bug 200088648

Change-Id: I63bba55b1432e4284c9074d2729a176f1767a83a
Signed-off-by: Amit Sharma <amisharma@nvidia.com>
Reviewed-on: http://git-master/r/842260
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2015-12-14 02:19:44 -08:00
Terje Bergstrom
8cdd72996c gpu: nvgpu: gm20b: Add tile caching registers
Add tile caching related registers to access map.

Bug 1692373

Change-Id: I4516812dd571bed3be2dfa2b210abe3177e794fe
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812354
2015-12-07 09:52:47 -08:00
Terje Bergstrom
1ee25b11c5 gpu: nvgpu: Make access map chip specific
Bug 1692373

Change-Id: Ie3fc3e02fa7b0636da464d6ee1c28da7a4543ec2
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/812353
2015-12-07 09:52:22 -08:00
sujeet baranwal
397c6d44ed gpu: nvgpu: Wait for pause for SMs
SM locking & register reads Order has been changed.
Also,  functions have been implemented based on gk20a
and gm20b.

Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1
Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com>
Signed-off-by: ashutosh jain <ashutoshj@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/837236
2015-12-04 13:03:11 -08:00