Commit Graph

606 Commits

Author SHA1 Message Date
Thomas Fleury
d150daf75e gpu: nvgpu: add init_preemption_state gr method
This method is called when setting up gr
hardware. It is meant to adjust preemption
parameters.

Bug 1593548
Jira VFND-1894

Change-Id: I0f5aa3212bec3058a0493366bed6fe2a365c9542
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1162625
(cherry picked from commit c2e6d12570af28b3aae087401d7f670df40d40bd)
Reviewed-on: http://git-master/r/1166987
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-28 10:00:07 -07:00
Konsta Holtta
f438c66598 gpu: nvgpu: force clean patch ctx begin/end
This patch_context map/unmap pair has become a mere wrapper for the more
general gk20a_mem_{begin,end}(). To be consistent about mappings,
require that each patch_write is surrounded by an explicit begin/end
pair, instead of relying on possible inefficient per-write map/unmap.

Remove also the cpu_va check from .._write_end() since the buffers may
be exist in vidmem without a cpu mapping.

JIRA DNVGPU-24

Change-Id: Ia05d52d3d712f2d63730eedc078845fde3e217c1
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1157298
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-22 08:30:05 -07:00
Lakshmanan M
6299b00beb gpu: nvgpu: Add multiple engine and runlist support
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt handling support
   for gm206 GPU family
5) Added generic mechanism to identify the
   CE engine pri_base address for gm206
   (CE0, CE1 and CE2)
6) Removed hard coded engine_id logic and
   made generic way
7) Code cleanup for readability

JIRA DNVGPU-26

Change-Id: I2c3846c40bcc8d10c2dfb225caa4105fc9123b65
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1155963
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-07 12:31:34 -07:00
Deepak Nibade
c16d985c8a gpu: nvgpu: return if no fecs intr
In gk20a_gr_handle_fecs_error(), if we do not see
any error interrupt from gr_fecs_host_int_status_r(),
just return immediately

Bug 1646259

Change-Id: Iea037e0dab57111d2a0fb41c5c19529b7d6c83c0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1158591
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-06 11:01:00 -07:00
Terje Bergstrom
1d2e66540a gpu: nvgpu: Fix calculation of timeout
Fix calculation of timeout in multiple places. The #defines
GR_IDLE_CHECK_DEFAULT and GR_IDLE_CHECK_MAX are meant to be used
only for defining the frequency of checking for timeout. Using them
for actual timeouts makes the timeout really short.

Change-Id: I3d0f8cbc91d619be8e5a9168ee1ab1d6298f129b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1158269
2016-06-05 20:44:19 -07:00
Terje Bergstrom
3b566957fe gpu: nvgpu: Add context reset at golden context init
Part of golden context initialization is in powerup sequence, and
part done as part of first channel creation. The sequence is
missing a context reset, which causes initialization of golden
context to fail on dGPU.

Just moving the code to golden context initialization does not work,
because iGPU can be rail gated, and part of the sequence is required
in GPU boot.

Thus a part of context initialization is replicated to golden context
init after a context reset.

Change-Id: Ife1b167447018317d3a692b706880e0eda073e43
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1130698
2016-06-04 15:37:42 -07:00
Deepak Nibade
31fa6773ef gpu: nvgpu: use correct APIs for disable and preempt
In gr_gk20a_ctx_zcull_setup(), gr_gk20a_update_smpc_ctxsw_mode(),
and in gk20a_channel_suspend(), we call channel specific APIs
to disable/preempt/enable channel
But we do not consider TSGs in this case

Hence use correct (below) APIs in above functions which
will handle channel or TSG internally :
gk20a_disable_channel_tsg()
gk20a_fifo_preempt()
gk20a_enable_channel_tsg()

Bug 200205041

Change-Id: Ieed378dac4ad2322b35f9102706176ec326d386c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157189
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-06-01 13:03:56 -07:00
Deepak Nibade
6ff0d4e6eb gpu: nvgpu: disable/preempt TSG for hwpm update
To update hwpm, we currently disable/preempt only one
channel without considering if channel could be part
of a TSG

Hence, use proper APIs to disable/preempt/enable which
will internally handle channel/TSG case

Bug 200203191

Change-Id: I329a3c02d635265775f2081abba8e047f491fe7d
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1155838
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-30 17:35:18 -07:00
Terje Bergstrom
20c65d8f4a gpu: nvgpu: Use correct IO register pointer
Use correct IO register pointer in cyclestats code. The code used
reg_mem which is not supposed to be used. It is defined only on iGPU.

Change-Id: I03cdaf5d2add2bf2c7cc6d7b3c41ac3be0f9a768
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1154708
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com>
Reviewed-by: Ken Adams <kadams@nvidia.com>
2016-05-27 11:35:51 -07:00
Terje Bergstrom
c8103a39ad gpu: nvgpu: Do not reallocate pes_tpc_mask array
We allocated a new pes_tpc_mask for each PES on each GPC. This
causes us to forget masks for all GPCs but the last one.

Change-Id: I825788ad75333d4aecd93c78d1b277c0d9d65f15
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1152703
2016-05-25 13:16:10 -07:00
Konsta Holtta
3e431e26c5 gpu: nvgpu: add PRAMIN support for mem accessors
To support vidmem, implement a way to access buffers via the PRAMIN
window instead of just kernel-mapped sysmem buffers for iGPU as of now.
Depending on the buffer aperture, choose between the two access types in
the buffer memory accessor functions.

vmap()/vunmap() pairs are no-ops for buffers that can't be cpu-mapped.

Two uses of DMA_ATTR_READ_ONLY are removed in the ucode loading path to
support writing to them too via the indirection in addition to cpu.

JIRA DNVGPU-23

Change-Id: I282dba6741c6b8224bc12e69c1fb3936bde7e6ed
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1141314
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-24 12:39:06 -07:00
Sri Krishna chowdary
e82c840119 gpu: nvgpu: fix compilation issues
compiling kernel with clang pointed out below issues in nvgpu.
Fixing them.

gr_gk20a.c:1185:12: error: stack frame size of 3152 bytes in
function 'gr_gk20a_setup_alpha_beta_tables'

cde_gk20a.c:1376:22: error: duplicate 'const' declaration
cde_gk20a.c:1377:22: error: duplicate 'const' declaration
cde_gk20a.c:1378:22: error: duplicate 'const' declaration

ctxsw_trace_gk20a.c:71:19: error: unused function 'ring_space'

platform_gk20a_tegra.c:55:19: error: unused function 'pmc_read'
platform_gk20a_tegra.c:60:20: error: unused function 'pmc_write'

bug 1745660

Change-Id: I8cd4383cb898307bbeb162ca00b3e20d04de2c90
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Reviewed-on: http://git-master/r/1150486
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-24 10:52:56 -07:00
Terje Bergstrom
a21dcf0bc6 gpu: nvgpu: Enable CE in GR reset
Enable GRCE when enabling GR. Also use the reset mask read from
device info instead of using the hard coded value.

Change-Id: I4812c32d09ea8b5e07abd1b2c6f1efdbe00cb36e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1149359
2016-05-20 14:21:32 -07:00
Peter Daifuku
ce0fe5082e gpu: nvgpu: hwpm broadcast register support
Add support for hwpm broadcast registers (ltc and lts)

In gr_gk20a_find_priv_offset_in_buffer, replace "Unknown address type" error
with informational message: gr_gk20a_exec_ctx_ops calls
gk20a_get_ctx_buffer_offsets and if that fails,
calls gr_gk20a_get_pm_ctx_buffer_offsets; HWPM registers will fail the first
call, so an error or warning is overkill.

Bug 1648200

Change-Id: I197b82579e9894652add4ff254418f818981415a
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1131365
(cherry picked from commit 9f30a92c5d87f6dadd34cc37396a6b10e3a72751)
Reviewed-on: http://git-master/r/1133628
(cherry picked from commit 7eb7cfd998852ba7f7c4c40d3db286f66e83ab3a)
Reviewed-on: http://git-master/r/1127749
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-19 15:58:24 -07:00
Terje Bergstrom
211edaefb7 gpu: nvgpu: Fix CWD floorsweep programming
Program CWD TPC and SM registers correctly. The old code did not work
when there are more than 4 TPCs.

Refactor init_fs_mask to reduce code duplication.

Change-Id: Id93c1f8df24f1b7ee60314c3204e288b91951a88
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1143697
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
2016-05-16 10:57:48 -07:00
Terje Bergstrom
773b3f2034 gpu: nvgpu: Do not program max ways evict
Setting max_ways_evict reserves some of L2 for CB. In gk20a CB is in
dedicated RAM, so we don't need to reserve space for it.

The code gets invoked only on gk20a.

Change-Id: Ib8efec8c5e90c135bd0c10bb1eaa3f797ec68698
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1144993
2016-05-13 16:07:00 -07:00
Terje Bergstrom
67535de642 gpu: nvgpu: Do not enable L2 bit CYA15
Enabling L2 CYA15 is necessary only in another GPU for enabling
an HW fix. gk20a does not have this problem, so enabling CYA15
is not necessary.

Change-Id: I7318e8541ad392f9a34f3650beac05a39d7bba68
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1146086
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
2016-05-13 07:21:22 -07:00
Konsta Holtta
6eebc87d99 gpu: nvgpu: refactor gk20a_mem_{wr,rd} for vidmem
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.

gk20a_mem_{rd,wr}32() work as previously; add also gk20a_mem_{rd,wr}()
for byte-indexed accesses, gk20a_mem_{rd,wr}_n() for memcpy()-like
functionality, and gk20a_memset() for filling buffers with a constant.
The 8 and 16 bit accessor functions are removed.

vmap()/vunmap() pairs are abstracted to gk20a_mem_{begin,end}() to
support other types of mappings or conditions where mapping the buffer
is unnecessary or different.

Several function arguments that would access these buffers are also
changed to take a mem_desc instead of a plain cpu pointer. Some relevant
occasions are changed to use the accessor functions instead of cpu
pointers without them (e.g., memcpying to and from), but the majority of
direct accesses will be adjusted later, when the buffers are moved to
support vidmem.

JIRA DNVGPU-23

Change-Id: I3dd22e14290c4ab742d42e2dd327ebeb5cd3f25a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121143
Reviewed-by: Ken Adams <kadams@nvidia.com>
Tested-by: Ken Adams <kadams@nvidia.com>
2016-05-13 07:11:33 -07:00
Peter Daifuku
911f14e1f4 gpu: nvgpu: use correct ltc cnt in hwpm offset map
Fix the offset calculation code to save/restore proper number of LTCs
to conform to FECS microcode

Bug 1736910

Change-Id: I46ae1e0b45ffe354693db6ceb0aeeeac3d344b34
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1133896
(cherry picked from commit 97751ee7a44537f5aa5198ac362d9ee416adc172)
Reviewed-on: http://git-master/r/1146244
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-12 08:41:54 -07:00
Richard Zhao
bc72480f8d gpu: nvgpu: add fuse overrides for tpc disabling
- add fuse_override in gops. Implement it starting from gm20b.
- set cwd fs register, so cuda won't use disabled TPCs

Bug 1757262
Bug 200169697

Change-Id: If7bac58bd3a6bcf2925197ea5b7c2d10a77e0933
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
(cherry picked from commit 66cde7724815e9e5e85ab9b07fc985a78530222f)
Reviewed-on: http://git-master/r/1132177
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Tested-by: Adeel Raza <araza@nvidia.com>
2016-05-10 12:29:49 -07:00
Deepak Nibade
771f742703 gpu: nvgpu: add supported preemptions to gpu characteristics
Add below flag fields to gpu characteristics to indicate
supported and default preemption modes on platform for
graphics and compute

__u32 graphics_preemption_mode_flags;
__u32 compute_preemption_mode_flags;
__u32 default_graphics_preempt_mode;
__u32 default_compute_preempt_mode;

Add struct nvgpu_preemption_modes_rec to struct gr_gk20a
to store these values locally

Use platform specific get_preemption_mode_flags() to
get the flags and define gk20a/gm20b specific
get_preemption_mode_flags() API

Bug 1646259

Change-Id: I80193c0d988dc93bd96585f9aa631fd817f4dfa3
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1133595
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-05-09 13:16:53 -07:00
Terje Bergstrom
f14152c081 gpu: nvgpu: Support 3 PEs per GPC
Old code maxed at 2 PEs per GPC. Support 3 PEs and add code to make
sure we will get a warning if hardware supports more than that.

JIRA DNVGPU-6

Change-Id: Id6061567bad20474f4b4a7a0959be3426e5e4828
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1142440
2016-05-09 11:57:47 -07:00
Terje Bergstrom
ec62c649b5 gpu: nvgpu: Idle GR before calling PMU ZBC save
On gk20a when PMU is updating ZBC colors it is reading them from L2.
But L2 has one port, and ZBC reads can race with other transactions.
Idle graphics before sending PMU the ZBC_UPDATE request.

Also makes pmu_save_zbc a HAL, because PMU ucode has changes to bypass
this problem on some chips.

Bug 1746047

Change-Id: Id8fcd6850af7ef1d8f0a6aafa0fe6b4f88b5f2d9
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1129017
2016-04-26 09:46:04 -07:00
Deepak Nibade
5765694f2a gpu: nvgpu: do not include hw_proj_*.h
hw_proj_gk20a.h and hw_proj_gm20b.h should not be
included, hence remove the includes and APIs used
from the header

Use nvgpu_get_litter_value() API to replace use
of header

Bug 200156699

Change-Id: I5e88f71657682dd94ac7f0a45f940b70cf8222e7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1129611
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-23 10:34:30 -07:00
Deepak Nibade
b63c4bced5 gpu: nvgpu: IOCTL to suspend/resume context
Add below IOCTL to suspend/resume a context
NVGPU_DBG_GPU_IOCTL_SUSPEND_RESUME_CONTEXTS:

Suspend sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, suspend all SMs
- otherwise, disable channel/TSG
- enable ctxsw

Resume sequence :
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, resume all SMs
- otherwise, enable channel/TSG
- enable ctxsw

Bug 200156699

Change-Id: Iacf1bf7877b67ddf87cc6891c37c758a4644b014
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120332
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:45 -07:00
Deepak Nibade
c651adbeaa gpu; nvgpu: IOCTL to write/clear SM error states
Add below IOCTLs to write/clear SM error states

NVGPU_DBG_GPU_IOCTL_CLEAR_SINGLE_SM_ERROR_STATE
NVGPU_DBG_GPU_IOCTL_WRITE_SINGLE_SM_ERROR_STATE

Bug 200156699

Change-Id: I89e3ec51c33b8e131a67d28807d5acf57b3a48fd
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120330
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:22 -07:00
Deepak Nibade
04e45bc943 gpu: nvgpu: support storing/reading single SM error state
Add support to store error state of single SM before
preprocessing SM exception

Error state is stored as :
struct nvgpu_dbg_gpu_sm_error_state_record {
u32 hww_global_esr;
u32 hww_warp_esr;
u64 hww_warp_esr_pc;
u32 hww_global_esr_report_mask;
u32 hww_warp_esr_report_mask;
}

Note that we can safely append new fields to above
structure in the future if required

Also, add IOCTL NVGPU_DBG_GPU_IOCTL_READ_SINGLE_SM_ERROR_STATE
to support reading SM's error state by user space

Bug 200156699

Change-Id: I9a62cb01e8a35c720b52d5d202986347706c7308
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120329
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-19 08:07:03 -07:00
Terje Bergstrom
7d8e219389 gpu: nvgpu: Use sysmem aperture for SoC memory
In Tegra GPU, SoC memory has to be accessed as vidmem. In discrete GPU, it
has to be accessed as sysmem.

Change-Id: I4efe71b54a9a32f0bf1f02ec4016ed74405a14c5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120468
2016-04-15 08:50:34 -07:00
Terje Bergstrom
6839341bf8 gpu: nvgpu: Add litter values HAL
Move per-chip constants to be returned by a chip specific function.
Implement get_litter_value() for each chip.

Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1121383
2016-04-15 08:48:20 -07:00
Konsta Holtta
6433963801 gpu: nvgpu: clean up ctx_vars properly
Set ctx_vars.valid to false when removing support.  Otherwise a
re-poweron sequence could crash when the flag wouldn't match the real
state of the driver.

Also free all allocated regs instead of leaking some of them.

Change-Id: I3fc4fa759d839bc435e53cbd942fa5d39efe7f57
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1126138
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-14 08:43:16 -07:00
Konsta Holtta
f22df728de gpu: nvgpu: free global ctx bufs only if allocated
Try to free only allocated buffers in
gr_gk20a_free_global_ctx_buffers(), otherwise the destroy function
pointer would be NULL and crash for nonallocated buffers. This can
happen when init fails for some of the buffers. Additionally, make the
pointer NULL when a buffer is destroyed, to signify this state.

Also refactor the function upwards and call it from
gr_gk20a_alloc_global_ctx_buffers() to reduce code duplication.

Change-Id: I6e74795014f5e315b5f8342f544ddfccc0d02b71
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1126026
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-14 08:42:35 -07:00
Konsta Holtta
289cb4d9bd gpu: nvgpu: move cs_lock init to happen always
The cs_lock cyclestats mutex is unconditionally taken when removing cs
support, but it wouldn't be initialized if some part of gr init would
fail before it. Move it up to happen first, before other inits.

Change-Id: Ia5d7a888c29dc99728630a07698b1ed25af960c2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1126004
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-14 08:42:05 -07:00
Terje Bergstrom
9b5427da37 gpu: nvgpu: Support GPUs with no physical mode
Support GPUs which cannot choose between SMMU and physical
addressing.

Change-Id: If3256fa1bc795a84d039ad3aa63ebdccf5cc0afb
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120469
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
2016-04-13 13:12:41 -07:00
Peter Daifuku
6eeabfbdd0 gpu: nvgpu: vgpu: virtualized SMPC/HWPM ctx switch
Add support for SMPC and HWPM context switching when virtualized

Bug 1648200
JIRASW EVLR-219
JIRASW EVLR-253

Change-Id: I80a1613eaad87d8510f00d9aef001400d642ecdf
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1122034
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-08 12:34:50 -07:00
Terje Bergstrom
e8bac374c0 gpu: nvgpu: Use device instead of platform_device
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.

Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
2016-04-08 09:42:41 -07:00
Thomas Fleury
11f11a6c0d gpu: nvgpu: fix zcb_count
gr_gk20a_init_gr_config may be invoked several times
if gr->sw_ready is reset. Need to clear zcb_count before
summing gpc_zcb_count[gpc] to avoid ever incrementing count

Change-Id: If3bfa47ed807a3e9a5dc31f7f7f96f0c6e1fed08
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1120772
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 13:17:30 -07:00
Peter Daifuku
37155b65f1 gpu: nvgpu: support for hwpm context switching
Add support for hwpm context switching

Bug 1648200

Change-Id: I482899bf165cd2ef24bb8617be16df01218e462f
Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com>
Reviewed-on: http://git-master/r/1120450
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 11:05:49 -07:00
Deepak Nibade
16658fd39d gpu: nvgpu: post BPT_INT/PAUSE and BLOCKING_SYNC events
Post EVENT_ID_BPT_INT when bpt.int is pending
Post EVENT_ID_BPT_PAUSE when bpt.pause is pending
Post EVENT_ID_BLOCKING_SYNC whenever there is
non-stalling semaphore interrupt indicating work
completion from GR/CE2 engine

Bug 200089620

Change-Id: I91b7bf48f8585f0d318298fc0c4a66d42055f0a7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1112274
(cherry picked from commit d2b744b1f9acac56435cd7e7ab9a7a845579ef24)
Reviewed-on: http://git-master/r/1120321
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-04-07 08:45:47 -07:00
Deepak Nibade
e87ba53235 gpu: nvgpu: add channel event id support
With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can
raise events to User space. But user space
cannot distinguish between various types of events.
To overcome this, we need finer-grained API to
deliver various events to user space.

Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL,
and all the support for this API (we can remove
this since User space has not started using this
API at all)

Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL
which will accept an event_id (like BPT.INT or
BPT.PAUSE), a command to enable the event,
and return a file descriptor on which
we can raise the event (if cmd=enable)
Event is disabled when file descriptor is closed

Add file operations "gk20a_event_id_ops"
to support polling on event fd

Also add API gk20a_channel_get_event_data_from_id()
to get event_data of event from its id

Bug 200089620

Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1030775
(cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275)
Reviewed-on: http://git-master/r/1120318
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
2016-04-07 08:43:49 -07:00
Vishal Annapurve
cd29c45e67 gpu: nvgpu: Fix compilation with CONFIG_DEBUG_FS disabled
This change fixes issues with kernel compilation when
CONFIG_DEBUG_FS is disabled.

Bug 1737085

Change-Id: I74719674d07ae071e3df99b0dda249b54173f40b
Signed-off-by: Vishal Annapurve <vannapurve@nvidia.com>
Reviewed-on: http://git-master/r/1024167
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sandeep Trasi <strasi@nvidia.com>
2016-03-29 02:00:58 -07:00
Anton Vorontsov
1c40d09c4c gpu: nvgpu: Add support for FECS ctxsw tracing
bug 1648908

This commit adds support for FECS ctxsw tracing. Code is compiled
conditionnaly under CONFIG_GK20_CTXSW_TRACE.
This feature requires an updated FECS ucode that writes one record to a ring
buffer on each context switch. On RM/Kernel side, the GPU driver reads records
from the master ring buffer and generates trace entries into a user-facing
VM ring buffer. For each record in the master ring buffer, RM/Kernel has
to retrieve the vmid+pid of the user process that submitted related work.

Features currently implemented:
- master ring buffer allocation
- debugfs to dump master ring buffer
- FECS record per context switch (with both current and new contexts)
- dedicated device for ctxsw tracing (access to VM ring buffer)
- SOF generation (and access to PTIMER)
- VM ring buffer allocation, and reconfiguration
- enable/disable tracing at user level
- event-based trace filtering
- context_ptr to vmid+pid mapping
- read system call for ctxsw dev
- mmap system call for ctxsw dev (direct access to VM ring buffer)
- poll system call for ctxsw dev
- save/restore register on ELPG/CG6
- separate user ring from FECS ring handling

Features requiring ucode changes:
- enable/disable tracing at FECS level
- actual busy time on engine (bug 1642354)
- master ring buffer threshold interrupt (P1)
- API for GPU to CPU timestamp conversion (P1)
- vmid/pid/uid based filtering (P1)

Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1022737
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-23 07:48:47 -07:00
Seshendra Gadagottu
471c14f76e gpu: nvgpu: update slcg/blcg prod setiings
Add following missing prod settings:
  blcg bus
  blcg ce
  slcg ce2
  slcg chiplet
  slcg gr

Bug 1689806

Change-Id: Ic7c9afdb1fc47ad71ca326384f5d2a4528121abe
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/1030987
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-14 20:15:17 -07:00
Terje Bergstrom
d2e5eaf359 gpu: nvgpu: Reset channel on SM exception
If we receive an exception without debugger attached, trigger
a fault recovery.

Change-Id: I8c02e37eb7fb0cba2fcb7afed7beb26b86f38d9e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1026003
(cherry picked from commit 526eef512eaed1c6472677eddec051541a939d63)
Reviewed-on: http://git-master/r/1026002
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
2016-03-08 10:51:30 -08:00
Supriya
9670bf5948 gpu: nvgpu: gk20a: FECS BL checksum
Update FECS BL checksum

Bug 200149721

Change-Id: Icebcf9c0440e88f9018f514804b1e0eeaa7c89cb
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/826772
(cherry picked from commit 634363dc33bc23bf81cee319e68d6dbc8e29a53c)
Reviewed-on: http://git-master/r/1026001
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-08 10:51:08 -08:00
Thomas Fleury
4331166afd gpu: nvgpu: disable ELPG while accessing gr_gpcs_tpcs_sm_sch_macro_sched_r
bug 200139995

Any GR register access should disable ELPG and clock gating before
access and enable it back after it is done. Disable ELPG while tweaking
perf parameters in gk20a_alloc_obj_ctx.

Also output NV_PBUS_INTR_0 in case of interrupt (including fix to
display correct value on pbus isr).

Change-Id: I81d2eb4461e92fbb33db8554779f6566f6b002c1
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/835307
(cherry picked from commit 6acc35bd1bcc706fbde8d11521cf1d0f64a16fe4)
Reviewed-on: http://git-master/r/921299
(cherry picked from commit 73afd520445bb1f4757fd167b38289143fd46d80)
Reviewed-on: http://git-master/r/930040
(cherry picked from commit 7a784ebea0dd60a88469f51eaa61c33b356e499c)
Reviewed-on: http://git-master/r/1023529
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-03-03 14:22:34 -08:00
Supriya
640cb6642f gpu: nvgpu: LRF, TEX, LTC, DRAM override
- Adding support for FECS mem overrides

Bug 1699676

Change-Id: I6c9ddcd98d57b29059513ee508c6f92b194c4fc7
Signed-off-by: Supriya <ssharatkumar@nvidia.com>
Reviewed-on: http://git-master/r/921253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-26 12:29:55 -08:00
Ashutosh Jain
e55a459e2b gpu: nvgpu: post events on all channels of TSG.
Raise the SM exception event on dbg fds of all
channels as userspace might have registered on
only one of the channels.
WAR till we fix Bug 200089620

Bug 1724367

Change-Id: I69c20ee9837927c116f350f4bdc70af5e90cd0a8
Signed-off-by: Ashutosh Jain <ashutoshj@nvidia.com>
Reviewed-on: http://git-master/r/1012851
(cherry picked from commit 92f7086856bc9e23b39c5f3ceec3130b6407e0d1)
Reviewed-on: http://git-master/r/1013813
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Tested-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-23 08:02:46 -08:00
Adeel Raza
d3bd5adfca gpu: nvgpu: always handle gr exception
Always handle gr exception regardless of whether the SM debugger is
attached or not.

Bug 1699676

Change-Id: If98ab6948c42d3fb1e4f02d54db12745485b0607
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: http://git-master/r/1013164
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-19 12:51:30 -08:00
Adeel Raza
21605d09a5 gpu: nvgpu: add create_gr_sysfs() function pointer
Add create_gr_sysfs() function pointer for creating gr specific sysfs
nodes.

Bug 1699676

Change-Id: I0a14d3676ebfcd5adebce673e46bdaad8d6aecf7
Signed-off-by: Adeel Raza <araza@nvidia.com>
Reviewed-on: http://git-master/r/1008658
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-19 12:51:01 -08:00
Deepak Nibade
7d7033d831 gpu: nvgpu: return error for handled intr only
In gk20a_gr_handle_fecs_error(), we always return error
value and that triggers recovery in each case

Return error only if we need to trigger recovery
(depending on case)
Otherwise, clear the interrupt and return success

Bug 200156699

Change-Id: I117f3702b751e8bbc1cd3834b1b72b6533e246f9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1001694
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
2016-02-05 12:45:50 -08:00