- Define fuse macros depending on kernel version as fuse
offset got changed in K4.4 and for K4.4 fuse defines are
defined in common header file (tegra-fuse.h)
- Use fuse control read/write APIs when reading control
registers for K4.4
Bug 200243956
Change-Id: I34dabd1a307d10010cb89ac6a5f1e3f5b177c0fc
Signed-off-by: Shardar Shariff Md <smohammed@nvidia.com>
Reviewed-on: http://git-master/r/1245825
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
GVS: Gerrit_Virtual_Submit
Event notifications on TSGs should only be sent to the channel that caused the
event to happen in the first place, not evey channel in the tsg. Any more and
the debugger will not be able to tell what channel actually got the event.
Worse yet, if all the channels in a tsg are bound to the same debug session
(as is the case with cuda-gdb), then multiple nvgpu events for the same gpu
event will be triggered, causing events to be buffered and the client to get
out of sync.
One gpu exception, one nvgpu event per tsg.
Bug 1793988
Change-Id: Iee36c774f193554ffb9ab7c1650ee0610e476a99
Signed-off-by: Cory Perry <cperry@nvidia.com>
Reviewed-on: http://git-master/r/1194206
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Added preemption mode (WFI, GFXP, CTA and CILP) support for gp10x
family gr class (PASCAL_B and PASCAL_COMPUTE_B).
Bug 200221149
Change-Id: Ia8b781c5baedba660db5997f190a0b363286ed7f
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1193209
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
For devices that have vidmem available, use the vidmem allocator in
gk20a_gmmu_alloc{,attr,_map,_map_attr}. For others, use sysmem.
Because all of the buffers haven't been tested to work in vidmem yet,
rename calls to gk20a_gmmu_alloc{,attr,_map,_map_attr} to have _sys at
the end to declare explicitly that vidmem is used. Enabling vidmem for
each now is a matter of removing "_sys" from the function call.
Jira DNVGPU-18
Change-Id: I4a67eae403f1d9d271118c35e3775b1129170676
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1176806
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Propagate the buffer aperture flag in gk20a_locked_gmmu_map up so that
buffers represented as a mem_desc and present in vidmem can be mapped to
gpu.
JIRA DNVGPU-18
JIRA DNVGPU-76
Change-Id: Icd675e83e3c28836f0ed8880425748697713bb0a
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1169296
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Set NV_PGRAPH_PRI_FE_GFXP_WFI_TIMEOUT from the
default of ~20us to ~100us. Also set
NV_PGRAPH_DEBUG_2_GFXP_WFI_ALWAYS_INJECTS_WFI o
avoid going into GFXP all the time.
Bug 1593548
Jira VFND-1894
Change-Id: I6310c3605f7b83178c38de88788d87e36ee428b4
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1162629
(cherry picked from commit 873ddc7288063b1773d31a5bda30d980122d6645)
Reviewed-on: http://git-master/r/1166988
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
From this patch onwards, runlist_id is a member of
struct channel_gk20a. So removed hard coded
runlist_id mapping logic.
JIRA DNVGPU-25
Change-Id: Ib87d96a518a490d4167071708a76100a4d4c02dd
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1161776
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
This CL covers the following modification,
1) Added multiple engine_info support
2) Added multiple runlist_info support
3) Initial changes for ASYNC CE support
4) Added ASYNC CE interrupt support for
Pascal GPU series
5) Removed hard coded engine_id logic and
made generic way
6) Code cleanup for readability
JIRA DNVGPU-26
Change-Id: Ibf46a89a5308c82f01040ffa979c5014b3206f8e
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: http://git-master/r/1156022
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
The per-write map/unmap feature from gr_gk20a_ctx_patch_write_begin() is
dropped, so call begin/end explicitly from gr_gp10b_set_preemption_mode
for the commit_global_cb_manager call.
Change-Id: I7bf952fffb54d4f18706e77dea015ffe4b68bcfe
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1157835
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
In gr_gp10b_set_preemption_mode() and in gp10b_fifo_resetup_ramfc(),
we call channel specific APIs to disable/preempt/enable channel
But we do not consider TSGs in this case
Hence use correct (below) APIs in above function which
will handle channel or TSG internally :
gk20a_disable_channel_tsg()
gk20a_fifo_preempt()
gk20a_enable_channel_tsg()
Bug 200205041
Change-Id: I2369e79b2af3b8a91699044106293865d5f8f260
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1157192
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
* Moving jiffy counter after preemption work to more accurately and fairly give
time for preemption to complete.
* Add debug information to coordinate waiting.
* Check if cilp is still pending before returning the timedout error.
Bug 1700310
Change-Id: Ic16bb3b11f2cd5aea9a5a85b5e0d9927732a065c
Signed-off-by: Cory Perry <cperry@nvidia.com>
Reviewed-on: http://git-master/r/1151907
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Use multiplication instead of division to come up with an SM id.
Change-Id: Ib185970ee99cc8c010d02ba846229e0959a5fef3
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1150599
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Program CWD TPC and SM registers correctly. The old code did not work
when there are more than 4 TPCs.
Change-Id: I18a14a0f76d97b0962607ec0bbd71aafcd768bca
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1143075
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
To support vidmem, pass g and mem_desc to the buffer memory accessor
functions. This allows the functions to select the memory access method
based on the buffer aperture instead of using the cpu pointer directly
(like until now). The selection and aperture support will be in another
patch; this patch only refactors these accessors, but keeps the
underlying functionality as-is.
JIRA DNVGPU-23
Change-Id: I21d4a54827b0e2741012dfde7952c0555a583435
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1121914
GVS: Gerrit_Virtual_Submit
Reviewed-by: Ken Adams <kadams@nvidia.com>
Separate out new API gr_gp10b_set_ctxsw_preemption_mode()
which will check requested preemption modes and take appropriate
action for each preemption mode
This API will also do some sanity checking for valid
preemption modes and combinations
Define API set_preemption_mode() for gp10b which will set the
preemption modes passed as argument and then use
gr_gp10b_set_ctxsw_preemption_mode() and
update_ctxsw_preemption_mode() to update preemption mode
Legacy path from gr_gp10b_alloc_gr_ctx() will convert
flags NVGPU_ALLOC_OBJ_FLAGS_* into appropriate preemption modes
and then call gr_gp10b_set_ctxsw_preemption_mode()
New API set_preemption_mode() will use new flags
NVGPU_GRAPHICS/COMPUTE_PREEMPTION_MODE_* and set and update
ctxsw preemption mode
In gr_gp10b_update_ctxsw_preemption_mode(), update graphics
context to set CTA premption mode if mode
NVGPU_COMPUTE_PREEMPTION_MODE_CTA is set
Also, define preemption modes in nvgpu-t18x.h
and use them everywhere
Remove old definitions of modes from gr_gp10b.h
Bug 1646259
Change-Id: Ib4dc1fb9933b15d32f0122a9e52665b69402df18
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1131806
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add API gr_gp10b_suspend_contexts() to support context
suspend on gp10b
sequence to suspend:
- disable ctxsw
- loop through list of channels
- if channel is ctx resident, suspend all SMs
- if CILP channel, set CILP preempt pending = true
- resume all SMs
- otherwise, disable channel/TSG
- enable ctxsw
- if CILP preempt is pending, wait for it to complete
Bug 200156699
Change-Id: Id9609077c283f99f420ad21c636b29f74b8eff6b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120334
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Move per-chip constants to be returned by a chip specific function.
Implement get_litter_value() for each chip.
Change-Id: I8bda9bf99b2cc6aba0fb88a69cc374e0a6abab6b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1121384
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Remove posting of events using old channel event API i.e.
gk20a_channel_post_event()
Also, update gk20a_channel_semaphore_wakeup() to post
events when called from ce2_nonblockpipe_isr()
Bug 200089620
Change-Id: I677cdab11183a649663ff9272a527c63b9994430
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1112275
(cherry picked from commit 4840efda393cd5928f1a8463db8b52cc586860bc)
Reviewed-on: http://git-master/r/1120289
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
fix below sparse warning :
drivers/gpu/nvgpu/gp10b/gr_gp10b.c:1364:5: warning: symbol
'gr_gp10b_pre_process_sm_exception' was not declared. Should it be
static?
Bug 200088648
Change-Id: Ie55ffc12eb653b10358001e2aef8766562fd0df9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1009938
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
While posting CILP preemption complete event to
user space, raise the event to all channels of TSG
(if channel is part of TSG)
This is a WAR until we have proper sync mechanism
with user space to raise CILP events
Bug 200156699
Change-Id: Ieedc866498a8c5464cf65962257a803b37da6826
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1001696
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add CILP support for gp10b by defining below function
pointers (with detailed explanation)
pre_process_sm_exception()
- for CILP enabled channels, get the mask of errors
- if we need to broadcast the stop_trigger, suspend all SMs
- otherwise suspend only current SM
- clear hww_global_esr values in h/w
- gr_gp10b_set_cilp_preempt_pending()
- get ctx_id
- using sideband method, program FECS to generate
interrupt on next ctxsw
- disable and preempt the channel/TSG
- set cilp_preempt_pending = true
- clear single step mode
- resume current SM
handle_fecs_error()
- we get ctxsw_intr1 upon next ctxsw
- clear this interrupt
- get handle of channel on which we first
triggered SM exception
- gr_gp10b_clear_cilp_preempt_pending()
- set cilp_preempt_pending = false
- send events to channel and debug session fd
Bug 200156699
Change-Id: Ia765db47e68fb968fada6409609af505c079df53
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925897
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
gr_*__set_alpha_circular_buffer_size() left max_batches field of
gr_pd_ab_dist_cfg1_r as 0 which results in too many alpha beta
transitions and poor performance when tessellation or geometry
shaders are used
Change-Id: Ic3673f45b60674b3527641a6fdda0cedc6861db5
Signed-off-by: David Li <davli@nvidia.com>
Reviewed-on: http://git-master/r/840079
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Some of the allocated buffers are used during normal graphics
processing. Mark them as GPU cacheable to improve performance.
Bug 1695718
Change-Id: I71d5d1538516e966526abe5e38a557776321597f
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/827087
(cherry picked from commit 60b40ac144c94e24a2c449c8be937edf8865e1ed)
Reviewed-on: http://git-master/r/828493
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add gp10b PROD value for FE_GO_IDLE_TIMEOUT. Use the PROD value
written in gk20a_init_gr_setup_hw() instead of hard coding here.
Change-Id: If3bd981c1c0d9cc8ad19c21c220b7de81fdb529e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/813959
We used to allocate 1.5x buffer size. This leads to memory waste, as
we do not set the CB size via SW methods anymore.
Bug 1686189
Change-Id: I45cbdeadc154f59b65138f99f50a72d97511cb78
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/801865
(cherry picked from commit 791f2fe03d16521206649ab90498443e91e284e2)
Reviewed-on: http://git-master/r/815683
Spill buffer size is in chunks of 256B. Multiply the size by
granularity to get the size in bytes.
Bug 1686189
Change-Id: I0462293668322645bd1eab190c12faaeb6c316c1
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/801344
(cherry picked from commit 4bf6de7d9c9014a9eaeff56b19437d1841d7cfb0)
Reviewed-on: http://git-master/r/815680
If pagepool size equals max we should use zero. Add the comparison
to do that.
Bug 1686189
Change-Id: I15bd43663550b1089a726c0256b89f849c193e21
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/801526
(cherry picked from commit 9d89ea5ba345b19d2cff86130ba9d3c4c5f07e6e)
Reviewed-on: http://git-master/r/815681
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
We program the default steady state beta CB size. The default is
for deep binning, but we've disabled deep binning. As result steady
state CB size was left too high.
Bug 1683535
Change-Id: I17029078d9c83e55eec6faacfc83c6d812f8c3c0
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/795306
Reviewed-on: http://git-master/r/806189