We currently bind only one channel to a debug session
But some use cases might need multiple channels bound
to same debug session
Add this support by adding a list of channels to debug session.
List structure is implemented as struct dbg_session_channel_data
List node dbg_s_list_node is currently defined in struct
dbg_session_gk20a. But this is inefficient when we need to
add debug session to multiple channels
Hence add new reference structure dbg_session_data to
store dbg_session pointer and list entry
For each NVGPU_DBG_GPU_IOCTL_BIND_CHANNEL call, create
two reference structure dbg_session_channel_data for channel
and dbg_session_data for debug session and bind them together
Define API nvgpu_dbg_gpu_get_session_channel() which will
get first channel in the list of debug session
Use this API wherever we refer to channel bound to debug
session
Remove dbg_sessions define in struct gk20a since it is
not being used anywhere
Add new API NVGPU_DBG_GPU_IOCTL_UNBIND_CHANNEL to support
unbinding of channel from debug sesssion
Bug 200156699
Change-Id: I3bfa6f9cd5b90e7254a75c7e64ac893739776b7f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120331
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add support to store error state of single SM before
preprocessing SM exception
Error state is stored as :
struct nvgpu_dbg_gpu_sm_error_state_record {
u32 hww_global_esr;
u32 hww_warp_esr;
u64 hww_warp_esr_pc;
u32 hww_global_esr_report_mask;
u32 hww_warp_esr_report_mask;
}
Note that we can safely append new fields to above
structure in the future if required
Also, add IOCTL NVGPU_DBG_GPU_IOCTL_READ_SINGLE_SM_ERROR_STATE
to support reading SM's error state by user space
Bug 200156699
Change-Id: I9a62cb01e8a35c720b52d5d202986347706c7308
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1120329
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.
Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
Add NVGPU_IOCTL_TSG_EVENT_ID_CTRL API for channel
event id support to TSGs
This API will accept an event_id (like BPT.INT or
BPT.PAUSE), a command to enable
the event, and return a file descriptor on which
we can raise the event (if cmd=enable)
Events generated for TSGs will reuse file
operations "gk20a_event_id_ops"
Bug 200089620
Change-Id: I2f563c6d3a0988eb670caac2d3c7c6795724792c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1030776
(cherry picked from commit 72b61fa266279038f013e582be80c21808e1038d)
Reviewed-on: http://git-master/r/1120319
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
With NVGPU_IOCTL_CHANNEL_EVENTS_CTRL, nvgpu can
raise events to User space. But user space
cannot distinguish between various types of events.
To overcome this, we need finer-grained API to
deliver various events to user space.
Remove old API NVGPU_IOCTL_CHANNEL_EVENTS_CTRL,
and all the support for this API (we can remove
this since User space has not started using this
API at all)
Add new API NVGPU_IOCTL_CHANNEL_EVENT_ID_CTRL
which will accept an event_id (like BPT.INT or
BPT.PAUSE), a command to enable the event,
and return a file descriptor on which
we can raise the event (if cmd=enable)
Event is disabled when file descriptor is closed
Add file operations "gk20a_event_id_ops"
to support polling on event fd
Also add API gk20a_channel_get_event_data_from_id()
to get event_data of event from its id
Bug 200089620
Change-Id: I5288f19f38ff49448c46338c33b2a927c9e02254
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1030775
(cherry picked from commit 5721ce2735950440bedc2b86f851db08ed593275)
Reviewed-on: http://git-master/r/1120318
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
include/linux/gk20a.h was missed while spliting the nvgpu driver off.
This patch imports it into the nvgpu repo.
bug 200187033
Change-Id: I0622091348c1b6e19f592a1807a19739dc1f9cd0
Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Reviewed-on: http://git-master/r/1119271
bug 1648908
This commit adds support for FECS ctxsw tracing. Code is compiled
conditionnaly under CONFIG_GK20_CTXSW_TRACE.
This feature requires an updated FECS ucode that writes one record to a ring
buffer on each context switch. On RM/Kernel side, the GPU driver reads records
from the master ring buffer and generates trace entries into a user-facing
VM ring buffer. For each record in the master ring buffer, RM/Kernel has
to retrieve the vmid+pid of the user process that submitted related work.
Features currently implemented:
- master ring buffer allocation
- debugfs to dump master ring buffer
- FECS record per context switch (with both current and new contexts)
- dedicated device for ctxsw tracing (access to VM ring buffer)
- SOF generation (and access to PTIMER)
- VM ring buffer allocation, and reconfiguration
- enable/disable tracing at user level
- event-based trace filtering
- context_ptr to vmid+pid mapping
- read system call for ctxsw dev
- mmap system call for ctxsw dev (direct access to VM ring buffer)
- poll system call for ctxsw dev
- save/restore register on ELPG/CG6
- separate user ring from FECS ring handling
Features requiring ucode changes:
- enable/disable tracing at FECS level
- actual busy time on engine (bug 1642354)
- master ring buffer threshold interrupt (P1)
- API for GPU to CPU timestamp conversion (P1)
- vmid/pid/uid based filtering (P1)
Change-Id: I8e39c648221ee0fa09d5df8524b03dca83fe24f3
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: http://git-master/r/1022737
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
As part of improving GPU scheduling, userspace can now set a
channel's timeslice, within reasonable limits imposed by the
kernel driver.
JIRA VFND-1312
Bug 1729664
Change-Id: I4c3430c43437889b8685f12988d4b967bb7877bb
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1020917
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Previously, only "high" priority bare channels were interleaved
between all other bare channels and TSGs. This patch decouples
priority from interleaving and introduces 3 levels for interleaving
a bare channel or TSG: high, medium, and low. The levels define
the number of times a channel or TSG will appear on a runlist (see
nvgpu.h for details).
By default, all bare channels and TSGs are set to interleave level
low. Userspace can then request the interleave level to be increased
via the CHANNEL_SET_RUNLIST_INTERLEAVE ioctl (TSG-specific ioctl will
be added later).
As timeslice settings will soon be coming from userspace, the default
timeslice for "high" priority channels has been restored.
JIRA VFND-1302
Bug 1729664
Change-Id: I178bc1cecda23f5002fec6d791e6dcaedfa05c0c
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/1014962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add IOCTL NVGPU_DBG_GPU_IOCTL_SET_NEXT_STOP_TRIGGER_TYPE
to set next stop_trigger type (either single SM or
broadcast to all SMs)
Also, expose below APIs to check and clear broadcast flag:
gk20a_dbg_gpu_broadcast_stop_trigger()
gk20a_dbg_gpu_clear_broadcast_stop_trigger()
Bug 200156699
Change-Id: I5e6cd4b84e601889fb172e0cdbb6bd5a0d366eab
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/925882
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Update last IOCTL number to point to GET_TIMEOUT to allow its use.
Bug 1706457
Change-Id: I9c8cc4e972fc25e14ca5aff075eca72bc1807a0b
Signed-off-by: Chris Dragan <kdragan@nvidia.com>
(cherry-picked from commit f73444b0f92737919dafd1623a2dde60c467b25b)
Reviewed-on: http://git-master/r/921453
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
SM locking & register reads Order has been changed.
Also, functions have been implemented based on gk20a
and gm20b.
Change-Id: Iaf720d088130f84c4b2ca318d9860194c07966e1
Signed-off-by: sujeet baranwal <sbaranwal@nvidia.com>
Signed-off-by: ashutosh jain <ashutoshj@nvidia.com>
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/837236
Add IOCTL NVGPU_IOCTL_CHANNEL_WDT to disable/enable
watchdog per-channel
Also, if watchdog is disabled, we currently schedule
the worker with MAX timeout.
Instead of this, do not schedule any worker if
watchdog is disabled
Bug 1683059
Bug 1700277
Change-Id: I7f6bec84adeedb74e014ed6d1471317b854df84c
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/837962
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
In job submission path, we always take refcount on all
the mapped buffers to safeguard against case where user
space releases the buffer early
But in case user space itself is doing proper buffer
management, kernel need not take refcounts on all the
buffers - which is also a overhead in submit path
Hence, provide a new submit flag
NVGPU_SUBMIT_GPFIFO_FLAGS_SKIP_BUFFER_REFCOUNTING to
optionally skip taking refcounts on all the buffers
Also, if we do not take refcounts, then no need to drop
any refcounts in gk20a_channel_update() as well
Bug 1698667
Bug 200141116
Change-Id: I81bb7a03240300b691c70bcec04ea1badd5934f4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/824718
(cherry picked from commit 8c8978fa303ec4e6db0233becdbdcbad4a248173)
Reviewed-on: http://git-master/r/835801
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Implement NVGPU_GPU_IOCTL_ALLOC_AS_FLAGS_USERSPACE_MANAGED, which
enables creating userspace-managed GPU address spaces.
When an address space is marked as userspace-managed, the following
changes are in effect:
- Only fixed-address mappings are allowed.
- VA space allocation for fixed-address mappings is not required,
except to mark space as sparse.
- Maps and unmaps are always immediate. In particular, the mapping
ref increments at kickoffs and decrements at job completion are
skipped.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: I834fe19b3f65e9b02c268952383eddee0e465759
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/738558
Reviewed-on: http://git-master/r/833253
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Implement NVGPU_GPU_IOCTL_GET_BUFFER_INFO. The new IOCTL can be used
to identify buffers and retrieve their sizes. This allows the
userspace to be agnostic to the dmabuf implementation, as the generic
dmabuf fd interface does not have a reliable way for buffer
identification.
Bug 1614735
Bug 1623949
Bug 1660392
Change-Id: Ic3dd0a9385c9852778110ccb80636dd6f4f36208
Signed-off-by: Sami Kiminki <skiminki@nvidia.com>
Reviewed-on: http://git-master/r/822845
Reviewed-on: http://git-master/r/833252
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add new IOCTL NVGPU_IOCTL_TSG_SET_PRIORITY to allow
setting timeslice for entire TSG
Return error from channel specific IOCTL_CHANNEL_SET_PRIORITY
if the channel is part of TSG
Separate out API gk20a_channel_get_timescale_from_timeslice()
to get timeslice_timeout and scale from timeslice period
Use this API to get timeslice_timeout and scale for TSG and
store it in tsg_gk20a structure
Then trigger runlist update so that new timeslice values
will be re-written to runlist for TSG
Bug 200146615
Change-Id: I555467d034f81b372b31372f0835d72b1c159508
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/824206
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Add IOCTL NVGPU_DBG_GPU_IOCTL_TIMEOUT to support
disabling/re-enabling scheduler timeout from user space
If user space application is closed without re-enabling
the timeouts, kernel will restore the timeouts' state
while releasing the debug session
This is needed for debugging purpose
Bug 1514061
Change-Id: I32efb47ad09d793f3e7fd8f0aaa9720c8bc91272
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/788176
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Get rid of the duplicate gpfifo struct to emphasize the fact that
nvgpu_gpfifo is the only memory layout for gpfifo entries that works.
This is the same layout that HW uses.
Also, add a local pointer to the gpfifo memory in
gk20a_submit_channel_gpfifo to get rid of repeated typecasts.
Bug 1592391
Bug 1550886
Change-Id: I5432859ef8e7c1aab5907e44098994d7bb807f50
Signed-off-by: Janne Hellsten <jhellsten@nvidia.com>
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/677341
(cherry picked from commit 724c8c6228af81dd440e825bddf545dd6b2b8bd7)
Reviewed-on: http://git-master/r/822548
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Arto Merilainen <amerilainen@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
The server now exposes a new GMMU map interface that
can accept a scatter-gather list. This is needed to support
SMMU-bypass configurations.
Bug 1677153
JIRA VFND-689
Change-Id: I7b5af145db57dcebe2c9125ec90c689798d7e69e
Signed-off-by: Aingara Paramakuru <aparamakuru@nvidia.com>
Reviewed-on: http://git-master/r/792558
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Vladislav Buzov <vbuzov@nvidia.com>
Add support for CDE scatter buffers. When the bus addresses for
surfaces are not contiguous as seen by the GPU (e.g., when SMMU is
bypassed), CDE swizzling needs additional per-page information. This
information is populated in a scatter buffer when required.
Bug 1604102
Change-Id: I3384e2cfb5d5f628ed0f21375bdac8e36b77ae4f
Signed-off-by: Jussi Rasanen <jrasanen@nvidia.com>
Reviewed-on: http://git-master/r/789436
Reviewed-on: http://git-master/r/791243
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
this ioctl can be called only by a ctrlfd
created from the /dev/nvhost-ctrl-gpu node
therefore NVGPU_GPU_IOCTL_MAGIC, not
NVGPU_DBG_GPU_IOCTL_MAGIC should be used
for this
bug 200111987
Change-Id: I9fce7eae9f8203a15270ac1d25b575aebd9ccf88
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/755164
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
(cherry picked from commit 609d45ddd98c31ecd089d2e213ee1b6c560fc21e)
Reviewed-on: http://git-master/r/760830