The high level API for the timer unit is the same across all OSs, so
get rid of the slight code duplication by moving the timer init
functions under a new file in common code:
- nvgpu_timeout_init_cpu_timer
- nvgpu_timeout_init_cpu_timer_sw
- nvgpu_timeout_init_retry
Much of the timer logic is also duplicated, but it is mixed between OS
specific current time retrieval. With some refactoring and addition of
an OS independent time keeping layer, that logic could also be made
shared.
Change-Id: I75d02ceb0d32022b0ba7f3bcd9fdb13d47039dbc
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2669510
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
In DRIVE 6.0, NvGPU is allowed to report only 32-bit metadata to
Safety_Services. So, there is no need to have distinct APIs for
reporting errors from units like GR, MM, FIFO to SDL unit. All
these error reporting APIs will be replaced with a single API. To
meet this objective, this patch does the following changes:
- Replaces nvgpu_report_*_err with nvgpu_report_err_to_sdl.
- Removes the reporting of error messages.
- Replaces nvgpu_log() with nvgpu_err(), for error reporting.
- Removes error reporting to Safety_Services from nvgpu_report_*_err.
However, nvgpu_report_*_err APIs and their related files are not
removed. During the creation of nvgpu-mon, they will be moved under
nvgpu-rm, in debug builds.
Note:
- There will be a follow-up patch to fix error IDs.
- As discussed in https://nvbugs/3491596 (comment #12), the high
level expectation is to report only errors.
JIRA NVGPU-7450
Change-Id: I428f2a9043086462754ac36a15edf6094985316f
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662590
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The nvgpu timeout API has an internal override for presilicon mode by
default: in presi simulation environments the timeouts never trigger.
This behaviour is intended in the original usecase of the timer unit
with hardware polling loops. In pure software logic though, the timer
must trigger after the specified timeout even in presi mode so add a new
init function to produce a timer for software logic. Use this new kind
of timer in channel and scheduling worker threads.
The channel worker currently times out for just the purpose of the
channel watchdog timer which has its own internal timer. Although that's
just software, the general expectation is that the watchdog does not
trigger in presilicon tests that run slower than usual. The internal
watchdog timer thus keeps the non-sw mode.
Bug 3521828
Change-Id: I48ae8522c7ce2346a930e766528d8b64195f81d8
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2662541
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
CBC contig allocation requires mempool node in DT and the
node can be used for contig allocations. The code duplication
can be avoided by unifying the code from vgpu.
Change-Id: I6eaa1d0c9db47b158602bf0ba68ce4e09cf487a7
Signed-off-by: Dinesh T <dt@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2650459
Reviewed-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-by: Richard Zhao <rizhao@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
GVS: Gerrit_Virtual_Submit
WARN() and WARN_ON() are most useful when the log explains where they
happened. The posix implementation of these prints neither that nor the
warning message (if any). Extend the macros to include function name and
line number, and print those plus the format string.
Actually formatting the format string is problematic wrt. MISRA rules,
so the arguments are not formatted.
The implementation of BUG() already prints the function name and line
number.
Change-Id: Ie246a915f5e8420e1c606bb1555a7f9b498725fd
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2634105
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
nvgpu_timeout_init() returns an error code only when the flags parameter
is invalid. There are very few possible values for flags, so extract the
two most common cases - cpu clock based and a retry based timeout - to
functions that cannot fail and thus return nothing. Adjust all callers
to use those, simplfying error handling quite a bit.
Change-Id: I985fe7fa988ebbae25601d15cf57fd48eda0c677
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2613833
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
To enable userspace query about comptags allocation status of a buffer,
comptags are to be allocated only during buffer registration done by
nvrm_gpu. Earlier, they were allocated during map.
nvrm_gpu will be sending metadata blob to be associated with the buffer.
This will have to be stored in the dmabuf privdata for all the buffers
registered by nvrm_gpu.
This patch moves the privdata allocation to buffer registration ioctl.
Remove g->mm.priv_lock as it is not needed now. This lock was added
to protect dmabuf private data setup. That private data is now
handled through dmabuf->ops and setup of dmabuf->ops is done
under dmabuf->lock.
To support legacy userspace, this patch still allocates comptags on
demand on map calls for unregistered buffers.
Bug 200586313
Change-Id: I88b2ca04c733dd02a84bcbf05060bddc00147790
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2480761
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
common.cic unit is divided into common.cic.mon and common.cic.rm
based on rm and mon process split.
CIC-mon subunit includes the code which is utilized in critical
interrupt handling path like initialization, error detection and
error reporting path. CIC-rm subunit includes the code corresponding
to rest of interrupt handling(like collecting error debug data from
registers) and ISR status management (status of deferred interrupts).
Split the CIC APIs and data-members into above two subunits.
JIRA NVGPU-6899
Change-Id: I151b59105ff570607c4a62e974785e9c1323ef69
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2551897
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add REMAP ioctl and accompanying support to the linux nvgpu driver.
REMAP support provides per-page control over sparse VM areas using the
concept of a virtual memory pool.
The REMAP ioctl accepts a list of operations (each a map or unmap) that
modify the VM area pages tracked by the virtual mmemory pool.
Inclusion of REMAP support in the nvgpu build is controlled by the new
CONFIG_NVGPU_REMAP flag. This flag is enabled by default for linux builds.
A new NVGPU_GPU_FLAGS_SUPPORT_REMAP characteristics flag is added for use
in detecting when REMAP support is available.
When a VM allocation tagged with NVGPU_VM_AREA_ALLOC_SPARSE is made the
base virtual memory pool resources are allocated. Per-page resources are
later allocated when the NVGPU_AS_IOCTL_REMAP ioctl is issued. All REMAP
resources are released when the corresponding VM area is freed.
Jira NVGPU-6804
Change-Id: I1f2cdc0c06c1698a62640c1c6fbcb2f9db24a0bc
Signed-off-by: scottl <scottl@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2542178
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add a new Central Interrupt Controller(CIC) unit in common code.
The interrupt handling is done in a distributed manner currently.
The error handling policy for different errors resides in each unit's
ISR code. The goal is to converge this data under one central place -
the CIC unit.
This patch creates framework for CIC unit and moves the gv11b QNX
safety LUT to CIC unit. All the error reporting APIs from different
units are also moved to CIC.
New APIs are exposed by CIC unit to access its internal data like:
1. Struct err_desc - the static err handling /injection data per
error id
2. Num_hw_modules - the number of error reporting HW units
supported by CIC
Init and deinit of CIC unit:
1. CIC unit should be initialized earlyon during boot so that it
is available for any interrupt handling.
2. Initialize CIC just before the interrupts are enabled during
boot.
3. Similarly, CIC is disabled late during deinit cycle; right
after the interrupts are masked.
LUT:
1. LUT is currently used only for reporting error to safety
services in gv11b QNX safety build.
2. This error handling policy LUT currently has only two levels
of handing - correctable and quiecse.
3. Once, the error handling policy decision is moved from leaf
unit nodes to CIC, LUT will be updated to have additional levels
like fast recovery and full recovery.
4. Also, then a separate LUT will be added for each platform/build.
5. In current framework, the LUT is set to NULL for all
configurations except gv11b.
report_err() ops is added to report error to safety services.
This ops is only effective for gv11b qnx build; and set to NULL for
other configurations.
NVGPU-6521
NVGPU-6523
NVGPU-6750
NVGPU-6758
NVGPU-6760
NVGPU-6754
Change-Id: I24be7836a96d787741e37b732e19863ed8014635
Signed-off-by: Tejal Kudav <tkudav@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2518683
Reviewed-by: Ajesh K V <akv@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, there are few chip specific erratas present in nvgpu code.
For better traceability of the erratas and corresponding fixes,
introduce flags to indicate existing erratas on a chip. These flags
decide if a corresponding solution is applied to the chip(s).
This patch introduces below functions to handle errata flags:
- nvgpu_init_errata_flags
- nvgpu_set_errata
- nvgpu_is_errata_present
- nvgpu_print_errata_flags
- nvgpu_free_errata_flags
nvgpu_print_errata_flags: print below details of erratas present in chip
1. errata flag name
2. chip where the errata was first discovered
3. short description of the errata
Flags corresponding to erratas present in a chip are set during chip hal
init sequence.
JIRA NVGPU-6510
Change-Id: Id5a8fb627222ac0a585aba071af052950f4de965
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2498095
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
- moved reg fields to gk20a
- added os abstract register accessor in nvgpu/io.h
- defined linux register access abstract implementation
- hook up with posix. posix implementation of the register accessor uses
the high 4 bit of address to identify register apertures then call the
according callbacks.
It helps to unify code across OSes.
Bug 2999617
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: Ifcb737e4b4d5b1d8bae310ae50b1ce0aa04f750c
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2497937
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
The simulator ring buffer DMA interface supports buffers of the following sizes:
4, 8, 12 and 16K. At present, it is configured to 4K and it happens to match
with the kernel PAGE_SIZE, which is used to wrap back the GET/PUT pointers once
4K is reached. However, this is not always true; for instance, take 64K pages.
Hence, replace PAGE_SIZE with SIM_BFR_SIZE.
Introduce macro NVGPU_CPU_PAGE_SIZE which aliases to PAGE_SIZE and replace
latter with former.
Bug 200658101
Jira NVGPU-6018
Change-Id: I83cc62b87291734015c51f3e5a98173549e065de
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2420728
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
* Removed unnecessary irqs_enabled flag, and
Replaced enable/disable irq logics with nvgpu variant functions.
* Added nvgpu_interrupts data structure to hold interrupt details.
* Interpret all stall irqs first and followed by nonstall irq from dt.
* Used interrupt size checks for enable/disable irqs instead of
comparing stall and nonstall interrupt lines.
Now adding new stall interrupt lines as easy as just updating macro.
Jira NVGPU-6019
Change-Id: I5a5eaa8d333c68ee87d25d2b45ec244ec8d7b297
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400777
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
The max values that the Linux nvhost driver tracks are adding some
complexity to our wrapper APIs. Max values are used only for internal
submit syncpoint tracking, so implement that tracking in the sync code
by just storing the last value that the syncpoing will reach after all
jobs are complete.
The value is a simple u32. It's accessed from functions in the submit
path that already is serialized, so there's no worrying about atomic
modifications.
Previously nvhost_syncpt_set_min_eq_max_ext() was used to reset the
syncpoint when necessary. Now with the internal max value we'll use
nvhost_syncpt_set_minval(), so add a wrapper for it.
The maxval reported with the user syncpoint allocation is just the
current value at allocation time since no jobs have affected it yet;
there is no means for the kernel to track the max value of user
syncpoints.
Jira NVGPU-5506
Change-Id: I34672eaa7fe3af36b2fbac92d11babe2bc6a2d2b
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400635
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Not sure if there's an actual bug or JIRA filed for this, but the
change here fixes a long standing bug in the MM code for unit tests.
Te GMMU programming code verifies that the CPU _physical_ address
programmed into the GMMU PDE0 is a valid Tegra SoC CPU physical
address. That means that it's not too large a value.
The POSIX imlementation of the nvgpu_mem related code used the CPU
virtual address as the "phys" address. Obviously, in userspace,
there's no access to physical addresses, so in some sense it's a
meaningless function. But the GMMU code does care, as described
above, about the format of the address.
The fix is simple enough: since the nvgpu_mem_get_addr() and
nvgpu_mem_get_phys_addr() values shouldn't actually be accessed by
the driver anyway (they could be vidmem addresses or IOVA addresses
in real life) ANDing them with 0xffffffff (e.g 32 bits) truncates
the potentially problematic CPU virtual address bits returned by
malloc() in the POSIX environment.
With this, a run of the unit test framework passes for me locally
on my Ubuntu 18 machine.
Also, clean up a few whitespace issues I noticed while I debugged
this and fix another long standing bug where the
NVGPU_DEFAULT_DBG_MASK was not being copied to g->log_mask during
gk20a struct init.
Change-Id: Ie92d3bd26240d194183b4376973d4d32cb6f9b8f
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395953
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
This patch updates nvgpu_assert as macro to print the information
about the calling function. Specifically, to print the function
name and the line number details.
This patch introduces misra violations (misra_c_2012_rule_10_1_violation)
in nvgpu_assert(). However, leaving misra violations unfixed has low
safety impact since misra violations are coming after fatal error is
hit where GPU driver is not expected to be serviceable thereafter.
Further, this patch provides debug benefit in quickly finding the
function that lead to the exit of NvGPU process.
Bug 2964898
Change-Id: Iba85f4a9226742a0bb08b045bcbfa26949bbe746
Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2342086
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Currently, nvgpu_writel_loop() writes to a register and immediately
checks if register value is updated. It might take some time for
hardware registers to get updated with value written by software.
Modify nvgpu_writel_loop() to accept number of retries to check if
register value is updated and assert with nvgpu_assert().
Also, move nvgpu_writel_loop() to common code and use generic
nvgpu_readl() and nvgpu_writel() APIs.
JIRA NVGPU-5490
Change-Id: Iaaf24203a91eee3d05de7d0c7dea18113367de5f
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2348628
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>