linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
Lakshmanan M	d0bc8237e3	gpu: nvgpu: linux: Disable diversity related support SM and CE diversities are safety only features. Hence, we do not require to expose their ioctl and diversity related flags for Linux. JIRA NVGPU-4133 Bug 2776580 Change-Id: Icc3cc04734ffdcd901222206fca9a3594340d0e1 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2258872 Reviewed-by: Shashank Singh <shashsingh@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Lakshmanan M	a52ee77837	gpu: nvgpu: Add SM diversity gpu characteristic flag To achieve permanent fault coverage, the CTAs launched by each kernel in the mission and redundant contexts must execute on different hardware resources. This feature requires a change in software to make it possible to modify the virtual SM id to TPC mapping across mission and redundant contexts. This CL adds only SM diversity flags which are exposed to its clients through ioctl/devctl interfaces. Actual virtual SM id to TPC mapping implementation will be part of upcoming patch sets. Added NvGpu CFLAGS to identify the safety build "CONFIG_NVGPU_BUILD_CONFIGURATION_IS_SAFETY" JIRA NVGPU-4133 Change-Id: I5a18256780e6726e399e39c1c8d155d2ef07d7bd Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2250461 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Sagar Kamble	a8c9c800cd	gpu: nvgpu: reorganization of MC interrupts control Previously, unit interrupt enabling/disabling and corresponding MC level interrupt enabling/disabling was not done at the same time. With this change, stall and nonstall interrupt for units are programmed at MC level along with individual unit interrupts. Kept access to MC interrupt registers through mc.intr_lock spinlock. For doing this separated CE and GR interrupt mask functions. mc.intr_enable is only used when there is global interrupt control to be set. Removed mc_gp10b.c as mc_gp10b_intr_enable is now removed. Removed following functions - mc_gv100_intr_enable, mc_gv11b_intr_enable & intr_tu104_enable. Removed intr_pmu_unit_config as we can use the generic unit interrupt control function. JIRA NVGPU-4336 Change-Id: Ibd296d4a60fda6ba930f18f518ee56ab3f9dacad Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2196178 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Lakshmanan M	d6a20e31b3	gpu: nvgpu: tu10x: Add CE diversity gpu characteristic flag Tu104 has multiple async-LCE (3), GRCE (2) and PCE (4). So it is possible to use a different LCE/PCE during redundant execution. This will allow us to claim very high coverage for permanent fault. JIRA NVGPU-4370 Change-Id: Ib39013d8d4f377eb20820db100af57c57592c39d Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2243984 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Peter Daifuku	c58029ad24	gpu: nvgpu: fix race for nvgpu_thread_stop The pmu init thread typically returns immediately without calling nvgpu_thread_should_stop(). pmu_pg_kill_task() checks if the thread is running, and if it is, calls nvgpu_thread_stop(). However, there's a race condition where the init thread could have exited between the time that kill_task() checked the running flag and the time we actually stop the thread, leading to a kernel crash. Fix this by making the running flag in the nvgpu_thread struct atomic. Both the thread proxy function and the thread_stop() function will set the flag to false. In the case of nvgpu_thread_proxy(), if the flag is already false, then nvgpu_thread_stop() has already reset it, at which point we just wait for nvgpu_thread_should_stop() to return true. In the case of nvgpu_thread_stop(), if the flag is already false, then the thread proxy function has already exited, and there is nothing more to do. Bug 2591298 Change-Id: I9ba6b63c30a5c3e1df11e790094836b44373122b Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2230358 GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Debarshi Dutta	71bb9aae91	gpu: nvgpu: remove call to tegra_get_chip_id() tegra_get_chip_id is going to be deprecated soon and this patch removes calls to it. Chip-ID is already read via DT and call to tegra_get_chip_id() can be avoided by adding metadata to store the Chip-ID information in struct gk20a_platform. Bug 200524194 Bug 200551105 Change-Id: I5f9f5abf679cf9afe98840e20144d76eb0238426 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2236311 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Sagar Kamble <skamble@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Sagar Kamble	aa893ff77a	gpu: nvgpu: use GPL license for linux code Linux specific code should have GPL license. Bug 2755169 Change-Id: I8cbf96f4be2e77fde01ef976a79ec6c578185c23 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2237105 Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Abdul Salam	8d3427633f	gpu: nvgpu: Provide ability to select dgpu freq cap from DT Add support in nvgpu to parse and get the freq cap from DT. The patch does below Parse the DT and gets the freq cap value during probe. During clk_arb init compare this with P0.Max and takes the lowest. Send change_seq with the new value and set dgpu freq. Use the lowest for "get points","get default","set VF". Bug 200556366 Change-Id: Ie10243f9bf83cb5ae07ebcc4cdc8efaffa56c309 Signed-off-by: Abdul Salam <absalam@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2204644 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Preetham Chandru Ramchandra	fc71914b28	gpu: nvgpu: add support for PCI device 0x1eb0 Add support for PCI device with ID 0x1eb0. Bug 200559157 Change-Id: I9ca196a123636ad640ce89aa496f003cc55119e4 Signed-off-by: Preetham Chandru Ramchandra <pchandru@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2217302 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
rmylavarapu	54d2132b69	nvgpu: gpu: Remove usage of VOLT_RAIL_GET_VOLTAGE RPC VOLT_RAIL_GET_VOLTAGE RPC is no longer available for turing auto profile. Instead volt_rail_get_status cmd will fetch the required voltage values. NVGPU-4326 Change-Id: I3270c259b92effd13b3183e52af689ea2dc35c37 Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2233106 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
rmylavarapu	692a442e9d	nvgpu: gpu: Remove freq_controller support. Removed Freq_controller support as it is no longer supported in auto profile. NVGPU-4284 Change-Id: I276048e44cb8a33f303517da91cb6ea0f1612695 Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2211457 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Debarshi Dutta	51544b8a68	gpu: nvgpu: avoid double mapping of usermode mmap region gk20a_pm_runtime_suspend can fail and invoke gk20a_pm_finalize_poweron that can cause double mapping of the usermode mmap region via io_remap_pfn_range(). Avoid this by using a boolean variable to track whether the region is already mapped. Bug 2707416 Change-Id: I4d8cbe427400a5b986348a19af145367cc08ffc6 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2229312 GVS: Gerrit_Virtual_Submit Reviewed-by: Dinesh T <dt@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Preetham Chandru Ramchandra	a12c627574	gpu: nvgpu: define P-state as a platform variable move P-state enabling from chip level to platform level. Bug 200559157 Change-Id: Ie71dc801583678dc3a19f2a8438e477e46053591 Signed-off-by: Preetham Chandru Ramchandra <pchandru@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2223300 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Abdul Salam	767e505792	gpu: nvgpu: Execute pci settings deferred from Devinit. The devinit executes in parallel with PCIE link training to reduce exit latency. Therefore, all PCIE settings that normally occur during devinit after the PCIE link is up are deferred until nvgpu has resumed control. Bug 2661545 Change-Id: Ifdd4f645b2e1791d93567cc34d6ab0691a25d101 Signed-off-by: Abdul Salam <absalam@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2210625 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Seshendra Gadagottu	d8058743d7	gpu: nvgpu: prepare class unit for safety build Move graphics related defs and functions under CONFIG_NVGPU_GRAPHICS switch. Move classes not supported in GV11B under CONFIG_NVGPU_NON_FUSA switch. Add missing valid class numbers to gpu_class.is_valid HAL. Also remove un-used class defs from class.h header. Lot of qnx safety tests are still using graphics 3d class. Until those tests got fixed, allowing 3d graphics class as valid class for safety build. JIRA NVGPU-4301 Change-Id: Ifd2a13bee3210821799c2bca10e7245eb3c79121 Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2224658 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Sagar Kamble	f2b49f1c40	gpu: nvgpu: add doxygen for common MC unit Add doxygen details about Master Control (MC) common unit. Moved the interrupt handling related variables to new structure nvgpu_mc. JIRA NVGPU-2524 Change-Id: I61fb4ba325d9bd71e9505af01cd5a82e4e205833 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2226019 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Sagar Kamble	2edf3db10a	gpu: nvgpu: move mc gpu_ops out of gk20a.h and add doxygen comments for HALs gk20a.h will include gops_mc.h to contain the mc ops definitions. Add doxygen comments for the HAL functions that are called directly. Also move mc_gp10b_intr_pmu_unit_config to non-fusa HAL file. JIRA NVGPU-2524 Change-Id: I4f326332d7842211b004b372d79fac9fe6ed40e7 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2226017 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Sagar Kamble	6fe794bc98	gpu: nvgpu: prepare ce_app.h header In preparation for SWUD of CG unit, separate CE app related APIs into separate header ce_app.h. JIRA NVGPU-4143 Change-Id: I9be8a4f2eee3aaf3af71f5843f957052064d9651 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2221660 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Philip Elcan <pelcan@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Leon Yu	e5767d2e7e	nvgpu: fix railgate_enable_store Writing same value to railgate_enable_store should be treated as nop and made successfully. Doing so is not only an optimization for the operation but also convention that users expect for "settings". This change is primary for fixing a peculiar situation in the driver: root@localhost:/sys/devices/17000000.gp10b# cat railgate_enable 0 root@localhost:/sys/devices/17000000.gp10b# echo 0 > railgate_enable bash: echo: write error: Invalid argument Attempt to disable railgating on a platform where railgating isn't supported shouldn't be treated as 'invalid'. It's disabled after all. Bug 200562094 Change-Id: I3c04934bdbaf337c33d7de9cac6d53c96b4dacae Signed-off-by: Leon Yu <leoyu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2225476 (cherry picked from commit `10b3b5b1d5`) Reviewed-on: https://git-master.nvidia.com/r/2226185 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Bibek Basu <bbasu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
rmylavarapu	65a7896987	nvgpu: gpu: Implement PMU therm channel get status Currently nvgpu reads the temperature by reading the NV_THERM_I2CS_SENSOR_00 register. Below are the issues with current approach 1) NV_THERM_I2CS_SENSOR_00 doesn't support fractional precision which is POR. 2) It doesn't support negative temperatures which is required for Auto. 3) It doesn't take into account the right POR sensor in VFE VBIOS tables. From therm channel get status interface we can read the current temperature from PMU. NVBUG - 200549047 Change-Id: I2fb21926208876f3d3bebe3f2dee08edafedbc7d Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2196224 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Philip Elcan	9169e8c048	gpu: nvgpu: mc: move mc declarations to mc.h Move declarations that belong to mc from gk20a.h to mc.h where they belong. JIRA NVGPU-2532 Change-Id: I91934ff60e2735c61d16459c04507fed6e1c96d7 Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2214421 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Peter Daifuku	05c892f3f1	nvgpu: fix get_maxrate when no dvfs In nvgpu_linux_get_maxrate, if tegra_dvfs_get_maxrate returns 0 (a sign that there is no dvfs support), call nvgpu_clk_arb_get_arbiter_clk_range to get the max gpu frequency. Bug 200543218 Change-Id: I4f9bc0acaef98cd9dfa22f709656f4bb7e9fd349 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2215161 (cherry picked from commit `12202fbdcf`) Reviewed-on: https://git-master.nvidia.com/r/2217945 GVS: Gerrit_Virtual_Submit Reviewed-by: Luis Dib <ldib@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Philip Elcan	06fd513e1e	gpu: nvgpu: move common.unit into common.mc nvgpu.common.unit was just an enum used for passing to nvgpu.common.mc APIs. So, move the enum into mc.h, and replace the include of unit.h with mc.h where appropriate. And update the yaml arch. JIRA NVGPU-4144 Change-Id: I210ea4d3b49cd494e43add1b52f3fbcdb020a1e3 Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2216106 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Peter Daifuku	77e3704d3d	nvgpu: vgpu: no debugfs entries that rely on PMU When virtualized, the guest OS has no direct access to PMU functionality: - Don't create debugfs entries that rely on PMU access - Clean up PMU vgpu HAL entries that imply that PMU access is supported Bug 200543218 Change-Id: I12730b600802448a240f3de042760041d3ae7d29 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2213650 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Vedashree Vidwans	7c98fbba42	gpu: nvgpu: fix MISRA 17.1 in logging functions MISRA Rule 17.1 forbids use of stdarg.h features which are defined for variable arguments. This patch modifies logging macros to use slogf function for QNX builds. This avoids use of variable argument functions used for formatting log message. Jira NVGPU-4075 Change-Id: I5b6bb1107a7e431afaa960003858193a477b2ee6 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2192016 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Sagar Kamble	7a62265dde	gpu: nvgpu: enable irqs before nvgpu_finalize_poweron IRQs were not enabled before nvgpu_finalize_poweron, so debugging early init issues such as MMU fault, invalid PRIV ring or bus access etc. triggered during nvgpu power-on was cumbersome. Hence, Enable the IRQs before nvgpu_finalize_poweron is called. In HUB (MMU fault) ISR, MMU fault handling is only limited to snapped in priv reg in case of fault during nvgpu power-on. In HUB (MMU fault) ISR, access to fault buffers is synchronized as nvgpu driver reads the fault buffer registers before proceeding with fault handling. However, additional MMU fault handling needs to be synchronized with GR/FIFO/quiesce/recovery setup through nvgpu power-on state. JIRA NVGPU-1592 Change-Id: I8a5f2fcd79cb7ad8e215359e7a9fad50bfd46d67 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2203861 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Philip Elcan <pelcan@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Sagar Kamble	6c3c360462	gpu: nvgpu: protect nvgpu power state access using spinlock IRQs can get triggered during nvgpu power-on due to MMU fault, invalid PRIV ring or bus access etc. Handlers for those IRQs can't access the full state related to the IRQ unless nvgpu is fully powered on. In order to let the IRQ handlers know about the nvgpu power-on state gk20a.power_on_state variable has to be protected through spinlock to avoid the deadlock due to usage of earlier power_lock mutex. Further the IRQs need to be disabled on local CPU while updating the power state variable hence use spin_lock_irqsave and spin_unlock_- irqrestore APIs for protecting the access. JIRA NVGPU-1592 Change-Id: If5d1b5e2617ad90a68faa56ff47f62bb3f0b232b Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2203860 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Sagar Kamble	1cd6ae945c	gpu: nvgpu: introduce nvgpu_enable_irqs Prepare function to enable the stall and non-stall kernel interrupts. Update the type of irq state irqs_enabled to bool. JIRA NVGPU-1592 Change-Id: I758794e0f230814a0bea2f3c035562e9a5c7e0ea Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2203859 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Philip Elcan <pelcan@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Philip Elcan	065f98f669	gpu: nvgpu: init: add return for all init APIs This adds return values for all init APIs. This make all the init APIs have the same signature. This is a prerequisite to making a table of init functions. JIRA NVGPU-3980 Change-Id: I5b71fd06ad248092af133ffe908e2930acb6d2b0 Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2202973 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Shashank Singh	6fd0d972ae	nvgpu: gpu: include qnx_init unit in doxygen documentation -Include qnx_init unit in doxygen documentation. -Add documentation for gk20a_busy/idle and similar functions. -Remove must_check return value as misra already reports violation for that. Jira NVGPU-2571 Change-Id: I9573cb61865677944809dcc494d92f63cc6e0f58 Signed-off-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2176755 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Abdul Salam	65ecd7a181	gpu: nvgpu: Remove fixed wait time for change seq completion Currently after sending change seq RPC, nvgpu waits for a fixed time of 20ms. This CL replaces this with pmu_wait_message_cond, which will return immediately after getting change seq completion event. Also added debug fs node to get the change seq execution time. Bug 200545366 Change-Id: Iba283f65d4949858be9cbff88de4d21a8c92ff81 Signed-off-by: Abdul Salam <absalam@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2202423 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Vedashree Vidwans	920b704ec7	gpu: nvgpu: put memory ref count Put dma buffer ref count for all vm buffer mapping fail conditions. Bug 200531152 Change-Id: I6bfad867eb9bd636a48b5ceb3a4417a80994a3ec Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Original Author: Bruce Xu <brucex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2194025 (cherry picked from commit f85504ae46d65d5346d9e2a5cc84ffb960ba9fb7) Reviewed-on: https://git-master.nvidia.com/r/2195439 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Sunny Li	516023e1e4	gpu: nvgpu: sysfs adding NULL pointer check golden image size will be set when memory allocated. See function: - nvgpu_gr_obj_ctx_init If golden image size is 0, gr_golden_image should be a NULL pointer in most cases. So add NULL pointer checking in tpc_pg_mask_store to avoid NULL pointer exception. Bug 2403210 Change-Id: I14df5cd94d7a4418c3089c5f84b6eab93c485ba6 Signed-off-by: Sunny Li <sunnyl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2161280 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Debarshi Dutta <ddutta@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Mahantesh Kumbar	525ff83910	gpu: nvgpu: Cleanup PMU unit header file pmu.h Moved PMU subunits specific defines from pmu.h to respective subunits header file by renaming properly as needed JIRA NVGPU-2457 Change-Id: Id29a2d5cb028fc69049738c735c5585b6276b115 Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2199547 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Rajesh Devaraj	935c5f6578	gpu: nvgpu: fix misra violations in SDL This patch addresses misra violations due to SDL error reporting callbacks. In particular, it addresses the following misra violation: - misra_c_2012_directive_4_7_violation: Calling function "nvgpu_report_*_err()" which returns error information without testing the error information. JIRA NVGPU-4025 Change-Id: Ia10b6b3fd9c127a8c5189c3b6ba316f243cedf04 Signed-off-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2196895 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Sagar Kamble	e53d24d6d2	gpu: nvgpu: fix MISRA Rule 8.6 violations ifdef function prototypes with CONFIG_* defines. This fixes MISRA rule 8.6 violations which complain about undefined functions. Also moved nvgpu_channel_get_from_file prototype to ioctl_channel.h & nvgpu_probe to driver_common.h as those are linux specific. Define nvgpu_init_soc_vars in posix/soc.c as it is implemented in QNX. JIRA NVGPU-3873 Change-Id: I5d2b238e1b5d1318867cd2416ac5f03cc6ab7c6a Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2196794 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Jeremy Ho	6118009b84	gpu: nvgpu: remove reversed ordering for deadlock In some cases, we would get deadlock issue due to there are two locks acquisition on common clk driver's lock and nvgpu driver's locks. At the bug, inconsistent lock ordering problem will come with one thread gets "nvgpu lock -> clk lock" and the other thread gets "clk lock -> nvgpu lock". Slove the latter path with one-time initializing clk_parent entry and use cached data afterward. Bug 2555115 Change-Id: I31c5c2728f406307e7cfd4e555f4db0c163234d8 Signed-off-by: Jeremy Ho <jeremyh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2146727 (cherry picked from commit `42c2bdfb9f`) Reviewed-on: https://git-master.nvidia.com/r/2160290 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Thomas Fleury	62d7c5641f	gpu: nvgpu: rename recovery capability Rename "recovery" capability to more specific "fault recovery": - NVGPU_SUPPORT_FAULT_RECOVERY in UAPI - NVGPU_GPU_FLAGS_SUPPORT_FAULT_RECOVERY in enabled flags. Jira NVGPU-3896 Change-Id: I2a60601a7c73ce15e08b65f377e8a27a526d5eb2 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2197427 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Preetham Chandru Ramchandra	1c1fd99faf	gpu: nvgpu: Enable big pages if PAGE_SIZE >= 64k Disable big pages only if iommu is not supported for the platform and if kernel page size is less then 64k Bug 2500080 Bug 2508793 Bug 2508677 Bug 2507041 Change-Id: I77dad7e54825e2cb36b5ca29e5d038a9bee293ff Signed-off-by: Preetham Chandru Ramchandra <pchandru@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2195084 GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Debarshi Dutta	06949c508f	gpu: nvgpu: Add support for XPU rail split Check if CPU/GPU rails are joint, disable railgating if they are. Add the DT support for T194 and T186 platforms. Disable railgate_enable sysfs node update in the above condition. Bug 200546450 Bug 200545711 Change-Id: I002488f6418805569b0ef0fc3032b58297adeafb Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2185221 (cherry picked from commit `1d532589b0` in rel-32) Reviewed-on: https://git-master.nvidia.com/r/2190402 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:01:38 -06:00
Thomas Fleury	9f0dff4a03	gpu: nvgpu: add recovery capability Add NVGPU_SUPPORT_RECOVERY and NVGPU_FLAGS_GPU_SUPPORT_RECOVERY, to indicate if recovery is supported. When true, an engine reset is performed in order to recover from an uncorrectable error. When false, the driver enters SW quiesce state. Jira NVGPU-3896 Change-Id: Iea809c13a844641e31ce6306fbd1630ef622bfe9 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2175447 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Philip Elcan <pelcan@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:01:38 -06:00
Thomas Fleury	f422aee393	gpu: nvgpu: use refcnt for ch mmu_debug_mode Replaced ch->mmu_debug_mode_enabled with ch->mmu_debug_mode_refcnt. If channel is enabled multiple times by userspace, then ref count is updated accordingly. There is an expectation that enable/disable calls are balanced for setting channel's mmu debug mode. When unbinding the channel, decrease refcnt for the channel until it reaches 0. Also, removed tsg parameter from nvgpu_tsg_set_mmu_debug_mode as it can be retrieved from ch. Bug 2515097 Change-Id: If334e374a55bd14ae219edbfd3b1fce5ff25c226 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2184702 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-28 16:54:51 -07:00
Thomas Fleury	8057514a9f	gpu: nvgpu: set FB/HSMMU debug mode Set NV_PFB_HSMMU_PRI_MMU_DEBUG_CTRL and NV_PFB_PRI_MMU_DEBUG_CTRL in addition to NV_PGRAPH_PRI_GPCS_MMU_DEBUG_CTRL, in NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE Bug 2515097 Change-Id: I1763b43e79fac3edb68a35980683d58bfa89519f Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2115785 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-28 16:54:26 -07:00
Vedashree Vidwans	7bc3cdcf95	gpu: nvgpu: use vpr resize enabled API This patch adds nvgpu API in linux and posix to query vpr resize. The new API nvgpu_is_vpr_resize_enabled() is used in nvgpu_submit_channel_gpfifo(). Previously, if non-deterministic channel has timeout disabled and GPU cannot railgate on some platform, then channel doesn't power ref count and results in video freeze. To resolve non-determinstic channel job tracking needs to be enabled if vpr resize is supported or if GPU can railgate. Bug 200532122 Change-Id: Icfbff6253762b195b2f5955749343974b1a7a269 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2171093 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-28 14:24:19 -07:00
Thomas Fleury	95bb19827e	gpu: nvgpu: add sw quiesce For safety build, nvgpu driver should enter SW quiesce state in case an uncorrectable error has occurred. In this state, any activity on the GPU should be prevented, without powering off the GPU. Also, a minimal set of operations should be used to enter SW quiesce state. Entering SW quiesce state does the following: - set sw_quiesce_pending: when this flag is set, interrupt handlers exit after masking interrupts. This should help mitigate an interrupt storm. - wake up thread to complete quiescing. The thread performs the following: - set NVGPU_DRIVER_IS_DYING to prevent allocation of new resources - disable interrupts - disable fifo scheduling - preempt all runlists - set error notifier for all active channels Note: for channels with usermode submit enabled, userspace can still ring doorbell, but this will not trigger any work on engines since fifo scheduling is disabled. Jira NVGPU-3493 Change-Id: I639a32da754d8833f54dcec1fa23135721d8d89a Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2172391 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-27 10:37:21 -07:00
Thomas Fleury	36fbd3bf40	gpu: nvgpu: check Board ID and VBIOS version Check that current VBIOS meets minimal version requirement. Read VBIOS Board ID to identify the board SKU. Warn if VBIOS version is lower than expected version for this SKU. Warn if Board ID is unknown. Bug 200544064 Change-Id: I83176ab1342c9b8c8f5d273dd5ac00e6e26a0e7d Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2176974 (cherry picked from commit 621a10c123b9ba25e3cb89dee340741c4ad2cd8e) Reviewed-on: https://git-master.nvidia.com/r/2176931 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-23 04:18:13 -07:00
Shashank Singh	c4e29841e5	nvgpu: gpu: Fix misra rule 10.3 in vm unit For getting mapping kind is passed as signed 32 bit whereas it is stored as unsigned 32 bit. So, change the kind type to s16 in struct nvgpu_mapped_buf and also in the declaration from int to s16 to address that. This is a dependent change for qnx https://git-master.nvidia.com/r/#/c/2174451/. Jira NVGPU-3891 Change-Id: I0578409313442ad0e2f09c8019d2701b4da53ec9 Signed-off-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2176497 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-22 14:07:25 -07:00
vinodg	087d4d3df4	gpu: nvgpu: rmmod support in dgpu simulation Changes added to support "rmmod nvgpu" in dgpu simulation after gpu poweron. nvgpu_engine-wait_for_idle got stuck in busy mode for nvdec and nvec engines in simulation as simulation doesnt support timeout. These engines are not valid engines in nvgpu engine list. Add nvgpu_engine_check_valid_id before checking engine status. Simulation crash on accessing 0xb81604 top interrupt register. Add func_priv_cpu_intr_top__size_1_v() function to get the supported size than using default MAX_INTR_TOP_REGS. nvlink is not supprted in dgpu simulation. Avoid warning for -ENODEV return. Avoid register read following gpu power off completion. Bug 2498574 Change-Id: I9f9f1cf1ac4620242bda1d2cc0f29f51f81a6711 Signed-off-by: vinodg <vinodg@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2179930 Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-21 23:38:56 -07:00
Sagar Kamble	2f95efd8d1	gpu: nvgpu: move CE app logic under CONFIG_NVGPU_DGPU CE app functionality from nvgpu is non-safe for igpu. CE engines init /reset/cg related functionality is required in safety. Hence move the CE app logic under CONFIG_NVGPU_DGPU flag and update the sources accordingly. JIRA NVGPU-3814 Change-Id: I37aa00b1184baccd5fe569ec315be60ac42dac9b Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2168956 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-19 07:55:57 -07:00
Konsta Holtta	6e2e4d0658	gpu: nvgpu: delete value tracking in syncpt wait API QNX nvhost_syncpt_wait_timeout_ext() no longer supports reporting the current syncpoint value (which nvgpu does not use either). Jira HOSTX-1347 Change-Id: I5108f19a53802df63df014dd0ec3a103e0c6531f Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2170180 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-19 07:07:18 -07:00

1 2 3 4 5 ...

571 Commits