linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
Konsta Hölttä	e8201d6ce3	gpu: nvgpu: decouple channel watchdog dependencies The channel code needs the watchdog code and vice versa. Cut this circular dependency with a few simplifications so that the watchdog wouldn't depend on so much. When calling watchdog APIs that cause stores or comparisons of channel progress, provide a snapshot of the current progress instead of a whole channel pointer. struct nvgpu_channel_wdt_state is added as an interface for this to track gp_get and pb_get. When periodically checking the watchdog state, make the channel code ask whether a hang has been detected and abort the channel from within channel code instead of asking the watchdog to abort the channel. The debug dump verbosity flag is also moved back to the channel data. Move the functionality to restart all channels' watchdogs to channel code from watchdog code. Looping over active channels is not a good feature for the watchdog; it's better for the channel handling to just use the watchdog as a tracking tool. Move a few unserviceable checks up in the stack to the callers of the wdt code. They're a kludge but this will do for now and demonstrates what needs to be eventually fixed. This does not leave much code in the watchdog unit. Now the purpose of the watchdog is to only isolate the logic to couple a timer and progress snapshots with careful locking to start and stop the tracking. Jira NVGPU-5582 Change-Id: I7c728542ff30d88b1414500210be3fbaf61e6e8a Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369820 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	281006ae7d	gpu: nvgpu: fix error for userspace build - Fix syntax error in Makefile.sources - Add missing test_enqueue entry to required_tests.ini - Add nvgpu-next include path in Makefile.units.common.tmk. This will provide an option to include nvgpu-next files in userspace build. Bug 2920876 Change-Id: I5d34a89a66813aa39fb1dbdf19decfbb9c63c7eb Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2377295 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	8ec1ec4f69	gpu: nvgpu: add syncpt_get_name() posix stub Add stub for nvgpu_nvhost_syncpt_get_name() stub in posix. JIRA NVGPU-5363 Change-Id: I3a1826e47685d54bda63cf04aa327adcb3da422e Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369658 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Lakshmanan M	02bc54c6ef	gpu: nvgpu: Add some more validation for gr remap window sequence This CL covers the following minor modifications, * Added input validation in the beginning of gr remap window call. * Changed gr_syspipe_lock/unlock sequence to handle legacy GR remap window sequence. JIRA NVGPU-5645 JIRA NVGPU-5646 Change-Id: I48758444096d2a962dbf087bcb211b7f8eacf326 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399603 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	d020778c55	gpu: nvgpu: reserve pma stream for legacy profiler Legacy profiler does not reserve PMA stream resource with PM reservation system. Also, HWPM system reset is separately implemented in membuf disable path. And it does not even restore perf unit SLCG prod values. Allcoate a dummy profiler object for debug session in perfbuf map path. Free it in perfbuf unmap path. This has advantage of synchronizing PMA stream reservation with new profiler stack. And this also leverages HWPM system reset and SLCG handling code during resource reservation. Remove explicit HWPM reset from gops.perf.membuf_reset_streaming() HALs Bug 2510974 Jira NVGPU-5360 Change-Id: I54c5202b6251dea3d80a4dfc011e8a296339e07f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399595 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	d90b9a3d4e	gpu: nvgpu: reset HWPM system on reservation Hardware HWPM system should be reset when first reservation is made either for HWPM or PMA_STREAM resource. Support this with below changes - Add hwpm_refcount counter to track HWPM and PMA_STREAM resource reservation count - Increment counter on every HWPM/PMA resource reservation - Decrement counter on every resource reservation release - Reset HWPM system in MC and disable perf unit SLCG on first refcount increment - Reset HWPM system in MC and re-enable perf unit SLCG after last refcount decrement - Add nvgpu_cg_slcg_perf_load_enable() to manage perf unit SLCG Bug 2510974 Jira NVGPU-5360 Change-Id: I20d2927947c3e4d8073cd3131b7733791e9c9346 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399594 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	dfd9feace6	gpu: nvgpu: recover pbdma errors before ack When a pbdma fault needs a channel teardown, do the recovery/teardown process before acking the pbdma interrupt status back. Acking it causes the hardware to proceed which could release fences too early before the involved channel(s) have been found to be broken. With these host copyengine interrupts, the teardown sequence is light and proceeds even with the pbdma intr flag still set; there are no engines to reset when these pbdma launch check interrupts happen. The bad tsg is just disabled and the channels in it aborted. A few unit tests are so heavily affected by this refactor that they would need to be rewritten. They're not strictly needed at the moment, so do only half of the rewrite: just delete them. Bug 200611198 Change-Id: Id126fb158b6d05e46ba124cd426389046eedc053 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392669 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	370ac6cc98	gpu: nvgpu: Update grmgr code to use nvgpu_device struct Instead of the nvgpu_engine_get_ids() function that will shortly be deleted, use the new nvgpu_device_get_copies() function. JIRA NVGPU-5421 Change-Id: I2b778b7818e885c807dfa90f15d03cddba9e59fc Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399165 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	fba96fdc09	gpu: nvgpu: Replace nvgpu_engine_info with nvgpu_device Delete the struct nvgpu_engine_info as it's essentially identical to struct nvgpu_device. Duplicating data structures is not ideal as it's terribly confusing what does what. Update all uses of nvgpu_engine_info to use struct nvgpu_device. This is often a fairly straight forward replacement. Couple of places though where things got interesting: - The enum_type that engine_info uses is defined in engines.h and has a bit of SW abstraction - in particular the GRCE type. The only place this seemed to be actually relevant (the IOCTL providing device info to userspace) the GRCE engines can be worked out by comparing runlist ID. - Addition of masks based on intr_id and reset_id; those can be computed easily enough using BIT32() but this is an area that could be improved on. This reaches into a lot of extraneous code that traverses the fifo active engines list and dramtically simplifies this. Now, instead of having to go through a table of engine IDs that point to the list of all host engines, the active engine list is just a list of pointers to valid engines. It's now trivial to do a for-all-active-engines type loop. This could even be turned into a generic macro or otherwise abstracted in the future. JIRA NVGPU-5421 Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Alex Waterman	df9695bd13	gpu: nvgpu: Add copy engine query device APIs Add two new APIs to the device code to query copy engines. These APIs handle the annoying change from COPY0-2 to LCEs in the Pascal generation. nvgpu_device_get_copies() nvgpu_device_get_async_copies() The first function gets all copy engines; the latter queries only async copy engines: that is CEs that do not share a runlist with the GR engine. JIRA NVGPU-5421 Change-Id: I707d9b004994b91f9d77974133912af9b9955882 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398597 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Lakshmanan M	48f1da4dde	gpu: nvgpu: Add bundle skip sequence in MIG mode In MIG mode, 2D, 3D, I2M and ZBC classes are not supported by GR engine. So skip those bundle programming sequence in MIG mode. JIRA NVGPU-5648 Change-Id: I7ac28a40367e19a3e31e63f3e25991c0ed4d2d8b Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397912 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	27cd70afd8	gpu: nvgpu: unit: Fix long standing MM bug Not sure if there's an actual bug or JIRA filed for this, but the change here fixes a long standing bug in the MM code for unit tests. Te GMMU programming code verifies that the CPU _physical_ address programmed into the GMMU PDE0 is a valid Tegra SoC CPU physical address. That means that it's not too large a value. The POSIX imlementation of the nvgpu_mem related code used the CPU virtual address as the "phys" address. Obviously, in userspace, there's no access to physical addresses, so in some sense it's a meaningless function. But the GMMU code does care, as described above, about the format of the address. The fix is simple enough: since the nvgpu_mem_get_addr() and nvgpu_mem_get_phys_addr() values shouldn't actually be accessed by the driver anyway (they could be vidmem addresses or IOVA addresses in real life) ANDing them with 0xffffffff (e.g 32 bits) truncates the potentially problematic CPU virtual address bits returned by malloc() in the POSIX environment. With this, a run of the unit test framework passes for me locally on my Ubuntu 18 machine. Also, clean up a few whitespace issues I noticed while I debugged this and fix another long standing bug where the NVGPU_DEFAULT_DBG_MASK was not being copied to g->log_mask during gk20a struct init. Change-Id: Ie92d3bd26240d194183b4376973d4d32cb6f9b8f Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395953 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Tejal Kudav	71b005c1ef	gpu: nvgpu: Enter Quiesce if GPU drops off the bus Currently, we reboot the entire system using kernel_restart() if the GPU registers become inaccessible due to GPU disappearing from the bus. GPU hitting high temperatures is one of the reasons we might end up in above scenario. Replace kernel_restart() with quiesce call as a more graceful way of notifying about GPU's unavailability. While entering quiesce state, make sure we do not trigger any register accesses which are bound to fail in this case. Bug 2919899 Change-Id: Ia9d413e04c7d205752414ff3e892f055c4363cce Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398801 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	9963b94b4b	gpu: nvgpu: unbind resources during reservation release nvgpu_profiler_pm_resource_release() right now returns error if PM resources are already bound. Update this to unbind the resources explicitly as per the user requirement. Bug 2510974 Jira NVGPU-5360 Change-Id: Ib71e2d8d3caacd3bc5e29a06af0b90983468d33a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398354 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
rmylavarapu	4787220ffe	gpu: nvgpu: Create ELPG cmd functions In nvgpu-next ELPG unit support RPC calls and no longer support command calls to communicate to PMU. This change will create separate ELPG command functions which can be called for legacy chips and can be replaced by RPC functions for nvgpu-next chip. NVGPU-5195 Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Change-Id: Iddea0f46eb3506a4f2d44d664f610215b8f1b666 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2386923 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	ae25924393	gpu: nvgpu: print enabled_flags after poweron GPU enabled_flags indicate features supported by nvgpu. Add nvgpu_print_enabled() to print GPU enabled_flags. Print flag value after poweron complete to help during debug. Add verbose function to print flag name and status if gpu_dbg_info is set. JIRA NVGPU-5838 Change-Id: I3b0ddb8c6872f4f3b6101050da087ff553c16f84 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2383531 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Antony Clince Alex	16d54e83bf	gpu: nvgpu: remove nvgpu_next functions from nvgpu_mc unit At present nvgpu_mc unit contains nvgpu_next_mc function definitions under conditional compilation macro. Move these functions to nvgpu_next specific files. Jira NVGPU-6004 Change-Id: Ieef68dad3c20941fd5580cad7341f165880f08ad Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396323 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	c73a2bddc9	gpu: nvgpu: delete unused nvgpu_nvhost_get_syncpt_host_managed We're using client managed syncpoints only. Delete this historical artifact. Jira NVGPU-5506 Change-Id: I8ebe34310eb99fd1fec2b238500aa9f4502cf09a Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398406 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	87a8e8980a	gpu: nvgpu: Correct dGPU shutdown path Currently, we just deinitialize the nvlink in the shutdown path. This alone is not sufficient and can lead to someone trying to use dGPU while being shutdown. Avoid triggers to dGPU usage by - 1. Set NVGPU_DRIVER_IS_DYING to let users know that the driver is currently in the process of dying. 2. Disable IRQs 3. Prepare for poweroff using nvgpu_prepare_poweroff 4. Stop CPU from accessing GPU registers 5. Set GPU state to POWEROFF Bug 200601517 JIRA NVGPU-5991 Change-Id: Ie185516618678bb893bcc3c3dcb514701483ecf2 Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2393565 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	2012a6b558	gpu: nvgpu: add profiler api to execute regops Implement new API nvgpu_prof_ioctl_exec_reg_ops() to support regops on new profiler objects. Add two new staging buffers to hold regops copied from userspace, and to convert and execute regops in common code. Buffers are allocated and released along with the profiler object. New API will implements this : - copy regops data in chunks of 4K from userspace - store them in staging buffer - convert the new regop struct into common regop struct and also copy the content into second staging buffer - trigger gops.regops.exec_regops() with second staging buffer as operation pointer - convert common regop struct back into new regop struct and copy back to userspace Export bunch of helper functions from ioctl_dbg.h. e.g. nvgpu_get_regops_op_values_common() Update regop execution code to skip regop execution if regop status is not valid. This is only possible when userspace requests for CONTINUE_ON_ERROR mode. Add more documentation to some of the fields in UAPI header. Note that maximum atomic operations reported by new API are same as legacy API and are incorrect. This will be fixed up in upcoming patches. Bug 2510974 Jira NVGPU-5360 Change-Id: I9f82052b22143aec33f6e778c0784386744b699e Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394208 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	a439d3767d	gpu: nvgpu: silence coverity on fence code - use release instead of free for the fence destroy identifier - nvhost_dev is a struct name, so use nvhost_device - compare nvgpu_nvhost_syncpt_read_ext_check retval properly Also, if the syncpt read fails when checking for fence expiration, behave as if the wait isn't expired. Possibly getting stuck is safer than possibly continuing too early. Jira NVGPU-5617 Change-Id: Ied529e25f8c43f1c78fd9eac73b9cd6c3550ead5 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398399 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	010f818596	gpu: nvgpu: initialize gr struct in poweron path struct nvgpu_gr is right now initialized during probe and from OS specific code. To support multiple instances of graphics engine, nvgpu needs to initialize nvgpu_gr after number of engine instances have been enumerated in poweron path. Hence move nvgpu_gr_alloc() to poweron path and after gr manager has been initialized. Some of the members of nvgpu_gr are initialized in probe path and they too are in OS specific code. Move them to common code in nvgpu_gr_alloc() Add field fecs_feature_override_ecc_val to struct gk20a to store the override flag read from device tree. This flag is later copied to nvgpu_gr in poweron path. Update tpc_pg_mask_store() to check for g->gr being NULL before accessing golden image pointer. Update tpc_fs_mask_store() to return error if g->gr is not initialized. This path needs nvgpu_gr struct initialized. Also fix the incorrect NULL pointer check in tpc_fs_mask_store() which breaks the write path to this sysfs. Jira NVGPU-5648 Change-Id: Ifa2f66f3663dc2f7c8891cb03b25e997e148ab06 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397259 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Lakshmanan M <lm@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	a04525ece8	gpu: nvgpu: require deterministic for usermode Deterministic mode has always been a requirement for usermode submit; enforce it in the setup_bind path. Adjust tests to use the flag. QNX uses NVGPU_SETUP_BIND_FLAGS_SUPPORT_DETERMINISTIC only if CONFIG_NVGPU_IOCTL_NON_FUSA is set, so guard the check with that for now. Jira NVGPU-5582 Change-Id: Idedd01a3a24420b45195a472e8ca5c9f32f4ef46 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369818 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Lakshmanan M	b86d5461c3	gpu: nvgpu: Add gr remap window disable/enable sequence Added gr remap window disable/enable programming sequence to access the legacy GR PGRAPH space during MIG mode. JIRA NVGPU-5647 Change-Id: I11bb9b1ce90cc1b21440fa2efdd53ce71e5cd03e Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397400 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
rmylavarapu	641cc6a59c	gpu: nvgpu: Support Perfmon events for nvgpu-next Created Perfmon events handling for nvgpu-next. Nvgpu-next pmu send perfmon events in the form of rpc events. Events are: - Change event: This gives information of whether it is increase/decrease event. - Init event: This gives information of perfmon init done in PMU. NVGPU-5202 NVGPU-5205 NVGPU-5206 Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Change-Id: Ida7e77dbaf70d2b594a0801c91a168dcb4a860bd Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395358 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Prateek sethi	9937a40b60	gpu: nvgpu: Disable igpu virt flag for dgpu DCONFIG_NVGPU_IGPU_VIRT is being set for all the non-safety builds that means it gets enable for both standard igpu and dgpu builds. This patch skips to enable DCONFIG_NVGPU_IGPU_VIRT flag in case of dgpu. Bug 200645357 Change-Id: I243aeaddda4ee79e5bd0adaaee11043ef76e7f41 Signed-off-by: Prateek sethi <prsethi@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396542 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Lakshmanan M	2a6fcec078	gpu: nvgpu: add gr manager ops-2 and mig infra-2 This CL covers the code changes related to following support, - Enabled gr manager ops. - Added gr manager init/remove support. - Refactor in gpu instance config infra. - Refactor in gr syspipe gpcs config infra. JIRA NVGPU-5645 JIRA NVGPU-5646 Change-Id: Ib2fab2796d76fe105fc5a08f2c5f9bfa36317f7c Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2393550 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	3245d48736	gpu: nvgpu: forbid watchdog on deterministic mode The channel watchdog feature has always been a blocker for deterministic submits. Instead of waiting for a submit call to happen just to reject it, nack already the setup_bind ioctl if deterministic is set and the watchdog has not been disabled before. This can avoid confusion with usermode submits where leaving the watchdog set would have worked but the watchdog would never see updates from userspace. Disallow also any other watchdog adjustments than disabling it when the channel has been set up for deterministic mode. Jira NVGPU-5582 Change-Id: I0ba4584bbc035197d952e5b562197c36aa483867 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369819 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	91515d1b47	gpu: nvgpu: unify joblist api names Add the nvgpu_ prefix to the peek, add and delete functions to make them consistent with the rest of the joblist functions. Rename the "prealloc resources" alloc and free functions to joblist init and deinit; there are many other resources that are also preallocated, and these handle just the job tracking list. NVGPU-5772 Change-Id: Ie5e6ba4f4b17465d626f36a0239bddb03a0a2fcb Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397395 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	345eae584d	gpu: nvgpu: remove nvgpu_channel_joblist_is_empty channel_joblist_peek() returns NULL if the list is empty. nvgpu_channel_joblist_is_empty() has been used only together with that function; remove it and check against NULL to see whether there are jobs in flight. This removes some duplication, simplifies the call sites slightly, and gets rid of a Coverity nag about a possible NULL pointer from peek that really isn't (when the emptiness was already checked). Jira NVGPU-5772 Change-Id: I814e9c510d99b88e59539359992fb44d4e7ce2ea Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397394 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
rmylavarapu	1aa64ba899	gpu: nvgpu: Check for pmu dmem alloc/free On nvgpu-next all cmd/msg communication happens on fbq and pmu dmem allocation is not needed. An extra conditional check for pmu dmem alloc/free which will avoid null pointer handling error. NVGPU-5185 Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com> Change-Id: I003a754ee7e91cc5d18a73576dd775a444b72d6d Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395747 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	07bed63377	gpu: nvgpu: PMU ucode version update for nvgpu-next Updating PMU ucode version for nvgpu-next Made below changes to PMU ucode on top P4 CL #28892402 -Enabled ACR task support -Enabled PERFMON task support -Disabled some features/code to build and commands work correctly -Enabled INIT_APERTURE_SETTINGS feature -ACRLib changes for ACR task JIRA NVGPU-5180 Change-Id: Idc6975e2b7f3501fd377d7e99d8fb47adcb78a52 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396641 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Ramesh Mylavarapu <rmylavarapu@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
mkumbar	3cc0dec8e7	gpu: nvgpu: update pmu init ack to support new unit id Added new command management unit id which will be received as INIT ack from PMU ucode upon boot, For legacy chips its called as INIT id, now changed to command management id to initialize the command/message setup. JIRA NVGPU-5185 Change-Id: I85203b373cef032f75b053b903d8b6763585be1f Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396450 Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
mkumbar	880a639a86	gpu: nvgpu: skip simulation check for pmu-lsfm unit skip simulation check for pmu-lsfm unit as lsfm unit execution is required on simulation to support secure boot of ctxsw. JIRA NVPU-5200 Change-Id: I85b8896643551e782b59663b13c52df36169754c Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396449 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	b012d50d9b	gpu: nvgpu: remove zbc stencil ioctl query Currently, ctrl ioctl NVGPU_GPU_IOCTL_ZBC_QUERY_TABLE or NVGPU_GPU_IOCTL_ZBC_SET_TABLE for stencil type, nvgpu returns value for depth instead. Remove NVGPU_GR_ZBC_TYPE_STENCIL case from both ctrl ioctls. Bug 3077459 Change-Id: I394344e9b80c05df72d8f7e0a79371966c9aea4c Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394948 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	b9ce3d50fc	gpu: nvgpu: pmu: Add new command line args for nvgpu-next Added new command line args for nvgpu-next and made required changes to support new args JIRA NVGPU-5185 Change-Id: I26faa3b8498387421b798b7abf9e757ed188f7f4 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396494 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	27a64f2e23	gpu: nvgpu: enforce priv usage of fence Add a "priv" fence struct type and use that in the fence type to emphasize that the inner data is not meant to be seen. The fence unit needs to have an outside-visible fence type so that fences can be allocated directly as a struct field in job metadata for performance and simplicity, so hiding the type entirely wouldn't work. A couple of places need to touch the priv data directly in channel code. Those can be thought to be technically fence unit's code scattered outside the fence files, but they mean that the architecture is not perfect yet. Jira NVGPU-5773 Change-Id: Ifa3c95757ae31eef0e32f2605293e23e210b065f Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395071 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
mkumbar	be6b37ba50	gpu: nvgpu: add support for ls_falcon_ucode_desc_v1 igpu-next LSPMU ucode built with newer ucode descriptor which adds changes to ACR blob construction. Constructing ACR blob with legacy ucode descriptor by fetching required data from ucode using newer descriptor. JIRA NVGPU-5857 Change-Id: I6d830be1ec955242b95f522e648528a6b36e7cf5 Signed-off-by: mkumbar <mkumbar@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2382855 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	6df58938ad	gpu: nvgpu: gp10b: add beta cb default size define Currently, get_attrib_cb_default_size() return value is hardcoded with recommended beta cb default size value. Add a macro for the fixed buffer size and add description. JIRA NVGPU-5302 Change-Id: If415e8bc6bc15b2d2ed6875a49a1a23bbe3c740a Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2375623 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	baaf25f8b0	gpu: nvgpu: decouple async and immediate cleanup Split up nvgpu_channel_clean_up_jobs() on the clean_all parameter so that there's one version for the asynchronous ("deferred") cleanup and another for the synchronous deterministic cleanup that occurs in the submit path. Forking another version like this adds some repetition, but this lets us look at both versions clearly in order to come up with a coherent plan. For example, it might be feasible to have the light cleanup of pooled items in also the nondeterministic path, and deferring heavy cleanup to another, entirely separated job queue. Jira NVGPU-5493 Change-Id: I5423fd474e5b8f7b273383f12302126f47076bd3 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2346065 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Richard Zhao	3f81f1952d	gpu: nvgpu: vgpu: fix NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS crash vgpu currently does not support suspend gpu context and stall the whole gpu, because of safety concerns. So vgpu does not set HALs that are related to on-gpu context. This change unset gops.gr.clear_sm_errors. And the ioctl NVGPU_GPU_IOCTL_CLEAR_SM_ERRORS will return -ENOSYS. Bug 200469468 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Ie578495e175ad898994fe1c4184a0243d5541cd3 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395598 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	4f85b2b1d4	gpu: nvgpu: make os fence headers consistent Move the sema-specific APIs to the sema-specific os fence header. Do the same for syncpts. Stub out the os fence type and the initialized check if os fence support is not enabled in build time. Guard the sema header with CONFIG_NVGPU_SW_SEMAPHORE. Jira NVGPU-5773 Change-Id: I838debd66a800b00cde76e65458b13eee367b55f Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395070 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	7aa852b31c	gpu: nvgpu: emphasize fence syncpt/sema interfaces Sometimes the syncpt-based fences are not used, and often the sema-based fences are not used. Move code around to new files to make it easier to see what happens and to allow leaving code out of the build easily. Start using nvgpu_fence_ops::free again and move the fence release there. The syncpt data is not refcounted, so it doesn't have this. Jira NVGPU-5773 Change-Id: I991f91886c59cf2c2fbfd2e75305ba512b5d7371 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395069 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Antony Clince Alex	e02ea5456b	gpu: nvgpu: tu104: update offset calculation of gpccs ctxsw'ed priregs The ctxsw'ed registers have been moved to a separate list starting from nvgpu_next chip onwards. Hence, update gr_tu104_get_offset_in_gpccs_segment function to account for ctxsw'ed registers in nvgpu_next. Introduce functions: nvgpu_netlist_get_gpc_ctxsw_regs_count to compute the number of ctxsw'ed gpc registers. Bug 2916121 Change-Id: I69fcd8df883af62999d0fa8d1f9a398f8f5d7454 Signed-off-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2394684 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Petlozu Pravareshwar	069af02ff1	gpu: nvgpu: Fix compile issue when enabling Nvlink This change fixes compile issue seen when enabling Nvlink drivers in K5.9. Bug 200609273 Change-Id: Ided44c746a21f36ff867822010d6666b4cdd79e8 Signed-off-by: Petlozu Pravareshwar <petlozup@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2391560 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Deepak Nibade	5311132781	gpu: nvgpu: add profiler apis to bind/unbind PM resources Add new APIs to bind/unbind PM resources to/from profiler objects: nvgpu_profiler_bind_pm_resources() nvgpu_profiler_unbind_pm_resources() Implement support to bind/unbind SMPC/HWPM/HWPM_STREAMOUT in various functions in common/profiler/profiler.c. Unbind all the PM resources explicitly in nvgpu_profiler_unbind_context() while closing the profiler object. If resources are bound during a resource reservation request, unbind the resources explicitly before reserving new resource. It is responsibility of application to bind the PM resources again. Bug 2510974 Jira NVGPU-5360 Change-Id: Ib2a0e017eaa23d0d376438771e8bf4e340865f03 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389655 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Deepak Nibade	330cc7d0e5	gpu: nvgpu: add profiler apis for resource reservation Add two new functions to reserve/release PM resources : nvgpu_prof_ioctl_reserve_pm_resource() nvgpu_prof_ioctl_release_pm_resource() Add ctxsw field to struct nvgpu_profiler_object to store per-resource context switch enable flag. Force resource reservation release while unbinding the context from profiler object or while closing the profiler object. Add this code in nvgpu_profiler_unbind_context() since both above paths will call this function. Bug 2510974 Jira NVGPU-5360 Change-Id: If334148e8df86360fba4162d1611187f3f04d01b Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2389654 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Alex Waterman	7e99a68e34	gpu: nvgpu: Add basic recovery debugging messages Add basic recovery messages that describe what's happening during the recovery process. Hide this under a new recovery specific GPU debug log flag. The logs look like: [ 276.000733] nvgpu: 17000000.gv11b gv11b_fifo_recover:162 [DBG] REC \| Recovery starting [ 276.000737] nvgpu: 17000000.gv11b gv11b_fifo_recover:163 [DBG] REC \| ID = 0 [ 276.000741] nvgpu: 17000000.gv11b gv11b_fifo_recover:164 [DBG] REC \| id_type = TSG [ 276.000745] nvgpu: 17000000.gv11b gv11b_fifo_recover:165 [DBG] REC \| rc_type = MMU fault [ 276.000748] nvgpu: 17000000.gv11b gv11b_fifo_recover:166 [DBG] REC \| Engine bitmask: 0x0 [ 276.000753] nvgpu: 17000000.gv11b gv11b_fifo_recover:170 [DBG] REC \| Acquiring engines_reset_mutex [ 276.000756] nvgpu: 17000000.gv11b gv11b_fifo_recover:174 [DBG] REC \| Acquiring runlist_lock for active runlists [ 276.000764] nvgpu: 17000000.gv11b gv11b_fifo_recover:185 [DBG] REC \| Channels bound to this TSG: [ 276.000767] nvgpu: 17000000.gv11b gv11b_fifo_recover:190 [DBG] REC \| 0 \| chid 511 [ 276.001098] nvgpu: 17000000.gv11b gv11b_fifo_recover:222 [DBG] REC \| PBDMA Bitmask: 0x1 [ 276.001102] nvgpu: 17000000.gv11b gv11b_fifo_recover:228 [DBG] REC \| Runlist Bitmask: 0x1 [ 276.001106] nvgpu: 17000000.gv11b gv11b_fifo_recover:240 [DBG] REC \| Disabling RL scheduler now [ 276.001126] nvgpu: 17000000.gv11b gv11b_fifo_recover:246 [DBG] REC \| Disabling CG/PG now [ 276.189348] nvgpu: 17000000.gv11b gv11b_fifo_recover:259 [DBG] REC \| Clearing PBDMA_FAULTED, ENG_FAULTED in CCSR register [ 276.191972] nvgpu: 17000000.gv11b gv11b_fifo_recover:264 [DBG] REC \| Disabling TSG [ 276.191983] nvgpu: 17000000.gv11b gv11b_fifo_recover:279 [DBG] REC \| Preempting runlists for RC [ 276.192001] nvgpu: 17000000.gv11b gv11b_fifo_recover:288 [DBG] REC \| Polling for TSG to be off PBDMA [ 276.192012] nvgpu: 17000000.gv11b gv11b_fifo_recover:296 [DBG] REC \| Done! [ 276.192016] nvgpu: 17000000.gv11b gv11b_fifo_recover:306 [DBG] REC \| Resetting relevant engines [ 276.192020] nvgpu: 17000000.gv11b gv11b_fifo_recover:318 [DBG] REC \| Engine bitmask for RL 0: 0xd [ 276.192024] nvgpu: 17000000.gv11b gv11b_fifo_recover:323 [DBG] REC \| > Restting engine: ID=0 [ 276.209567] nvgpu: 17000000.gv11b gv11b_fifo_recover:347 [DBG] REC \| Done! [ 276.209572] nvgpu: 17000000.gv11b gv11b_fifo_recover:323 [DBG] REC \| > Restting engine: ID=2 [ 276.214290] nvgpu: 17000000.gv11b gv11b_fifo_recover:347 [DBG] REC \| Done! [ 276.214295] nvgpu: 17000000.gv11b gv11b_fifo_recover:323 [DBG] REC \| > Restting engine: ID=3 [ 276.224986] nvgpu: 17000000.gv11b gv11b_fifo_recover:347 [DBG] REC \| Done! [ 276.225013] nvgpu: 17000000.gv11b gv11b_fifo_recover:377 [DBG] REC \| Re-enabling runlists [ 276.225034] nvgpu: 17000000.gv11b gv11b_fifo_recover:383 [DBG] REC \| Re-enabling CG/PG [ 276.225134] nvgpu: 17000000.gv11b gv11b_fifo_recover:394 [DBG] REC \| Releasing engines reset mutex Note the "REC \|" which lets one easily do: $ dmesg \| grep "REC \|" To get a clear ubobstrructed view of the recovery progress in the dmesg log. JIRA NVGPU-5606 Change-Id: I183f2b5ac54edc60ee894a82111723e27aa5c46b Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392991 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Konsta Hölttä	fcbd807842	gpu: nvgpu: remove lockless allocator The lockless allocator that spins in alloc and free ops using cmpxchg to mitigate race conditions has only ever been used for the post fences in preallocated job resources. Now each post fence has a clear owner (the job struct which already is allocated well) and lifetime, so this allocator has no longer a purpose. Delete it to avoid bitrot. (The design of the job queue has always been such that there's minimal contention in any case.) Jira NVGPU-5773 Change-Id: Ied98d977c2c75bacfd3d010ce60c80fe709231e0 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392705 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Konsta Hölttä	e6c0d84683	gpu: nvgpu: allocate fences in job structs As the submit job metadata has been simplified, the fence pool for job tracking fences is now just complex code for very simple purposes, so delete it. It's enough to hold the fence memory in the job struct itself instead of having separately allocated objects with different lifetimes. Each channel is using preallocated job arrays based on the prespecified inflight job count. The fences are used for tracking job completion, and a new job cannot be submitted before a previous wait has completed. This means that even with a ringbuffer with space for only one job, the previous job memory cannot get reclaimed by a new submit because the submits are ordered. Jira NVGPU-5773 Change-Id: I0c777df700aa7cfda6f971efa47aa72c5462b53a Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392704 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00

1 2 3 4 5 ...

8023 Commits