linux-nvgpu

mirror of git://nv-tegra.nvidia.com/linux-nvgpu.git synced 2025-12-22 17:36:20 +03:00

Author	SHA1	Message	Date
srajum	069fe05dca	gpu: nvgpu: remove whitelisting for wrongly reported violations by tool - Earlier we whitelisted wrongly reported static analysis violations by tool, raised coverity tool bugs for these cases. - These bugs are fixed with new version of tool, so no need fo whitelisting. JIRA NVGPU-7119 Change-Id: Ib2341db0d46fa7fac4c0cc9a6c1bdc8704377ef1 Signed-off-by: srajum <srajum@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2604365 (cherry picked from commit dc2d8ddaa409aefe0e04e0bacb3a8a977f6dbd64) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2677523 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-10 16:01:06 -08:00
Konsta Hölttä	f10ee4ab0e	gpu: nvgpu: add domain name API Add nvgpu_nvs_domain_get_name() to minimize messing up with nvs internals and to help code organization when nvs is not built in yet. A stub to help compilation returns NULL because no domains can exist when the stub is built in, and thus it won't be used. Jira NVGPU-6788 Change-Id: If663f7c0e8434ef00dd3a3f40f6404a35b477f2b Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2673120 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-01 00:09:01 -08:00
Konsta Hölttä	2a8914619d	gpu: nvgpu: bind sched domains as fds Replace id-based lookup with fd-based lookup when binding a TSG to a domain. The device node based domain interface naturally provides access control; this way userspace tools can limit which uid/gid can access each domain. Also, explicitly disallow binding channels to a TSG that has no runlist domain yet. Normally a TSG is in the default domain if nothing else has been specified, but the default domain can be deleted. Jira NVGPU-6788 Change-Id: I2af96dfc002367d894eaf0c175006332f790c27f Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2651165 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-03-01 00:08:55 -08:00
Konsta Hölttä	359e83b45a	gpu: nvgpu: tsg: release default nvs domain ref A reference to the default scheduling domain is taken when a TSG is opened. Although the explicit bind is designed to support only one bind, the TSG is bound to the default one implicitly at that point. Release the reference to avoid leaking it. The domain might be null at that point if the default domain has been removed; in that case there's just no domain to put back. Change-Id: I7db43f7bbb2a8c86c391280eb7fa32431c8982da Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2663420 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-02-06 10:09:34 -08:00
Sagar Kamble	29a0a146ac	gpu: nvgpu: fix coverity defects Fix following coverity defects: ioctl_prof.c resource leak ioctl_dbg.c logically dead code global_ctx.c identical code for branches therm_dev.c resource leak pmu_pstate.c unused value nvgpu_mem.c dead default in switch tsg.c Dereference before null check nvlink_gv100.c logically dead code nvlink.c Out-of-bounds write fifo_vgpu.c Dereference null return value pmu_pg.c Dereference before null check fw_ver_ops.c Identical code for different branches boardobjgrp.c Dereference after null check boardobjgrp.c Dereference before null check boardobjgrp.c Dereference after null check engines.c Dereference before null check nvgpu_init.c Unused value CID 10127875 CID 10127820 CID 10063535 CID 10059311 CID 10127863 CID 9875900 CID 9865875 CID 9858045 CID 9852644 CID 9852635 CID 9852232 CID 9847593 CID 9847051 CID 9846056 CID 9846055 CID 9846054 CID 9842821 Bug 3460991 Change-Id: I91c215a545d07eb0e5b236849d5a8440ed6fe18d Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2657444 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com> GVS: Gerrit_Virtual_Submit	2022-01-28 04:50:12 -08:00
Richard Zhao	9ab1271269	gpu: nvgpu: common: fix compile error of new compile flags It's preparing to add bellow CFLAGS: -Werror -Wall -Wextra \ -Wmissing-braces -Wpointer-arith -Wundef \ -Wconversion -Wsign-conversion \ -Wformat-security \ -Wmissing-declarations -Wredundant-decls -Wimplicit-fallthrough Jira GVSCI-11640 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: Ia8f508c65071aa4775d71b8ee5dbf88a33b5cbd5 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2555056 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2022-01-13 12:36:14 -08:00
Konsta Hölttä	55afe1ff4c	gpu: nvgpu: improve nvs uapi - Make the domain scheduler timeslice type nanoseconds to future proof the interface - Return -ENOSYS from ioctls if the nvs code is not initialized - Return the number of domains also when user supplied array is present - Use domain id instead of name for TSG binding - Improve documentation in the uapi headers - Verify that reserved fields are zeroed - Extend some internal logging - Release the sched mutex on alloc error - Add file mode checks in the nvs ioctls. The create and remove ioctls require writable file permissions, while the query does not; this allows filesystem based access control on domain management on the single dev node. Jira NVGPU-6788 Change-Id: I668eb5972a0ed1073e84a4ae30e3069bf0b59e16 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2639017 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-12-15 06:05:25 -08:00
Konsta Hölttä	632644b44a	gpu: nvgpu: couple runlist domains and nvs Now that the main nvsched code exists in the nvgpu build, make it control the runlist domains. As a new nvs domain is created, create the relevant runlist data too. To support the default domain, create a default nvs domain at boot. The scheduling domain code owns the responsibility of domain lifetime, and runlist domains exist to serve that logic although the RL domains are directly used by channel and TSG logic. Add refcounting to the scheduler uapi level to make sure that busy domains (that still have TSG participants) do not get removed too early. Adjust error injection sensitive unit tests to match the updated logic. Jira NVGPU-6425 Jira NVGPU-6427 Change-Id: I1beec97c54c60ad334165b1c0acb5e827c24f2ac Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2632287 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-12-07 07:07:12 -08:00
Konsta Hölttä	1d14a4412f	gpu: nvgpu: scheduler management uapi Add ioctls for creating, removing and querying scheduling domains and interface with the "nvsched" entity that will be the core scheduler. Include the scheduler in the Linux build. The core scheduler code will ultimately hold data on and control what gets scheduled, but this intermediate layer in nvgpu-rm needs a bit of bookeeping to manage the userspace interface. To keep changes isolated, this does not touch the internal runlist domains yet. The core scheduler logic will eventually control the runlist domains. Jira NVGPU-6788 Change-Id: I7b4064edb6205acbac2d8c593dad019d517243ce Signed-off-by: Alex Waterman <alexw@nvidia.com> Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2463625 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-12-07 07:07:01 -08:00
Konsta Hölttä	fe7ae02f5f	gpu: nvgpu: add sched domain bind ioctl Support binding TSGs to some other scheduling domain than the default one. Binding happens by name until a more robust interface appears in the future, as the name is a natural identifier for users. No other domains are actually created anywhere yet; that will happen in next patches. Jira NVGPU-6788 Change-Id: I5abcdea869b525b0a0e9937302f106f7eee94ec2 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2628047 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-11-24 04:47:30 -08:00
Konsta Hölttä	9be8fb80a2	gpu: nvgpu: make tsgs domain aware Start transitioning from an assumption of a single runlist buffer to the domain based approach where a TSG is a participant of a scheduling domain that then owns has a runlist buffer used for hardware scheduling. Concretely, move the concept of a runlist domain up to the users of the runlist code. Modifications to a runlist need to specify which domain is modified. There is still only the default domain that is created at boot. Jira NVGPU-6425 Change-Id: Id9a29cff35c94e0d7e195db382d643e16025282d Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621213 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-11-11 20:39:42 -08:00
Konsta Hölttä	c0473460ea	gpu: nvgpu: don't check ch activity on bind Delete an unnecessary check of the active_channels bitmap when attempting to bind a channel to a TSG. There is already a verification that the channel must not be a part of a TSG; if it's not, it cannot be set in the bitmap. All channels become active via a parent TSG, but the activity check predates this design. A channel is bound to a TSG early before setting up its gpfifo etc. and mandatory membership of a TSG is one of the setup_bind prechecks. Jira NVGPU-6425 Change-Id: Id34686f198db0a0265ffd6a49a0b2e47c37fd5f7 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2621211 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-11-04 12:47:54 -07:00
Konsta Hölttä	3cf796b787	gpu: nvgpu: move active bitmaps to domain Move the active_channels and active_tsgs bitmaps from struct nvgpu_runlist to struct nvgpu_runlist_domain. A TSG and its channels are currently active as part of a runlist; in the future, a runlist may be switched from multiple domains that each are a collection of TSGs. The changes are still internal to the runlist code. Users of runlists need no modifications. Jira NVGPU-6425 Change-Id: I2d0e98e97f04b9716bc3f4890cf881735d0ab664 Signed-off-by: Konsta Hölttä <kholtta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2618387 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-11-03 20:55:08 -07:00
Debarshi Dutta	791dc18666	gpu: nvgpu: bvec for struct nvgpu_tsg_sm_error_state fields Add Setter and Getter methods for accessing tsg->sm_error_states. Getter returns a constant pointer for struct nvgpu_tsg_sm_error_state. This renders it unnecessary to add BVEC for above fields for the struct in multiple locations. The current design ensures that only a constant pointer is obtained from the owner unit i.e. FIFO. The following new methods are added. Both unit tests and BVEC tests are added for them as well. nvgpu_tsg_store_sm_error_state nvgpu_tsg_get_sm_error_state Jira NVGPU-6947 Change-Id: I82c22a2774862c8579baa41b6fb8292fa164704a Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> (cherry picked from commit 79574638671a0c6efe41cd3423668fcd1bd96826) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2556938 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-09-13 20:57:09 -07:00
Deepak Nibade	3c97f3b932	gpu: nvgpu: disallow binding more channels than MAX channels supported per TSG There is HW specific limit on number of channel entries that can be added for each TSG entry in runlist. Right now there is no checking to enforce this from SW and hence if User binds more than supported channels to same TSG, invalid TSG formation error interrupts are generated. Fix this by adding appropriate checks in below steps : - Add new field ch_count to struct nvgpu_tsg to keep track of channels bound to TSG. - Define new hal gops.runlist.get_max_channels_per_tsg() to retrieve HW specific maximum channel count per TSG. - Implement the HAL for gk20a and gv11b chips, and assign new HALs for all chips appropriately. - Increment ch_count while binding the channel to TSG and decrement it while unbinding. - While binding channel to TSG, Check if current channel count is already equal to max channel count. If yes, print an error and bail out. Bug 200763991 Change-Id: Ic5f17a52e0fb171d1c020bf4f085f57cdb95f923 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2582095 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-08-25 09:47:47 -07:00
Debarshi Dutta	200777b854	gpu: nvgpu: bvec for channel and tsg Below changes are added. 1) Added checks in nvgpu_channel_from_id__func, nvgpu_tsg_check_and_get_from_id 2) Added BVEC tests for nvgpu_channel_open_new, nvgpu_channel_from_id, nvgpu_tsg_check_and_get_from_id, nvgpu_tsg_set_error_notifier 3) Added common function get_random_u32. Jira NVGPU-6905 Change-Id: I374d6f5503dc05e3224213d772a1752d82cbdc91 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2548304 (cherry picked from commit 39b2529b3e96cfd3cbd3bb020f32ee2cca0ea363) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2554021 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Sachin Nikam <snikam@nvidia.com> Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-07-07 12:25:50 -07:00
Richard Zhao	4e08649b7f	gpu: nvgpu: move mem checking of gr_ctx to .alloc_obj_ctx Preparing for adding vgpu cmd .add_obj_ctx and memory will be allocated on server side. Outside of implementation of .alloc_obj_ctx, code should not check whether gr_ctx is valid by check gr_ctx mem. Jirs GVSCI-10977 Change-Id: I6b3d826e930fdfaaae517d204186642e49f5c2d7 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2546190 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-06-28 18:10:01 -07:00
Deepak Nibade	419a65965b	gpu: nvgpu: add mutex for gr_ctx initialization If user calls IOCTL to allocate object context for two channels in same TSG in parallel, nvgpu_gr_setup_alloc_obj_ctx() could end up racing and trying to allocate object context for both channels at the same time. This could result in corrupting object context. Fix this by introducing per-TSG mutex ctx_init_lock to serialize context initialization for all channels within TSG. In ideal scenario nvrm_gpu is the only caller of all the IOCTLs, and nvrm_gpu makes sure to initialize object context for each channel in serial order. Because of this new lock does not cause any contention. Jira NVGPU-6431 Change-Id: Ibb1cbb4878748929bb7f23e8666c283c39ecbf5a Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538333 (cherry picked from commit 8be447838dc1ecbd5637eb6bd13b8f338eaf33cd) Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2538773 Reviewed-by: svcacv <svcacv@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: svc_kernel_abi <svc_kernel_abi@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-06-03 15:59:43 -07:00
Sagar Kamble	89ec2afbd4	gpu: nvgpu: fix tsg unbind failure paths nvgpu_tsg_unbind_channel_common failure handling missed channel.clear & nvgpu_tsg_set_mmu_debug_mode calls. Bug 200711183 Change-Id: I19fd53be55db9df725b7cf467b2673e4cd29deb5 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2521972 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Antony Clince Alex <aalex@nvidia.com> Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-05-03 03:25:07 -07:00
Sagar Kamble	ff706e5456	gpu: nvgpu: handle ctx_reload when force unbinding the channel When force closing the channel, NEXT and CTX_RELOAD bits might be set. Currently CTX_RELOAD bit is ignored. However, due to this, the channel created after the erroneous unbind encounters FECS fault. If the channel is unbound while it is running, fifo unbind error happens and can lead to unspecified behavior. By moving CTX_RELOAD to other channel in the TSG, the channel can be unbound safely. In other cases, if the channel is truly running something when it is being unbound it should either get preempted or be handled through engine reset. Bug 200701444 Change-Id: Iba956544dcaa1144c6064247257c64cbe9a29ae6 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2515083 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-04-15 16:21:44 -07:00
Mayur Poojary	6277d57936	gpu: nvgpu: Add new api for setting longer timeslice on dbg node Add new ioctl api for setting longer timeslice and get timeslice inside 'dbg' dev node. Update ioctl gpu_get_characteristic to pass the max timeslice value Add debugfs to access and change the max timeslice value Bug 1842244 Change-Id: I7e80f59162cf5d90496f9752fc128f5fa8dcc7d2 Signed-off-by: Mayur Poojary <mpoojary@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2471569 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-04-06 04:37:38 -07:00
Alex Waterman	5bf229dcd5	gpu: nvgpu: Rename runlist_id to id Rename the runlist_id field in struct nvgpu_runlist to just id. The runlist part is redundant given that this id is already in 'struct nvgpu_runlist'. Change-Id: Ie2ea98f65d75e5e46430734bd7a7f6d6267c7577 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470306 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-02-19 15:16:46 -08:00
Alex Waterman	bd1b395b5c	gpu: nvgpu: Update runlist_id in TSG Update the runlist_id field in struct tsg to now be a pointer to the relevant runlist. This further cleans up the rampant use of runlist_ids throughout the driver. Change-Id: I3dce990f198d534a80caa9ca95982255dcf104ad Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470305 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-02-19 15:16:41 -08:00
Alex Waterman	77c0b9ffdc	gpu: nvgpu: Update runlist_update() to take runlist ptr Update the nvgpu_runlist_update_for_channel() function: - Rename it to nvgpu_runlist_update() - Have it take a pointer to the runlist to update instead of a runlist ID. For the most part this makes the code better but there's a few places where it's worse (for now). This starts the slow and painful process of moving away from the non-runlist code using runlist IDs in many places it should not. Most of this patch is just fixing compilation problems with the minor header updates. JIRA NVGPU-6425 Change-Id: Id9885fe655d1d750625a1c8aceda9e67a2cbdb7a Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470304 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-01-29 09:51:44 -08:00
Alex Waterman	11d3785faf	gpu: nvgpu: Rename struct nvgpu_runlist_info, fields in fifo Rename struct nvgpu_runlist_info to struct nvgpu_runlist; the info is not necessary. struct nvgpu_runlist is soon to be a first class object among the nvgpu object model. Also rename the fields runlist_info and active_runlist_info to simply runlists and active_runlists respectively. Again the info text is just not necessary and somewhat misleading. These structs _are_ the runlist representations in SW; they are not merely informational. Also add an rl_dbg() macro to print debug info specific to runlist management and some debug prints specifying the runlist topology for the running chip. Change-Id: Id9fcbdd1a7227cb5f8c75cca4abbff94fe048e49 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2470303 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2021-01-20 21:56:33 -08:00
Sagar Kamble	cf287a4ef5	gpu: nvgpu: retry tsg unbind if NEXT is set The NEXT bit can remain set for the channel if timeslice expires before scheduler clears it. Due to this nvgpu fails TSG unbind and in turn nvrm_gpu fails channel close. In this case, checking the channel hw state after some time can help see NEXT bit cleared by scheduler. Reenable the tsg and return -EAGAIN to nvrm_gpu for it to retry again. Bug 3144960 Change-Id: I35f417f02270e371a4e632986b73a00f8a4f921a Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2468391 Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2021-01-18 23:11:57 -08:00
Sagar Kamble	4d101a6303	gpu: nvgpu: do tsg unbind hw state check only for multi-channel TSG Host scheduler might be confused if more than one channels are present in TSG and one of the unbound channel has NEXT set. This is not so much of an issue if there is single channel in the TSG. So don't fail unbind in that case. ctx_reload and engine_faulted check can also be skipped for single channel TSG. Bug 3144960 Change-Id: I85eb9025ea53706ce8fda6d9b4bcf6a15a300d17 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2442970 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:48 -06:00
Sagar Kamble	842dec2470	gpu: nvgpu: unrailgate gpu during tsg release There is race condition between nvgpu runtime suspend and l2_flush or tlb_invalidate that happens as part of gmmu_unmap done during nvgpu_gr_ctx_free. Since l2_flush and tlb_invalidate does not do pm_runtime_get_sync, the suspend in progress can lead to registers getting locked and then l2_flush or tlb_invalidate can access the registers when registers are locked (GPU is railgated). Bug 3132891 Signed-off-by: Sagar Kamble <skamble@nvidia.com> Change-Id: If1696a9e9d3d9bc5fd55dd754be90a81114a75cc Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2425680 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	b2ff527d15	gpu: nvgpu: add channel.clear gops - Add channel.clear gops for nvgpu-next. - Do not return error if hw_state.next is set and channel.clear is not NULL. Bug 200650602 Bug 3109773 Change-Id: I4252691e4557351899e6fb9d85934e2d72517a36 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2414211 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Seema Khowala <seemaj@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Deepak Nibade	969b901999	gpu: nvgpu: create device/context profiler dev nodes Create new dev nodes for device and context profilers. Example of dev nodes on iGPU /dev/nvhost-prof-dev-gpu - device scope profiler /dev/nvhost-prof-ctx-gpu - context scope profiler Add below APIs to open/close above dev nodes : nvgpu_prof_dev_fops_open() nvgpu_prof_ctx_fops_open() nvgpu_prof_fops_release() Add common API nvgpu_prof_fops_ioctl() to handle IOCTL call on these dev nodes. Add IOCTL NVGPU_PROFILER_IOCTL_BIND_CONTEXT to bind the TSG to profiler objects. Add nvgpu_tsg_get_from_file() to retrieve TSG struct pointer from file descriptor. Also store profiler object pointer into TSG struct. Enable NVGPU_SUPPORT_PROFILER_V2_DEVICE capability on gv11b and tu104. Note that this is not yet enabled for vGPU. Keep NVGPU_SUPPORT_PROFILER_V2_CONTEXT capabiity disabled since this will take longer to support. Add new IOCTL NVGPU_PROFILER_IOCTL_UNBIND_CONTEXT so that userspace can explicitly unbind the context and release the resources before closing the profiler descriptor. Add context_init flag to profiler object for book keeping. Bug 2510974 Jira NVGPU-5360 Change-Id: Ie07e0cfd5a9da9d80008f79c955c7ef93b4bc60f Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2384354 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Tejal Kudav	ab2b0b5949	gpu: nvgpu: Set unserviceable flag early during RC During recovery, we set ch->unserviceable at the end after we preempt the TSG and reset the engines. It might be too late and user-space might submit more work to the broken channel which is not desirable. Move setting this unserviceable flag right at the start of recovery sequence. Another thread doing a submit can still read the unserviceable flag just before it is set here, leaving that submit stuck if recovery completes before the submit thread advances enough to set up a post fence visible for other threads. This could be fixed with a big lock or with a double check at the end of the submit code after the job data has been made visible. We still release the fences, semaphore and error notifier wait queues at the end; so user-space would not trigger channel unbind while channel is being recovered. Also, change the handle_mmu_fault APIs to return void as the debug_dump return value is not used in any of the caller APIs. JIRA NVGPU-5843 Change-Id: Ib42c2816dd1dca542e4f630805411cab75fad90e Signed-off-by: Tejal Kudav <tkudav@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2385256 Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:13:28 -06:00
Richard Zhao	98264f7505	gpu: nvgpu: call gops.tsg.unbind_channel on fail path When current context is busy, nvgpu_tsg_unbind_channel_common may fail because of preemption failed. In such case, the .unbind_channel hal still need to be called to notify vserver that the channel will be removed from tsg in teardown path. Bug 2833924 Signed-off-by: Richard Zhao <rizhao@nvidia.com> Change-Id: I9996202485429b4d9cba0c2f985f8e55fcdd3f29 Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2361977 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-by: automaticguardword <automaticguardword@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> GVS: Gerrit_Virtual_Submit	2020-12-15 14:13:28 -06:00
Vedashree Vidwans	2fb56f2cea	gpu: nvgpu: add bvec check for common.fifo input This patch adds boundary value check for common.fifo parameters as listed below. 1. nvgpu_channel_setup_bind() includes a condition to check that value of num_gpfifo_entries does not exceed 2^31. Otherwise prints message and returns error. 2. nvgpu_tsg_bind_channel() includes a condition to check if channel subctx had ASYNC id. If true, runqueue selector is set to 1 and 0 otherwise. This check is to be moved from devctl to common.fifo. Jira NVGPU-4817 Change-Id: Id1c9253945859c245e584b5c42b3285a6b620055 Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2278613 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Scott Long	3b4b418330	gpu: nvgpu: fifo: misra 12.1 fixes MISRA Advisory Rule states that the precedence of operators within expressions should be made explicit. This change removes the Advisory Rule 12.1 violations from fifo code. Jira NVGPU-3178 Change-Id: I487d039c5be8024b21ec87d520d86763f9338d2a Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2276793 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Lakshmanan M	1c991a58af	gpu: nvgpu: Add SM diversity support To achieve permanent fault coverage, the CTAs launched by each kernel in the mission and redundant contexts must execute on different hardware resources. This feature proposes modifications in the software to modify the virtual SM id to TPC mapping across the mission and redundant contexts. The virtual SM identifier to TPC mapping is done by nvgpu when setting up the patch context. The recommendation for the redundant setting is to offset the assignment by one TPC, and not by one GPC. This will ensure that both GPC and TPC diversity. The SM and Quadrant diversity will happen naturally. For kernels with few CTAs, the diversity is guaranteed to be 100%. In case of completely random CTA allocation, e.g. large number of CTAs in the waiting queue, the diversity is 1 - 1/#SM, or 87.5% for GV11B, 97.9% for TU104. Added NvGpu CFLAGS to enable/disable the SM diversity support "CONFIG_NVGPU_SM_DIVERSITY". This support is only enabled on gv11b and tu104 QNX non safety build. JIRA NVGPU-4685 Change-Id: I8e3eaa72d8cf7aff97f61e4c2abd10b2afe0fe8b Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2268026 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Shashank Singh <shashsingh@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	67696c6870	gpu: nvgpu: conditionally compile tsg event ids event_id_list and event_id_list_locks fields are only needed in nvgpu_tsg when CONFIG_NVGPU_CHANNEL_TSG_CONTROL is defined. Conditionally compile those fields and related code, so that they are removed from safety build. Jira NVGPU-4376 Change-Id: I8678aa1b8cd4166aa37bcb42cda1eb9c703fd32f Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2273261 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	1fc9a427e0	gpu: nvgpu: tear down TSG on unbind HAL failure Currently nvgpu_tsg_unbind ignores return code from g->ops.tsg.unbind_channel. For consistency, tear down TSG in case an error occurs in the unbind HAL. Also make sure to restore valid ops for fifo.preempt_tsg in test_gr_setup_free_obj_ctx, to avoid unbind failure. Jira NVGPU-4387 Change-Id: I27a9c0daa365d05684149fc4bb17874d60ae1fde Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2248159 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Thomas Fleury	1865b2804b	gpu: nvgpu: bail out on HAL failure in TSG bind Currently nvgpu_tsg_bind adds that channel to TSG's channel list, even if g->ops.tsg.bind_channel fails. Instead, bail out from function, and return an error. Jira NVGPU-4387 Change-Id: I02dd836d9d499ddbe9b269856e39b2a7c9ccfe64 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2248158 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com> Reviewed-by: Deepak Nibade <dnibade@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:10:29 -06:00
Philip Elcan	9378675213	gpu: nvgpu: whitelist MISRA violations for WARN_ON/BUG_ON Whitelist false positive violations cause by a Coverity bug that that overrides the WARN_ON/BUG_ON macros. See nvbug 2277532 for details on the bug. JIRA NVGPU-4031 Change-Id: I395f97c89580195485e93275663a062f26ab6fc7 Signed-off-by: Philip Elcan <pelcan@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2207326 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Richard Zhao	e770573468	gpu: nvgpu: disable mmu debug mode before unbind ch from tsg disable mmu debug mode needs to reference tsg struct, so it must be called when ch can trace back to tsg. Bug 2586624 Change-Id: I050b557fb7abbf7e52faec242a1c290742e86c0d Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2206636 Reviewed-by: Kajetan Dutka <kdutka@nvidia.com> Tested-by: Kajetan Dutka <kdutka@nvidia.com> Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Thomas Fleury	e7e0879217	gpu: nvgpu: return int for tsg.init_eng_method_buffers nvgpu_kzalloc can fail in gv11b_init_eng_method_buffers. Added checks on returned pointer. Also changed g->ops.tsg.init_eng_method_buffers to return an int, and check return value in callers. Jira NVGPU-3788 Change-Id: Icb541665c40b89d512929cc9cf9f6a3e7a0033db Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2205851 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Adeel Raza	252ddc4f05	gpu: nvgpu: add coverity whitelisting support Add macros for whitelisting coverity violations. These macros use pragma directives. The pragma directives and whitelisting macros are only enabled when a coverity scan is being run. The whitelisting macros have been added to a new header called static_analysis.h. The contents of safe_ops.h (CERT C safe ops) have been moved into static_analysis.h because this will be the new header for static analysis related macros/defines/etc. JIRA NVGPU-3820 Change-Id: I9c63f20f670880b420415535738034619314b7c3 Signed-off-by: Adeel Raza <araza@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2180600 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2020-12-15 14:05:52 -06:00
Thomas Fleury	f422aee393	gpu: nvgpu: use refcnt for ch mmu_debug_mode Replaced ch->mmu_debug_mode_enabled with ch->mmu_debug_mode_refcnt. If channel is enabled multiple times by userspace, then ref count is updated accordingly. There is an expectation that enable/disable calls are balanced for setting channel's mmu debug mode. When unbinding the channel, decrease refcnt for the channel until it reaches 0. Also, removed tsg parameter from nvgpu_tsg_set_mmu_debug_mode as it can be retrieved from ch. Bug 2515097 Change-Id: If334e374a55bd14ae219edbfd3b1fce5ff25c226 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2184702 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-28 16:54:51 -07:00
Thomas Fleury	8057514a9f	gpu: nvgpu: set FB/HSMMU debug mode Set NV_PFB_HSMMU_PRI_MMU_DEBUG_CTRL and NV_PFB_PRI_MMU_DEBUG_CTRL in addition to NV_PGRAPH_PRI_GPCS_MMU_DEBUG_CTRL, in NVGPU_DBG_GPU_IOCTL_SET_CTX_MMU_DEBUG_MODE Bug 2515097 Change-Id: I1763b43e79fac3edb68a35980683d58bfa89519f Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2115785 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-28 16:54:26 -07:00
Seema Khowala	2f731c5fa8	gpu: nvgpu: Add doxygen documentation in tsg.h - Add doxygen documentation. - Remove unused fields of nvgpu_tsg struct: -- timeslice_timeout -- timeslice_scale - Remove unused functions: -- nvgpu_tsg_set_runlist_interleave - nvgpu_tsg_post_event_id is not supported in safety build. This function is moved under CONFIG_NVGPU_CHANNEL_TSG_CONTROL compiler flag. - Below functions are moved under CONFIG_NVGPU_KERNEL_MODE_SUBMIT nvgpu_tsg_ctxsw_timeout_debug_dump_state nvgpu_tsg_set_ctxsw_timeout_accumulated_ms - Rename gk20a_is_channel_active -> nvgpu_tsg_is_channel_active release_used_tsg -> nvgpu_tsg_release_used_tsg - nvgpu_tsg_unbind_channel_common declared static - Fix build issue when CONFIG_NVGPU_CHANNEL_TSG_CONTROL is disabled Remove CONFIG_NVGPU_CHANNEL_TSG_CONTROL for nvgpu_gr_setup_set_preemption_mode as it is needed in safety build. By default compute preemption mode will be set to WFI. CUDA will change it to CTA during context init time. JIRA NVGPU-3595 Change-Id: I8ff6cabc8b892c691d951c37cdc0721e820a0297 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2151489 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-26 16:06:42 -07:00
Scott Long	4277f65834	gpu: nvgpu: fix misra 2.7 violations Advisory Rule 2.7 states that there should be no unused parameters in functions. This patch removes unused function parameters from the following: * nvgpu_channel_ctxsw_timeout_debug_dump_state() * nvgpu_channel_destroy() * nvgpu_tsg_destroy() * nvgpu_rc_pdbma_fault() Jira NVGPU-3178 Change-Id: I12ad0d287fd7980533663a9776428ef5d4fd1fb9 Signed-off-by: Scott Long <scottl@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2176066 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-16 16:06:04 -07:00
Debarshi Dutta	5980d4c44f	gpu: nvgpu: fix cert-c issues in common.fifo unit Fix cert-c issues that violate the following rule for common/fifo/* INT30-C: Unsigned integer operation may wrap. Jira NVGPU-3881 Change-Id: Ifd1994960774cc0e190610c67d0e3f4334b73cf0 Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2166535 Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-08-07 04:06:07 -07:00
Peter Daifuku	5a08b51559	nvgpu: check if unbound tsg in timeslice/interleave in set_timeslice/set_interleave, the TSG may not have been bound yet; if so, just return after saving the new values. Bug 2661079 Change-Id: I4fb915674d2bc45c062dbd10b942408800f6f2a5 Signed-off-by: Peter Daifuku <pdaifuku@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2161683 Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com> Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-by: Alex Waterman <alexw@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-07-26 16:26:47 -07:00
Deepak Nibade	84fc6f37e0	gpu: nvgpu: add DEBUGGER flag for SM exception mask type Add CONFIG_NVGPU_DEBUGGER flag for SM exception mask type flag, lock and APIs to set the flag Jira NVGPU-3579 Change-Id: I7d82af11e31a8bc013b2b47e2bca939ae64aff29 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2155259 GVS: Gerrit_Virtual_Submit Reviewed-by: Vinod Gopalakrishnakurup <vinodg@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-07-18 10:36:29 -07:00
Debarshi Dutta	f6c96f620f	gpu: nvgpu: add CONFIG_NVGPU_KERNEL_MODE_SUBMIT flag The following functions belong to the path of kernel_mode submit and the flag CONFIG_NVGPU_KERNEL_MODE_SUBMIT is used to compile these out of safety builds. channel_gk20a_alloc_priv_cmdbuf channel_gk20a_free_prealloc_resources channel_gk20a_joblist_add channel_gk20a_joblist_delete channel_gk20a_joblist_peek channel_gk20a_prealloc_resources nvgpu_channel nvgpu_channel_add_job nvgpu_channel_alloc_job nvgpu_channel_alloc_priv_cmdbuf nvgpu_channel_clean_up_jobs nvgpu_channel_free_job nvgpu_channel_free_priv_cmd_entry nvgpu_channel_free_priv_cmd_q nvgpu_channel_from_worker_item nvgpu_channel_get_gpfifo_free_count nvgpu_channel_is_prealloc_enabled nvgpu_channel_joblist_is_empty nvgpu_channel_joblist_lock nvgpu_channel_joblist_unlock nvgpu_channel_kernelmode_deinit nvgpu_channel_poll_wdt nvgpu_channel_set_syncpt nvgpu_channel_setup_kernelmode nvgpu_channel_sync_get_ref nvgpu_channel_sync_incr nvgpu_channel_sync_incr_user nvgpu_channel_sync_put_ref_and_check nvgpu_channel_sync_wait_fence_fd nvgpu_channel_update nvgpu_channel_update_gpfifo_get_and_get_free_count nvgpu_channel_update_priv_cmd_q_and_free_entry nvgpu_channel_wdt_continue nvgpu_channel_wdt_handler nvgpu_channel_wdt_init nvgpu_channel_wdt_restart_all_channels nvgpu_channel_wdt_restart_all_channels nvgpu_channel_wdt_rewind nvgpu_channel_wdt_start nvgpu_channel_wdt_stop nvgpu_channel_worker_deinit nvgpu_channel_worker_from_worker nvgpu_channel_worker_init nvgpu_channel_worker_poll_init nvgpu_channel_worker_poll_wakeup_post_process_item nvgpu_channel_worker_poll_wakeup_process_item nvgpu_submit_channel_gpfifo_kernel nvgpu_submit_channel_gpfifo_user gk20a_userd_gp_get gk20a_userd_pb_get gk20a_userd_gp_put nvgpu_fence_alloc The following members of struct nvgpu_channel are compiled out of safety build. struct gpfifo_desc gpfifo; struct priv_cmd_queue priv_cmd_q; struct nvgpu_channel_sync *sync; struct nvgpu_list_node worker_item; struct nvgpu_channel_wdt wdt; The following files are compiled out of safety build. common/fifo/submit.c common/sync/channe1_sync_semaphore.c hal/fifo/userd_gv11b.c Jira NVGPU-3479 Change-Id: If46c936477c6698f4bec3cab93906aaacb0ceabf Signed-off-by: Debarshi Dutta <ddutta@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/2127212 GVS: Gerrit_Virtual_Submit Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>	2019-06-30 22:04:48 -07:00

1 2 3

138 Commits