mirror of
git://nv-tegra.nvidia.com/linux-nvgpu.git
synced 2025-12-24 18:42:29 +03:00
From gv11b onwards, FECS ucode returns an ACK for set watchdog timeout method. Failure to wait for this ACK was leading to races, and in some cases, the ACK could be mistaken for the reply to the next method. In particular, this happened for the discover golden image size method which is sent after set watchdog timeout. With instrumented FECS ucode, it takes longer for the code to process the set watchdog timeout method, and the write to ack that method could happen after nvgpu driver clears the mailbox to send the discover image size method. With an invalid golden context image size, FECS ended up causing an MMU fault while attempting to save past allocated buffer. Added NVGPU_GR_FALCON_METHOD_SET_WATCHDOG_TIMEOUT to be used with gops_gr_falcon.ctrl_ctxsw, and implemented 2 variants: - gm20b_gr_falcon_ctrl_ctxsw, without ACK - gv11b_gr_falcon_ctrl_ctxsw, with ACK Added NVGPU_GR_FALCON_SUBMIT_METHOD_F_LOCKED flag to allow executing above method without re-acquiring FECS lock. Longer term, the 'flags' could be added to gop_gr_falcon.ctrl_ctxsw parameters. Use gops_gr_falcon.ctrl_ctxsw instead of register writes to invoke set watchdog timeout method in gm20b_gr_falcon_wait_ctxsw_ready. Also replaced calls to gm20b_gr_falcon_ctrl_ctxsw to gops_gr.falcon.ctrl_ctxsw when appropriate, since there are multiple variants (gm20b, gp10b and gv11b). Last, fixed clearing of mailbox 0 in gm20b_gr_falcon_bind_instblk. Bug 200586923 Change-Id: I653b9a216555eec8cd4bb01d6f202bc77b75a939 Signed-off-by: Thomas Fleury <tfleury@nvidia.com> Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2287340 Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>