gpu: nvgpu: handle ctx_reload when force unbinding the channel

When force closing the channel, NEXT and CTX_RELOAD bits might be set.
Currently CTX_RELOAD bit is ignored. However, due to this, the channel
created after the erroneous unbind encounters FECS fault.

If the channel is unbound while it is running, fifo unbind error
happens and can lead to unspecified behavior.

By moving CTX_RELOAD to other channel in the TSG, the channel can be
unbound safely. In other cases, if the channel is truly running
something when it is being unbound it should either get
preempted or be handled through engine reset.

Bug 200701444

Change-Id: Iba956544dcaa1144c6064247257c64cbe9a29ae6
Signed-off-by: Sagar Kamble <skamble@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2515083
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Sagar Kamble
2021-04-14 23:40:27 +05:30
committed by mobile promotions
parent 848d80e4d7
commit ff706e5456

View File

@@ -338,7 +338,7 @@ int nvgpu_tsg_unbind_channel_check_hw_state(struct nvgpu_tsg *tsg,
{
struct gk20a *g = ch->g;
struct nvgpu_channel_hw_state hw_state;
int err;
int err = 0;
nvgpu_rwsem_down_read(&tsg->ch_list_lock);
g->ops.channel.read_state(g, ch, &hw_state);
@@ -346,9 +346,6 @@ int nvgpu_tsg_unbind_channel_check_hw_state(struct nvgpu_tsg *tsg,
if (g->ops.tsg.unbind_channel_check_hw_next != NULL) {
err = g->ops.tsg.unbind_channel_check_hw_next(ch, &hw_state);
if (err != 0) {
return err;
}
}
if (g->ops.tsg.unbind_channel_check_ctx_reload != NULL) {
@@ -360,7 +357,7 @@ int nvgpu_tsg_unbind_channel_check_hw_state(struct nvgpu_tsg *tsg,
&hw_state);
}
return 0;
return err;
}
void nvgpu_tsg_unbind_channel_check_ctx_reload(struct nvgpu_tsg *tsg,