gpu: nvgpu: hold power ref for deterministic channels

To support deterministic channels even with platforms where railgating
is supported, have each deterministic-marked channel hold a power
reference during their lifetime, and skip taking power refs for jobs in
submit path for those.

Previously, railgating blocked deterministic submits in general because
of gk20a_busy()/gk20a_idle() calls in submit path possibly taking time
and more significantly because the gpu may need turning on which takes a
nondeterministic and long amount of time.

As an exception, gk20a_do_idle() can still block deterministic submits
until gk20a_do_unidle() is called. Add a rwsem to guard this. VPR resize
needs do_idle, which conflicts with deterministic channels' requirement
to keep the GPU on. This is documented in the ioctl header now.

Make NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_NO_JOBTRACKING always
set in the gpu characteristics now that it's supported. The only thing
left now blocking NVGPU_GPU_FLAGS_SUPPORT_DETERMINISTIC_SUBMIT_FULL is
the sync framework.

Make the channel debug dump show which channels are deterministic.

Bug 200291300
Jira NVGPU-70

Change-Id: I47b6f3a8517cd6e4255f6ca2855e3dd912e4f5f3
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1483038
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Konsta Holtta
2017-05-16 13:47:58 +03:00
committed by mobile promotions
parent 3c3c39dfe0
commit 7680fd689e
9 changed files with 189 additions and 29 deletions

View File

@@ -3494,10 +3494,11 @@ void gk20a_dump_channel_status_ramfc(struct gk20a *g,
syncpointa = inst_mem[ram_fc_syncpointa_w()];
syncpointb = inst_mem[ram_fc_syncpointb_w()];
gk20a_debug_output(o, "%d-%s, pid %d, refs: %d: ", hw_chid,
gk20a_debug_output(o, "%d-%s, pid %d, refs %d%s: ", hw_chid,
g->name,
ch_state->pid,
ch_state->refs);
ch_state->refs,
ch_state->deterministic ? ", deterministic" : "");
gk20a_debug_output(o, "channel status: %s in use %s %s\n",
ccsr_channel_enable_v(channel) ? "" : "not",
gk20a_decode_ccsr_chan_status(status),
@@ -3576,6 +3577,7 @@ void gk20a_debug_dump_all_channel_status_ramfc(struct gk20a *g,
ch_state[chid]->pid = ch->pid;
ch_state[chid]->refs = atomic_read(&ch->ref_count);
ch_state[chid]->deterministic = ch->deterministic;
nvgpu_mem_rd_n(g, &ch->inst_block, 0,
&ch_state[chid]->inst_block[0],
ram_in_alloc_size_v());