gpu: nvgpu: allow skipping pramin barriers

A wmb() next to each gk20a_mem_wr32() via PRAMIN may be overly careful,
so support not inserting these barriers for performance, in cases where
they are not necessary, where the caller would do an explicit barrier
after a bunch of reads.

Also, move those optional wmb()s to be done at the end of the whole
internally batched write for gk20a_mem_{wr_n,memset} from the per-batch
subloops that may run multiple times.

Jira DNVGPU-23

Change-Id: I61ee65418335863110bca6f036b2e883b048c5c2
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/1225149
(cherry picked from commit d2c40327d1995f76e8ab9cb4cd8c76407dabc6de)
Reviewed-on: http://git-master/r/1227474
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Konsta Holtta
2016-09-12 12:37:30 +03:00
committed by mobile promotions
parent 718af968f0
commit a8e260bc8d
2 changed files with 7 additions and 12 deletions

View File

@@ -74,6 +74,7 @@ struct mem_desc {
bool user_mem; /* vidmem only */
struct gk20a_allocator *allocator; /* vidmem only */
struct list_head clear_list_entry; /* vidmem only */
bool skip_wmb;
};
struct mem_desc_sub {