gpu: nvgpu: userd slab allocator

We had to force allocation of physically contiguous memory for
USERD in nvlink case, as a channel's USERD address is computed as
an offset from fifo->userd address, and nvlink bypasses SMMU.

With 4096 channels, it can become difficult to allocate 2MB of
physically contiguous sysmem for USERD on a busy system.

PBDMA does not require any sort of packing or contiguous USERD
allocation, as each channel has a direct pointer to that channel's
512B USERD region. When BAR1 is supported we only need the GPU VAs
to be contiguous, to setup the BAR1 inst block.

- Add slab allocator for USERD.
- Slabs are allocated in SYSMEM, using PAGE_SIZE for slab size.
- Contiguous channels share the same page (16 channels per slab).
- ch->userd_mem points to related nvgpu_mem descriptor
- ch->userd_offset is the offset from the beginning of the slab

- Pre-allocate GPU VAs for the whole BAR1
- Add g->ops.mm.bar1_map() method
  - gk20a_mm_bar1_map() uses fixed mapping in BAR1 region
  - vgpu_mm_bar1_map() passes the offset in TEGRA_VGPU_CMD_MAP_BAR1
  - TEGRA_VGPU_CMD_MAP_BAR1 is called for each slab.

Bug 2422486
Bug 200474793

Change-Id: I202699fe55a454c1fc6d969e7b6196a46256d704
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1959032
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
This commit is contained in:
Thomas Fleury
2018-11-20 16:34:21 -08:00
committed by mobile promotions
parent 0eb555c5c0
commit 7e68e5c83d
22 changed files with 264 additions and 147 deletions

View File

@@ -507,10 +507,11 @@ int vgpu_remove(struct platform_device *pdev)
bool vgpu_is_reduced_bar1(struct gk20a *g)
{
struct fifo_gk20a *f = &g->fifo;
struct nvgpu_os_linux *l = nvgpu_os_linux_from_gk20a(g);
struct fifo_gk20a *f = &g->fifo;
u32 size = f->num_channels * f->userd_entry_size;
return resource_size(l->bar1_mem) == (resource_size_t)f->userd.size;
return resource_size(l->bar1_mem) == size;
}
int vgpu_tegra_suspend(struct device *dev)