Commit Graph

2471 Commits

Author SHA1 Message Date
Deepak Nibade
ebb66b5d50 gpu: nvgpu: add macros to get current GR instance
Add macros to get current GR instance id and the pointer
nvgpu_gr_get_cur_instance_ptr()
nvgpu_gr_get_cur_instance_id()

This approach makes sure that the caller is getting GR instance pointer
under mutex g->mig.gr_syspipe_lock in MIG mode. Trying to access
current GR instance outside of this lock in MIG mode dumps a warning.

Return 0th instance in case MIG mode is disabled.

Use these macros in nvgpu instead of direct reference to
g->mig.cur_gr_instance.

Store instance id in struct nvgpu_gr. This is to retrieve GR instance
id in functions where struct nvgpu_gr pointer is already available.

Jira NVGPU-5648

Change-Id: Ibfef6a22371bfdccfdc2a7d636b0a3e8d0eff6d9
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2413140
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
db20451d0d gpu: nvgpu: fix pmm chiplet offsets
gr_gv100_init_hwpm_pmm_register() and gr_gv100_set_pmm_register() right
now assume common chiplet stride for all sys/fbp/gpc and use common API
g->ops.perf.get_pmm_per_chiplet_offset() to get the stride.

Chiplet strides are same for all partitions only by chance, and future
chip might change that.

Hence add and use below 3 separate HALs to get appropriate strides.
g->ops.perf.get_pmmsys_per_chiplet_offset()
g->ops.perf.get_pmmgpc_per_chiplet_offset()
g->ops.perf.get_pmmfbp_per_chiplet_offset()

Also store sys/fbp/gpc perfmon count in struct gk20a after first query
instead of querying them again and again. Querying the counts from HW
is time consuming.

Bug 2510974
Jira NVGPU-5360

Change-Id: I186009221009780d561617c0cd6f535854db585f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2413108
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
7a937a6190 gpu: nvgpu: add debug logs for common.gr debugging
Add separate flag gpu_dbg_gr to enable common.gr specific debugging.
Add this flag to all the existing debug logs that use gpu_dbg_fn or
gpu_dbg_info for debugging. Also add many other debugging logs that
might be helpful in debugging.

Removing debug log in gv11b_gr_init_get_nonpes_aware_tpc() as it dumps
too much data that does not seem useful.

Batch all interrupt enable functions in gr_init_setup_hw() together for
readability.

Jira NVGPU-5648

Change-Id: I0b857650122cdb1f974b452d28c26e7f142baf61
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2411740
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kadamati
19e9b38385 gpu: nvgpu: added new argument to nvhost function
* Added struct gk20a as input argument, which help in chip detection
   nvgpu_nvhost_syncpt_unit_interface_get_byte_offset

Jira NVGPU-6068

Change-Id: I76f342b1c9f51632f72bc00f0328c7cac5991956
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410872
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Dinesh T <dt@nvidia.com>
Reviewed-by: Ankur Kishore <ankkishore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
e0dd79cd43 gpu: nvgpu: rearch mc reset and enable hals
Remove current mc hals
- mc.reset()
- mc.enable()
- mc.disable()
- mc.reset_mask()
- mc.reset_engine()
- mc.reset_engine_enable()

Add new mc hals
- mc.enable_units(g, units, enable)
  > enable/disable given unit(s)
- mc.enable_dev(g, dev, enable)
  > enable/disable engine represented by given device pointer
- mc.enable_devtype(g, devtype)
  > enable/disable all engines of given devtype

Move common mc intr functions to common/mc/mc_intr.c.
Add below common mc functions
- nvgpu_mc_reset_units(g, units)
  > reset given logical OR of nvgpu unit bitmap
- nvgpu_mc_reset_dev(g, dev)
  > reset given single engine via dev
  > if engine is graphics, reset gpcs for nvgpu_next
- nvgpu_mc_reset_devtype(g, devtype)
  > reset all engines of given devtype
  > if devtype is graphics, reset gpcs for nvgpu_next

Bug 200648985
Bug 3109773

Change-Id: Idc67a14a0a7cde83de44fbfbec13007fead3ed5c
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2408523
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
deepak goyal
215403552f gpu: nvgpu: falcon2 core loading support
- Added ops for new core.
- Added firmware structs for new core.

JIRA NVGPU-5736

Change-Id: Ifebc8987bf3a749803c1c5539e7d08716c1842a4
Signed-off-by: deepak goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2372104
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Srirangan Madhavan <smadhavan@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Lakshmanan M
47dc015b86 gpu: nvgpu: Add physical gpu instance support
This patch added the physical gpu intance support when MIG
is enabled.

JIRA NVGPU-5647

Change-Id: Ic642b88ebc70ea6114e63c2287db8bca00860c67
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2410698
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
c8add76c8d gpu: nvgpu: increment gr instance id in macros
Increment gr instance id in loop implemented for below macros
nvgpu_gr_exec_with_ret_for_each_instance
nvgpu_gr_exec_for_each_instance

Ensure remap window is disabled in case function returns error
in nvgpu_gr_exec_with_ret_for_each_instance

Jira NVGPU-5648

Change-Id: I72d34bbfd4067e3448883b5daeee45c614ee029f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2409638
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
3b746dce0c gpu: nvgpu: use a falcon flag instead of enabled bit
common.gr unit right now makes use of a capability bit
NVGPU_PMU_FECS_BOOTSTRAP_DONE to ensure the recovery path hits a
different routine. This is actually needless and a common check
cannot be used for all GR instances anyways.

Delete this capability bit. Add and use a new flag
coldboot_bootstrap_done added under struct nvgpu_gr_falcon

Jira NVGPU-5648

Change-Id: I46faea6f07cf054f17a3215d4cbbe0fc8a6382ae
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2409533
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Antony Clince Alex
f5f9cd9cba gpu: nvgpu: update nvgpu_(err/ecc) to support nvgpu-next errors
Update nvgpu_err and nvgpu_ecc units to support nvgpu-next errors.

Jira NVGPU-5286

Change-Id: Iaa36c408a46aec163ba3e375bb7c363aa96bdc8d
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2407382
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Vaibhav Kachore <vkachore@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
fc12a284bf gpu: nvgpu: initialize per GR instance config
Expose below two new APIs from common.grmgr unit
nvgpu_grmgr_get_gr_num_gpcs() - get per instance number of GPCs
nvgpu_grmgr_get_gr_gpc_phys_id() - get physical GPC id for MIG engine
local id in corresponding instance

Execute gr_init_config() for each GR instance.
Add gr_config_init_mig_gpcs() to initialize GPC data in case MIG is
enabled. Separate out gr_config_init_gpcs() for legacy GPC data
initialization.

These functions will inititialize below data in struct nvgpu_gr_config:
max_gpc_count
gpc_count
gpc_mask
gpc_tpc_mask[gpc_count]
max_tpc_per_gpc_count

Rest of the values in struct nvgpu_gr_config are either based on above
values, or read from HW after setting GPC PRI window.

In gr_config_alloc_struct_mem(), rename total_gpc_cnt to total_tpc_cnt
since it represents total TPC count and not GPC. Remove use of temp3
variable since it does not give any idea on usage.

Jira NVGPU-5648

Change-Id: I646cac2ddc312e72b241b1b2a0e51a5cce141535
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406390
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
002edb782a gpu: nvgpu: move cur_gr_instance tracking to MIG infra
Move cur_gr_instance from struct gk20a to struct nvgpu_mig since this
tracking is really MIG specific.

Jira NVGPU-5648

Change-Id: I27b124925c2291e352ef9456c7189da0bc447a42
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2406389
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
2b48aa5b0c gpu: nvgpu: Add device for_each macro
Add a macro to iterate over a device list; it is just a wrapper to
the nvgpu_list_for_each() macro. It lets code iterate over the
list of detected devices without being aware of the underlying
instance IDs.

This also removes the need to do a separate nvgpu_device_get()
and subsequent NULL checking. This will reduce overhead for
unit testing!

Change-Id: If41dbee30a743d29ab62ce930a819160265b9351
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2404914
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
717921a274 gpu: nvgpu: return intr mask of all GR engine instances
nvgpu_gr_engine_interrupt_mask() earlier returned mask of all GR engine
instance interrupts. During device refactor series, this got changed to
return interrupt of only first instance.

Change this again to return interrupt mask of all the GR engine
instances since common.mc unit does not yet support APIs to enable
interrupt of individual GR instance.

Update nvgpu_gr_get_syspipe_id() API to take gr_instance_id as parameter
instead of struct nvgpu_gr pointer. Definition of struct nvgpu_gr is not
available outside of common.gr unit.

Jira NVGPU-5648

Change-Id: I5320d1515eea6054150dc14706a16475bd650da7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2405409
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
e579a708f7 gpu: nvgpu: add macros to run function on given gr instance in SMC mode
Add below new macros to execute given function on given gr instance
in SMC mode. If SMC mode is not enabled or supported, function is
executed as is.

1) nvgpu_gr_exec_for_instance
Execute a function for given GR instance by configuring GR remap
window for that instance. Function being executed returns void.

2) nvgpu_gr_exec_with_ret_for_instance
Execute a function for given GR instance by configuring GR remap
window for that instance. Function being executed returns an error.

Jira NVGPU-5648

Change-Id: I3f051bcfdac297b7e5216da9eeee06a46878e7b8
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2405407
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Deepak Nibade
6745b0685e gpu: nvgpu: support resetting each GR instance
Add a new header file <nvgpu/gr/gr_instances.h> that supports below
macros to execute various functions for GR instances

1) nvgpu_gr_exec_for_each_instance
   Execute a function for each GR instance by configuring GR remap
   window for that instance. Function being executed returns void.

2) nvgpu_gr_exec_with_ret_for_each_instance
   Execute a function for each GR instance by configuring GR remap
   window for that instance. Function being executed returns an error.

3) nvgpu_gr_exec_for_all_instances
   Execute a function for all GR instances at once. For this GR remap
   window needs to be disabled temporarily.

If CONFIG_NVGPU_MIG is disabled, all above macros will turn into simple
funciton calls.
If CONFIG_NVGPU_MIG is disabled or if runtime flag  NVGPU_SUPPORT_MIG is
disabled, all above macros will turn into simple function calls that
configure single GR instance.

Separate out GR engine reset code into new API gr_reset_engine() and
execute it with nvgpu_gr_exec_with_ret_for_each_instance().

PROD values need to be loaded in legacy mode, hence call
nvgpu_cg_init_gr_load_gating_prod() inside
nvgpu_gr_exec_for_all_instances().

Rename gr_init_prepare_hw() to more appropriate
gr_reset_hw_and_load_prod()

Moe gops.gr.init.fifo_access() call to gr_init_reset_enable_hw().

Add new API nvgpu_grmgr_get_gr_syspipe_id() to query GR instance syspipe
id from common.grmgr unit. Add nvgpu_gr_get_syspipe_id() that returns
same value stored in nvgpu_gr struct.

Add cur_gr_instance field to struct nvgpu_gr to track current GR
instance being programmed under remap window.

Jira NVGPU-5648

Change-Id: I86920303427a6e6547ebf195daa37438365bb38e
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2403550
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Debarshi Dutta
38ce6fa717 gpu: nvgpu: change unnamed structs to named structs
Following changes are made in this patch.
1) Change unnamed structs within gpu_ops to named structs
with the prefix gops_*.

2) Each named struct gops_ are moved into a separate gops specific file
under include/nvgpu/gops/

3) struct gpu_ops is moved into a separate file include/nvgpu/gpu_ops.h
and all other dependent struct gops_* are included in this header.

4) Direct references to include/nvgpu/gops are removed from files as its enough
to include gk20a.h.

Change-Id: Ieb22cb853be567e3bef14f5f8a04674eebd902ea
Signed-off-by: Debarshi Dutta <ddutta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398776
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Rajesh Devaraj <rdevaraj@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
rmylavarapu
d0c01fc14c gpu: nvgpu: Support ELPG feature on nvgpu-next
Changes:
 -Implemented pg init_send ops for legacy chips.
 -Implemented RPC response handler.
 -Added pg rpc function call macros for nvgpu-next.

NVGPU-5192
NVGPU-5195
NVGPU-5196

Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Change-Id: I4e99d3929d7db796434aaeaa6f5773e9aac9fd32
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2391029
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
a2809088eb gpu: nvgpu: remove unnecessary hal gops.gr.gr_enable_hw()
gops.gr.gr_enable_hw() is a common function and not referred on vGPU.
Remove HAL pointer and directly use nvgpu_gr_enable_hw() instead.

Jira NVGPU-5648

Change-Id: Id031024ed01f9d890cffb5902cc433800810b219
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2403548
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Sagar Kadamati
e773cb6087 gpu: nvgpu: re-organize interrupt logic
* Removed unnecessary irqs_enabled flag, and
   Replaced enable/disable irq logics with nvgpu variant functions.
 * Added nvgpu_interrupts data structure to hold interrupt details.
 * Interpret all stall irqs first and followed by nonstall irq from dt.
 * Used interrupt size checks for enable/disable irqs instead of
   comparing stall and nonstall interrupt lines.

Now adding new stall interrupt lines as easy as just updating macro.

Jira NVGPU-6019

Change-Id: I5a5eaa8d333c68ee87d25d2b45ec244ec8d7b297
Signed-off-by: Sagar Kadamati <skadamati@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400777
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
1117ea1286 gpu: nvgpu: ce: check address ranges before exec
The source and destination addresses are masked to low 40 bits only.
Make sure that the input params don't cross that; it would mean a bug
somewhere in the caller side. Silently truncating values could cause
unexpected behaviour, but no device even has that much memory.

Also rename the src_buf and dst_buf to src_paddr and dst_paddr to
emphasize that the addresses are gpu physical.

Jira NVGPU-5172

Change-Id: I30653bf93791517991d04e4ba43220b5b541f581
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2402031
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
82b4a8e825 gpu: nvgpu: ce: allocate exact cmdbuf size
Avoid the magic value 256 by basing the constant max cmdbuf bytes per
submit on the actual data used in the submits. Each submit contains a
setclass header and at most two transfer or memset operations.

Jira NVGPU-5172

Change-Id: I66d715fe5e7fcfc676c0d78a3cf35c2c6197a342
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2402029
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
a54e4f1d74 gpu: nvgpu: ce: use clear upper bound for op size
The copyengine code to do big transfers or memsets supports a 64-bit
size. Each copy is done as a rectangle with either side being at most
2GB, so a size that does not align nicely is split into multiple ops. It
turns out that there are at most two of these ops, so structure the code
to not loop but do two ops explicitly.

The first copyengine operation works with the first chunk that is less
than two gigabytes long. That leaves the remaining size to be a multiple
of two gigabytes, so it's sufficient to do just another operation as a
2D rectangle whose width is two gigabytes; the remaining size determines
the height, i.e. the number of two-gig lines.

The loop did just this already, but now with at most two operations per
submit the required pushbuf length is seen more easily from the code.

Jira NVGPU-5172

Change-Id: I6bca3b1204db3b79e131898c07018a1337d85774
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2402028
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
4351978013 gpu: nvgpu: ce: make payload param u32
The payload word used for copyengine memsets is written to an u32
buffer, so use the correct type from the beginning.

Jira NVGPU-5172

Change-Id: Id813e042b609cb9d0705ba32d3cc03351bded413
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2402027
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
2427d45102 gpu: nvgpu: initialize gr ecc counters for each instance
Add new API nvgpu_ecc_counter_init_per_gr() to initialize ECC counters
per GR instance.
Switch NVGPU_ECC_COUNTER_INIT_GR macro to use
nvgpu_ecc_counter_init_per_gr() instead of nvgpu_ecc_counter_init().

Fix error handling path in nvgpu_gr_alloc().

Jira NVGPU-5648

Change-Id: I18f1bf8b245956bdb5a3e4bb6b03114282366ce6
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2402025
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
f4cc6bf7b9 gpu: nvgpu: add wrapping_sub_u32
Add nvgpu_wrapping_sub_u32() to perform static analysis safe arithmetic
where unsigned wraparound is expected. nvgpu_safe_sub_u32() expects that
the result does not wrap, so it cannot be used in such cases.

Jira NVGPU-5506

Change-Id: I904bd749da0eb44ad6d5a4f00490eaec7fa55839
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2401291
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
1487401072 gpu: nvgpu: remove nvhost syncpt max apis
The nvhost-tracked max values are no longer used now that the channel
sync code tracks the values when needed. Delete the wrappers.

Jira NVGPU-5506

Change-Id: Ia0da1d7529bc560895e7d58647abeb5659478c58
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400636
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
5e570610b3 gpu: nvgpu: track syncpt max internally
The max values that the Linux nvhost driver tracks are adding some
complexity to our wrapper APIs. Max values are used only for internal
submit syncpoint tracking, so implement that tracking in the sync code
by just storing the last value that the syncpoing will reach after all
jobs are complete.

The value is a simple u32. It's accessed from functions in the submit
path that already is serialized, so there's no worrying about atomic
modifications.

Previously nvhost_syncpt_set_min_eq_max_ext() was used to reset the
syncpoint when necessary. Now with the internal max value we'll use
nvhost_syncpt_set_minval(), so add a wrapper for it.

The maxval reported with the user syncpoint allocation is just the
current value at allocation time since no jobs have affected it yet;
there is no means for the kernel to track the max value of user
syncpoints.

Jira NVGPU-5506

Change-Id: I34672eaa7fe3af36b2fbac92d11babe2bc6a2d2b
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400635
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
d4fb476e70 gpu: nvgpu: remove joblist cleanup lock
The joblist cleanup lock exists to synchronize the submit job cleanup
and the abort cleanup that may run in separate threads concurrently.
This concurrency is no problem anymore, so delete the lock.

The lock was added in commit f1072a28be ("gpu: nvgpu: add worker for
watchdog and job cleanup") when the abort cleanup still went through
each job in the pending list and released their semaphores; ordinary job
cleanup from the worker thread also accesses the jobs. Commit d20a501dcb
("gpu: nvgpu: simplify job semaphore release in abort") deleted the
entire loop because the semaphore, if any, is now reset in one go (via
the "set_min_eq_max" ch sync op), but the lock stayed.

With aggressive sync destroy enabled the sync object under the cleanup
lock can still disappear if the job cleanup runs, but that's already
guarded with the sync lock.

Jira NVGPU-5998

Change-Id: I6554eb2065b003c6fdf83f66f97067b59aa272f5
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2401467
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
8cccb49bd2 gpu: nvgpu: collapse nvgpu_gr_prepare_sw into nvgpu_gr_alloc
common.gr unit exports a separate API nvgpu_gr_prepare_sw to
initialize some SW pieces required for nvgpu_gr_enable_hw().
A separate API is really unnecessary since same initialization
can be performed in nvgpu_gr_alloc().

Remove nvgpu_gr_prepare_sw() and HAL gops.gr.gr_prepare_sw().
Initialize falcon and interrupt structures in loop from
nvgpu_gr_alloc().

Move nvgpu_netlist_init_ctx_vars() from nvgpu_gr_prepare_sw() to
common init path since netlist parsing need not be done from
common.gr unit. It just needs to happen before nvgpu_gr_enable_hw().

Also, trigger nvgpu_gr_free() from gr_remove_support() instead
of OS specific paths. Also remove nvgpu_gr_free() calls from
probe error paths since nvgpu_gr_alloc is no longer called in
probe path.

Move interrupt and falcon data structure free calls to nvgpu_gr_free().

Also remove corresponding unit testing code that tests
nvgpu_gr_prepare_sw() specifically.
Update some unit tests to initialize ecc counters and netlist.
Disable some unit tests that fail for reasons unknown.

Jira NVGPU-5648

Change-Id: I82ec8160f76530bc40e0c11a9f26ba1c8f9cf643
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400166
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
cfa360f5b8 gpu: nvgpu: allocate struct nvgpu_gr based on enumerated gr count
Add new API nvgpu_grmgr_get_num_gr_instances() that returns number of
GR instance enumerated by GR manager. This just returns number of sys
pipes enabled since it is same as number of GR instances.

For consistency until common.gr supports multiple GR instances
completely, add a temporary macro NVGPU_GR_NUM_INSTANCES and set it
to 1. If this macro is changed to 0 (for local MIG testing), fall
back to use nvgpu_grmgr_get_num_gr_instances() to get enumerated number
of GR instances.

Use a for loop to initialize other variables of struct nvgpu_gr.

Remove unnecessary NULL check in nvgpu_gr_alloc() since struct gk20a
pointer can never be NULL in this path. Also remove corresponding unit
test code.

Jira NVGPU-5648

Change-Id: Id151d634a23235381229044f2a9af89e390886f2
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2400151
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Richard Zhao
625d6520b8 gpu: nvgpu: vgpu: use mempool to get constants
As the number of engines increases, the constants structure exceeds 512B
which is cpu cacheline size and IVC queue frame size. So move to use
mempool to get constants.

Also increases max engine number.

Jira GVSCI-4645

Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Change-Id: I301386fb8bcc9bb90a30e40f40ba1ecfaa311514
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399829
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Shashank Singh
cdc96f900f gpu: nvgpu: do sm id programming early
Move sm id programming before loading ctxsw and gpccs firmwares. This
is the actual sequence expected by ctxsw ucode. Legacy chips will use
the same old sequence.

Bug 200631350

Change-Id: I3cc1384982b238475af47da6a25e2acd6616fd84
Signed-off-by: Shashank Singh <shashsingh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398300
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
e8201d6ce3 gpu: nvgpu: decouple channel watchdog dependencies
The channel code needs the watchdog code and vice versa. Cut this
circular dependency with a few simplifications so that the watchdog
wouldn't depend on so much.

When calling watchdog APIs that cause stores or comparisons of channel
progress, provide a snapshot of the current progress instead of a whole
channel pointer. struct nvgpu_channel_wdt_state is added as an interface
for this to track gp_get and pb_get.

When periodically checking the watchdog state, make the channel code ask
whether a hang has been detected and abort the channel from within
channel code instead of asking the watchdog to abort the channel. The
debug dump verbosity flag is also moved back to the channel data.

Move the functionality to restart all channels' watchdogs to channel
code from watchdog code. Looping over active channels is not a good
feature for the watchdog; it's better for the channel handling to just
use the watchdog as a tracking tool.

Move a few unserviceable checks up in the stack to the callers of the
wdt code. They're a kludge but this will do for now and demonstrates
what needs to be eventually fixed.

This does not leave much code in the watchdog unit. Now the purpose of
the watchdog is to only isolate the logic to couple a timer and progress
snapshots with careful locking to start and stop the tracking.

Jira NVGPU-5582

Change-Id: I7c728542ff30d88b1414500210be3fbaf61e6e8a
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369820
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
8ec1ec4f69 gpu: nvgpu: add syncpt_get_name() posix stub
Add stub for nvgpu_nvhost_syncpt_get_name() stub in posix.

JIRA NVGPU-5363

Change-Id: I3a1826e47685d54bda63cf04aa327adcb3da422e
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2369658
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
d020778c55 gpu: nvgpu: reserve pma stream for legacy profiler
Legacy profiler does not reserve PMA stream resource with PM
reservation system. Also, HWPM system reset is separately implemented
in membuf disable path. And it does not even restore perf unit SLCG
prod values.

Allcoate a dummy profiler object for debug session in perfbuf map
path. Free it in perfbuf unmap path.

This has advantage of synchronizing PMA stream reservation with new
profiler stack. And this also leverages HWPM system reset and SLCG
handling code during resource reservation.

Remove explicit HWPM reset from gops.perf.membuf_reset_streaming()
HALs

Bug 2510974
Jira NVGPU-5360

Change-Id: I54c5202b6251dea3d80a4dfc011e8a296339e07f
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399595
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
d90b9a3d4e gpu: nvgpu: reset HWPM system on reservation
Hardware HWPM system should be reset when first reservation is made
either for HWPM or PMA_STREAM resource. Support this with below changes

- Add hwpm_refcount counter to track HWPM and PMA_STREAM resource
  reservation count
- Increment counter on every HWPM/PMA resource reservation
- Decrement counter on every resource reservation release
- Reset HWPM system in MC and disable perf unit SLCG on first refcount
  increment
- Reset HWPM system in MC and re-enable perf unit SLCG after last
  refcount decrement
- Add nvgpu_cg_slcg_perf_load_enable() to manage perf unit SLCG

Bug 2510974
Jira NVGPU-5360

Change-Id: I20d2927947c3e4d8073cd3131b7733791e9c9346
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399594
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
dfd9feace6 gpu: nvgpu: recover pbdma errors before ack
When a pbdma fault needs a channel teardown, do the recovery/teardown
process before acking the pbdma interrupt status back. Acking it causes
the hardware to proceed which could release fences too early before the
involved channel(s) have been found to be broken.

With these host copyengine interrupts, the teardown sequence is light
and proceeds even with the pbdma intr flag still set; there are no
engines to reset when these pbdma launch check interrupts happen. The
bad tsg is just disabled and the channels in it aborted.

A few unit tests are so heavily affected by this refactor that they
would need to be rewritten. They're not strictly needed at the moment,
so do only half of the rewrite: just delete them.

Bug 200611198

Change-Id: Id126fb158b6d05e46ba124cd426389046eedc053
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2392669
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
370ac6cc98 gpu: nvgpu: Update grmgr code to use nvgpu_device struct
Instead of the nvgpu_engine_get_ids() function that will shortly
be deleted, use the new nvgpu_device_get_copies() function.

JIRA NVGPU-5421

Change-Id: I2b778b7818e885c807dfa90f15d03cddba9e59fc
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2399165
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Alex Waterman
fba96fdc09 gpu: nvgpu: Replace nvgpu_engine_info with nvgpu_device
Delete the struct nvgpu_engine_info as it's essentially identical to
struct nvgpu_device. Duplicating data structures is not ideal as it's
terribly confusing what does what.

Update all uses of nvgpu_engine_info to use struct nvgpu_device. This
is often a fairly straight forward replacement. Couple of places though
where things got interesting:

  - The enum_type that engine_info uses is defined in engines.h and
    has a bit of SW abstraction - in particular the GRCE type. The only
    place this seemed to be actually relevant (the IOCTL providing device
    info to userspace) the GRCE engines can be worked out by comparing
    runlist ID.
  - Addition of masks based on intr_id and reset_id; those can be
    computed easily enough using BIT32() but this is an area that
    could be improved on.

This reaches into a lot of extraneous code that traverses the fifo
active engines list and dramtically simplifies this. Now, instead of
having to go through a table of engine IDs that point to the list of
all host engines, the active engine list is just a list of pointers to
valid engines. It's now trivial to do a for-all-active-engines type
loop. This could even be turned into a generic macro or otherwise
abstracted in the future.

JIRA NVGPU-5421

Change-Id: I3a810deb55a7dd8c09836fd2dae85d3e28eb23cf
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2319895
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Alex Waterman
df9695bd13 gpu: nvgpu: Add copy engine query device APIs
Add two new APIs to the device code to query copy engines. These APIs
handle the annoying change from COPY0-2 to LCEs in the Pascal generation.

  nvgpu_device_get_copies()
  nvgpu_device_get_async_copies()

The first function gets all copy engines; the latter queries only async
copy engines: that is CEs that do not share a runlist with the GR engine.

JIRA NVGPU-5421

Change-Id: I707d9b004994b91f9d77974133912af9b9955882
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398597
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Lakshmanan M
48f1da4dde gpu: nvgpu: Add bundle skip sequence in MIG mode
In MIG mode, 2D, 3D, I2M and ZBC classes are not supported
by GR engine. So skip those bundle programming sequence in MIG mode.

JIRA NVGPU-5648

Change-Id: I7ac28a40367e19a3e31e63f3e25991c0ed4d2d8b
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397912
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
rmylavarapu
4787220ffe gpu: nvgpu: Create ELPG cmd functions
In nvgpu-next ELPG unit support RPC calls and no
longer support command calls to communicate to PMU.
This change will create separate ELPG command
functions which can be called for legacy chips and
can be replaced by RPC functions for nvgpu-next chip.

NVGPU-5195

Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Change-Id: Iddea0f46eb3506a4f2d44d664f610215b8f1b666
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2386923
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Vedashree Vidwans
ae25924393 gpu: nvgpu: print enabled_flags after poweron
GPU enabled_flags indicate features supported by nvgpu.
Add nvgpu_print_enabled() to print GPU enabled_flags. Print flag value
after poweron complete to help during debug.
Add verbose function to print flag name and status if gpu_dbg_info is
set.

JIRA NVGPU-5838

Change-Id: I3b0ddb8c6872f4f3b6101050da087ff553c16f84
Signed-off-by: Vedashree Vidwans <vvidwans@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2383531
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Antony Clince Alex
16d54e83bf gpu: nvgpu: remove nvgpu_next functions from nvgpu_mc unit
At present nvgpu_mc unit contains nvgpu_next_mc function definitions under
conditional compilation macro. Move these functions to nvgpu_next specific
files.

Jira NVGPU-6004

Change-Id: Ieef68dad3c20941fd5580cad7341f165880f08ad
Signed-off-by: Antony Clince Alex <aalex@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2396323
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00
Konsta Hölttä
c73a2bddc9 gpu: nvgpu: delete unused nvgpu_nvhost_get_syncpt_host_managed
We're using client managed syncpoints only. Delete this historical
artifact.

Jira NVGPU-5506

Change-Id: I8ebe34310eb99fd1fec2b238500aa9f4502cf09a
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398406
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-misra <svc-mobile-misra@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Konsta Hölttä
a439d3767d gpu: nvgpu: silence coverity on fence code
- use release instead of free for the fence destroy identifier
- nvhost_dev is a struct name, so use nvhost_device
- compare nvgpu_nvhost_syncpt_read_ext_check retval properly

Also, if the syncpt read fails when checking for fence expiration,
behave as if the wait isn't expired. Possibly getting stuck is safer
than possibly continuing too early.

Jira NVGPU-5617

Change-Id: Ied529e25f8c43f1c78fd9eac73b9cd6c3550ead5
Signed-off-by: Konsta Hölttä <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2398399
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Deepak Nibade
010f818596 gpu: nvgpu: initialize gr struct in poweron path
struct nvgpu_gr is right now initialized during probe and from OS
specific code. To support multiple instances of graphics engine,
nvgpu needs to initialize nvgpu_gr after number of engine instances
have been enumerated in poweron path.
Hence move nvgpu_gr_alloc() to poweron path and after gr manager has
been initialized.

Some of the members of nvgpu_gr are initialized in probe path and they
too are in OS specific code. Move them to common code in
nvgpu_gr_alloc()

Add field fecs_feature_override_ecc_val to struct gk20a to store the
override flag read from device tree. This flag is later copied to
nvgpu_gr in poweron path.

Update tpc_pg_mask_store() to check for g->gr being NULL before
accessing golden image pointer.
Update tpc_fs_mask_store() to return error if g->gr is not initialized.
This path needs nvgpu_gr struct initialized. Also fix the incorrect
NULL pointer check in tpc_fs_mask_store() which breaks the write path
to this sysfs.

Jira NVGPU-5648

Change-Id: Ifa2f66f3663dc2f7c8891cb03b25e997e148ab06
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2397259
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Lakshmanan M <lm@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
rmylavarapu
641cc6a59c gpu: nvgpu: Support Perfmon events for nvgpu-next
Created Perfmon events handling for nvgpu-next.
Nvgpu-next pmu send perfmon events in the form of
rpc events. Events are:
- Change event: This gives information of whether
  it is increase/decrease event.
- Init event: This gives information of perfmon init
  done in PMU.

NVGPU-5202
NVGPU-5205
NVGPU-5206

Signed-off-by: rmylavarapu <rmylavarapu@nvidia.com>
Change-Id: Ida7e77dbaf70d2b594a0801c91a168dcb4a860bd
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2395358
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
2020-12-15 14:13:28 -06:00
Lakshmanan M
2a6fcec078 gpu: nvgpu: add gr manager ops-2 and mig infra-2
This CL covers the code changes related to following support,
 - Enabled gr manager ops.
 - Added gr manager init/remove support.
 - Refactor in gpu instance config infra.
 - Refactor in gr syspipe gpcs config infra.

JIRA NVGPU-5645
JIRA NVGPU-5646

Change-Id: Ib2fab2796d76fe105fc5a08f2c5f9bfa36317f7c
Signed-off-by: Lakshmanan M <lm@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvgpu/+/2393550
Reviewed-by: automaticguardword <automaticguardword@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: svc-mobile-cert <svc-mobile-cert@nvidia.com>
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
GVS: Gerrit_Virtual_Submit
2020-12-15 14:13:28 -06:00