- Commit c61e21c868 fixed race codition in PMU
state transition.
- The race condition is such that PMU response(intr callback for messages) can
run faster than kthread posting commands to PMU and thus PMU message callback
may skip important pmu state change.
- Commit c61e21c868 introduced a fix where PMU
state change was only updated from callback while other places can only update
pmu_state variable
- However, this commit introduced a regression as follows:
- When PMU state is PMU_STATE_INIT_RECEIVED, we loop over every engine
supported by GPU --> If state = PMU_STATE_INIT_RECEIVED, change the state
to PMU_STATE_ELPG_BOOTING and init ELPG else If state != PMU_STATE_INIT_RECEIVED
throw an error saying "PMU INIT not received"
- Now, if GPU supports multiple engines, first engine will check that
pmu_state is PMU_STATE_INIT_RECEIVED and change it to PMU_STATE_ELPG_BOOTING
However, from second engine onwards, since state is already changed to
PMU_STATE_ELPG_BOOTING, all engines except first engine start throwing
error "PMU INIT not received"
- This patch fixes the issue by changing pmu state from
PMU_STATE_INIT_RECEIVED to PMU_STATE_ELPG_BOOTING only once.
Bug 200372838
JIRA EVLR-2164
Change-Id: Ic8c954d14acb1d6ec3adcbc4bcf4d4745542d9f0
Signed-off-by: Deepak Bhosale <dbhosale@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1769814
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-by: Aparna Das <aparnad@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: svccoveritychecker <svccoveritychecker@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
While removing nvgpu driver, devm mapped reg mappings
are released on driver_unregister. For iGPU, these
regs are explicitly unmapped with iounmap(). This
results in "Trying to vfree() nonexistent vm area"
warnings on driver removal.
Address this by using devm* variants to map all IO regions
of both iGPU and dGPU and let the driver unregister
release these mappings.
Also, lock out GPU regs in driver removal path.
Bug 1987855
Change-Id: I0388daf90bea3eaf8752255059cfd3ceabf66e7d
Signed-off-by: Nitin Kumbhar <nkumbhar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1730539
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In some use cases client will disable and preempt TSG and then re-enable it
using IOCTLs provided
In case there is only one context getting re-enabled and there is no other job
submission in parallel runlist fetcher will just sleep until doorbell is
received next time
This causes above mentioned test cases to stall after re-enabling TSG until
some one submits a new job and triggers a doorbell
Fix this by explicitly triggering doorbell from gv11b_fifo_enable_tsg() after
we enable all channels in TSG
Bug 2205192
Change-Id: I08e70e3d0f7e4dc6471e63809e246430cc4200c1
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1772378
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
gv11b_mm_fault_info_mem_destroy() and gv11b_mm_mmu_hw_fault_buf_deinit()
serve a similar purpose of disabling hub interrupts and deinitializing
memory related to MMU fault handling.
Out of the two the latter was called from BAR2 deinitialization, and the
former from nvgpu_remove_mm_support().
Combine the functions and leave the call from nvgpu_remove_mm_support().
This way BAR2 deinitialization can be combined with gp10b version.
JIRA NVGPU-714
Change-Id: I4050865eaba404b049c621ac2ce54c963e1aea44
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1769627
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Rename os/linux/vidmem.c to os/linux/dmabuf_vidmem.c. The code is
mainly dealing with interfacing with Linux dmabuf framework and its
responsibilities got confused with common/mm/vidmem.c.
Also move the header include/nvgpu/linux/vidmem.h to
os/linux/dmabuf_vidmem.h. It does not expose any interface to outside
Linux code.
Change-Id: I2cb1057a8934d5cb5c5860023aa12f8f048a6684
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1768261
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
vidmem.h had a forward declaration for a Linux specific struct
work_struct. Removed that.
vidmem.h also #included nvgpu_mem.h even though there was no use
for it. As a follow-up css_gr_gk20a.h did refer to nvgpu_mem but
did not #include it, so added that.
Change-Id: Ifea88adae86ed95302465641821fbb107d7cc233
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1768260
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In nvgpu_dma_alloc_flags_vid_at(), we check pending bytes of vidmem which are
yet to be cleared by reading g->mm.vidmem.bytes_pending.atomic_var
If there is something to be cleared we return EAGAIN otherwise we return ENOMEM
But to store above variable we use "int before_pending" which evaluates to zero
for sizes like 4GB and we end up returning ENOMEM instead of EAGAIN
Fix this by declaring before_pending variable as u64
Bug 200427361
Change-Id: I6ffe977e3663a5135fa17699ecafe78ac90d9314
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1770384
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add new HAL gops.fb.mmu_invalidate_replay() to invalidate replay mmu fault
Use existing API gv11b_fb_mmu_invalidate_replay() to set to this HAL on all
Volta chips
Bug 2228914
Jira NVGPU-838
Jira NVGPUT-73
Change-Id: I394901857d41499f3ea44023393fe271fb664260
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1767970
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gk20a_split_ppc_broadcast_addr() we convert a PPC broadcast address to
its corresponding unicast address list
But we consider gr.pe_count_per_gpc instead of actual number of PPCs and that
leads to generating incorrect list of addresses
Fix this by using gr.gpc_ppc_count[gpc_num] which gives correct number of
PPC count
Jira NVGPUT-117
Change-Id: If7e7c19244b90cb3c405dcba4ae7a86c782972f7
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1767838
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
- Added prefix gp106_ to sec2_wait_for_halt()
& sec2_clear_halt_interrupt_status() for gp106
SEC2 HAL
- Made changes to gp106_sec2_wait_for_halt() to
read SEC2 falcon mailbox using common falcon
mailbox access functions.
- Add define for falcon mailbox
- These changes are done to reuse gp106 HAL's
for GPU_NEXT.
Change-Id: Id32a7636d775b482684212ed4ef5d01c8ea65335
Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1755618
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
In gr_gk20a_find_priv_offset_in_buffer() we right now calculate
offset of a register in gpccs segment based on register address type
Separate out sequence to find offset in gpccs segment and move it to new API
gr_gk20a_get_offset_in_gpccs_segment()
Introduce new HAL gops.gr.get_offset_in_gpccs_segment() and set above API
to this HAL
Call HAL from gr_gk20a_find_priv_offset_in_buffer() instead of calling direct
API
Jira NVGPUT-118
Change-Id: I0df798456cf63e3c3a43131f3c4ca7990b89ede0
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1761669
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_mem_rd*() functions were implemented per OS. They also used
nvgpu_pramin_access_batched() and implemented a big portion of logic
for using PRAMIN in OS specific code.
Make the implementation for the functions generic. Move all PRAMIN
logic to PRAMIN and simplify the interface provided by PRAMIN.
Change-Id: I1acb9e8d7d424325dc73314d5738cb2c9ebf7692
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1753708
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
When requesting a gpc2clck below lowest V/f point, clock arbiter
did not properly adjust target value to nearest V/f point. This
could lead to lower than expected effective frequency. Fixed the
logic to adjust to nearest V/f point.
Bug 200412996
Change-Id: I36c24b4c081931e2ac54da14d49e46fcb14503e3
(cherry picked from commit 7ed1f8fb39f76208922daa91d00905cdb96b2304)
Signed-off-by: Thomas Fleury <tfleury@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1763641
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
the logic used for selecting frequencies from achievable frequencies
of the GPU clk is selecting one in a set of 8 frequencies.
This reduces the number of available frequencies when the number of
achievable frequencies is small. Change this implementation to choose
all frequencies when the achievable frequency list is small.
Bug 200381453
Change-Id: Ib280d7ccf9b75f88f6c7c6d2666f05e92a0343bd
Signed-off-by: Vishruth <vishruthj@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1753289
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
nvgpu_mem_get_addr() gets virtual/phys address depending on the platform.
But we need to explicitly use physical addresses to configure PCI simulation
support since simulator expects physical address only
Hence use nvgpu_mem_get_phys_addr() explicitly to configure msg/send/recv
buffers needed for pci simulation support
Jira NVGPUT-41
Change-Id: I6870feef35fe81d43189fa048dc2f7052926bcc4
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1756843
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Now that GR buffers always have a kernel mapping, remove the unnecessary
calls to nvgpu_mem_begin() and nvgpu_mem_end() on these buffers:
- global ctx buffer mem in gr
- gr ctx mem in a tsg
- patch ctx mem in a gr ctx
- pm ctx mem in a gr ctx
- ctx_header mem in a channel (subctx header)
Change-Id: Id2a8ad108aef8db8b16dce5bae8003bbcd3b23e4
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1760599
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Alex Waterman <alexw@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Add support for `rfr' to send emails directly to the nvgpu-core mailing
list. Copying the email contents and then manually updating it is
annoying. This streamlines that process greatly!
Since most people always write something extra past what the `rfr'
script generates you can edit the message before it sends. If you don't
specify a message explicitly with '--msg' then an editor based on your
setting of $EDITOR is opened up with the contents of the email. What
you edit into there will be sent verbatim. If for some reason you want
nothing to be added there's the '--no-msg' option.
Change-Id: Icb0ccba936357cf5e3a93001a5c22aeacf906e11
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1595544
GVS: Gerrit_Virtual_Submit
Reviewed-by: Konsta Holtta <kholtta@nvidia.com>
Tested-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
To finish OS unification of the submit path, move the
gk20a_submit_channel_gpfifo* functions to a file that's accessible also
outside Linux code.
Also change the prefix of the submit functions from gk20a_ to nvgpu_.
Jira NVGPU-705
Change-Id: I8ca355d1eb69771fb016c7a21fc7f102ca7967d7
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1760421
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Remove Gr engine reset during ELPG entry.
Engine reset is causing clock gating logic to get reset thus
clock gating gets disabled during ELPG entry sequence.
It leads to higher power numbers observed at light graphics.
Removing GR reset during ELPG entry helped save power.
Bug 2180198
Change-Id: I957951eb93f9d044f4d9a908f2b56a4903dfbfad
Signed-off-by: Deepak Goyal <dgoyal@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/1757695
Reviewed-by: svc-mobile-coverity <svc-mobile-coverity@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Alex Waterman <alexw@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Vijayakumar Subbu <vsubbu@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>