Existing implementation uses a single page_pool
for all Rx DMA channels. As by default all irqs
are on CPU 0, this does not cause any issue.
Routing an irq to some other CPU causes race
conditon for page allocation from page_pool which
leads to random memory corruption.
[ 158.416637] Call trace:
[ 158.416644] arm_lpae_map_pages+0xb4/0x1e0
[ 158.416649] arm_smmu_map_pages+0x8c/0x160
[ 158.416661] __iommu_map+0xf8/0x2b0
[ 158.416677] iommu_map_atomic+0x58/0xb0
[ 158.416683] __iommu_dma_map+0xac/0x150
[ 158.416687] iommu_dma_map_page+0xf4/0x220
[ 158.416690] dma_map_page_attrs+0x1e8/0x260
[ 158.416727] page_pool_dma_map+0x48/0xd0
[ 158.416750] __page_pool_alloc_pages_slow+0xc4/0x390
[ 158.416757] page_pool_alloc_pages+0x64/0x90
[ 158.416762] ether_padctrl_mii_rx_pins+0x164/0x9d0 [nvethernet]
[ 158.416807] ether_padctrl_mii_rx_pins+0x478/0x9d0 [nvethernet]
[ 158.416822] osi_process_rx_completions+0x284/0x4d0 [nvethernet]
[ 158.416832] 0xffffa26218b8f71c
[ 158.416855] __napi_poll+0x48/0x230
[ 158.416871] net_rx_action+0xf4/0x290
[ 158.416875] __do_softirq+0x130/0x3e8
[ 158.416889] __irq_exit_rcu+0xe8/0x110
This change creates a page_pool per Rx DMA channel.
This ensures there is no race conditon for page
alloc/dealloc from each DMA napi context.
Bug 4541158
Change-Id: I12668ee7d824fd30d54a874bbbdf190d02943478
Signed-off-by: Aniruddha Paul <anpaul@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3093534
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
If the userspace service 'irqbalance' is installed then the nvethernet
driver crashes when there is network activity. To avoid this crash set
the IRQF_NOBALANCING flash for the VM interrupts. No performance
degradation is observed when running iperf3 with a 1Gbps link.
Long-term the nvethernet driver still needs to be fixed to allow IRQ
balancing.
Bug 4293378
Bug 4541158
Change-Id: I0aa4ee28e36c7d273f14ff043544e72d3e988bd3
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3087525
Reviewed-by: Brad Griffis <bgriffis@nvidia.com>
Reviewed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Add tests to conftest for detecting if the mii_bus structure has the
read_c45 and write_c45 function pointers and use the definitions
generated by conftest in the nvethernet driver.
This fixes support for nvethernet in 3rd party Linux kernels that have
backported the mii_bus structure changes to their kernel that have a
kernel version prior to Linux v6.3.
Bug 4014315
Change-Id: I5ae98fc5077337286921da6e9347df9781565a70
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3018935
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Issue:
The L2 Orin bridge use case is encountered
an issue on AV+L NDAS builds for VLAN ping
from HOST1 to HOST2 over L2 Orin bridge.
As HOST1/HOST2 VLAN packets are dropped
by EQOS/MGBE MAC due to VLAN VID feature enable.
Fix:
To support the L2 Orin bridge VLAN
HOST1 to HOST2 data usecase, disable
the VLAN VID filters in the nvethernet driver.
Bug 4230250
Change-Id: I02514fcc15b7764fe93118a9f90c16716368c73c
Signed-off-by: Mohan Thadikamalla <mohant@nvidia.com>
(cherry picked from commit 86cb37f933846a012ba6d8cdac275a2edda0aa42)
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/3003075
Reviewed-by: Bhadram Varka <vbhadram@nvidia.com>
Reviewed-by: Srinivas Ramachandran <srinivasra@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
In Linux v6.4, commit
3948b05950fd ("net: introduce a config option to tweak MAX_SKB_FRAGS")
added a kernel config option for MAX_SKB_FRAGS and change the type of
the definition to integer. This causes the nvethernet driver build to
fail and the following errors are seen ...
drivers/net/ethernet/nvidia/nvethernet/ether_linux.c:5949:33: error:
format ‘%ld’ expects argument of type ‘long int’, but argument 4
has type ‘nveu32_t’ {aka ‘unsigned int’} [-Werror=format=]
"invalid tx-frames, must be inrange %d to %ld",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/nvidia/nvethernet/ethtool.c:950:28: error: format
‘%ld’ expects argument of type ‘long int’, but argument 4 has
type ‘nveu32_t’ {aka ‘unsigned int’} [-Werror=format=]
"invalid tx-frames, must be in the range of"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by changing the type in the respective prints to integer. Note
that his is also to make this change for kernel prior to Linux v6.4.
Bug 4183168
Bug 4221847
Change-Id: Ibd9160ad08e65e0b85cae1cf4903075c31a1cd77
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2989002
Reviewed-by: Shanker Donthineni <sdonthineni@nvidia.com>
Reviewed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Instead of relying on kernel version to determine if functions or
specific versions of functions are present in the kernel, add compile
time tests to the conftest.sh script to determine this at compile time
for the kernel being used. This is beneficial for working with 3rd party
Linux kernels that may have back-ported upstream changes into their
kernel and so the kernel version checks do not work.
Bug 4119327
Change-Id: I79e701940ca70ca4d66500c75b5992f9d92b54b0
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2985744
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
ether_get_tx_ts, pdata->tx_ts_ref_cnt is an atomic variable
that is used as a mutex in this function and should not return 0
if the function fails to acquire the mutex.
The workqueue is scheduled when the return value of the
judgment function is <0. If one CPU core execution softirq
is running in ether_get_tx_ts function and another CPU core
softirq also calls ether_get_tx_ts function to get hardware
timestamp, then the acquisition of mutex fails and return 0.
After returning 0, the workqueue is not scheduled.
The timestamp cannot be obtained in time. So ether_get_tx_ts
should returns -1 on failure to acquire the mutex.
Bug 4150416
Change-Id: Icffd409b349d8bb8dbf5a483124b3bd3d7ef6cc8
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2934350
Reviewed-by: Bitan Biswas <bbiswas@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Compiling the nvethernet driver without the ccflag CONFIG_TEGRA_NVPPS
defined generates the following warning ...
drivers/net/ethernet/nvidia/nvethernet/ether_linux.c:2936:5:
warning: "CONFIG_TEGRA_NVPPS" is not defined, evaluates to 0 [-Wundef]
2936 | #if CONFIG_TEGRA_NVPPS
| ^~~~~~~~~~~~~~~~~~
Fix this by using '#ifdef' to determine if CONFIG_TEGRA_NVPPS is
defined instead of '#if'.
Bug 4190030
Change-Id: Id1523de6f9ef9f4c72669584efb77ffd9ddf20f3
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2938375
Reviewed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Issue:
MGBE0 ASID is getting programmed for all the MAC instances
which inturn causing the data transfer failures.Macsec is not enable
and hence instance_id is not getting updated to the corresponding mac
instance number and hence all mac instance_id's are initialized with 0.
Since ASID programming is based on mac instance_id, all MACs are
programmed with MGBE0 MAC instance id.
Fix:
Moved reading of instance_id from DT out of macsec_probe and hence the
instance_id gets updated irrespective of MAC sec enable/disable.
Bug 4124937
Change-Id: I500ed8d3db402995488260af99ad190217ff6dd2
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2925252
Reviewed-by: Narayan Reddy <narayanr@nvidia.com>
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Commit db1a63aed89c ("net: phy: Remove fallback to old C45 method")
removes a fallback C45 method in the MDIO bus driver. This breaks
nvethernet support for Linux v6.3 and the following errors are observed.
failed to register MDIO bus (nvethernet_mdio_bus)
net eth0: failed to register MDIO bus
Fix this by adding the necessary read/write_c45 callbacks in the
nvethernet driver. Note these callbacks are only supported for Linux
v6.3+ kernels.
Bug 4014315
Change-Id: Ia6ab753941db0515799657da8522f32996d0852a
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2924619
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Reviewed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-by: Narayan Reddy <narayanr@nvidia.com>
The nvethernet driver has a dependency on the NVPPS driver which in
turn has a dependency on the GTE driver. The GTE driver is being
transitioned to the new upstream HTE driver and while this transition is
on-going, allow the nvethernet driver to be distributed and built
without NVPPS and GTE.
Note that by default the dependency on the NVPPS driver is enabled by
setting CONFIG_TEGRA_NVPPS to 'y'. Only if the user sets
CONFIG_TEGRA_NVPPS to 'n' will it be disabled.
Bug 3918941
Bug 3961133
Change-Id: I019e5ebfddfcb158b0a684d2eacc404231577166
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2913273
Reviewed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-by: Bhadram Varka <vbhadram@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Issue: skb_vlan_tag_get gives only the VLAN_ID for
the earlier kernel versions (K4.9) and hence added
a logic to take care of adding the VLAN priority
explicitly and hence the VLAN priority getting
modified in a newer kernel version where VLAN PRIO is
already included
Fix: remove the logic of explicit adding of VLAN
priority.
Bug 3788862
Bug 4088361
Change-Id: I0ac9ebfe21d3e696ac1b0f6ba540c010928f775f
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2806468
(cherry picked from commit 799b17fc35f3bc3ec75ea93452c03bd52ed28527)
Signed-off-by: Narayan Reddy <narayanr@nvidia.com>
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2895043
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Issue:
During eth0 down case, the link change gets called twice,
1. due to ndo close
2. due to interrupt generated by phy.
The ether_adjust_link callback also configures the EEE settings
and calls phy_init_eee (a phy framework API) for the same.
Here a race condition occur, when ether_close sets phydev to NULL
before the phy_init_eee gets called (from ether_conf_eee) which tries
to deference the phydev causing kernel panic.
[35779.541305] Call trace:
[35779.544018] phy_init_eee+0x24/0x290
[35779.547455] ether_conf_eee+0x110/0x1310 [nvethernet]
[35779.552525] ether_conf_eee+0x4e0/0x1310 [nvethernet]
[35779.557840] phy_link_change+0x40/0x90
[35779.561603] phy_state_machine+0x190/0x240
[35779.565643] process_one_work+0x1c4/0x4d0
[35779.569396] worker_thread+0x54/0x430
[35779.573072] kthread+0x148/0x180
[35779.576586] ret_from_fork+0x10/0x34
[35779.580069] Code: a90153f3 d5384113 a9025bf5 12001c36 (f941b801)
[35779.586195] ---[ end trace 2ea5b6492f36b0cb ]---
[35779.590737] Kernel panic - not syncing: Oops: Fatal exception
Fix:
Add a NULL check and return -ENODEVICE in EEE configuration
Bug 3877272
Bug 3920560
Bug 3865666
Bug 4088361
Change-Id: I2b26c6eeabb7a109b9d14b3bd2a035f5c3b3c6fa
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2854404
(cherry picked from commit 40606dc247eacf2a8987687777874575d9037058)
Signed-off-by: Sushil Kumar Singh <sushilkumars@nvidia.com>
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2855237
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Issue: If the supplicant is killed for some reason Data would flow
plain on that interface, this needs to be avoided
Fix: Update bypass LUT such that if the frames from the VF(on which
supplicant is launched) is received on MACSEC either authenticate the
same or drop. Along with this handles below items as well. All the VFs MACIDs
are obtained in OSI to update the bypass LUTs to decide on which VF frames
to be authenticated and which VF frames needs to be bypassed.
1. Remove osi_macsec_en API and have single API to init and deinit
2. Remove explicit command from supplicant to set control port and
set protected frames. Handle the same in osi_macsec_init
Bug 3984665
Change-Id: I8bc8aa95d1e21e99e992b471fb70ed58073163f7
Signed-off-by: Sanath Kumar Gampa <sgampa@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2878515
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Upstream commit 99d5fe9c7f3d ("net: mdio: Remove support for building C45
muxed addresses") removed the definition MII_ADDR_C45 breaking the build
for the nvethernet driver for Linux v6.3. Update the driver to use the
OSI_MII_ADDR_C45 instead.
Bug 4014315
Bug 4075345
Change-Id: I98f8d10ce1a093d458d293b286a4e5a544d48d04
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2889876
Reviewed-by: Rohit Khanna <rokhanna@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Issue: MACSEC is not enabled in DT for eqos_1 and mgbe2_1
interfaces which makes macsec_pdata pointer to NULL.
During suspend/resume macsec_pdata pointer accessed without
NULL check which resulted in crash.
Fix:
o Add NULL pointer checks for macsec_pdata in suspend and resume.
o Remove unnecessary checks for macsec in suspend and resume.
Bug 4060718
Change-Id: I00e042d979e115c088696a8964496ed0054360e5
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2888289
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Ethernet driver expect the VM IRQ configuration. The VM IRQ
is provided with the property "interrupt" which have multiple
VM irq numbers and their configurations via the vm irq
configuration node which is referred by property
"nvidia,vm-irq-config".
The vm irq configuration node have the configuration of
all VM IRQ and each configuration is separated with the
child node like below.
ethernet@6800000 {
nvidia,vm-irq-config = <&mgbe_vm_irq_config>;
}
mgbe_vm_irq_config: mgbe-vm-irq-config {
nvidia,num-vm-irqs = <5>;
vm_irq1 {
nvidia,num-vm-channels = <2>;
nvidia,vm-channels = <0 1>;
nvidia,vm-num = <0>;
nvidia,vm-irq-id = <0>;
};
vm_irq2 {
nvidia,num-vm-channels = <2>;
nvidia,vm-channels = <2 3>;
nvidia,vm-num = <1>;
nvidia,vm-irq-id = <1>;
};
vm_irq3 {
nvidia,num-vm-channels = <2>;
nvidia,vm-channels = <4 5>;
nvidia,vm-num = <2>;
nvidia,vm-irq-id = <2>;
};
vm_irq4 {
nvidia,num-vm-channels = <2>;
nvidia,vm-channels = <6 7>;
nvidia,vm-num = <3>;
nvidia,vm-irq-id = <3>;
};
vm_irq5 {
nvidia,num-vm-channels = <2>;
nvidia,vm-channels = <8 9>;
nvidia,vm-num = <4>;
nvidia,vm-irq-id = <4>;
};
};
};
The child 1(vm_irq1) is link with the VM IRQ 1 listed in
interrupt property, next child which is vm_irq2 is link
with the 2nd VM irq. So, the order of VM configuration
child must be in same order as VM IRQ provided in the
property interrupt.
If the irq configuration order i..e child order is not
matching with VM irq number then there is interrupt issue.
Now, if the VM IRQ config child nodes are provided in base
DT file in same order then the ethernet works fine.
But when applying via overlay, it does not work.
This is because of overlay technique. The overlay logic is
such that it iterates all child node in overlay and
apply the child as first child of target node.
Hence the child sequence provided in overlay fragment
is applied in reverse order in target node. This causes
the irq configuration order not matching with VM IRQs.
To address the overlay issue, read the sequence of
interrupt in property "interrupt" from the newly added property
"nvidia,vm-irq-id" of each child. This way child node
sequence does not matter, and sequence is identified
by property.
Bug 3956724
Change-Id: I5c08edcd8a7e8d3112618dd23050d8b5c10ddc59
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2855639
Reviewed-by: Bhadram Varka <vbhadram@nvidia.com>
Reviewed-by: Narayan Reddy <narayanr@nvidia.com>
Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
The nvethernet driver fails to build for Linux v5.14 and Linux v5.17+
kernels. In Linux v5.15 the return type of the get/set_coalesce
callbacks were updated. In Linux v5.17 the parameters for the
get/set_ringparam callbacks were updated. In Linux v6.1 the
weight parameter for the netif_napi_add() function was removed and the
function netif_napi_add_weight() was added for passing the weight
parameter. Add the necessary kernel version checks to ensure that the
drivers build with the different versions of the Linux kernel.
Bug 3793131
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Change-Id: I56738eddd27eb4ebf7db5a2d0d031a64379482e8
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nv-oot/+/2820712
Reviewed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Reviewed-by: Bibek Basu <bbasu@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>
Issue:
1) dev_queue_xmit acquired the per queue tx lock and called driver
transmit routine.
2) During the transmit - common interrupt asserted which tries to
acquire the same tx lock which resulted in lock recursion.
Fix: Move the lock acquire to tasklet context under common isr.
Bug 3773016
Change-Id: I7cfd49beb1238286d3bccd9e4b9ccc054c4f6d30
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Reviewed-on: https://git-master.nvidia.com/r/c/linux-nvidia/+/2770227
Reviewed-by: Narayan Reddy <narayanr@nvidia.com>
Reviewed-by: Bitan Biswas <bbiswas@nvidia.com>
Reviewed-by: svcacv <svcacv@nvidia.com>
GVS: Gerrit_Virtual_Submit <buildbot_gerritrpt@nvidia.com>