#How to get the 2nd 10GbE port of the Mezzanine card to work on a FAS2240?
1 messages · Page 1 of 1 (latest)
Define works? If you tag one of them for clustering, it won’t show up for data..
Switchless cluster or switched?
Tried both, currently switched. One (e1b) is tagged cluster. The other (e1a) is intended for Data use, but the interface is always down. Somehow only one e1a or e1b is up. No matter
I can tell from the switch (on FAS reboot) that both ports are active and lights are also on, but upon finish of ONTAP boot one port is turned off
You’re attempting to configure it in a supported method as per page 7 of https://library.netapp.com/ecm/ecm_download_file/ECMP1139842 so it should work
e1a should be cluster and e1b data though
afaik my engineer has tried all sorts of variations by now and none of them worked. Even now the interface unexpectedly goes down and there is no way to wake up the interface
The anticipated configuration is (Single, non-redundant cluster connection) like here
this is a standalone deployment and the interface are not coming up either.
is there any way I could manually turn on the link?
even in diag shell (systemshell) with ifconfig the "network stack cannot be modified".
Here is an old post @opal heart with some reply from you as well.
https://community.netapp.com/t5/ONTAP-Hardware/Network-Port-Roles-for-FAS2240-With-SFP-Running-clustermode-9-1/td-p/135948
Is any portset required like andris mentioned it or is a "normal" cluster configuration ok?
We're in the process of Migrating a FAS2240 to run ontap 9.1 that has onboard 2 x SFP+ (e1a, e1b) and 4 x 1Gb (e0a - e0d), We Would Like to know if it is possible to run it with the interconnects partly on 1gb so we have redundant links but also still have some 10gb available for data can the belowp...
Again, after running cluster-setup i suspect that port e1a (now directly connected as shown in the guide) is DOWN. Thus the setup wizard shows e1b as cluster port:
Whichever port is used for the cluster should be in the cluster IPSpace, on both nodes. Is this a fresh deployment of ontap? I would suggest doing a special boot menu 4 on both nodes, with the cable in e1a (not e1b like that chump Alex says on the community 🤣) and start fresh. It does not look like you have cluster LIFs currently which is.. not what I’d expect. Without them the rrdbs which ontap needs to properly manage networking will not run properly
What cable are you using for the crossover?
🙂 very fresh deployment. This is as of right now
Its a Twinax SFP+ DAC
Which vendor?
I think Arista
That is not supported. I say that not to be a jerk, but DAC compatibility is limited
Use Cisco or Intel
Even Intel isn’t supported, but I know it works with those mezz cards
I could exchange with what is available here. I tried with a Fiber as well. The SFPs work. I already had them up and running once and was able to ping the other cluster. Just not both ports e1a + e1b at the same time.
This was when e1a once worked
Hmm
I ran out of ideas how I could troubleshoot why both ports at the same time are not working. (considering that they work independently but not at the same time). Do you think that could be a SFP issue?
Does “run local netdiag -v” work?
Netdiag used to be a 7mode command
I don’t know if it’s still in cdot
havent tried. let me try. I would love to manually turn on the interfaces and see if the link comes up or of its a HW issue of some sort. BC when booting the system (BIOS stage and loader) both ports are turned on
is there a command to mess with the network stack?
I think the mezz card is a single asic so it’s not out of the question it’s an SFP issue, but I’ll agree this is a head scratcher
I don’t remember off the top of my head
I just finished setup of the new cluster:
FAS2240::*> run local netdiag -v
netdiag not found. Type '?' for a list of commands
FAS2240::*> run -node local netdiag -v
netdiag not found. Type '?' for a list of commands
FAS2240::*> node run -node local netdiag -v
netdiag not found. Type '?' for a list of commands
Blast
Intel P/N: XDACBL2M SFP + Copper Twinax cables are suitable for very short distances and offer a very cost-effective way to connect in racks and over adjacent racks. SFP + cable provides a cost-effective SFP + solution. SFP + directly connected twinax cable assemblies support 10 Gigabit Ethernet,...
I think this is the Twinax
But there are various vendors around
Usually the port turns on when I reboot both systems at the same time
At this point I’d setup the cluster with two 1G links for cluster interconnect then try setting up the 10G for data
Or have you done that?
I have done that and that works as the 4x1G interfaces work. However, not what the customer wants. One port for data usually works. Still a bummer
Is there a way I could turn on the ports on via diag commands or systemshell?
Btw, the NVRAM bat is complaining as well. Could that cause any issues related to the 10GbE card?
Nvram bat shouldn’t cause an issue no. But it also should be working by now..
I don’t know of any commands to do that. I believe the up/down detection is pretty low level. That it looks like they work at bios/startup could just be that the card is not actually initialised
I’ve seen enough freaky DAC issues that I wouldn’t spend more time until there are Cisco cables being used
FAS2240-01% ifconfig e1a
e1a: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
uuid: 6cb224ea-a0e9-11ed-80c8-00a0981c09f8
options=6c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:a0:98:xx:xx:xx
media: Ethernet autoselect (autoselect <full-duplex,rxpause,txpause>)
status: no carrier
FAS2240-01% ifconfig e1b
e1b: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
uuid: 6cb277ad-a0e9-11ed-80c8-00a0981c09f8
options=6c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
ether 00:a0:98:xx:xx:yy
media: Ethernet autoselect (autoselect <full-duplex,rxpause,txpause>)
status: no carrier
FAS2240-01% ifconfig e1b up
ifconfig: Modifying networking stack is not allowed at this point!
FAS2240-01% ifconfig e1a up
ifconfig: Modifying networking stack is not allowed at this point!
FAS2240-01%
(And make sure they aren’t V01 cables)
OK. Thx! Let me try that then
What is really freaky is that the e1b (data) port connected to the switch can be seen as (RS=Running) and later shut (S=Shut) after boot. Some part of ONTAP shuts it down when they system is close to wrote key file "/tmp/rndc.key"
[201] module_register_init: MOD_LOAD (NVMeOF, 0xffffffff83e12750, 0) error 45
[271] e1a: Could not setup receive structures
[271] e1b: ixgbe_check_mac_link_generic: **LINK UP** link_up = 0x00000001
[271] e1b: ixgbe_check_mac_link_generic: **LINK DOWN** link_up = 0x00000000
[271] e1b: Could not setup receive structures
FAS2240-01%
You think this one could work?
https://www.amazon.de/Cisco-SFP-H10GB-CU1M-Copper-Networking-Cable/dp/B001L5829E?ref_=ast_sto_dp
Das direkt verbundene Cisco 10G Twinax-SFP+-Kabel ist eine effiziente und kostengünstige Lösung für Konnektivität innerhalb von und zwischen benachbarten Racks. Dieses passive Twinax-Kupferkabel ist an beiden Enden mit SFP+-Steckern ausgestattet und bietet eine Vielzahl von 10-Gigabit-Ethernet-Ko...
That’s from the ixgbe driver in FreeBSD
Nvmeof? What version of ontap is this?
Ok. Thanks Alex. Ordered. I took 3. 2x Data to a switch and 1x Direct
9.8P14
On a 2240?
yes. seems so. Not good?
No.. FAS2240 supports 9.1 at maximum
Whatever you or the customer have done to get 9.8 on there.. 😮
That fact it even gets this far is.. surprising
Was already installed when we got there. Cool 🙂 Happy to report up and running but just without the 10GbE. Maybe with the SPF cable then. However, performance on a single node was aweful. ~150 MB/s with all CPUs on 100% from what I was told on the 10GbE.
Well, that may be why we don't support it on that platform 😉
interestingly though we do support it on the FAS255x which is.. not substantially different in CPU, only in RAM
but there's all sorts of things that get turned on/off according to platform, so.. if it's running 9.8 it may not turn off some stuff on a FAS2240 because it's not expecting it to run on there
ahjaja. I dont even know if the setup will work or what the h%$! has been going on before on that platform. But from my past with NetApp I couldnt image you would sell a system that only does a ~200MB/s supporting 10GbE, no matter if it is 10years old
I was told AES-NI is not available and thus no encryption
yeah
that's block by the CPU, but there's other things like how often tasks run, how much priority they have, etc, which is platform dependant
I dont know how WAFL evolved and if its performing but I will try to find out why 9.8 is on there and if 9.1 could perform better
For the sake of knowing, I will wait for the cables to get delivered first
both the FAS255x and FAS2240 are C3528 Processors but the FAS255x has 3x the RAM
(and no, RAM upgrades don't work)
we had a few customers try it back years ago, so at boot it checks if the right DIMMs are in place for the platform
🤣 glad you mentioned that. Saves me further questions. And I guess the CPU cant be exchanged either 😆
no, it's a BGA CPU I think 😉
challenge accepted
actually looks like it isn't.. hah
joke, I would be glad if I only get the interfaces up and running
nod
and then if the performance can saturate the 10GbE interface
I.. would not be expecting that 🙂
ie. 9500Mbps was measured between 10G. That would give a theoretical 1,187.5MB/s - overhead.
yeah, I think more than about 400MB/sec is probably unlikely on that platform
There is an additional disk shelf with 21x 6TB performance disks SAS attached.
it's been a while since I've seen people performance test it
I wouldn't do it but if it's a tops 150MB/s the customer wont be happy
well, with respect to your customer, if they want performance, starting with a platform released 10 years ago isn't where we suggest they look for solving problems 🙂
nod they just bought it out of a bankruptcy and were told its performing well
However, what do you think they should expect (in case we get the 10GbE card running)?
internal drives are .. 600GB? or 450?
afaik 450 and the ds4243 with 21x 6TB
I'd say 250-300MByte/sec out of each controller is not unlikely
for sequential reads
of large files
random reads, small files and you'll get metadata read penalties etc
ok, starting the day here in Perth Australia. Let me know how the Cisco twinax's go
Will do! thx a lot for the help and have a good one. Cheers from Germany
One final thought. Could this be related to a power issue maybe? I mean everything is up and running but the power LEDs are orange?
Maybe the system goes after boot into some power saving state that cuts the energy for the SFPs
It’s entirely possible, I don’t know for sure though. Until it’s running 9.1 though, I don’t recommend further troubleshooting