For the past little while, I’ve been quite involved in hardware automation. Whether it’s spinning up cloud environments (with OpenStack) or working with services on bare-metal machines outside of a cloud-like environment, the common requirement is:
- Power
- Internet (sometimes optional)
- Networking switches
- Servers (for compute, storage, or both)
- Cables (sometimes numerous)
This blog post will focus on some of my recent experiences with Cumulus Linux, as I’ve had to deal with it as part of a deployment for an OpenStack cloud.
Note
I want to very clearly state that I, by no means, am a networking expert!
Cumulus Linux
I’ve got to tell you, there’s something really cool about connecting to a console port of a switch and seeing a GRUB bootloader operating system selection menu appear during the early phases of booting.
Background

Cumulus Linux, now referred to as NVIDIA Cumulus Linux/NVIDIA Linux following the acquisition, is an operating system for bare-metal switching equipment based on Debian. That means you essentially get a fully fledged Linux distribution with certain tweaks/changes to make it more suitable as a network switch OS. As a result, it offers a more traditional switch CLI tool, but it also provides a full bash shell.
The Working Environment
The environment that I’m working with in this blog post is as follows:
Brand | Model | Description | Purpose |
---|---|---|---|
Cisco | ??? | Unknown | External VLAN trunk |
Edge-core | AS4610-54T | 1GbE RJ45 w/ 4x SFP+ | Management (BMC) |
NVIDIA | SN3700C | 100GbE QSFP28 | LAG mem. 1 |
NVIDIA | SN3700C | 100GbE QSFP28 | LAG mem. 2 + Provisioning |
The gist of the networking setup can be demonstrated as below:
Port Configuration
Some switches come with RJ45, some with SFP or SFP+ or SFP28 or QSFP or… you get the idea.
Switches with ports that can handle over 10Gbps bandwidth, typically SFP28+, have the ability to modify the rate at which ports can operate. Not only this, but ports can also be described as splitter ports (i.e. 100GbE -> 4x 25GbE). This functionality is provided through the /etc/cumulus/ports.conf
file. Here’s an example from my second SN3700C, creatively dubbed SN3700C-1, running Cumulus 4.3.0:
|
|
The above configuration indicates that the first 11 ports (1-11) of the SN3700C are being split from 100GbE to 4x 25GbE and is for use with a splitter cable. These ports are intended for high-performance networking operations, such as an OpenStack API and tenant data plane.
The second set of ports that are modified, 15-22, are made split out from 100GbE to 4x 10GbE SFP+ ports, to be used for slower network operations such as one-shot node provisioning via PXE.
Note
In the working case here, the split out to 4x 10GbE also facilitates the connection between the Edge-core and the SN3700Cs.
Interface Configuration (Bridges and VLANs)
Context
Internet access needed to be provided to the servers. To do this, we needed to trunk a VLAN (2) from the current external Cisco switch to the Edge-core and then trunk it further to the two SN3700C switches.
|
|
Unfortunately, we couldn’t directly connect the Cisco switch to the SN3700C switches because they have different form factors for cabling. The Cisco switch uses RJ45, which is not compatible with the SN3700C switches.
It’s important to note that Cumulus Linux denotes interfaces as swp<n>
, with n
being the number that corresponds to the port number.
Configuration
As I mentioned, Cumulus comes with two main interaction methods:
- The traditional-like switch CLI tool called “Network Command Line Utility” or
nclu
. - Editing
/etc/network/interfaces
. Yes. Debian or old-Ubuntu users will feel very at home with this.
Because I’m lazy and more comfortable with interfaces
, we’ll be going with option 2.
Let’s look at a default interfaces file:
|
|
Edge-core (entrypoint)
The Edge-core is responsible for carrying the trunked traffic through from the Cisco to the SN3700C’s and vice versa. None of the interfaces necessary for the VLAN have been added to the file. Let’s begin by creating a virtual bridge interface that can be made aware of the VLAN.
|
|
In the above config, we’ve prepared a bridge named internet
, which allows tagging vlan ID 2
and does not allow any untagged traffic to flow.
Regarding the bridge-vlan-aware
parameter, suffice it to say that this mode allows us to reduce configuration by having the bridge define the VLANs. It subsequently allows a larger scale of devices to be connected by only having a single instance of the spanning-tree protocol and improving MAC address handling. The NVIDIA documentation says it best, though. Click here to read more about it.
Now we need to add the port configurations.
|
|
And that’s pretty much it for trunking the VLAN through via the respective ports. On to the destination switch configuration.
SN3700C’s
On the SN3700C’s, the /etc/network/interfaces
file looks much the same as the Edge-core when unconfigured. The interfaces used to connect the SN3700C’s on the Edge-core are two SFP+ ports swp51
and swp52
.
The SN3700C’s need a 100GbE -> 4x 10Gbe SFP+ splitter, which allows communication between the two switches. This results in the ports being represented inside the OS as swp<n>s<m>
, where n
is the mapping to the physical port on the switch and m
is the identifier for an individual cable from the splitter.
In my case, since I used port 32 with a 4x splitter, it results in swp32s[1-4]
. I am only using the first cable from the splitter, so it’s not necessary to consider 2-4.
Let’s look at the configuration:
We start with the same bridge creation; however, since I want to use other VLANs to my servers I won’t just be dedicating this one for the internet. Because of that, I simply named it bridge
.
|
|
I have a use case for allowing untagged traffic with this switch, but I am not going to to detail it in this post. I wanted to mention this in case you were confused as to why it was left out. In terms of VLANs, it’s easy to add multiple by simply adding a space followed by the next VLAN id next to the bridge-vids
parameter.
Now we can define our ports:
|
|
All that’s left is to repeat the interfaces for the nodes that are connected to the SN3700C’s, to which we want to provide internet access.
For example, if we had a compute node connected to port swp25, we would change the configuration to look like this:
|
|
Conclusion
It’s interesting working with different networking equipment, and the more I do, the more I realise that I need to learn more about computer networking in general.
I really do enjoy using a familiar Linux-like interface to my networking switches, because I believe that it enables easy-to-read documentation and allows automation tooling (like Ansible) to not have to cater specially to the switch. You could potentially use widely common Ansible modules that configure Debian networking on your switches, just the same as you do with your servers.
All in all, the VLAN configuration on Cumulus Linux is not overly complicated. I hope this posts was helpful in some way.