Building loadfront: Owning our network

· 1100 words · 6 minute read

By advancing in the building of loadfront, we have taken an essential step to control what happens on the network and affects our users. In my Twitter account, I placed some of the milestones, which I now take the opportunity to detail better.

Following what we have wanted for a long time, we now own our network. We no longer depend on the network of ASPA.cloud, our colocation/housing provider. And that also has its drawbacks.

Past 🔗

The connection with the outside world was through ASPA.cloud. We placed the servers, connected them to a switch on their network, and assigned the IP address(es) we wanted. It just worked.

RIPE assigned the IP address ranges to us, both IPv4 and IPv6. We can say that they are “ours,” but we did not indicate this to the outside world; our provider did it on our behalf, on-demand, through their routers.

This has its advantages and inconveniences.

Advantages are that we didn’t have to worry about updating the network electronics, configuring it, or replacing one device with another if one fails.

We also didn’t have to negotiate with providers that connected us to the global Internet or worry about an interruption in their service since the network was redundant at our provider level.

As for drawbacks, there are some.

Disadvantages such as lack of control. We didn’t control how the network was segmented, the “neighbors” on the switches, or when the updates were carried out.

Another drawback is the price. Each port you connect to on the provider’s switch has a price, and usually, one is included. But if you want redundancy and have a separate management network, you’ll need extra ports for each server. Each independent network (using VLANs) usually has an associated cost. However, this was not our case, and it is something that we greatly appreciate from ASPA.cloud.

The operation time also influences. Connecting to your devices and making the changes at the time you prefer is not the same as having to request the modification and wait for it to be carried out.

But the one that worried me the most was the lack of managing how we wanted the Internet to see our network through the BGP protocol, which allows us to make decisions about it.

Present 🔗

Since January 29, we have had our network equipment in the data center. We have a router and a switch; and an additional server.

Shortly after, we migrated to our network, using our equipment.

The first network migrated was the IPv6 one. One of my goals is for the entire loadfront network to be 100% IPv6, except in public services where it coexists with IPv4 for those users (or remote services) that do not have IPv6 connectivity. We were not providing any service beyond some tests to proceed without fear in this network.

We carried the opportunity and enrolled in community services offered by Team Cymru, such as Nimbus and UTRS. Then we migrated the Bogons service connection we already had. Being part of these communities allows us to have greater visibility of threats, enhance the DDoS attacks filter, and to block connections with IPs that can’t be used.

On February 10, we took the opportunity to ask ASPA.cloud to relocate the server that we already had there to place it next to the new devices. We had to shut down the server for about 4 hours, warning our users a few days in advance.

Some days later, on February 15, we migrated the IPv4 network. This network is connected to both our router and ASPA.cloud, so traffic coming from and going to the Internet has two paths: our network and the ASPA.cloud network.

Using “more specific prefixes,” we have indicated to the global Internet that we prefer everything to come and go through our network, using the ASPA.cloud network as a “backup.” BGP stuff.

Making the change to our network had a loss of service of about 7 minutes for one of the network segments -the shared hosting one-. I misconfigured the gateway, as I forgot to propagate the VLAN through the network switch properly.

We have placed in UptimeRobot a small status page, where it is possible to see the interruptions that occurred during the performances. And through the HE’s BGP Toolkit, it is possible to see how we communicate with the world, both in IPv4 and IPv6.

Having complete control of our network, we have joined MANRS. This is important, as we have committed to complying and demonstrating that we comply with actions that allow the global Internet to continue to function as it should.

Future 🔗

We have just entered the world where providers manage their networks. We have taken a small step compared to any other established ISP/CSP, but a big one for us. However, it is only the beginning of this path.

Currently, our network has Cogent as the primary transit provider and ASPA.cloud as a backup. In the middle of this year, we will stop using the ASPA.cloud network, so our efforts will focus on adding at least one new additional transit link for redundancy.

We want to have a presence in Internet Exchange Points (IXPs) like DE-CIX or Global Peer Connect (this one from Cogent). Having a presence in them will allow us to have better agreements and reduce the latency of our connection with other providers. They all meet in the same place with direct links (“peering”) and help to depend less on IP transit since its resources will be less used. This lowers the final cost by needing less capacity on transit links.

The future of the loadfront network is to grow it, improve our connection to the world, and pass these benefits on to our end users.

Lessons learned 🔗

  • Having your network provides flexibility that does not exist otherwise, both at the management and operational levels.
  • A misconfiguration means your users could be affected from minutes to hours.
  • Having a good data center service, including remote hands, is critical in any eventuality.

Timeline 🔗

  • January 29, 2022: Installed a router, a switch, and an additional server in the data center.
  • January 31: Established IPv6 connectivity and announced our range through the installed router.
  • February 1: Established flow monitoring with Team Cymru’s Nimbus service.
  • February 8: Established a multihop BGP peering session with Team Cymru’s UTRS service.
  • February 10: Relocated the server “vms1” with the rest of the equipment.
  • February 11: Re-established multihop BGP sessions with Team Cymru’s fullbogons BGP service for IPv4 and IPv6.
  • February 15: Migrated the IPv4 network to our devices, maintaining a backup link through the ASPA.cloud network.