Just have a segregated network, and let the VPC/dhcp do all the hard stuff.
Have your hosts on the default VLAN(or Interface if your cloudy), with its own subnet (Subnets should only exist in one VLAN.) Then if you are in cloud land, have a second network adaptor on a different subnet. If you are running real steel, then you can use a bonded network adaptor with multiple VLANs on the same interface. (The need for a VLAN in a VPC isn't that critical because there are other tools to impose network segregation.)
Then use macvtap, or macvlan(or which ever thing that gives each container a macaddress) to give each container its own IP. This means that your container is visible on that entire subnet, either inside the host or without.
There is no need to faff with routing, it comes for free with your VPC/network or similar. Each container automatically has a hostname, IP, route. It will also be fast. As a bonus it call cane be created at the start using cloudformation or TF.
You can have multiple adaptors on a host, so you can separate different classes of container.
Look, the more networking that you can offload to the actual network the better.
If you are ever re-creating DHCP/routing/DNS in your project, you need to take a step back and think hard about how you got there.
70% of the networking modes in k8s are batshit insane. a large amount are basically attempts at vendor lock in, or worse someone's experiment thats got out of hand. I know networking has always been really poor in docker land, but there are ways to beat the stupid out of it.
The golden rule is this:
Always. Avoid. Network. Overlays.
I have bare metal servers tied together with L3 routing via Free Range Routing running BGP/VxLAN. It Just Works.
No hard coded vlans between physical machines. Just point-point L3 links. Vlans are tortuous between machines as a Layer 2 protocol, given spanning tree and all of its slow to converge madness.
Therefore a different Golden Rule:
Always. Overlay. Your. Network.
Leave a note if you'd like more details.
> Always. Avoid. Network. Overlays.
What do you think VPC is?
but, unless you have an actual reason, why put another layer over the top? especially given the performance and tooling hit.
We encrypt 100% of our machine-to-machine traffic at the TCP level. There's a lot of shuffling of certs around to get some webapp to talk to postgres, then have that webapp serve https to haproxy, etc.
I'd be awesome if there was a way your cloud servers could just talk to each other using wiregaurd by default. We looked at setting it up, but it'd need to be automated somehow for anything above a handful of systems :/
I don't understand why you'd want to do this?
I use wireguard to join machines on disparate networks into one.
However to do it inside the same VPC, I just don't get. If you don't trust your VPC surely you need to be moving off the cloud?
Edit: also the OSI layer model was specified in the eighties, and isn't all that accurate in 2019 to describe how our networks actually work.
A subnet should only be in one vlan, but there are networks where there is more than one subnet in a vlan.
Whether that is appropriate or not, that would be a different topic.
In my experience, IPv4 has the strong advantage of being familiar and well-supported, which means that when (not if) your network infrastructure starts to act up, it's easier to figure out what's going on. IPv6 works great if you have robust, reliable multicast support on all your devices and nothing ever goes wrong.
In IPv4 you're going to need RFC1918 addresses, and then you're going to have to make sure that _your_ RFC1918 addresses don't conflict with any _other_ RFC1918 addresses that inevitably absolutely everything else is using or else you'll get hard-to-debug confusion. No need in IPv6, you should use globally unique addresses everywhere, there are plenty and you will not run out.
Everybody who has ever used a single byte to store a value they were convinced wouldn't need to be more than a few dozen, and then it blew up because somebody figured 300 ought to fit and it doesn't already knows in their heart that they shouldn't be using IPv4 in 2019.
ZeroTier supports a mode where it emulates NDP for v6 and works without having to do multicast or broadcast at all. It does this by embedding its cryptographic hash vl1 addresses into v6 addresses.
Huh? Are you assuming large flat L2 networks addressed with IPv6?
IPv6 works great at scale, just route everything everywhere, stick with unicast & anycast, and don't roll large L2 domains. Multicast is entirely unnecessary aside from the small amount needed for ND/RA between host and ToR.
And, for operations, a routed IPv6 network without NAT, VXLAN, or VLANs spanned across switches is much easier to troubleshoot and generally has fewer moving parts to fail.
Quagga is available in the default package managers of most distros so its a good place to start.
And they have their own apt repo at deb.frrouting.org for other Debians.
Microsoft runs FRR on SONiC.
Vyos runs FRR.
6wind runs FRR.
Cumulus Networks runs FRR.
Juniper runs FRR in certain products.
VMware runs FRR.
Broadcom is integrating it.
I don't think you are very familiar with the scope of changes that have gone in since Quagga. Not to detract from BIRD - which is a great, solid BGP implementation - but it is disingenuous to say FRR isn't used in production.
Would you trust two compared TCP implementations using those stats as well?
For something simple like this post, using quagga is completely fine and probably much better that using the latest Swiss Army knife.
The Quagga source repo[1]'s certificate expired over 6 months ago. Looking at the Bugzilla[2] report (also with an expired certificate) there are 14 blockers, 49 critical and 69 issues that have not been resolved.
So no, I'd agree with the parent comment that using a project as seemingly dead as Quagga for something as critical as BGP routing is putting yourself on shaky ground at the very least.
1. https://gogs.quagga.net/Quagga
2. https://bugzilla.quagga.net/report.cgi?x_axis_field=bug_seve...
Not entirely correct.
Linux has had unicast vxlan for quite some time.
Flannel is doing unicast and works pretty much anywhere.
See "Unicast with dynamic L3 entries" section: https://vincent.bernat.ch/en/blog/2017-vxlan-linux
Historically vxlan was a multicast thing, but not anymore.
Flannel (popular among the container networking solutions) will maintain its state in etcd by watching the Kubernetes resources then program the linux data plane with static unicast entries for the neighbors.