Amazon VPC CNI vs Calico CNI vs Weave Net CNI on EKS

Setting Up CNI

Amazon VPC CNI

Create the EKS cluster:

eksctl create cluster --name awsvpccnitest --ssh-access=true

Calico CNI

Create the EKS cluster with 0 nodes so that the Amazon VPC CNI doesn’t get configured on any EC2 instances:

eksctl create cluster --name calicocnitest --ssh-access=true --nodes 0
# Get the node group name for the cluster
eksctl get nodegroups --cluster calicocnitest
# Take the value in the NODEGROUP column and place it into this command to scale to 1 node
eksctl scale nodegroup --cluster calicocnitest --name <node group name> --nodes 1

Weave Net CNI

Create the EKS cluster with 0 nodes so that the Amazon VPC CNI doesn’t get configured on any EC2 instances:

eksctl create cluster --name weavenetcnitest --ssh-access=true --nodes 0
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
# Get the node group name for the cluster
eksctl get nodegroups --cluster weavenetcnitest
# Take the value in the NODEGROUP column and place it into this command to scale to 1 node
eksctl scale nodegroup --cluster weavenetcnitest --name <node group name> --nodes 1

Bootstrap the Cluster

Now that we have 3 running EKS clusters with 3 different CNIs, we will install the gRPC ping pong client/server as well as the prometheus operator to collect metrics. I chose to do this using Weave Flux. This makes installation real simple.

kubectl -n kube-system create sa tiller

kubectl create clusterrolebinding tiller-cluster-rule \
--clusterrole=cluster-admin \
--serviceaccount=kube-system:tiller

helm init --service-account tiller --wait
git clone https://github.com/jwenz723/flux-grpcdemo
cd flux-grpcdemo
./scripts/flux-init.sh git@github.com:jwenz723/flux-grpcdemo
  • installs Prometheus Operator Helm Release
  • installs grpcdemo-client Helm Release
  • installs grpcdemo-server Helm Release

Visualizing the Data

With everything now running, I used grafana to visualize everything. I chose to run grafana on my own computer and connect to each of the prometheus instances in the 3 EKS clusters using kubectl port-forward. You can install Grafana by following the instructions here.

kubectl config use-context eks-weavenetCNI
kubectl port-forward -n promop svc/prometheus-operated 9090:9090 &
kubectl config use-context eks-calicoCNI
kubectl port-forward -n promop svc/prometheus-operated 9091:9090 &
kubectl config use-context eks-awsvpcCNI
kubectl port-forward -n promop svc/prometheus-operated 9092:9090 &

Requests/Sec Results

1 Node

When running the EKS clusters each with only 1 node, the Amazon VPC CNI was able to perform slightly better than the other 2 CNIs.

  1. Amazon VPC CNI: 10.9116k
  2. Weave Net CNI: 10.6719k
  3. Calico CNI: 10.5539k

2 Nodes

The first step to this test was to scale each of the clusters to have 2 nodes each. This is accomplished with the following commands:

# Scale AWS VPC EKS Cluster
eksctl get nodegroups --cluster awsvpccnitest
# Replace <node group name> with NODEGROUP value from previous cmd
eksctl scale nodegroup --cluster awsvpccnitest --name <node group name> --nodes 2
# Scale Calico EKS Cluster
eksctl get nodegroups --cluster calicocnitest
# Replace <node group name> with NODEGROUP value from previous cmd
eksctl scale nodegroup --cluster calicocnitest --name <node group name> --nodes 2
# Scale Weave Net EKS Cluster
eksctl get nodegroups --cluster weavenetcnitest
# Replace <node group name> with NODEGROUP value from previous cmd
eksctl scale nodegroup --cluster weavenetcnitest --name <node group name> --nodes 2
kubectl get pods -n grpcdemo -o yaml
kubectl get node <node name> -o yaml -- | grep failure-domain.beta.kubernetes.io/zone
kubectl edit deploy -n grpcdemo grpcdemo-server
spec:
nodeSelector:
failure-domain.beta.kubernetes.io/zone: us-west-2a
kubectl edit deploy -n grpcdemo grpcdemo-client
spec:
nodeSelector:
failure-domain.beta.kubernetes.io/zone: us-west-2b
$ get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
grpcdemo-client-79bb44ddbb-jcdln 1/1 Running 0 8s 192.168.1.153 ip-192-168-1-14.us-west-2.compute.internal <none>grpcdemo-server-9fc5cd7d-k8vng 1/1 Running 0 42s 192.168.1.154 ip-192-168-1-154.us-west-2.compute.internal <none>
  1. Weave Net CNI: 3.968k
  2. Amazon VPC CNI: 3.518k
  3. Calico CNI: 1.902k

System Utilization Results

1 Node

CPU Utilization as cpu seconds used per second, not cpu % used (best to worst):

  1. Weave Net CNI: 0.00064
  2. Amazon VPC CNI: 0.00103
  3. Calico CNI: 0.01146
  1. Calico CNI: 31.7 MiB
  2. Amazon VPC CNI: 32.7 MiB
  3. Weave Net CNI: 102.9 MiB

2 Nodes

CPU Utilization as cpu seconds used per second, not cpu % used (best to worst):

  1. Weave Net CNI: 0.00162
  2. Amazon VPC CNI: 0.00184
  3. Calico CNI: 0.02404
  1. Calico CNI: 63.2 MiB
  2. Amazon VPC CNI: 65.1MiB
  3. Weave Net CNI: 202.0 MiB

Conclusion

I am in no way advocating that my results shown above are conclusive. From my very naive benchmarking I was able to conclude that the Amazon VPC CNI is at least comparable in performance to the other CNIs available on the market. I believe that any of the 3 CNIs compared here would be excellent options.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store