Docker Swarm Webinar Q&A: Long Live Docker Swarm!
In our recent webinar Long Live Docker Swarm! Sr. Solutions Engineer Jason Epstein and Sr. Solutions Architect Avinash Desireddy discussed the current state of Docker Swarm and how its use has changed with the emergence of Kuberenetes. This proved to be an extremely popular webinar, with tons of interest about where Docker Swarm stands today — and many questions about how to take advantage of Swarm now and in the future.
There were so many interesting and important questions raised during our talk that we wanted to transcribe them for you below. We didn’t have time to get to all the questions from the chat during the live presentation, so we’ve answered as many as we could here. If you are interested in seeing the full webinar, save your seat today, as we will be presenting on the same topic again at the end of this month.
Is Swarm being sunsetted or not? I thought Mirantis had only planned to support it for two years after the acquisition of Docker Enterprise?
It all stems from when Mirantis first acquired the Docker Enterprise business two years ago, they put out a blog saying, "We're gonna support Swarm for at least two more years and we're gonna take that time to kind of see what the need is, what the market dictates…”
Unfortunately, some third parties took that and said, "Oh, they're only going to support it for two years and then they're going to sunset it." That's not true at all. This is important to us and to our future direction. We continue to invest in it and we have lots of large enterprise customers using Swarm in production at scale.
Swarm and Kubernetes are part of our future strategy and there are some cases where Swarm might be a better choice and other cases where Kubernetes is. All we're looking to do is just make our customers successful, no matter what orchestrator they're going to be on, now and in the future.
What are some of Swarm's scalability limitations?
Swarm has pretty good numbers with scale, and again, it depends on what you're looking to do, but we actually have one customer who did a test with 1,200 worker nodes on a Swarm cluster. There were 3 manager nodes, 1,200 worker nodes, and they just wanted to do a kind of “scale test.” What they found was that the orchestration layer worked just fine. They started having crashes and out-of-memory exceptions on the manager nodes, but that was because the application they're running could not handle that scale, but they found that the orchestration part was doing just fine.
Having 1,200 worker nodes is very rare. I've never seen a customer do it deliberately or without looking to deliberately test out the scale of it, but I think that tells you that whatever scale you're looking for, most likely, Swarm will handle your needs. Certainly a few hundred should be fine.
To add onto that, it just completely depends on how these applications are deployed. So we have thousands of applications that are deployed on small clusters where the limits are clustered on the manager nodes. It depends on how frequently applications are being restarted and how well they are managed. We are not just looking at horizontal scalability, but also the vertical scalability as well as how they are managed.
Are there any existing deployment patterns with Swarm having, say, a Dev environment, a QA environment, and a production environment? Is there an example of how to implement Swarm in this way?
Right within Swarm, if you have to put all three environments on the same cluster, we can change that using mode labels. Mirantis has a custom solution built on top of it, which is averaged through collections, called Swarm Collections with which you attach a collection to a group of nodes and all your workers will be isolated to that specific collection, or a specific collection of nodes.
What can happen if you lose all of your managers, or what might happen if your host machine crashes and reboots? Will they automatically restart?
So, first of all, your worker nodes are going to continue to run even if your manager nodes go down, and then once the manager nodes come back up, they'll catch up. It's a similar idea to Kubernetes. If your manager nodes are offline or down and something happens with the containers on your worker nodes, they're not going to get automatically restarted. There's no facility to do that, but as long as everything is normal, they'll continue to run as you would expect. If your worker node dies, the Swarm manager will reschedule that workload on another machine that has the resources and is up. It's, again, very similar to what's done in Kubernetes.
What kind of databases or file storage might Swarm use to keep your configuration persistent?
There's a concept of persistent volumes in Swarm. So you might know there's already volumes at the container level, Docker volumes, but also we're implementing CSI so you have cluster-wide volumes in Swarm, so that you can persist your data that way, similar to what you would do in Kubernetes.
We also have these plug-ins for different storage diverts that can be leveraged (when we are spinning up a service) to get everything back to the command that you just executed to launch that service. So along with that, we will be recording extra arguments saying that there's the volume that I want to connect to, and that the volume is attached to one of the storage drivers. So that's how we can leverage or reprocess the data that an application is trying to store.
Do you know if cloud providers offer managed Docker Swarm functionality or if, every time, it is necessary to have virtual machines running Docker in Swarm mode?
Cloud providers aren't going to have managed Swarm services. If you think about EKS, AKS, GKE, where they're offering managed Kubernetes services, they don't have the equivalent for Swarm, but if they're running the Docker Engine (or if you want to deploy the MCR Engine) then you'll have Swarm mode available on those orchestrators. And, of course, if you use Mirantis Kubernetes Engine (MKE), which is our orchestration engine, then you're going to have enterprise-level Swarm that's fully supported and has all the bells and whistles that you'd expect from an enterprise product.
The second part of the question is: would it be necessary to have virtual machines running Docker in Swarm mode? You could obviously run Swarm on either virtual machines or bare metal, and it all works just fine. Although the question might be if I had an EC2 instance, would it be necessary to have that running in Swarm mode? And, yes, if you wanted to have Swarm, you could either do it that way, in Swarm mode, or you can use Mirantis Kubernetes Engine, which works on any of the cloud providers, and you can deploy the enterprise version of Swarm that way, whichever makes more sense for you.
Is it possible to set up mutual TLS or SSL two-way without having to add this feature inside of microservice source code?
Interesting. The underlying question is that in Kubernetes, we can use a service mesh that'll do that without having to implement it into the source code. Can we do that in Swarm?
It cannot be done on the top public open source Docker Swarm, but it can be done on top of the Mirantis stack. So we have a component called Interlock with which we can set up the mutual TLS between the applications. That means if I deployed a container that doesn't use SSL - say it's two microservices that are talking to each other over Port 80 because it's open - it's just in the clear. Using Interlock, we can achieve that feature. The feature is in very early stages, so it will be available in the latest version Mirantis Kubernetes Engine. It is and will not be available on legacy versions.
What protocols are possible for communication between manager nodes and worker nodes?
By default, it's going to be GRPC. In the Kubernetes world, it's HTTPS, and I don't think that's something that you can change. It is encrypted, so it is going over TLS, but it uses GRPC to manage the protocol for the communication.
You mentioned that Kubernetes uses a flat network. Does Swarm also use a flat network?
No. In Swarm, by default, when you deploy a new stack it automatically partitions that network and only the containers or the tasks that are part of that stack are going to be on that network partition. This is very deliberate. It gives you an additional security layer because, by default, you have this overlay network and only the containers deployed to that overlay network can communicate with each other instead of having every container able to communicate with every other container. So, by default, it's not a flat network.
How does Swarm handle failure for tasks and nodes?
First of all, if a task fails, which would usually happen if the container goes down, Swarm will restart it generally on that same node. I don't think it's guaranteed that it's going to be on the same node, but generally as long as that node has sufficient resources and is up - this is very similar in concept to how Kubernetes works - then the task will be restarted on the same node. If the whole node goes down, Swarm will then reschedule that task on a different node that has resources to handle it and is up, again, similar to how Kubernetes works.
What's the autoscaling solution in Swarm? Kubernetes provides horizontal pod autoscaling, as well as cluster-node autoscaling, so what kind of equivalents or analogs might we find in Swarm?
In Swarm, you would scale up your clusters manually. And you could strip that, too, if you have a script that is looking for, say, resource utilizations and then sending a command to either the Swarm API or issuing the command line with a command to scale up. It's really easy to do. There's just one command where you increase the replicas of your containers. It's a little different than how Kubernetes works, though.
How do you use Mirantis Kubernetes Engine (MKE) in Azure? Could you give any kind of example of that?
Using MKE in Azure, from our perspective, is going to work the same way as running MKE on-prem, or on GCP, or on AWS, or on Equinix, or on VMs, or anywhere. In the case of Azure, like anywhere else, you would use your IaaS layer. You would provision some nodes or spin up some virtual nodes and then install MKE on those nodes, the same way you would do it in your own data center.
Now, an even easier way is with Mirantis Container Cloud. It is kind of a federation product that will actually use Azure's API. Basically, you can just click on a button saying I want three manager nodes and four worker nodes, and it will go make the API calls to Microsoft Azure and spin up those nodes and install all the pieces on a MKE cluster.
Can you talk about the Swarm-only mode that is coming for Mirantis Kubernetes Engine? Is this just a Swarm-like interface for Kubernetes?
Swarm-only mode in the UI is going to look exactly like MKE - it just won't have the Kubernetes pieces. It's only going to have the Swarm pieces. The features that you get when using Swarm in MKE are exactly the same as Swarm-only mode. The benefit is that it’s a little bit easier to use because if you're just using Swarm, you don't need to worry about pods and stateful sets, or deployments - all those things present in Kubernetes. Also, you're not running Kubernetes components, so it's going to be a little more optimal for resource utilization, as an example.
In the case that you're beginning a container orchestration and you start with Docker Swarm, and later on, while you're coming up to speed on your knowledge of Kubernetes, then you have a choice. You can either stay with Docker Swarm, and then you use that in production, or development, wherever you want. If you really need the extensibility and the customizations available in Kubernetes, then you can move from Swarm to Kubernetes and there's a migration path for that. Mirantis would be able to help you with that migration process.
Is there an open source, UCP for Swarm?
At this time, Mirantis Kubernetes Engine (formerly called Docker Enterprise/UCP) is only available as an enterprise, production-ready product from Mirantis.
Can Swarm manage nodes other than Docker nodes?
Currently, Swarm requires either Docker or Mirantis Container Runtime (MCR) as the container runtime.
What is the Docker Swarm licensing model?
Docker, Inc. offers a free, unsupported version. The enterprise version that includes full support and production capabilities is available as a paid version through Mirantis.
What is the drawback of having workloads on manager nodes?
For a single-node development environment, you can do this. However, for production or on any critical deployments, we recommend against this. This is mainly to avoid resource contention between manager node components and application workloads.
Will Swarm be open source “forever”?
That is our intention. Of course, "forever" is a very long time and hard to predict.
Are there plans for a Swarm cloud service inside of Azure? AWS?
We are not aware of any managed Swarm service provided by a cloud provider, but you can easily run Swarm on any cloud provider with Mirantis Kubernetes Engine.
Swarm runs on Windows containers, does it run on Linux containers as well?
Swarm runs on Windows containers.
In Mirantis Kubernetes Engine, is Swarm being used for the control plane?
Yes, Mirantis Kubernetes Engine runs on top of Docker Swarm.
Is there a specific example you can provide in which Kubernetes is preferred to Swarm? Maybe there are technical limitations to Swarm not present in Kubernetes?
This would be something for more advanced or custom use cases that require things like CRDs, a service mesh, Helm charts, etc, anything specific to a Kubernetes architecture.
Is there a tool to switch from Swarm to Kubernetes? Viceversa?
You can run Docker Compose in Kubernetes mode to move simple configs from Swarm to Kubernetes, but for anything more complex, we recommend engaging with our professional services team to aid in your migration. Contact us today if you're interested in learning more.
Is Swarm still implemented with SwarmKit? Is there any relationship with Kubernetes now?
Swarm is still implemented with SwarmKit, and there is no relationship with Kubernetes at this point in time.
How to configure a Dashboard for Docker Swarm?
Dashboards are available by enabling enterprise components on Swarm using Mirantis Kubernetes Engine. There are a few open source projects like Swarmpit that provide nice visualization of Swarm components.
Is it possible for Swarm to work in multi-region on a geo-scale?
This is a really interesting use case, but not one currently in the scope for Docker Swarm because there are various timeouts within Swarm mode that do not handle high latency well and may start to assume nodes within the cluster are down, lose quorum, or have networking issues.
Is it a good idea to use global services with an AWS autoscaling group?
Yes, global services will work seamlessly with the AWS autoscaling group.
When running Swarm in production, why is there a limit to the amount of containers that can be run on a given overlay network? Are there plans to increase/eliminate this limit?
The overlay networks created, by default, uses /24 subnet. It can be customized when creating an overlay network using
docker network create --driver overlay --subnet 192.168.0.0/16 --gateway 192.168.0.1 my_network will create a
my_network overlay network and support 65,535 containers.
What's the best way to manage LetsEncrypt SSL certificates in Swarm? In Kubernetes you have certificate-manager; what about in Swarm?
There is no out-of-the-box integration with LetsEncrypt certificates in Swarm. However, it is possible to create a LetsEncrypt container in a service, and expose the generated certs to target Swarm containers using volume mounts.
Are there different tasks/service update strategies?
Yes, Swarm supports several update and rollback strategies. These options can be set when a
service create command is performed, or updated after the service is created. See the MKE documentation for more information.
Are any Mirantis customers running Windows containers on Kubernetes at production scale? Are they stable?
Yes, we do have customers running Windows containers on Kubernetes (and Swarm) in production. There were few challenges in the beginning, but it is more stable now.
How does Mirantis message its customers that don't accept Docker but prefer Kubernetes given that Mirantis Kubernetes Engine supports both?
Docker is the core building block to run Mirantis Kubernetes Engine. First we would like to understand the challenges a customer might have to not accept Docker. Based on the challenges discussed, we will help to educate them and mitigate their workloads accordingly.
Save Your Seat to Learn more about Swarm
We hope that you found this Q&A helpful. If you are interested in watching the full version of this webinar, sign up for our November 24th presentation here. You will also be able to access the full version on demand after this date.
If you still have questions regarding Docker Swarm, Kubernetes, or any of the topics discussed in the Q&A section above, please don’t hesitate to contact us. Our team is very knowledgeable and is standing by to assist you.