Moving to Cloud Native: How to Move Apps from Monolithic to Microservices
Enterprises face the challenge of consistently deploying and managing applications in production, at scale. Fortunately, there are more technologies and tools available today than ever before. However, transitioning from a traditional, monolithic architecture to a cloud native one comes with its own unique challenges. Below, you will find a list of the critical first steps you need to take when migrating applications from a monolithic to microservices-based architecture.
Note: the following post is an excerpt from our latest eBook: From Virtualization to Containerization: A Guide for VMware Admins and Other Smart People by Bruce Basil Mathews. Download the full eBook here.
Table of Contents
- Logical Steps to Consider for Migration
- Step 1: Define Boundaries
- Step 2: Identify Coupling
- Step 3: Move to RPC
- Step 4: Define Data Ownership
- Step 5: Implement Asynchronous Messaging
- Cloud-Aware Languages for Cloud Native Development
Logical Steps to Consider for Migration
Compared to monolithic applications, microservices are small, autonomous units that address individual functions and work with others to help an application function. Operating with these distributed components brings several benefits, but also its own unique set of issues.
Maintaining the quality of your software during the move from your monolithic or other legacy system can be tough. Too often, it’s holding back teams from even starting the transition, but it can be accomplished with a little planning, preparation and perspiration. The process can be broken down into multiple steps. Let’s go through them together.
Step 1: Define Boundaries
The first step in the series is to define the boundaries and capabilities of your application. This will help you to ferret out how much coupling you have within your monolith, which represents the largest concern in determining how challenging the process will be for this particular application.
The term spaghetti code is one that is used far too often in reference to monolithic applications. The reason for this snide reference is likely due to the fact that almost all codebases with size and age add more and more coupling to their logic. That is the difficult part of transition — decoupling that which is too tightly coupled. However, the age and size of the codebase is not always indicative of this coupling problem. A monolithic application does not necessarily need to be tightly coupled.
We have defined coupling as the enemy. But, if good coding practices were employed during the creation of the monolith, the developer found a balance between cohesion and coupling. If the proper boundaries were defined in the monolith, then the transition will be much easier to accomplish. Microservices are single units that contain a certain set of capabilities within the system. The same boundaries can and should be applied within a monolithic application.
Step 2: Identify Coupling
Where step one involved defining the capabilities of the system, step two defines the boundaries (Bounded Context), which are a collection of those capabilities. For example, the shopping cart in a point-of-sale application may include an inventory boundary component on one side, and an identity-bounded element on the other. The cart object is the vehicle for holding them together.
Step 3: Move to RPC
In step three, we need to discover and document which Bounded Context references couple to other bounded contexts, and which capabilities in one bounded context need to call a function in another bounded context, usually used to perform an action or get data.
This type of coupling breaks down into two categories:
- Database Schema
In the first case, function/API calls were previously done in-process, but, since you’re separating out the function/API to a new service, you can’t make these calls in the same way. Instead, you need to make some type of Remote Procedure Call (RPC) over the network. The RPC call is usually done via HTTP, gRPC, and so on.
Synchronous RPC is NOT a good long-term solution for any distributed system. However, it can be an intermediate step towards moving portions of a system, in phases, toward being independent and autonomous. Eventually, the synchronous RPC calls should be replaced with an asynchronous messaging system of some sort.
Meanwhile, each boundary must retain its own data. Shared databases are not used directly within a microservices context because it creates an undesired coupling between different services. Once decoupled, you do not want to have one Bounded Context query the schema of another Bounded Context.
So far, by completing the steps outlined, we’ve created a distributed monolith. We have moved all communication that was in-process to an RPC call over the network. One small step for man; one giant leap toward a cloud native application.
Step 4: Define Data Ownership
In this step, you need to remove the tight coupling of data and the stateful storage mechanism employed to maintain the state of the application. This will enable you to move toward a more stateless model where the datagram passed between microservices only contains the data needed to support the task being performed within the boundaries of the microservice itself. Instead of going back and forth to a persistent storage area to get and update data in a shopping cart, for example, the cart object may maintain the identity object for the purchaser and the inventory IDs for the products in the cart until both are used at checkout time.
Step 5: Implement Asynchronous Messaging
Finally, in step five, you have a bit more work to do to get your services to be autonomous. Having carved off a Bounded Context, it’s a matter of applying the same concepts to other boundaries and capabilities in your system until you have transitioned all of them. Commands and Events will be sent and published to a Message Broker to remove RPC. This eliminates the complexity of retaining the access needed to execute remote procedure calls from the interaction between services, even if they live on different hosts. Role-Based Access Controlled certificates replace all of that interaction. This is a giant step in gaining autonomy of a service.
- Define Boundaries
- Identify coupling
- (Intermediate) Move to RPC
- Define Data Ownership
- Implement Asynchronous Messaging
Modern Cloud-Aware Languages for Cloud Native Development
Now let's look at some of the programming languages you might consider for this new development.
Since I’ve been at this a while, I will start with a few oldies but goodies that have stood the test of time and then go a bit deeper into the new kids on the block who will undoubtedly assume a more prominent position on the list as more and more students are taught them in school.
Let’s start with a few that grew up out of the development of the internet:
Java is widely known as a general-purpose programming language. Today, it has positioned itself as one of the best programming languages for cloud computing and is used by millions of developers and executed in over 15 billion terminals across the globe. Java is highly versatile. Its versatility is a key feature that makes it one of the few languages that can be used to create applications for websites, desktops, mobile devices and video games using the same codebase.
This programming language provides many benefits, including:
- It is object-oriented.
- It can be used without complications from dependencies, etc.
- It is truly platform independent.
- It is fairly easy to learn.
- Cloud computing programs created using Java can run in different operating systems, including Windows, iOS, Blackberry, Linux and more, all running the same interpreted codebase.
Java has security features built in that are robust and easy to use. If you want to realize serverless architecture, you can easily do it using a few programming languages, including Java. It features AOT (ahead-of-time) compilation of various frameworks, which allow you to efficiently address a big distributive size and a long cold start.
All major cloud providers, including Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP) offer level one support for Java in their SDKs.
Python is one of the hottest languages in the cloud industry today. The Python language is geared towards novices so almost anyone can learn to program in it. Python provides exceptional features like third-party modules, immense support libraries, and open-source and community development options, to name a few. Python is a high-level, interpreted and very interactive object-oriented scripting language with a well-defined hierarchical indented format. Python was designed to be easily readable and uses English syntax and keywords frequently where other languages use punctuation. The Python language has fewer syntactic constructs than other languages, making it simpler to learn.
Python combines various high-tech features such as speed, productivity, community and open source development, extensive support libraries, third-party modules and more to improve programming. Whether you want to create business applications, games, operating systems, computational and scientific applications, or graphic design and image processing applications, Python has got you covered.
Here are some of the main features and benefits this language provides:
- Web frames and applications
- Scientific and computational applications
- GUI-based desktop applications
- Language development
Python is used extensively in the AWS Cloud and is natively supported by AWS Lambda. This is a great language to use for developing serverless applications on Amazon Web Services.
Okay, this is not my favorite language for a lot of reasons, mostly because it is slanted toward use with the Windows Operating System and its development is governed by Microsoft, but it still holds a prominent position in programming for the web.
ASP.NET is mostly used to develop web applications and websites with multiple functions. One of the reasons it has positioned itself as a fantastic cloud computing language is its ability to provide dynamic web pages and cutting-edge solutions that can be viewed across different browsers.
Beginning developers will find the ASP.NET framework fairly easy to use. ASP.NET comes with a lot of built-in features, including:
- It minimizes the use of large code when developing large applications.
- It is effective in the development of dynamic web pages.
- It is language-independent and extremely easy to use.
- It separates logic and content to keep application development inconveniences to a bare minimum.
- It uses built-in Windows authentication to secure applications.
PHP is a popular language used by programmers primarily for such purposes as website automation. It is fairly easy to learn and manipulate. PHP is a top choice for many programmers because it helps create applications with dynamic elements. PHP is an object-oriented language that can be used to develop complex and large web applications.
PHP has the ability to run on UNIX and Windows servers. It is also worth mentioning PHP’s powerful output buffer feature that makes it popular. Its remarkable speed, low cost, reliability, and security are also other features that make it worth investigation for cloud native applications.
PHP is integrated with several popular database management systems. It can connect to MySQL and manipulate the database fairly straightforwardly, for instance, to perform MySQL backups. For non-DBA’s, backing up databases can be tedious and time-consuming. The task is made much easier with PHP.
PHP is reliable, safe, fast, and affordable. It is definitely a cloud computing language you should consider using to fulfill unique development needs.
The Node.js language is easy to manipulate and is highly effective in the development of end-to-end applications. Node.js features a non-blocking, evented, asynchronous communication pattern that allows applications to handle a huge number of connections. Running on Google JS engine, this language is extremely fast, which makes it a favorite among many modern developers.
Some of the key benefits of Node.js include:
- Cross-platform compatibility
- The convenience of using one coding language
- Facilitating quick deployment and microservice development
- Highly Scalable
- Commendable data processing capability
- Active open-source community
- Additional functionality of Node Package Manager (NPM)
- Advanced hosting ability of NodeJs
- Fast data streaming
Now let's look at some of the newer offerings that have sprouted up recently to fill this cloud native programming need.
The Google-born language, Golang (or Go, as it is sometimes referred) is rapidly becoming the language of choice for many cloud native operations. Go has played a significant role in the creation of Docker, Kubernetes, Istio, and many of the other cloud-related technologies. Simply put, Go is the language of cloud infrastructure.
Golang has gained popularity among programmers. It is one of the easiest to use cloud development languages for cloud native infrastructure manipulation. It was originally used chiefly on Google Cloud (GCP), but also applicable on different cloud platforms. Go had developed a standard library for the cloud. The standard library currently supports the three major cloud providers, Amazon, Azure, and Google Cloud Platform.
Most enterprises that work on the cloud today are utilizing Golang. Its features, such as deep integration, libraries, authentication, and so on, make it an excellent tool for these enterprises. Programmers choose it because it helps them build fast, secure, efficient, and scalable applications.
Golang has emerged as an alternative to some of the earliest programming languages. It has also proved to be faster than many languages.
Ballerina is an open-source programming language for the cloud that makes it easier to use, combine, and create network services. WSO2, an IBM group founded in 2005 to create a standardized API for enterprise platform manipulation, also initiated the Ballerina language project in 2017. WSO2 had a strong presence at KubeCon 2018 in Copenhagen showing how to use Ballerina on Kubernetes.
Cloud native programming inherently involves working with remote network endpoints: microservices, serverless, APIs, WebSockets, SaaS apps and more. Ballerina is a programming language that is designed to make coding network constructs an inherent part of the language, and to bring code-first agility to the challenge of integrating across endpoints.
Ballerina has first-class support for distributed transactions, circuit-breaker patterns, stream processing, data-access, JSON, XML, gRPC and many other network endpoints. It deploys directly onto Docker and Kubernetes. It integrates with common IDEs including IntelliJ and Visual Studio Code.
Ballerina empowers developers to write code to integrate items, rather than use complex configuration-based integration schemes. Ballerina’s underlying value type system makes JSON and XML tables, records, maps and errors primitives, eliminating the need for libraries to work with these fundamental data structures. As a result, developers can do a lot of data structure manipulation using simple constructs within the source code. Sequence diagrams can then be automatically generated from code.
The language is designed around the following core design principles:
- Sequence diagram generation
- Granular Observability
- Concurrency workers are multithreaded
- Network aware
- Environment aware
- DevOps ready
- Secure by Design
- Built-in container support
The Ballerina language is an open source project that can be found in GitHub.
A downloadable CLI, runtime, libraries, and a hosted service work together to deliver a robust way of provisioning, updating, and managing cloud infrastructure. Pulumi is targeted to satisfy the following infrastructure as code needs:
- Build: Build cloud applications and infrastructure by combining the safety and reliability of infrastructure as code with the power of familiar programming languages and tools.
- Deploy: Deploy cloud applications and infrastructure faster and with confidence, using one shared approach that works for day one and beyond for the entire team.
- Manage: Manage cloud applications and infrastructure with a shared platform that helps teams adopt Cloud Engineering through collaboration, visibility, and policies and controls.
Pulumi uses a desired state model for managing infrastructure. A Pulumi program is executed by a language host to compute a desired state for a stack’s infrastructure. The deployment engine compares this desired state with the stack’s current state and determines what resources need to be created, updated or deleted. The engine uses a set of resource providers (such as AWS, Azure, Kubernetes, etc.) in order to manage the individual resources. As it operates, the engine updates the state of your infrastructure with information about all resources that have been provisioned, as well as any pending operations.
Pulumi executes resource operations in parallel whenever possible but understands that some resources may have dependencies on other resources. If an output of one resource is provided as an input to another, the engine records the dependency between these two resources as part of the state and uses these when scheduling operations. By default, if a resource must be replaced, Pulumi will attempt to create a new copy of the resource before destroying the old one.
Want to continue reading? Download the complete 100+ page eBook, From Virtualization to Containerization: A Guide for VMware Admins and Other Smart People.