This week's news: Cloudy (cost) forecasts
Every Wednesday, Nick Chase and Eric Gregory from Mirantis go over the week’s cloud native and industry news on the Radio Cloud Native podcast.
This week, Eric discussed:
Public cloud customers moving to control cloud costs
The state of HTTP
How eBay Engineering uses OpenTelemetry
And more on the podcast
You can download the podcast from Apple Podcasts, Spotify, or wherever you get your podcasts. If you'd like to tune into the next show live, follow Mirantis on LinkedIn to receive our announcement of the next broadcast.
Public cloud customers moving to control cloud costs
Eric: On the show we talked a good deal last year about repatriation from the cloud and optimizing cloud spend, and of course that was part of a wider trend—which in turn led to some lower-than-expected end of year earnings reports for the big public cloud providers. Now that’s gotten some interesting general media coverage in the Financial Times, in a piece by Richard Waters and Dave Lee entitled “Big Tech under pressure from cost-conscious customers.”
The write-up details how many companies have become leery of skyrocketing cloud costs, and that has led to slower-than-expected growth:
Revenue at Microsoft’s Azure cloud platform grew 42 per cent before the effects of foreign currency moves, a point below expectations, while Amazon Web Services sales grew 27 per cent, the slowest quarterly growth rate since Amazon started breaking out cloud sales from overall revenue.
For their part, both Amazon and Microsoft explained these earnings as the results of customer initiatives to optimize cloud spending. They also attempted to take a long-term view, essentially arguing that they’ve made it easier for customers to control their spend through various dashboard tools and offerings, which will theoretically slow growth in the near term but foster goodwill from customers over the long term. In a comment to analysts, Microsoft CEO Satya Nadella sums up the line of thinking like this: “In this particular period, I think we are going to optimize for long-term customer loyalty.”
The big cloud providers aren’t putting all of their chips on customer goodwill. The Financial Times piece also nods to efforts from Amazon and Microsoft to woo customers into longer-term deals with more attractive pricing. Of course, that lock-in comes with its own world of headaches, whether you’re talking compliance questions or disaster recovery or simply matching the right service to your needs.
This is something I know we’ll be talking a lot more about over the coming weeks and months, both here on the show and in other outlets. Kubecost and Mirantis are running a live workshop on Kubernetes cost reduction on April 13—you can sign up for that if you’d like to dig deeper on controlling Kubernetes costs this year. We’ll put a link for the sign-up in the chat, and podcast and YouTube listeners, you can find that link the description.
The state of HTTP
The Cloudflare blog published a nice piece reviewing the year in HTTP. And it was a big year for the protocol, with a brand new specification in HTTP/3. The big idea here was to continue the work of HTTP/2 in unblocking network traffic and making the protocol as efficient as possible for the network-heavy demands of modern use-cases. In this case, that largely meant replacing TCP with QUIC, which feels sort of like retiring a reliable old car that just isn't fit to task anymore.
The issue here is that TCP is an in-order protocol where one packet loss can domino to block all the packets waiting in line behind it. QUIC, which originally stood for Quick UDP Internet Connection, is exactly that: UDP-based and stream-multiplexing. Instead of one big orderly queue for packets, you have multiple streams leading to a given endpoint at once. That can have a higher initial bandwidth overhead, but it nets performance improvements under many circumstances because one rude customer doesn't hold up the entire line.
In addition to HTTP/3, other big milestones for HTTP included extensions focused on privacy, including UDP tunneling that can be used with HTTP/3 and QUIC, and work on security specifications such as Digest Fields and HTTP Message Signatures.
For much more information and a treasure trove of resource links, check out the blog.
How eBay Engineering uses OpenTelemetry
There’s another nice blog from eBay Engineering entitled “Why and how eBay Pivoted to OpenTelemetry,” which may be of interest to anyone pondering their observability strategy, especially at scale. Here’s an excerpt from the blog to set the stage:
eBay’s observability platform Sherlock.io provides developers and Site Reliability Engineers (SREs) with a robust set of cloud-native offerings to observe the various applications that power the eBay ecosystem. Sherlock.io supports the three pillars of observability — metrics, logs and traces. The platform’s metricstore is a clustered and sharded implementation of the Prometheus storage engine. We use the Metricbeat agent to scrape around 1.5 million Prometheus endpoints every minute, which are ingested into the metricstores. These endpoints along with recording rules result in ingesting around 40 million samples per second. The ingested samples result in 3 billion active series being stored on Prometheus. As a result, eBay’s observability platform operates at an uncommonly massive scale, which brings with it new challenges.
The post goes on to break down, in pretty exacting detail, how and why eBay’s approach to observability has evolved, first from metrics scraping via DaemonSets to cluster-local scrapes via StatefulSet, then to replacing that scraper with OpenTelemetry Collector. In the genre of a big companies’ engineering blogs, it’s a particularly good example that gives you a lot of context and really walks you through the problem-solving process procedurally. Definitely worth a read.
Check out the podcast for more of this week's stories.