Happy May everyone! I hope this month was a great one for you. I hope all got to take a nice break on Memorial Day too if you are here in the United States. If your part of the world had holidays in may, I hope you got to enjoy those.
I managed to get a few ski days in before the season officially ended and we had to move out of our ski lease. Now comes the time where I put my ski gear away until December. But hey, serverless news is always fresh (admittedly not my best transition)!
I opted to use a photo of my dog getting caught in the snow (don’t worry, she was fine) to signify the serverless storm. This month I read a lot on how many cloud native application businesses are moving to serverless and a new era for enterprise efficiency being lead by AI and serverless. This seems to indicate that AI is going to bring in a serverless storm so let’s talk about it.
AI as a {Serverless} Service?
Rafay has released a Serverless Inference for NVIDIA Cloud Partners. The idea here is that it will allow these NVIDIA Cloud Partners to improve their margins on GPUs. This will allow the providers to no longer focus on just being “GPU-as-a-Service” but “AI-as-a-Service” (I have called it Inference-as-a-Service in the past).
What does this mean? Well today cloud providers are offering GPUs for their customers to use to either train or serve their models. Nothing wrong with that setup but the margins on hardware are pretty low. With the new Rafay software, the providers can now better monetize the act of inferencing itself.
One way to think of it is like going to a restaurant. I say that I want a burger. With the GPU-as-a-Service model, the restaurant gives me the hamburger meat, lettuce, tomato, pickles and bun and then a table to assemble and stove to cook.
AI(Inference)-as-a-service is the restaurant giving me a ready made hamburger where I can charge extra due to labor, atmosphere, whatever. As the end user, I get what I wanted, food.
Sounds a lot like serverless right? Instead of me having to create my own server with inferencing software and a GPU attached to inference with my model, I can just use the inferencing software that my cloud provider gives me.
I predicted that inference-as-a-service was here to stay and this move by Rafay and NVIDIA are showing how this is happening.
Rafay had a great presentation on this concept if you want to dig in further.
Rise of Serverless MCP
In April I talked about the rise of the Model Context Protocol (MCP). Cloudflare has announced an MCP Server but so have many others!
A part of Google I/O, Google announced Cloud Run MCP Servers and an AI Studio integration. Cloud Run is Google’s serverless platform that relies on containers for their runtime.
AWS also announced their own Serverless MCP server. This of course integrates with AWS Lambda. While I didn’t find a specific product from Azure, they do have a blog post that shows how to run MCP servers on Azure Functions. It’s like a DIY kit for MCP Servers.
The fact that “the big three” are validating MCP servers can signal that this is a concept that is here to stay… for now. AI is a space that is constantly changing but I think that MCP will evolve into something else rather than go away completely.
The other interesting fact is that they are focusing on the serverless use-case. Using MCP should be easy. Making someone spin up a Kubernetes cluster or a VM to deploy an MCP server seems a bit backwards in 2025. Of cousre you want to make that runtime easier to deploy and that’s why serverless is incredibly common for MCP Servers by the cloud hyperscalers.
More on Serverless Startups
In the past I have talked about Neon, the serverless database often used for RAG implementation. Well DataBricks is buying them for $1B. DataBricks has become one of the leaders in AI and Data and could be seen as a late stage startup by many. They haven’t been acquired or filed an IPO but my guess is that they will within 3 years.
DataBricks themselves are a serverless company in the sense that they have some first-party serverless offerings. Buying Neon would ultimately open up new options for their platform. This will help them have native RAG options on their platform allowing their customers to have a more customized experience. They could also offer it as a service to the hyperscalers.
Another startup, HostColor.com, has created a serverless Wordpress offering. Wordpress is an open source Content Management System (CMS) and is said to power roughly 40% of the internet. My site, jasonsmith.io, was a Wordpress site up until a year and a half ago ( read more about my migration here ). Great product but more than what I needed.
Anyway, I thought this was pretty cool as most deployments of WordPress that I have seen were monolithic. While there are many WordPress hosts out there, what I am curious to see is if this helps them provide better performance or an overall better offering.
They appear to be using a FaaS based solution so I am curious what that looks like from an architecture standpoint. My inclination is that the decomposed all of the application functions into actual function-based microserver.
If you want to see a serverless container-based solution, feel free to check out my demo here.
Guarding of Serverless
As more and more people use serverless technology, securing that technology will become important. Now there are two personas to concern here. There is the serverless provider and then the serverless user. Ideally, with serverless, the user has to consider less about security, at least when it comes to infrastructure, but they are still responsible for their applications
Serverless runtimes require people to rethink Application Security (AppSec). I found a neat article on HelpNetSecurity.com that talks a little about the considerations.
I want to encourage people to read this article on their own but I wanted to highlight something interesting that was said.
“With serverless, you’re no longer dealing with long-lived servers or predictable runtime behavior. Instead, you’re working with short-lived functions triggered by events, often running in unpredictable sequences. This makes it much harder to apply traditional security tools that expect stable infrastructure. What’s needed instead is a dynamic approach, one that understands how these functions behave in real time, within the context of the application’s actual workflows.”
Now they specifically called out functions but the same rules would apply to serverless containers and WASM. Security for serverless needs to be thought about in a different way.
And just because you don’t have access to the infrastucture doesn’t mean that you are hack proof. After all, there was recently an attack that targeted serverless platform. While I don’t have specific news about a new serverless security platform or tool, I want to remind people to always take their security seriously when on the cloud.
Closing Thoughts
Serverless is growing, this is a fact. I believe that this AI revolution has been something of a tipping point for the serverless movement. It is now reshaping FinOps as compute is becoming even more “pay-as-you-go” than it has been in previous years.
This past month we saw a major acquisition of a serverless firm, someone taking 20-year old software and making it serverless and serverless MCP servers being provided by major clouds. A lot of this, again, is being driven by the rise of generative AI.
I look forward to seeing what we have next month.
By the way, if you want a crash course on serverless AI/ML, check out this post from KD Nuggets.