March is a mad time for me. It was not only my birthday but I managed to get some amazing skiing in after a large snow storm. As a south Texan, I am not used to digging a car out of the snow, so that was definitely an experience! I think a place like the Bay Area is best for me. No snow, but not too far to get to snow.
Anyway, you didn’t come here to listen to me talk about skiing and snow. You came to learn about serverless!
There have been a number of interesting stories as of late.
Agentic AI and Serverless
I have spoken often about Serverless AI and Inference as a Service. I even published a blog post with Google on inference-as-a-service last month. One thing that I often state is that serverless is the perfect platform for generative AI. Not so much the training (though who knows what the future holds) but definitely with regards to serving/inferencing.
As Generative AI has matured, we need to look past LLMs and look more towards agents. At least this is what Marc Benioff said in a recent interview. Generative AI is going to be less about building “bigger and badder” LLMs and more about building software that can leverage LLMs to perform tasks with little to no human input.
Needless to say, the best way to do this would be with serverless runtimes. Let the developers build agents with the least friction possible. We are seeing a rise of serverless runtimes built for AI. Even NVIDIA launched their DGX Serverless Interface recently. They see the value in serverless for AI.
We need to think beyond simply building chatbots but building agents capable of doing more than simply chatting. Agents need to be able to scale up and scale down as needed. This is because these will arguably be very “bursty” workloads so maintaining infrastructure just doesn’t make sense.
Some Raises, Some Developments, All Serverless
Featherless.ai also just raised money from Airbus Ventures. They focus mainly on serverless inferencing for AI. This space is truly growing as developers are wanting to find simple ways to leverage AI without having to take on unnecessary technical debt.
In the past I talked about Fermyon on their investment in WASM as a runtime. Well just recently they teamed up with Akamai to bring serverless AI to the edge. (If you want to learn more about Edge Computing and Serverless, feel free to review this IEEE article. )
You could, in theory, have mobile applications or kiosk application that could connect to the internet leverage AI runtimes on a CDN. That’s pretty wild when you think about that. I mean, Cloudflare has been doing similar things with their functions and their global CDN but this feels a bit different. For one, it’s WASM which is proving to be a very strong runtime. I haven’t heard of someone bringing WASM to the edge before so this can really change the game. If WASM is as fast as they claim, this could not only address cold-start times but given faster compute experience.
This article talks about serverless payment gateways and honestly, serverless-edge may be the best way to execute!
Vercel, a truly serverless company, has now announced Vercel Fluid. In short, it’s a way to allow serverless functions to behave more like a VM. You still get the elasticity of serverless functions but you are handle multiple requests. This can also help address the cold start issues that plague FaaS. Now I am a tad skepitcal of the claims but I am very interested to see where this goes. I don’t know that it will be as scalable as pure functions but I am also happy to be proven wrong.
Journey to Serverless
In Forbes, Kirimgeray Kirimli is the CEO and Cofounder of Snapshot Reviews and Director of Flatiron Software, talked about how he moved Flatiron Software to serverless.
He does go over a few of the pain points and “triall and error” that they engaged in to get where they were going. However, this story is a great one as they talk about how the simplicity of serverless overall made them more productive. I encourage you to read this on your own but let me leave you with a quote…
“At the end of the day, infrastructure should work so seamlessly that you almost forget it’s there. And that’s the beauty of serverless—it clears the path for building what’s next, letting us focus on solutions that drive impact rather than roadblocks. As the technology continues to evolve, I’m confident serverless will play an even bigger role in shaping the future of how we design and maintain systems.”
Closing Thoughts
Serverless is PERFECT for Agentic AI. “SaaS is dead, long live Agentic AI” is what I have been hearing lately. While I think that may be a bit hyperbolic, there is a lot of truth to that.
People will begin interfacing more with agents vs. click-ops through a website to get the information that they need. It’s possible that the web browser, as we know it, will no longer be used. We may have agent portals that exist and you just make the request to an agent and it returns an answer.
At the end of the day, serverless is what will ease the deployment of Agentic Applications. Stop worrying about infra and start worrying about your apps. The major investments into serverless for AI should be proof that this is a viable market!