Five unconventional predictions

LLMs

tech predictions

Or, the GenAI årsgång, 2025 edition

Author

Chris von Csefalvay

Published

31 December 2024

As I sit here at year’s end, I’m reminded of the ancient Swedish tradition of årsgång - the ritual winter walk taken on New Year’s Eve to divine the fortunes of the coming year. The practice required one to walk alone in complete silence, visiting places of significance while remaining carefully alert to any signs or omens.¹ While I may be rather ill-equipped for mystical midnight wanderings, I’ve spent enough time observing the enterprise AI space to develop my own form of augury.

¹ I gather from time to time, certain mind-altering substances were involved.

² Note that I did not come up with the idea of agentic AI – in fact, a good part of what I believe is that there’s much less novelty to agentic AI than some would like to pretend. Rather, my point was that this more mature framework of agents interacting with agents will dominate over simple human-machine interactions.

In 2023, I predicted the rise of agentic systems, when chatbots were still seen as the dominant form of LLM usage.² What at the time felt like a wild-ass guess is now almost received wisdom. And thus, I will throw my customary conservatism and restraint to the wind and in the spirit of årsgång, let me share five predictions for 2025.

AI governance gets productised

A few months ago, I shared a drink with an acquaintance whose work is in the Responsible AI field. It was pretty obvious he needed that drink a whole lot more than I needed my Diet Coke.

“Nobody wants or needs us,” he bemoaned.

“What are you talking about? Everybody is talking about Responsible AI and AI governance.” I was puzzled.

“No, I meant us,” he pointed at himself. “They want the ideas, the manuals, the guidelines… just not the people. They want governance as a service.”

Thinking back to several conversations I had this past year, I could see his point. There is plenty of interest in AI governance and Responsible AI, especially in the regulated sectors, where I spend most of my working (waking?) hours. This is unsurprising – even more in those sectors than any others, the success of GenAI initiatives hinges on regulatory tolerability to a fairly significant extent. But I hear much more enthusiasm for neatly packaged, productised, almost SaaS-like governance products than for Responsible AI as a function.

And so, while everyone’s been politely nodding along to principles and frameworks for most of 2024, the conversations all seemed to end with the same question – “great, can I have this on a SaaS model?”. What’s emerging is a clear pattern: organizations want Responsible AI practices, but they want them delivered as a service. This is at least partly due to buyers’ perceptions that much of those practices sound lofty and abstract, when their main concern is keeping the board and the regulators happy. You can’t feed a starving belly with high-minded principles.

We’re already seeing this in cloud providers are starting to package governance features, the rise of tools to facilitate this, and most tellingly, in how procurement departments are writing RFPs that specifically ask for such services in product-ish or service-ish terms – documents, procedures, TTPs and as-a-services. This shift signals the operationalisation of AI governance as it matures from theoretical frameworks to practical, subscription-based implementations. Because let’s face it: nothing says “we take ethics seriously” quite like a monthly fee.

My bet is that by the time the new regulatory flora and fauna of AI that is slowly emerging comes to bear its first fruits (say, late 2025-early 2027), we’ll have entire platforms dedicated to automated governance, continuous monitoring and “RAI middleware” that sits between models and applications. The real winners will be those who can package the complex requirements of Responsible AI into digestible, subscription-based services that make compliance and governance feel as natural as running a CI/CD pipeline. And, of course, we’ll get a new buzzword out of it. I’ll go and grab the domains for “AIGovOps” after this, but I’m sure someone will come up with something even more cringe-worthy.

Small is beautiful (at the very least when it comes to language models)

In a paper I published earlier in 2024, I hijacked my audience for a few paragraphs’ worth of musings about the ethical, environmental, pragmatic and financial cases for small language models. At this time, one must recall, fine-tuning GPT models became feasible and, as it happens with AI hype, turned instantly into a status symbol. You know an industry has jumped the shark when you buy a night light, and it comes with a companion app with its own fine-tuned GPT. Just no.

Fortunately, the arc of AI development has largely bent the other way – towards small language models (which I believe is the correct direction). Turns out not every task needs a model that’s read all of Wikipedia and can write Shakespearean sonnets about your cat. Instead, we’re seeing enterprises discover that they can develop small domain-specific models for even very specific sets of terminologies, with better results than the large generalist models, which can then be relegated to act as dispatchers over these specialists. LLMs will be the general practitioners, while SLMs will increasingly take the specialist’s role.

It’s perhaps worth noting at this juncture that in AI, quantity has a quality all of its own. I’m somewhat reminded of a realisation that came to me this year when putting together a training plan. I hold multiple adaptive world records in a fairly esoteric sport called the SkiErg,³ at distances ranging from 100m (the shortest distance eligible for a record) to marathons and half-marathons (the longest record-eligible distances). My heart is mostly with longer distances, so I had to put quite a bit of thought into figuring out how to train for shorter, explosive sprints. A 100m sprint is not just a ‘shorter 2k’. You need to approach it as a distance of its own, with its own challenges and merits. I was amused to see the same in SLMs – these aren’t pared-down LLMs. Successful SLMs are created as SLMs to begin with, not as reduced afterthoughts of larger models.⁴ And so, just like I had to create a completely new training plan for sprint distances, SLM developers have to keep in mind that they’re not building LLMs writ small, but a different type of model with different desiderata.

³ Think of it as an indoor rower rotated by 90 degrees around its coronal plane, replicating not the ‘draw’ of rowing but the ‘pull’ of… something, I guess, having to do with skiing? Here’s a video.

⁴ That approach and attitude may be appropriate for quantisation, however, which is an entirely different story.

I expect we’ll see a proliferation of specialised SLMs in 2025, each trained on narrow domains but designed to work in concert. The art will be in the orchestration – how to route queries to the right specialist model, how to combine their outputs and how to maintain this constellation of smaller models efficiently. This parallels the evolution of microservices in software architecture, and we’re about to rediscover all the same lessons about service discovery, orchestration, and system design, but this time with AI models. The emergence of “ModelOps” is inevitable, and the best we can hope for is that we haven’t forgotten too much from last time.

Plot twist: the new AI kingmakers

Here’s a plot twist for 2025: your next AI project will likely live or die based on a decision made by some actuary who’s never written a line of code in their life. The insurance industry – that most conservative of institutions – may well become the de facto regulator of enterprise AI deployment. We’re already seeing the early signs in how cyber insurance policies are evolving to cover AI incidents, and how underwriters are starting to ask increasingly sophisticated questions about model governance and deployment practices. Munich Re now offers AI insurance not only for commercial providers but also, essentially, in-house AI work, and apparently, business is booming. Meanwhile, other insurers remain rather less sanguine. Lloyd’s commissioned a report last March that is noticeably less upbeat. At least they didn’t compare it to asbestos.

But AI insurance is coming, and it may well become the kingmaker of AI solutions. This will particularly affect the startup ecosystem, who often do not have the funds to pursue certification before pitching to clients. Alas, the cost of enterprise executives’ comfort that is bought by the availability of an insurance and indemnity policy is that market access will be rather less feasible for those who cannot obtain such insurance – who, paradoxically enough, are the ones who would need it most.

And since that leaves insurers with a ‘shadow governance’ function, I can well imagine the insurance industry creating de facto standards for AI governance faster than any standards body or government regulator could dream of. And let’s be honest – the actuaries will probably do a better job than most regulators could anyway.

Cross Estate AI (XEAI) is the future

Agentic AI was an interesting development, but to me, that’s primarily about the how. The really interesting story is of the what – that is, of what we can do with agents that we couldn’t do without them. And by far one of the most interesting such applications is what I call Cross Estate AI (XEAI). In XEAI, information crosses boundaries between enterprises as AI agents from different places are composed together to build an agentic structure. Think of API calling, but for AI agents – and with way more sophistication. A company’s agentic model may reach out to multiple other companies that offer their own agents, and collaborate with them. We are now relatively comfortable with notions like our AI agents calling an external API – after all, REST API calling has been around for a long, long time. But my company’s AI agent ‘calling up’ a specialised AI agent from another company to, say, engage in a discussion to refine the wording in a regulatory submission is something that enterprise stakeholders might need some time to get their head around.

The technical challenges here are fascinating – we need dynamic trust scoring, cryptographic proofs of model lineage and secure compute enclaves. But the real challenge is the social architecture: a very complex dance of trust, verification and governance that will have to be mediated. We’re essentially speed-running the development of diplomatic protocols that took human societies centuries to develop. And just like real diplomacy, it’s all about managing relationships between different systems with different organizational cultures and governance models.

GenAI does not yet have a lingua franca akin to REST to allow systems to talk to each other, least of all one that allows not just a query-response format to be conveyed but also various conventions on trust and governance to be exchanged. The real potential of agentic AI lies in how various agents can interact and together create emergent structures that are ultimately more than their parts. Function calling and an AI agent being able to look up something on Google are neat, but not really anything spectacular that a simple RPA actor or a 10-line Python script starting with import requests couldn’t do just as well. To paraphrase Kipling: the strength of the agent is the ecosystem. The big challenge, then, is to figure out how we can allow such ecosystems to exist across the boundaries of corporate networks.

Model metabolomics takes over

Energy efficiency is about to become the dominant factor in AI deployment economics. We’re not just talking about cost savings – we’re talking about a fundamental shift in how we think about AI system design. Just as biological systems optimize for metabolic efficiency rather than raw performance, we’re about to see AI architecture undergo a similar evolution.

This shift will upend the current obsession with inference speed and model size. Organizations will start optimising models for what I call “computational metabolomics” – the total energy cost of getting useful work done. We’ll see the emergence of new architectures that might be slower in raw terms but dramatically more efficient in their use of resources.

The really controversial part? Many organizations will opt for slower but more energy-efficient approaches, fundamentally challenging the industry’s obsession with real-time everything. I expect to see in 2025 more and more architectures that prioritize efficiency over speed, and a new set of metrics focused on energy consumption per useful output. The hype cycles of 2023-24 were all about who could build the biggest model - 2025 will be about who can build the most efficient one. Especially with the rise of SLMs, we’re slowly approaching the point where increasing the parameter size of generalist LLMs is going to yield any useful business benefits, or get developers any more free drinks. Such diminishing returns are, of course, part and parcel of every evolving system, and while it will be conceptualised as the beginning of a new AI winter by the usual suspects (it is not!), this is a good thing. For as we now have slowly reached a model size we’re comfortable with, we can start on making it more metabolically efficient. This is, of course, not limited to language models – in fact, this development has been going on in the computer vision world, where a good deal of processing occurs on edge devices, for a long time. Much of the lacking enthusiasm for wholesale replacement of computer vision models with vision-language models like LLaVA, CogVLM or DeepSeek boils down to the simple economics of the matter: the vast majority of computer vision challenges, in practice, can be solved quite well with a 30-year-old algorithm that can be implemented in fifty lines of C. The same, incidentally, is true for LLMs. I wince when I see GPT-based approaches to problems that a bag-of-words classifier can solve with comparable accuracy for essentially no cost and in a fraction of time. And once the power of cool wears off, I expect a renaissance of many of those solutions.

As I complete this digital årsgång, what strikes me most is how 2025 looks to be the year when enterprise AI grows up. The trends all point toward practicality over pizzazz: smaller models over larger ones, efficiency over raw power, productized governance over philosophical frameworks, and insurance actuaries over innovation evangelists. Perhaps that’s not as exciting as the breathless predictions of AI singularities and digital transformations that dominated 2023, and maybe closer to the relatively sober atmosphere of 2024. But then again, maybe that’s exactly the point – real progress tends to be more about making things work than making headlines. And personally, I find that far more interesting.

Note: These are my personal (and somewhat tongue-in-cheek) views, and may not reflect the views of any organisation, company or board I am associated with, in particular HCLTech or HCL America Inc. My day-to-day consulting practice is complex, tailored to client needs and informed by a range of viewpoints and contributors. Click here for a full disclaimer.

Citation

BibTeX citation:

@misc{csefalvay2024,
  author = {{Chris von Csefalvay}},
  title = {Five Unconventional Predictions},
  date = {2024-12-31},
  url = {https://chrisvoncsefalvay.com/posts/five-wild-guesses/},
  langid = {en-GB}
}

For attribution, please cite this work as:

Chris von Csefalvay. 2024. “Five Unconventional Predictions.” Preprint, December 31. https://chrisvoncsefalvay.com/posts/five-wild-guesses/.