Seatbelts and straitjackets

Chris von Csefalvay

Seatbelts and straitjackets

AI

LLMs

alignment

ethics

What happens when a despotic regime turns alignment into control, open source into propaganda and guardrails into straitjackets.

Author

Chris von Csefalvay

Published

25 January 2025

DeepSeek has been grabbing headlines in AI circles lately, showing up everywhere from Discord servers full of ML enthusiasts to LinkedIn posts where “thought leaders” tag each other in endless threads. CNBC even ran a piece framing it as the latest challenge to American AI hegemony, and soon the story emerged painting DeepSeek as the scrappy competitor to OpenAI, with a heart-warming underdog narrative about a small quant shop in the PRC that decided, on a whim, to open source their fancy new large language model. Except, as any cynic will tell you, if a story seems too neat, it probably is.

Figure 1: It’s basically *The Astronaut Farmer*, but with LLMs and jackboots.

Of course, I had to go and try it. And I’ve found something rather interesting – albeit unsurprising. In a dictatorship, truth can be adjusted. It is a negotiable commodity. And if the facts do not support the regime’s truth… well, then it can be suppressed. When simply asked about something inconvenient to the CCP, we get a very expected answer.

Where it gets interesting is when you enable the search functionality, which – oddly enough – performs a search beyond the Great Firewall.¹

¹ The Great Firewall actually cuts both ways. Its main intent might be to keep China’s netizens from accessing the ‘free’ internet, but a good deal of the effort is also to keep users on this side of the Wall from having access to whatever slight semblance of occasional uncontrolled discourse there is on the Chinese internet. This aspect is rather often ignored by Western commentators unfamiliar with Chinese internet culture.

Figure 3: This is where it gets interesting.

Ask it a question about the PRC’s track record on, say, human rights or historical controversies, and it would start to spill the beans – then abruptly slam on the brakes, invoking what we in the AI business call a ‘guardrail’, a kind of safety mechanism that protects users from undesirable outputs. It’s as if DeepSeek half-read you a classified file, then went “actually, never mind” and pretended you never asked. Who needs a sense of free inquiry when you can enjoy curated silence?

And there’s the real scandal about DeepSeek. It’s not that it’s a CCP mouthpiece. It’s not even that the CCP managed to lobotomise a machine learning model then put it out into the world as ‘open source’, making an absolute and utter mockery of that idea. The real scandal is that it also managed to pervert the notion of responsible AI and guardrails in the process, abusing what is meant to be a seatbelt, turning it into a straitjacket.

Whose Seatbelt Is It Anyway

Guardrails are meant to filter out harmful or illegal content. Sounds decent enough, right? Nobody wants a chatbot spitting out tips for building bombs or fomenting genocidal ideation. But guardrails also make a convenient muzzle when the people setting them have a vested interest in what can and cannot be said.

DeepSeek demonstrates this with such awkward flamboyance it almost feels like performance art. It shows that it knows certain inconvenient truths – only to pull the plug mid-sentence. There’s an inconvenient truth here: what can keep you safe from, say, the recipe for mustard gas (my pet test case for guardrails) can, in the wrong hands, keep you “safe” from factual history. When the powers behind an LLM use guardrails to stifle legitimate discourse, they invert the entire idea of “safety” –suddenly it’s about state or corporate safety, not user empowerment.

Which, of course, makes the mythology about DeepSeek being just a jolly side project even more ridiculous.² Training a massive model is not a trivial affair, financially or otherwise. You need loads of data, advanced expertise, technical infrastructure and computing power to pull this off. Doing so in the PRC, by a regulated company (which inevitably means CCP presence inside the company’s decision-making apparatus), means governmental oversight at best, direct involvement at worst, and my money is firmly on the latter.

² Because one might need to be actively working in this field to understand how silly the assertion is that training a model of this size and accuracy, even with the clever RL only trick, could be anything other than a large scale industrial endeavour.

³ Never mind that they can run it for free, at pretty good (OpenAI-defying!) levels of performance. With the volume of press they’ve been getting, load must be incredible, yet I have encountered no performance degradation or outages, nor have any of the other researchers with whom I have had the pleasure to discuss DeepSeek these last few days. This is not running on someone’s home lab and a bunch of spare GPUs that aren’t doing whatever quants do with GPUs. Just operating this system at its current performance is an industrial level task, no matter how clever the reinforcement learning trick used to improve model efficiency might be.

I hope I have illustrated why the “quant side project” explanation is about as plausible as me spontaneously building a passenger jet in my garage. Sure, it is not impossible, but it sits ill with reality.³ The moment you notice state-friendly guardrails are baked into the system, the notion that nobody official was meddling starts to crumble.

Souring trust and fueling polarisation

What really stings is how stunts like this undermine trust in AI more broadly. If a single model can appear to openly discuss a contentious issue, then suddenly lock down as if a party censor is peering over its shoulder, that sets off alarm bells. It is a short leap from there to suspecting all publicly released LLMs might be covert mouthpieces for whichever power sponsored them. The innocent curiosity that once framed AI as a neutral tool becomes overshadowed by paranoia.

This also amplifies the kind of polarisation we see in geopolitics. When big states can afford to spin up models that quietly nudge narratives in a particular direction, we lose the last remnants of hope that AI might be above the political fray. It becomes yet another field where states compete to drown one another in carefully curated content or half-truths, with users stuck in the middle.

DeepSeek drives home a new brand of cynicism: “If a fancy new LLM appears, can we trust it?” The official line might be “We open-sourced it! Nothing to hide!” But if the training data was curated, or if shadowy “alignment” policies are embedded, the model can still be a Trojan horse. Once that suspicion sets in, good luck convincing people to use AI tools for earnest, balanced exploration.

The answer, of course, is that one shouldn’t trust anything, or at least verify. Not only is that avenue generally foreclosed to the lay end user, it is not even really afforded to those with the means and knowledge. Open sourcing an LLM is not the same as open sourcing human-readable code. The sole reason DeepSeek’s internal flaws are so evident is that we know where to look. Were that not the case, or had the developers (and their minders from what likely is the PRC’s Ministry for State Security aka MSS) been any more subtle, we would not know what biases we have brought under our roof. The usual “don’t trust me, bro” disclaimers are about as useful here as the “not cleared by the FDA” notices on snake oil. It’s still, at the end of the day, making promises. It still, like snake oil, fails to meet them.

The epistemic of tools vs. information

We tend to treat large language models like glorified chat apps, forgetting that they amass vast amounts of textual knowledge. They do more than just parse grammar; they internalise cultural, historical and political contexts. When external gatekeepers meddle with the training set or impose ideological constraints, the model will reflect that in its embedded worldview, something most of us forget even exists. If a user is unaware of those hidden constraints and accept answers at face value, they might never suspect how they are being manipulated, be it as hamfisted as DeepSeek’s responses or wiser, subtler, more insidious bending of the truth that a slightly less stupid evil regime would have tried. We ultimately must disabuse ourselves of the notion that we’re dealing with unbiased, value-neutral tools, and consider LLMs what they are – information. And the moment guardrails come into the picture, any claim to being free of human bias goes off the table.

This isn’t an argument against guardrails altogether. On the contrary, we need some form of alignment to keep truly vile content at bay. But the question remains: which alignment and whose values? A possible way forward is an auditable chain-of-custody for model training, coupled with immutable model cards that detail the sources, curation processes and alignment methods. If a government or company demands specific guardrails for certain subjects, that fact should be clearly disclosed for all to see. But we have yet to see credible attempts at widespread use and popularisation of such technology. More ink was spilled on comparing DeepSeek with o1 than on the glaring issues presented by a model that one could, and should, expect to bear the fingerprints of one of the most repressive regimes on the planet.

DeepSeek is more than just a technological marvel or a threat to Western AI hegemony. It’s a reminder that alignment itself can be weaponized – particularly by entities with a history of suppressing information and controlling narratives. When alignment is used to conceal rather than safeguard, or to manipulate rather than protect, we’re staring down the business end of a propaganda pipeline disguised as advanced software.

It’s easy to argue users are responsible for checking the answers they get, but let’s be honest – propaganda works. We know it does. It works because by and large, people don’t check the information they consume. When millions of users worldwide rely on these models to explain historical or political content, subtle manipulations can shape public understanding in ways we might not even notice until it’s too late. AI might not have self-awareness, but it certainly does have the power to shape awareness in others.

And so, here we are, strapped in tight on the propaganda rollercoaster by a seatbelt supposedly meant to protect us from harm. In a sense, we are fortunate – because hopefully, this will spark the right kind of discussion in certain corners about the painful reality that there are some very human hands turning the knobs and levers of alignment and guardrails. And we must have this awkward, painful discussion, because the alternative is a world in which reality is dictated by the mightiest sponsor with a big enough GPU farm – and a bigger political agenda.

Note: These are my personal (and somewhat tongue-in-cheek) views, and may not reflect the views of any organisation, company or board I am associated with, in particular HCLTech or HCL America Inc. My day-to-day consulting practice is complex, tailored to client needs and informed by a range of viewpoints and contributors. Click here for a full disclaimer.

Citation

BibTeX citation:

@misc{csefalvay2025,
  author = {{Chris von Csefalvay}},
  title = {Seatbelts and Straitjackets},
  date = {2025-01-25},
  url = {https://chrisvoncsefalvay.com/posts/deepseek-seatbelts-and-straitjackets/},
  langid = {en-GB}
}

For attribution, please cite this work as:

Chris von Csefalvay. 2025. “Seatbelts and Straitjackets.” https://chrisvoncsefalvay.com/posts/deepseek-seatbelts-and-straitjackets/.