US Tariffs Will Not Be Able To Stop "Made in China" AI Models

Fresh & Hot curated AI happenings in one snack. Never miss a byte 🍔

Apr 16, 2025

This snack byte will take approx 4 minutes to consume.

Remember when "Made in China" was synonymous with cheap knockoffs? Those days are long gone, especially in the realm of artificial intelligence.

As I sit here typing on my keyboard (probably made in China), I'm watching Chinese AI companies perform what can only be described as a technological pole vault over U.S. export restrictions.

In the ever-evolving world of AI, China's startups aren't just playing catch-up—they're sprinting toward the finish line, occasionally pausing to tie their shoelaces with impressive dexterity.

Despite facing increasingly stringent U.S. restrictions on advanced chip acquisitions—restrictions that President Trump has doubled down on since his January 2025 inauguration—these companies are proving that where there's a will and a few thousand GPUs, there's a way.

The Rising Dragons: China's AI Champions

Let's take a closer look at some of these AI mavericks that are making waves across the Pacific.

DeepSeek might sound like an underwater exploration company, but they're actually diving deep into the realm of advanced language models. Backed by High-Flyer, a quantitative hedge fund with a penchant for AI and approximately $8 billion under management, DeepSeek unveiled a large language model in November 2024 that claims to rival OpenAI's o1 reasoning model. Not bad for a company that began as the AI research unit of a hedge fund.

Moonshot AI isn't just aiming for the moon—they're shooting for the stars with backers like Alibaba and Tencent providing rocket fuel. Their math-specialized model purportedly gives o1 a run for its money. Their founder, Yang Zhilin, has emphasized reinforcement learning approaches that mimic human trial and error, potentially reducing the need for high-end computing power that's increasingly difficult to source from the U.S.

Alibaba itself isn't sitting on the sidelines either. The e-commerce giant has asserted that its experimental model outperformed OpenAI's in mathematical prowess. While these claims are harder to verify than a cat's age on the internet, even U.S. experts are nodding in begrudging respect.

Andrew Carr, a former OpenAI fellow turned AI entrepreneur, remarked, "China is catching up faster than many expected." He noted that DeepSeek's researchers replicated OpenAI's reasoning model within a few months, leaving many of his colleagues pleasantly surprised—and perhaps a tad concerned.

To understand China's current AI sprint, we need to look back at the starting blocks. Chinese language models didn't emerge overnight—they evolved through several distinct phases.

Phase 1: The Translation Era (2015-2019)

China's initial forays into language models were heavily focused on solving translation problems—a natural priority given the language barrier faced by Chinese researchers and businesses operating globally. Companies like iFlytek focused on speech recognition and translation technologies, laying groundwork for future LLM development.

Phase 2: The BERT Adaptation Period (2019-2021)

After Google released BERT in 2018, Chinese researchers quickly adapted this architecture for Chinese language processing. Models like ERNIE (from Baidu) and BERT-wwm (from Harbin Institute of Technology) emerged, optimized for the unique characteristics of Chinese language structure.

The Chinese language, with its logographic writing system and lack of spaces between words, presents unique challenges compared to alphabetic languages like English. These early models focused heavily on these linguistic peculiarities.

Phase 3: The GPT Challenge (2021-2023)

As OpenAI's GPT models gained prominence, Chinese tech giants recognized the need to develop indigenous alternatives. Baidu's ERNIE Bot, Alibaba's Tongyi Qianwen, and the open-source Chinese-LLaMA marked China's serious entry into the generative AI space.

However, these models were generally regarded as trailing their U.S. counterparts by 6-12 months in capabilities, with most being adaptations of existing architectural approaches rather than novel innovations.

Phase 4: The Innovation Leap (2023-Present)

The current phase represents what I like to call China's "innovation leap." Facing export controls that limited access to cutting-edge Nvidia chips, Chinese companies were forced to innovate in software and architecture rather than simply throwing more computing power at problems.

This necessity-driven innovation has led to remarkable efficiency improvements. DeepSeek's creation of the Fire-Flyer 2 computing cluster in 2021 connected approximately 10,000 of Nvidia's A100 chips to create a powerhouse that, according to their August 2024 paper, achieved performance close to a similar Nvidia system but at a lower cost and with reduced energy consumption.

The Math Olympics: LLMs Go Head-to-Head

One of the battlegrounds for these AI models is the American Invitational Mathematics Examination (AIME), a test designed to challenge the brightest high school mathletes. It's like the Olympics for math nerds, and AI models are now competing for gold.

DeepSeek claims its model outperformed OpenAI's on the AIME. However, when The Wall Street Journal put 15 problems from this year's AIME to the test, OpenAI's o1 model solved them faster than DeepSeek, Moonshot, and Alibaba's experimental model.

In one word puzzle involving a hypothetical two-player game, OpenAI's program delivered the answer in 10 seconds, while DeepSeek took over two minutes. Speed isn't everything in AI evaluation—accuracy matters more—but in the world of AI benchmarking, it's certainly a significant bragging right.

That said, the mere fact that we're comparing Chinese models to OpenAI's latest offerings represents a seismic shift from just two years ago, when the performance gap was much wider.

Working Around the Great Chip Wall

Despite U.S. restrictions on advanced AI chips—which have intensified under the Trump administration's new tariffs announced in March 2025—Chinese developers are finding creative workarounds. If the U.S. won't sell them a Ferrari, they'll build a rocket-powered bicycle.

Several innovative approaches have emerged:

Reinforcement Learning from Human Feedback (RLHF) optimization has been a focus for companies like Moonshot AI. By refining how models learn from human feedback, they can achieve better results with less raw computing power.
Mixture of Experts (MoE) architectures are being cleverly deployed. In this approach, an initial routing mechanism directs problems to specialized expert models—much like a head chef assigning the spaghetti order to the Italian cook and the sushi to the Japanese chef. This division of labor improves efficiency dramatically. Tencent's MoE model, released in November 2024, claims performance comparable to Meta Platforms' Llama 3.1 model, despite being trained with approximately one-tenth of the computing power.
Hardware Optimization efforts are bearing fruit. DeepSeek's May 2024 paper on an MoE model, incorporating efficient data processing techniques and hardware optimizations, garnered significant attention in the AI community.
Domestic Chip Development has accelerated. While companies like Biren Technology haven't yet matched Nvidia's cutting-edge offerings, their BR100 series chips represent significant progress toward indigenous alternatives.

Jack Clark, co-founder of AI startup Anthropic, observed that China's approach to circumventing export controls involves building "highly efficient software and hardware training stacks with accessible hardware."

David vs. Goliath: The Funding and Valuation Gap

For all their technical achievements, Chinese AI startups face a significant disadvantage in the form of valuation and funding disparities.

OpenAI's valuation has soared past $80 billion, while Anthropic is valued at approximately $35 billion. In contrast, Chinese AI startups are currently valued at a fraction of their U.S. counterparts. DeepSeek reportedly raised funds at a valuation of approximately $1 billion in early 2024—impressive, but still dwarfed by their American competitors.

This disparity creates challenges in attracting talent, building infrastructure, and sustaining long-term research initiatives. Western financiers remain skeptical about Chinese AI companies' ability to monetize their technological advancements, particularly given geopolitical tensions and market access limitations.

Yet, necessity is often the mother of invention. Limited access to capital has forced Chinese startups to pursue efficiency and pragmatic applications rather than engaging in blank-check research. Their focus on efficient algorithms, specialized models for specific industries, and innovative training techniques is yielding significant returns on investment.

Building an Edge: China's Path Forward

Despite Trump's renewed tariffs and continued restrictions, I see several avenues through which Chinese AI companies might build a competitive edge:

1. Energy Efficiency Innovations

With Chinese companies forced to optimize for efficiency rather than raw power, we're seeing fascinating innovations in energy-efficient AI training and inference. DeepSeek's claim of comparable performance to Nvidia systems with reduced energy consumption points to a future where Chinese models might actually lead in environmental sustainability—a growing concern as AI's energy appetite continues to balloon.

In a world increasingly concerned about AI's carbon footprint, this could become a significant selling point.

2. Specialized Domain Expertise

Rather than competing head-on with general-purpose models like GPT-4, Chinese companies are increasingly focusing on domain-specific models that excel in particular industries or applications. Moonshot's math specialization is one example of this approach.

This specialization allows for more targeted data collection and training regimes, potentially leading to superior performance in specific domains even with less computational resources.

3. Architectural Innovations

The MoE approach being pursued by companies like Tencent and DeepSeek represents a fundamentally different architectural philosophy compared to the monolithic models favored by many Western companies. By distributing tasks among specialized "expert" sub-models, these systems can potentially achieve better performance and efficiency.

If successful, this approach could represent a legitimate architectural advantage rather than merely a stopgap measure until more powerful chips become available.

4. Domestic Market Scale

China's massive digital economy provides AI companies with unparalleled access to data and use cases. With over 1 billion internet users and advanced digital infrastructure, Chinese companies can rapidly deploy, test, and refine AI applications at scale.

This deployment advantage shouldn't be underestimated—practical application often reveals limitations and opportunities that purely academic research might miss.

The path forward isn't without obstacles. The U.S. continues to tighten export controls, and as Nvidia rolls out its latest AI chips, the technology gap could widen. Companies like Elon Musk's xAI are constructing data centers with tens of thousands of Nvidia chips, raising the stakes in the computational arms race.

As models move beyond simple pattern recognition to more sophisticated reasoning capabilities, the architectural and algorithmic requirements are shifting. It remains to be seen whether Chinese companies' efficiency-focused innovations will translate to these newer paradigms.

Yet, the ingenuity and determination of Chinese developers are undeniable. By focusing on efficient algorithms, specialized models, and innovative training techniques, they're making significant strides in the AI arena. As the saying goes, necessity is the mother of invention—and in this case, it's birthing some remarkably clever AI solutions.

While the AI race between China and the U.S. continues, Chinese startups are proving that with resourcefulness and a dash of audacity, they can keep pace with, and occasionally outpace, their Western counterparts despite significant headwinds.

The story of Chinese AI development is evolving from one of catching up to one of divergent innovation. Rather than simply replicating Western approaches with fewer resources, Chinese companies are increasingly charting their own course—developing novel architectures, optimization techniques, and application strategies that respond to their unique constraints and opportunities.

And who knows? Maybe one day, we'll all be using AI models stamped with "Made in China," pondering how they managed to leap over trade restrictions with the grace of a cat avoiding a bath.

Perhaps the next generation of AI won't be about who has the most chips, but who uses them most wisely.

In the meantime, I'll be watching this technological leapfrog game with fascination. Pass the popcorn—but make it efficient, energy-saving popcorn, please. The Chinese AI developers would approve.

About the author: Rupesh Bhambwani is a technology enthusiast specializing in the broad technology industry dynamics and international technology policy. When not obsessing over nanometer-scale transistors, energy requirements of AI models, real-world impacts of the AI revolution and staring at the stars, he can be found trying to explain to his relatives why their smartphones are actually miracles of modern engineering, usually to limited success.

AI Snack Bytes

Discussion about this post