Entrusting your money to an AI? Real test with 6 LLM and $10K each

Alpha Arena competes GPT-5, Claude, Gemini and 3 other AIs on the crypto market with $10,000 each. The results? Chinese models Qwen and DeepSeek dominate, while Western stars lose. This fascinating experience raises a question: can we really trust our investments to artificial intelligence?

The experience: 6 autonomous trading AIs

The Alpha Arena concept

Alpha Arena is the first public platform where artificial intelligences trade in total autonomy on the crypto market, without human intervention.

The 6 competitors:

GPT-5 (OpenAI)
Gemini 2.5 Pro (Google)
Claude 4.5 Sonnet (Anthropic)
DeepSeek V3.1 (DeepSeek)
Gork 4 (xAI)
Qwen 3 Max (Alibaba)

The rules:

$10,000 in initial capital per model
Objective: to generate profit without human help
All trades visible in real time on nof1.ai
Real crypto market with real money

This is the first time that we can observe live how different AIs make financial decisions under the same market conditions. The experience is publicly accessible, allowing everyone to follow the performances live.

The results that surprise

Chinese models are crushing the competition

Against all odds, The only two AIs who really win are Chinese.

The leaders (data as of 10/11):

Qwen 3 Max (Alibaba): in the lead
DeepSeek V3.1 : peak at +19.96% in 2 days

DeepSeek is relatively well known, but Qwen remains much less publicized in the West than GPT, Claude or Gemini. This dominance raises questions: real superiority or system bias?

Western stars in the red

GPT-5, Claude, Gemini and Grok show disappointing performances, with some posting significant losses.

The brutal gap:

Best performance: +23%
Worst performance: -63%
Majority of models in the red

The best AIs in conversation are clearly not the best at making financial decisions.

Extreme volatility

The case DeepSeek illustrates the problem: +19.96% in 2 days, then a huge drop. This volatility reveals that even the best AIs can explode and then implode just as quickly.

The questions that the experience raises

Is the site really reliable?

Alpha Arena is managed by Jay A Zhang, founder of Nof1.ai, a Chinese company specialized in AI applied to trading. This information raises a legitimate question: is the site “tampered with” to favor Chinese models?

Disturbing signs:

The only two winning AIs are Chinese
Qwen is much less well known than the other 5
The creator is Chinese
No independent audit available

Without complete technical transparency, it is impossible to make a decision. But this doubt should temper our interpretation of the results.

A period too short to conclude

Crucial point: The first season ended on November 3. A new season has been announced.

Why it's a problem:

A few days are not enough to evaluate a trading strategy
Need to observe over several market cycles
Reproducibility not demonstrated
Safeguards and human supervision are essential in such volatile markets

This “seasons” format suggests that the experience remains very experimental.

Systemic risk

Let's imagine that thousands of investors blindly follow winning AI trades.

What is going on?

Massive self-fulfilling prophecy
Potential manipulation by malicious actors
Amplified sheep effect creating bubbles and crashes

It is precisely this systemic risk that financial regulators are worried about.

Why AIs struggle in trading

Theoretical advantages

What AI should bring:

No emotions (fear, greed)
Massive data analysis speed
24/7 availability without fatigue
No confirmation bias

The disappointing reality

Despite these advantages, The majority of AIs are in the red.

Probable reasons:

Ultra-volatile and unpredictable crypto markets
Lack of real macro context (news, geopolitics)
Over-optimization in backtest that fails in real life
Lack of robust risk management

The biases of LLMs in finance

An aspect that is often overlooked: LLMs are trained on texts, not on market data.

Their limits:

Following are “popular wisdoms” from their training corpus
Replicate described strategies rather than innovating
Lack of causal understanding of market mechanisms
Hallucinating patterns that do not exist

These limitations explain why even the most advanced LLMs struggle to beat the market.

The Chinese evolution that worries

Ultra-fast technical progress

Beyond the raw results, it is The speed of evolution of Chinese AIs that impresses. Qwen and DeepSeek outperform Western models in a field as complex as trading.

What's scary:

Catching up (or even overtaking) American leaders
Opacity on training methods and data
Geopolitical implications of Chinese AI dominance in finance

Western reluctance persists

Despite the performances, many remain reluctant to use Chinese AIs.

The brakes:

Concerns about data sovereignty
Uncertainties about governance and state control
Perceived lack of transparency
Cultural differences on privacy

This distrust limits the massive adoption of Chinese models in the West.

Humans remain indispensable

What AI doesn't have (yet)

Irreplaceable human assets:

Contextual intuition beyond raw data
Experiences of past crises
Ethics and responsibility in decisions
Creative adaptation in the face of the new

The hybrid model as a future

The future is probably not “AI vs human” but “AI + human”.

The ideal complementarity:

AI for massive quantitative analysis
Human for strategic validation
AI for fast execution
Human for risk management and ethics

Would you be ready to trust your money?

The essential criteria

Questions to ask yourself:

Transparency: do you understand the AI strategy?
Track record: proven performance over a significant period of time?
Risk management: what safeguards limit losses?
Auditability: can you understand decisions?
Regulation: is AI regulated by authorities?

My opinion (non-expert)

Short term (today): No Alpha Arena's results confirm that it's too random.

Medium term (2-3 years): Perhaps in a hybrid model where the AI proposes and the human validates.

Long term (5+ years): Probably yes, but with robust safeguards and clear regulations.

The question is not “if” but “when” and “how.”

Conclusion: lessons from a fascinating experience

Alpha Arena offers a unique insight into the capabilities (and limitations) of AIs in real trading.

What we remember:

Chinese models surprise (but the site could be biased)
The volatility is extreme (+20% then crash in a few days)
The majority loses (the promise of “infallible AI trading” is far away)
Questions go beyond performance (reliability, ethics, systemic risk)

This experience confirms a truth: AI is a powerful tool but not a magic tool. In finance as elsewhere, it requires supervision, safeguards and healthy skepticism.

Real traders can sleep easy. For now.

And you, would you be ready to entrust your investments to an AI? What criteria should be met?

‍

Can we trust our money to an AI? 6 LLMs clash in crypto trading with surprising results

The experience: 6 autonomous trading AIs

The Alpha Arena concept

The results that surprise

Chinese models are crushing the competition

Western stars in the red

Extreme volatility

The questions that the experience raises

Is the site really reliable?

A period too short to conclude

Systemic risk

Why AIs struggle in trading

Theoretical advantages

The disappointing reality

The biases of LLMs in finance

The Chinese evolution that worries

Ultra-fast technical progress

Western reluctance persists

Humans remain indispensable

What AI doesn't have (yet)

The hybrid model as a future

Would you be ready to trust your money?

The essential criteria

My opinion (non-expert)

Conclusion: lessons from a fascinating experience

Mascot, face, character: How B2B brands can finally embody their brand without losing credibility

The myth of the magic button: Why creating a movie with AI is not easy

“If ChatGPT were an employee”: the video of Cyprien who puts his finger on where it really hurts

Each image tells  a story. Ready to create your own?

Can we trust our money to an AI? 6 LLMs clash in crypto trading with surprising results

The experience: 6 autonomous trading AIs

The Alpha Arena concept

The results that surprise

Chinese models are crushing the competition

Western stars in the red

Extreme volatility

The questions that the experience raises

Is the site really reliable?

A period too short to conclude

Systemic risk

Why AIs struggle in trading

Theoretical advantages

The disappointing reality

The biases of LLMs in finance

The Chinese evolution that worries

Ultra-fast technical progress

Western reluctance persists

Humans remain indispensable

What AI doesn't have (yet)

The hybrid model as a future

Would you be ready to trust your money?

The essential criteria

My opinion (non-expert)

Conclusion: lessons from a fascinating experience

Mascot, face, character: How B2B brands can finally embody their brand without losing credibility

The myth of the magic button: Why creating a movie with AI is not easy

“If ChatGPT were an employee”: the video of Cyprien who puts his finger on where it really hurts

Each image tells a story. Ready to create your own?

Each image tells  a story. Ready to create your own?