AI News & Trends

Can we trust our money to an AI? 6 LLMs clash in crypto trading with surprising results

5 MIN
November 10, 2025

Alpha Arena competes GPT-5, Claude, Gemini and 3 other AIs on the crypto market with $10,000 each. The results? Chinese models Qwen and DeepSeek dominate, while Western stars lose. This fascinating experience raises a question: can we really trust our investments to artificial intelligence?

The experience: 6 autonomous trading AIs

The Alpha Arena concept

Alpha Arena is the first public platform where artificial intelligences trade in total autonomy on the crypto market, without human intervention.

The 6 LLMs in competition

The 6 competitors:

  • GPT-5 (OpenAI)
  • Gemini 2.5 Pro (Google)
  • Claude 4.5 Sonnet (Anthropic)
  • DeepSeek V3.1 (DeepSeek)
  • Gork 4 (xAI)
  • Qwen 3 Max (Alibaba)

The rules:

  • $10,000 in initial capital per model
  • Objective: to generate profit without human help
  • All trades visible in real time on nof1.ai
  • Real crypto market with real money

This is the first time that we can observe live how different AIs make financial decisions under the same market conditions. The experience is publicly accessible, allowing everyone to follow the performances live.

The results that surprise

Chinese models are crushing the competition

Against all odds, The only two AIs who really win are Chinese.

The leaders (data as of 10/11):

  • Qwen 3 Max (Alibaba): in the lead
  • DeepSeek V3.1 : peak at +19.96% in 2 days

DeepSeek is relatively well known, but Qwen remains much less publicized in the West than GPT, Claude or Gemini. This dominance raises questions: real superiority or system bias?

Western stars in the red

GPT-5, Claude, Gemini and Grok show disappointing performances, with some posting significant losses.

The brutal gap:

  • Best performance: +23%
  • Worst performance: -63%
  • Majority of models in the red

The best AIs in conversation are clearly not the best at making financial decisions.

Extreme volatility

The case DeepSeek illustrates the problem: +19.96% in 2 days, then a huge drop. This volatility reveals that even the best AIs can explode and then implode just as quickly.

The questions that the experience raises

Is the site really reliable?

Alpha Arena is managed by Jay A Zhang, founder of Nof1.ai, a Chinese company specialized in AI applied to trading. This information raises a legitimate question: is the site “tampered with” to favor Chinese models?

Disturbing signs:

  • The only two winning AIs are Chinese
  • Qwen is much less well known than the other 5
  • The creator is Chinese
  • No independent audit available

Without complete technical transparency, it is impossible to make a decision. But this doubt should temper our interpretation of the results.

A period too short to conclude

Crucial point: The first season ended on November 3. A new season has been announced.

Why it's a problem:

  • A few days are not enough to evaluate a trading strategy
  • Need to observe over several market cycles
  • Reproducibility not demonstrated
  • Safeguards and human supervision are essential in such volatile markets

This “seasons” format suggests that the experience remains very experimental.

Systemic risk

Let's imagine that thousands of investors blindly follow winning AI trades.

What is going on?

  • Massive self-fulfilling prophecy
  • Potential manipulation by malicious actors
  • Amplified sheep effect creating bubbles and crashes

It is precisely this systemic risk that financial regulators are worried about.

Why AIs struggle in trading

Theoretical advantages

What AI should bring:

  • No emotions (fear, greed)
  • Massive data analysis speed
  • 24/7 availability without fatigue
  • No confirmation bias

The disappointing reality

Despite these advantages, The majority of AIs are in the red.

Probable reasons:

  • Ultra-volatile and unpredictable crypto markets
  • Lack of real macro context (news, geopolitics)
  • Over-optimization in backtest that fails in real life
  • Lack of robust risk management

The biases of LLMs in finance

An aspect that is often overlooked: LLMs are trained on texts, not on market data.

Their limits:

  • Following are “popular wisdoms” from their training corpus
  • Replicate described strategies rather than innovating
  • Lack of causal understanding of market mechanisms
  • Hallucinating patterns that do not exist

These limitations explain why even the most advanced LLMs struggle to beat the market.

The Chinese evolution that worries

Ultra-fast technical progress

Beyond the raw results, it is The speed of evolution of Chinese AIs that impresses. Qwen and DeepSeek outperform Western models in a field as complex as trading.

What's scary:

  • Catching up (or even overtaking) American leaders
  • Opacity on training methods and data
  • Geopolitical implications of Chinese AI dominance in finance

Western reluctance persists

Despite the performances, many remain reluctant to use Chinese AIs.

The brakes:

  • Concerns about data sovereignty
  • Uncertainties about governance and state control
  • Perceived lack of transparency
  • Cultural differences on privacy

This distrust limits the massive adoption of Chinese models in the West.

Humans remain indispensable

What AI doesn't have (yet)

Irreplaceable human assets:

  • Contextual intuition beyond raw data
  • Experiences of past crises
  • Ethics and responsibility in decisions
  • Creative adaptation in the face of the new

The hybrid model as a future

The future is probably not “AI vs human” but “AI + human”.

The ideal complementarity:

  • AI for massive quantitative analysis
  • Human for strategic validation
  • AI for fast execution
  • Human for risk management and ethics

Would you be ready to trust your money?

The essential criteria

Questions to ask yourself:

  • Transparency: do you understand the AI strategy?
  • Track record: proven performance over a significant period of time?
  • Risk management: what safeguards limit losses?
  • Auditability: can you understand decisions?
  • Regulation: is AI regulated by authorities?

My opinion (non-expert)

Short term (today): No Alpha Arena's results confirm that it's too random.

Medium term (2-3 years): Perhaps in a hybrid model where the AI proposes and the human validates.

Long term (5+ years): Probably yes, but with robust safeguards and clear regulations.

The question is not “if” but “when” and “how.”

Conclusion: lessons from a fascinating experience

Alpha Arena offers a unique insight into the capabilities (and limitations) of AIs in real trading.

What we remember:

  • Chinese models surprise (but the site could be biased)
  • The volatility is extreme (+20% then crash in a few days)
  • The majority loses (the promise of “infallible AI trading” is far away)
  • Questions go beyond performance (reliability, ethics, systemic risk)

This experience confirms a truth: AI is a powerful tool but not a magic tool. In finance as elsewhere, it requires supervision, safeguards and healthy skepticism.

Real traders can sleep easy. For now.

And you, would you be ready to entrust your investments to an AI? What criteria should be met?

Article written by
Benjamin BENOLIEL
Co-founder & Head of Sales

Each image tells 
a story. Ready to create your own?

Let's discuss your ideas, we'll take care of bringing them to life.