I Tried DeerFlow, the Quietly Trending AI Agent Framework
Hands-on review of ByteDance's DeerFlow 2.0, a GitHub 30k+ star AI agent framework. Setup, real-world testing, and comparison with Claude Code CLI.
Column
kkm
Backend Engineer / AWS / Django
Hands-on review of ByteDance's DeerFlow 2.0, a GitHub 30k+ star AI agent framework. Setup, real-world testing, and comparison with Claude Code CLI.
Struggling to Build Your Party?
Building the perfect RPG party is an eternal dilemma, right?
Do you go with the jack-of-all-trades, or stack specialists? The "I can do everything" generalist versus the "I only do one thing, but I'm the best at it" expert. Most of the time, the specialist wins.
The AI agent landscape in 2026 is exactly this.
CrewAI, AutoGen, LangGraph, and now ByteDance's DeerFlow. A new framework pops up every month, and every time I scroll through GitHub Trending I think, "Is this finally the one?"
DeerFlow pitched itself as the "all-in-one." Sandbox, long-term memory, sub-agents, Web UI, Slack integration -- the works. Not a framework, but a "finished product."
I'm someone who spends $100 to $200 per month on Claude Code. I've tuned my skill files and sub-agents so that research and implementation all run through a single CLI. Honestly, I have no reason to switch tools.
But 30k+ stars got my attention. So I gave it a spin.
What is DeerFlow
DeerFlow (Deep Exploration and Efficient Research Flow) is an open-source AI agent platform by ByteDance.
v1 was apparently a research-focused tool, but v2.0, released on February 28, 2026, was a complete ground-up rewrite. Zero shared code. Bold move.
GitHub stars stood at 30.7k at the time of writing. Reaching that in two weeks after release is seriously impressive momentum.
Not a framework. A ready-to-use finished product.
This is where DeerFlow differs from CrewAI and AutoGen. Frameworks say "here are the parts, assemble it yourself." DeerFlow says "it's already assembled, just use it." Web UI, sandbox, memory, Slack integration -- all included.
Here's the tech stack.
| Layer | Technology |
|---|---|
| Frontend | React 19 + Next.js |
| Gateway | Python 3.12+ / FastAPI + Uvicorn |
| Agent Runtime | LangGraph + LangChain |
| Package Management | uv (Python), pnpm (JS) |
| Sandbox | Docker / Kubernetes / Local |
A Look Under the Hood
DeerFlow uses an Nginx reverse proxy as its entry point, routing to four services.
graph TB
subgraph Nginx["Nginx :2026"]
direction LR
end
subgraph Services["Services"]
FE["Next.js
Frontend
:3000"]
GW["FastAPI
Gateway
:8001"]
LG["LangGraph
Server
:2024"]
PV["Provisioner
:8002"]
end
Nginx --> FE
Nginx --> GW
Nginx --> LG
Nginx -.-> PV
11-Stage Middleware Pipeline
Agent execution is controlled through an 11-stage pipeline. Honestly, this was the most interesting part.
- 01 ThreadData creation
- 02 Upload tracking
- 03 Sandbox acquisition
- 04 Tool call cleanup
- 05 Message summarization
- 06 Todo list management
- 07 Conversation title generation
- 08 Memory fact injection
- 09 Image preparation
- 10 Sub-agent concurrency control
- 11 Clarification interrupt
The standout is stage 08, Memory Fact Injection. DeerFlow maintains three types of long-term memory and automatically injects the top 15 facts (scored 0 to 1 by confidence) into the prompt. Updates are processed asynchronously via a 30-second debounce queue, so the conversation flow isn't interrupted.
Sub-agents can run up to three in parallel under a lead agent. They poll at 2-second intervals with a 5-minute timeout. These parameters are configurable via config.yaml.
Sandbox
You can switch between Local, Docker, and Kubernetes modes. Everything runs on a unified virtual filesystem (/mnt/user-data/workspace), so switching modes requires no code changes.
How Does It Differ from Other Frameworks
AI agent frameworks are proliferating, but here's a rough breakdown.
| Aspect | DeerFlow | CrewAI | AutoGen | LangGraph Standalone |
|---|---|---|---|---|
| Sandbox | Docker/K8s built-in | None | None | None |
| Long-term Memory | 3 types built-in | Basic | Basic | None (DIY) |
| Web UI | Full-featured | CLI-focused | Studio | None |
| IM Integration | Telegram/Slack/Feishu | None | None | None |
| Supported LLMs | Anything OpenAI-compatible | Major LLMs | Major LLMs | All LangChain-supported |
| Positioning | Finished product | Framework | Framework | Library |
The biggest difference is the "Positioning" row. CrewAI and AutoGen say "here are the parts, build what you want." DeerFlow says "everything's included, just use it." Which is better depends on your use case, but if you just want to get something running, DeerFlow is by far the easiest.
Setting It Up on macOS
I actually got it running on macOS (Apple Silicon). Here's the process, gotchas included.
Basic Steps
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
# Generate config files
python3 ./scripts/configure.py
# Set API keys in .env (see below)
# Set model in config.yaml (see below)
# Start Docker
make docker-init
make docker-start
# Open http://localhost:2026Gotcha #1: python not found
On macOS, the python command doesn't exist by default. Running make config fails with python: No such file or directory.
alias python=python3 doesn't work inside Makefiles. Running the script directly is the reliable fix.
# NG: make config
# OK:
python3 ./scripts/configure.pyGotcha #2: Gemini model names
The model names in the config.yaml examples don't always work as-is. In my case, gemini-2.5-flash-preview-05-20 threw an error, but gemini-2.5-flash worked.
You can check available model names with the following.
curl -s "https://generativelanguage.googleapis.com/v1beta/models?key=YOUR_API_KEY" \
| python3 -c "import sys,json;[print(m['name']) for m in json.load(sys.stdin).get('models',[]) if 'flash' in m['name']]"Configuration Files
You need at minimum two entries in .env. I used Gemini this time, not for any particular reason -- I just happened to have an API key from a recent project. OpenAI, Claude, DeepSeek, or anything with an OpenAI-compatible endpoint works.
# .env
TAVILY_API_KEY=your-tavily-key # Web search (free 1,000 calls/month)
GEMINI_API_KEY=your-gemini-key # LLMAdd a model definition to the models: section in config.yaml.
# config.yaml (models section)
models:
- name: gemini-2.5-flash
display_name: Gemini 2.5 Flash
use: langchain_google_genai:ChatGoogleGenerativeAI
model: gemini-2.5-flash
google_api_key: $GEMINI_API_KEY
max_tokens: 8192
supports_vision: trueHands-On Impressions
Research is fast
Research tasks combining web search (Tavily) and LLM are impressively quick. I asked "Compare FastAPI vs Django vs Flask as of 2026" and got back a well-organized response with benchmark data.
No "can't live without it" moments
The GUI has almost the same feel as Claude's desktop app -- well-made, but nothing novel. I didn't find any scenario where DeerFlow was the only viable option during this review.
As someone paying a hefty Claude Code subscription, it honestly covers everything I need. I'm just not struggling with anything.
Streaming is unstable
This was the biggest pain point. Responses cut off mid-stream. Disappear. Fail to render.
The framework comparison response broke off every time I tried. Eventually I asked it to "output to a file" and it properly wrote a Markdown file that I could download. Workarounds exist, but needing them just to use the tool normally is rough.
It generated code but didn't execute it
I asked "Build a CLI tool that fetches the top 10 articles from the HN API" and it generated Python code and displayed it. The code itself was decent. But I was expecting it to run in the sandbox and show me the results, so that was a bit underwhelming.
The experience isn't bad. But there's no scenario where it's the only option. That's my honest take.
So, Will I Actually Use It?
I'm on a $100 Claude Code subscription, bumping it to $200 in heavy months. For coding within a repo, CLI beats GUI. So DeerFlow isn't going to beat Claude Code for my coding workflow. I'm simply not struggling with anything.
But not everyone is paying $100 to $200 a month for a subscription.
For people without a paid plan, or those who burn through the $20 plan too quickly, offloading research tasks to DeerFlow for an extra $10 or so seems viable. You get a decent UI at a good price.
Cost Estimate
| Item | Unit Price | Monthly Estimate (10/day) | Monthly Cost |
|---|---|---|---|
| Gemini 2.5 Flash Input | $0.30/1M tokens | 1.5M tokens | $0.45 |
| Gemini 2.5 Flash Output | $2.50/1M tokens | 0.9M tokens | $2.25 |
| Tavily Web Search | Free 1,000 calls/mo | ~900 calls | $0 |
| Total (light usage) | $3-7/mo | ||
| Scenario | Claude Code Only | Claude Code + DeerFlow | Savings |
|---|---|---|---|
| Normal month | $100 | $100 (not needed) | -- |
| Heavy month | $200 | $103-133 | -$67-97 |
Whether Gemini Flash's research quality matches Claude Opus is a separate question -- you'll need to accept that trade-off. But getting a research agent for $3 to $10 a month isn't a bad deal.
Who is it for?
Q1. What do you mainly use AI for?
- → Primarily coding → Claude Code CLI, no contest. No GUI can beat a CLI that runs inside your repo
- → Mostly research, with some light development → Go to Q2
Q2. How much do you spend on AI tools per month?
- → $100+ → Honestly, DeerFlow isn't necessary. Claude Code handles everything
- → You burn through the $20 plan quickly / No paid plan → DeerFlow + Gemini Flash. A polished research UI for $3-10/month
- → Some months you push $100-$200 → Offloading research to DeerFlow is worth considering
A Curious In-Between
That said, is it non-engineer friendly? Not really. You need Docker, you have to manually edit config.yaml, and you need to obtain API keys yourself. It bills itself as a "complete all-in-one product" but the setup requires a fair amount of technical literacy.
Ultimately, I think it fits best for people who do some light development alongside heavy research. Serious developers are better served by the CLI, and complete non-engineers will be put off by the Docker setup. For the middle ground, getting a solid AI agent UI for under $10/month is genuinely appealing.
The age of the generalist hasn't arrived yet. For now, mastering specialists is still the winning move.
That said, if the streaming issues get fixed, I'll probably give it another shot. If the all-in-one truly stabilizes, the calculus could change.