Wandering Nomad: web search agents

3.7.25

Baidu’s “AI Search Paradigm” Unveils a Four-Agent Framework for Next-Generation Information Retrieval

A Blueprint for Smarter Search

Traditional RAG pipelines handle simple fact look-ups well but struggle when queries require multi-step reasoning, tool use, or synthesis. In response, Baidu Research has introduced the AI Search Paradigm, a unified framework in which four specialized LLM-powered agents collaborate to emulate human research workflows.

Agent	Role	Key Skills
Master	Classifies query difficulty & launches a workflow	Meta-reasoning, task routing
Planner	Breaks the problem into ordered sub-tasks	Decomposition, tool selection
Executor	Calls external APIs or web search to gather evidence	Retrieval, browsing, code-run
Writer	Consolidates evidence into fluent, cited answers	Synthesis, style control

The architecture adapts on the fly: trivial queries may bypass planning, while open-ended questions trigger full agent collaboration.

Technical Innovations

Dynamic Workflow Graphs – Agents spawn or skip steps in real time based on intermediate results, avoiding rigid “one-size-fits-all” chains.
Robust Tool Layer – Executor can invoke search APIs, calculators, code sandboxes, and custom enterprise databases, all via a common interface.
Alignment & Safety – Reinforcement learning with human feedback (RLHF) plus retrieval-grounding reduce hallucinations and improve citation accuracy.

Benchmark Results

On a suite of open-web reasoning tasks the system, dubbed Baidu ASP in the paper, surpasses state-of-the-art open-source baselines and even challenges proprietary models that rely on massive context windows alone.

Benchmark	Prior Best (RAG)	Baidu ASP
Complex QA (avg. F1)	46.2	57.8
Multi-hop HotpotQA (Exact Match)	41.5	53.0
ORION Deep-Search	37.1	49.6

Practical Implications

Enterprise Knowledge Portals – Route user tickets through Planner→Executor→Writer to surface compliant, fully referenced answers.
Academic Research Assistants – Decompose literature reviews into sub-queries, fetch PDFs, and synthesize summaries.
E-commerce Assistants – From “Find a laptop under $800 that runs Blender” to a shoppable list with citations in a single interaction.

Because each agent is modular, organisations can fine-tune or swap individual components—e.g., plugging in a domain-specific retrieval tool—without retraining the entire stack.

Looking Ahead

The team plans to open-source a reference implementation and release an evaluation harness so other researchers can benchmark new agent variants under identical conditions. Future work focuses on:

Reducing latency by parallelising Executor calls
Expanding the Writer’s multimodal output (tables, charts, code diffs)
Hardening the Master agent’s self-diagnosis to detect and recover from tool failures

Takeaway
Baidu’s AI Search Paradigm reframes search as a cooperative, multi-agent process, merging planning, tool use, and natural-language synthesis into one adaptable pipeline. For enterprises and researchers seeking deeper, trustable answers—not just blue links—this approach signals how tomorrow’s search engines and internal knowledge bots will be built.

9.6.25

Google Open‑Sources a Full‑Stack Agent Framework Powered by Gemini 2.5 & LangGraph

Google has unveiled an open-source full-stack agent framework that combines Gemini 2.5 and LangGraph to create conversational agents capable of multi-step reasoning, iterative web search, self-reflection, and synthesis—all wrapped in a React-based frontend and Python backend

🔧 Architecture & Workflow

The system integrates these components:

React frontend: User interface built with Vite, Tailwind CSS, and Shadcn UI.
LangGraph backend: Orchestrates agent workflow using FastAPI for API handling and Redis/PostgreSQL for state management
Gemini 2.5 models: Power each stage—dynamic query generation, reflection-based reasoning, and final answer synthesis.

🧠 Agent Reasoning Pipeline

Query Generation
The agent kicks off by generating targeted web search queries via Gemini 2.5.
Web Research
Uses Google Search API to fetch relevant documents.
Reflective Reasoning
The agent analyzes results for "knowledge gaps" and determines whether to continue searching—essential for deep, accurate answers
Iterative Looping
It refines queries and repeats the search-reflect cycle until satisfactory results are obtained.
Final Synthesis
Gemini consolidates the collected information into a coherent, citation-supported answer.

🚀 Developer-Friendly

Hot-reload support: Enables real-time updates during development for both frontend and backend
Full-stack quickstart repo: Available on GitHub with Docker‑Compose setup for local deployment using Gemini and LangGraph
Robust infrastructure: Built with LangGraph, FastAPI, Redis, and PostgreSQL for scalable research applications.

🎯 Why It Matters

This framework provides a transparent, research-grade AI pipeline: query ➞ search ➞ reflect ➞ iterate ➞ synthesize. It serves as a foundation for building deeper, more reliable AI assistants capable of explainable and verifiable reasoning—ideal for academic, enterprise, or developer research tools

⚙️ Getting Started

To get hands-on:

Clone the Gemini Fullstack LangGraph Quickstart from GitHub.
Add .env with your GEMINI_API_KEY.
Run make dev to start the full-stack environment, or use docker-compose for production setup

This tooling lowers the barrier to building research-first agents, making multi-agent workflows more practical for developers.

✅ Final Takeaway

Google’s open-source agent stack is a milestone: it enables anyone to deploy intelligent agents capable of deep research workflows with citation transparency. By combining Gemini's model strength, LangGraph orchestration, and a polished React UI, this stack empowers users to build powerful, self-improving research agents faster.