Showing posts with label LangGraph. Show all posts
Showing posts with label LangGraph. Show all posts

6.7.25

LangGraph Rollout: how VeRL leveled-up multi-turn Agent RL

 

Why this matters

If you’ve ever tried to train an LLM-powered agent with many tool calls spread across a genuine back-and-forth conversation, you’ve probably discovered that “multi-turn” means different things to different frameworks. Yanbin Jiang’s latest post shows how the VeRL team punched through that ceiling by grafting LangGraph directly onto VeRL’s reinforcement-learning rollout engine. The result is a training loop that speaks the same language as production code. 


1. Where they started

  • Native VeRL multi-turn – great for quick experiments. You enable multi_turn: True, write a YAML schema for each tool, implement an async Python class, and you’re off; their GSM8K benchmark ran in two days. 

  • Pain points

    1. Double bookkeeping: every tool had to be declared twice (YAML + Python).

    2. Drift: schema and code fell out of sync, and prod tools (written for LangChain/LangGraph) diverged from the “training” clones. 


2. A quick stop-gap: automatic tool wrapping

Yanbin added BaseTool.from_callable(), which introspects any plain Python function with transformers.utils.get_json_schema, then fabricates a VeRL-compatible wrapper on the fly. One list of callables (tool_list = [multiply, add, …]) now powers both training and prod. 

My dev take: this is the same pattern I use in LangChain when I decorate business logic with @tool. Nice to see VeRL admit “if you can’t beat reflection, join it.”


3. The real blocker: orchestration power

Research quickly outgrew VeRL’s built-in rollout:

NeedWhy VeRL fell short
Dynamic branches & backtrackingNative graph was too rigid.
True multi-turn dialogue (user follow-ups)Any assistant message without tool calls ended the convo.
Per-node sampling / chat-template tweaksGlobal settings only.

Enter LangGraph: a lightweight DAG engine already shipping in production.

4. Architectural insight: separation of concerns

“Let VeRL manage actor weights & hardware; let LangGraph drive the conversation.” 

So they built a LangChain-compatible chat-model client for VeRL’s SGLang server. Training now works like this:

  1. VeRL hands the initial messages + model handle to the user’s LangGraph.

  2. The graph does its thing—branching, retrying, invoking tools—using the exact actor weights being optimized.

  3. When the graph stops, VeRL collects the message history and rewards. 

The PR shows a seven-line YAML snippet that swaps the old rollout for:

yaml
multi_turn:
chat_template_kwargs: {enable_thinking: false} langgraph: path: /path/to/graph.py graph_config: {recursion_limit: 100}

…and a 60-line example graph that binds tools, counts turns, and lets you vary temperature node-by-node. 


5. Why I’m excited

  • One graph to rule them all – deployment and training share code; no more “but it worked in prod!”

  • Easier ablations – want to test a new branch strategy? Edit the graph script; RL pipeline stays untouched.

  • Framework-agnostic future – the same bridge pattern could plug VeRL into OpenAI Function Calling, Microsoft’s AutoGen, or whatever framework wins next year.


My takeaway

VeRL just became a lot more attractive for serious agent RL work. By leaning on LangGraph instead of extending an in-house orchestration DSL, the team keeps VeRL laser-focused on fast rollouts, leaves graph logic to a dedicated library, and—crucially—lets devs iterate on one codebase. If you’re juggling duplicate tool definitions or fighting mismatch between training and production, clone Yanbin’s PR and breathe easier.

Explore it more here: https://jybsuper.github.io/posts/langgraph_rollout/ 

9.6.25

Google Open‑Sources a Full‑Stack Agent Framework Powered by Gemini 2.5 & LangGraph

 Google has unveiled an open-source full-stack agent framework that combines Gemini 2.5 and LangGraph to create conversational agents capable of multi-step reasoning, iterative web search, self-reflection, and synthesis—all wrapped in a React-based frontend and Python backend 


🔧 Architecture & Workflow

The system integrates these components:

  • React frontend: User interface built with Vite, Tailwind CSS, and Shadcn UI.

  • LangGraph backend: Orchestrates agent workflow using FastAPI for API handling and Redis/PostgreSQL for state management 

  • Gemini 2.5 models: Power each stage—dynamic query generation, reflection-based reasoning, and final answer synthesis.


🧠 Agent Reasoning Pipeline

  1. Query Generation
    The agent kicks off by generating targeted web search queries via Gemini 2.5.

  2. Web Research
    Uses Google Search API to fetch relevant documents.

  3. Reflective Reasoning
    The agent analyzes results for "knowledge gaps" and determines whether to continue searching—essential for deep, accurate answers 

  4. Iterative Looping
    It refines queries and repeats the search-reflect cycle until satisfactory results are obtained.

  5. Final Synthesis
    Gemini consolidates the collected information into a coherent, citation-supported answer.


🚀 Developer-Friendly

  • Hot-reload support: Enables real-time updates during development for both frontend and backend 

  • Full-stack quickstart repo: Available on GitHub with Docker‑Compose setup for local deployment using Gemini and LangGraph 

  • Robust infrastructure: Built with LangGraph, FastAPI, Redis, and PostgreSQL for scalable research applications.


🎯 Why It Matters

This framework provides a transparent, research-grade AI pipeline: query ➞ search ➞ reflect ➞ iterate ➞ synthesize. It serves as a foundation for building deeper, more reliable AI assistants capable of explainable and verifiable reasoning—ideal for academic, enterprise, or developer research tools 


⚙️ Getting Started

To get hands-on:

  • Clone the Gemini Fullstack LangGraph Quickstart from GitHub.

  • Add .env with your GEMINI_API_KEY.

  • Run make dev to start the full-stack environment, or use docker-compose for production setup 

This tooling lowers the barrier to building research-first agents, making multi-agent workflows more practical for developers.


✅ Final Takeaway

Google’s open-source agent stack is a milestone: it enables anyone to deploy intelligent agents capable of deep research workflows with citation transparency. By combining Gemini's model strength, LangGraph orchestration, and a polished React UI, this stack empowers users to build powerful, self-improving research agents faster.

 If large language models have one redeeming feature for safety researchers, it’s that many of them think out loud . Ask GPT-4o or Claude 3....