Showing posts with label function calling. Show all posts
Showing posts with label function calling. Show all posts

21.6.25

Mistral Elevates Its 24B Open‑Source Model: Small 3.2 Enhances Instruction Fidelity & Reliability

 Mistral AI has released Mistral Small 3.2, an optimized version of its open-source 24B-parameter multimodal model. This update refines rather than reinvents: it strengthens instruction adherence, improves output consistency, and bolsters function-calling behavior—all while keeping the lightweight, efficient foundations of its predecessor intact.


🎯 Key Refinements in Small 3.2

  • Accuracy Gains: Instruction-following performance rose from 82.75% to 84.78%—a solid boost in model reliability.

  • Repetition Reduction: Instances of infinite or repetitive responses dropped nearly twofold (from 2.11% to 1.29%)—ensuring cleaner outputs for real-world prompts.

  • Enhanced Tool Integration: The function-calling interface has been fine-tuned for frameworks like vLLM, improving tool-use scenarios.


🔬 Benchmark Comparisons

  • Wildbench v2: Nearly 10-point improvement in performance.

  • Arena Hard v2: Scores jumped from 19.56% to 43.10%, showcasing substantial gains on challenging tasks.

  • Coding & Reasoning: Gains on HumanEval Plus (88.99→92.90%) and MBPP Pass@5 (74.63→78.33%), with slight improvements in MMLU Pro and MATH.

  • Vision benchmarks: Small trade-offs: overall vision score dipped from 81.39 to 81.00, with mixed results across tasks.

  • MMLU Slight Dip: A minor regression from 80.62% to 80.50%, reflecting nuanced trade-offs .


💡 Why These Updates Matter

Although no architectural changes were made, these improvements focus on polishing the model’s behavior—making it more predictable, compliant, and production-ready. Notably, Small 3.2 still runs smoothly on a single A100 or H100 80GB GPU, with 55GB VRAM needed for full-floating performance—ideal for cost-sensitive deployments.


🚀 Enterprise-Ready Benefits

  • Stability: Developers targeting real-world applications will appreciate fewer unexpected loops or halts.

  • Precision: Enhanced prompt fidelity means fewer edge-case failures and cleaner behavioral consistency.

  • Compatibility: Improved function-calling makes Small 3.2 a dependable choice for agentic workflows and tool-based LLM work.

  • Accessible: Remains open-source under Apache 2.0, hosted on Hugging Face with support in frameworks like Transformers & vLLM.

  • EU-Friendly: Backed by Mistral’s Parisian roots and compliance with GDPR/EU AI Act—a plus for European enterprises.


🧭 Final Takeaway

Small 3.2 isn’t about flashy new features—it’s about foundational refinement. Mistral is doubling down on its “efficient excellence” strategy: deliver high performance, open-source flexibility, and reliability on mainstream infrastructure. For developers and businesses looking to harness powerful LLMs without GPU farms or proprietary lock-in, Small 3.2 offers a compelling, polished upgrade.

9.6.25

Enable Function Calling in Mistral Agents Using Standard JSON Schema

 This updated tutorial guides developers through enabling function calling in Mistral Agents via the standard JSON Schema format Function calling allows agents to invoke external APIs or tools (like weather or flight data services) dynamically during conversation—extending their reasoning capabilities beyond text generation.


🧩 Why Function Calling?

  • Seamless tool orchestration: Enables agents to perform actions—like checking bank interest rates or flight statuses—in real time.

  • Schema-driven clarity: JSON Schema ensures function inputs and outputs are well-defined and type-safe.

  • Leverage MCP Orchestration: Integrates with Mistral's Model Context Protocol for complex workflows 


🛠️ Step-by-Step Implementation

1. Define Your Function

Create a simple API wrapper, e.g.:

python
def get_european_central_bank_interest_rate(date: str) -> dict: # Mock implementation returning a fixed rate return {"date": date, "interest_rate": "2.5%"}

2. Craft the JSON Schema

Define the function parameters so the agent knows how to call it:

python
tool_def = { "type": "function", "function": { "name": "get_european_central_bank_interest_rate", "description": "Retrieve ECB interest rate", "parameters": { "type": "object", "properties": { "date": {"type": "string"} }, "required": ["date"] } } }

3. Create the Agent

Register the agent with Mistral's SDK:

python
agent = client.beta.agents.create( model="mistral-medium-2505", name="ecb-interest-rate-agent", description="Fetch ECB interest rate", tools=[tool_def], )

The agent now recognizes the function and can decide when to invoke it during a conversation.

4. Start Conversation & Execute

Interact with the agent using a prompt like, "What's today's interest rate?"

  • The agent emits a function.call event with arguments.

  • You execute the function and return a function.result back to the agent.

  • The agent continues based on the result.

This demo uses a mocked example, but any external API can be plugged in—flight info, weather, or tooling endpoints 


✅ Takeaways

  • JSON Schema simplifies defining callable tools.

  • Agents can autonomously decide if, when, and how to call your functions.

  • This pattern enhances Mistral Agents’ real-time capabilities across knowledge retrieval, action automation, and dynamic orchestration.

  Anthropic Enhances Claude Code with Support for Remote MCP Servers Anthropic has announced a significant upgrade to Claude Code , enablin...