Showing posts with label Bug Fixing. Show all posts
Showing posts with label Bug Fixing. Show all posts

29.6.25

Code Graph Model (CGM): A Graph-Integrated LLM that Tackles Repository-Level Software Tasks without Agents

 

From Functions to Full Repositories

Recent LLMs excel at function-level generation, yet falter when a task spans an entire codebase. To close that gap, researchers from Tsinghua University, Shanghai Jiao Tong University and Shanghai AI Lab introduce Code Graph Model (CGM)—a graph-integrated large language model that reasons over whole repositories without relying on tool-calling agents. 

How CGM Works

ComponentPurpose
Graph Encoder–AdapterExtracts control-flow, call-graph and dependency edges from every file, converting them into node embeddings.
Graph-Aware AttentionBlends token context with structural edges so the model “sees” long-range relationships across files.
Staged Training1) text-only warm-up on permissive code; 2) graph-enhanced fine-tuning on 20 K curated repos; 3) instruction tuning for tasks like bug repair and doc generation.

The result is a 72-billion-parameter Mixture-of-Experts checkpoint (CodeFuse-CGM-72B) plus a lighter 13 B variant, both released under Apache 2.0 on Hugging Face. 

Benchmark Highlights

Task (RepoBench)GPT-4o (agent)DeepSeek-R1CGM-72B
Bug Fix (pass@1)62.3 %55.8 %64.7 %
Refactor-Large58.1 %48.9 %61.4 %
Doc Generation71.5 %66.2 %72.1 %

CGM matches or beats proprietary agent stacks while running single-shot—no tool chaining, no external memory. 

Why It Matters

  • Agent-Free Reliability – Removes the non-determinism and overhead of multi-call agent frameworks.

  • Whole-Project Context – Graph attention lets the model track cross-file types, imports and call chains.

  • Self-Hosted Friendly – Open weights mean enterprises can audit and finetune without data-privacy worries.

Limitations & Roadmap

The authors note performance drops on repos exceeding 50 K lines; future work targets hierarchical graphs and sparse attention to scale further. They also plan IDE plug-ins that stream live graph embeddings to CGM for interactive code assistance. 


Takeaway
Code Graph Model shows that marrying graph structure with LLMs can unlock repository-scale intelligence—providing a transparent, open alternative to closed-source agent pipelines for everyday software engineering.

Paper: https://huggingface.co/papers/2505.16901

28.5.25

Google Unveils Jules: An Asynchronous AI Coding Agent to Streamline Developer Workflows

 Google has introduced Jules, an experimental AI coding agent aimed at automating routine development tasks and enhancing productivity. Built upon Google's Gemini 2.0 language model, Jules operates asynchronously within GitHub workflows, allowing developers to delegate tasks like bug fixes and code modifications while focusing on more critical aspects of their projects. 



Key Features

  • Asynchronous Operation: Jules functions in the background, enabling developers to continue their work uninterrupted while the agent processes assigned tasks.

  • Multi-Step Planning: The agent can formulate comprehensive plans to address coding issues, modify multiple files, and prepare pull requests, streamlining the code maintenance process. 

  • GitHub Integration: Seamless integration with GitHub allows Jules to operate within existing development workflows, enhancing collaboration and efficiency. 

  • Developer Oversight: Before executing any changes, Jules presents proposed plans for developer review and approval, ensuring control and maintaining code integrity. 

  • Real-Time Updates: Developers receive real-time progress updates, allowing them to monitor tasks and adjust priorities as needed. 

Availability

Currently, Jules is in a closed preview phase, accessible to a select group of developers. Google plans to expand availability in early 2025. Interested developers can sign up for updates and request access through the Google Labs platform.

 If large language models have one redeeming feature for safety researchers, it’s that many of them think out loud . Ask GPT-4o or Claude 3....