Self-Improving Agents

Programming Intelligence: Using Symbolic Rules to Steer and Evolve AI

Programming Intelligence: Using Symbolic Rules to Steer and Evolve AI

🧪 Summary

“What if AI systems could learn how to improve themselves not just at the level of weights or prompts, but at the level of strategy itself? In this post, we show how to build such a system, powered by symbolic rules and reflection.

The paper Symbolic Agents: Symbolic Learning Enables Self-Evolving Agents introduces a framework where symbolic rules guide, evaluate, and evolve agent behavior.

Adaptive Reasoning with ARM: Teaching AI the Right Way to Think

Adaptive Reasoning with ARM: Teaching AI the Right Way to Think

Summary

Chain-of-thought is powerful, but which chain? Short explanations work for easy tasks, long reflections help on hard ones, and code sometimes beats them both. What if your model could adaptively pick the best strategy, per task, and improve as it learns?

The Adaptive Reasoning Model (ARM) is a framework for teaching language models how to choose the right reasoning format direct answers, chain-of-thoughts, or code depending on the task. It works by evaluating responses, scoring them based on rarity, conciseness, and difficulty alignment, and then updating model behavior over time.

General Reasoner: The smarter Local Agent

General Reasoner: The smarter Local Agent

🔧 Summary

The General Reasoner paper shows how we can train LLMs to reason across domains using diverse data and a generative verifier. In this post, I walk through our open-source implementation showing how we built a modular reasoning agent capable of generating multiple hypotheses, evaluating them with an LLM-based judge, and selecting the best answer.


🧠 What We Built

We built a GeneralReasonerAgent that:

  • Dynamically generates multiple hypotheses using different reasoning strategies (e.g., cot, debate, verify_then_answer, etc.)
  • Evaluates each pair of hypotheses using either a local LLM judge or our custom MR.Q evaluator
  • Classifies the winning hypothesis using rubric dimensions
  • Logs structured results to a PostgreSQL-backed system

All of this was integrated with our existing co_ai framework, which includes: