Chain of Rubrics

📖 Summary

This post introduces a multidimensional reward modeling pipeline built on top of the CO_AI framework. It covers:

✅ Structured Evaluation Setup How to define custom evaluation dimensions using YAML or database-backed rubrics.
🧠 Automated Scoring with LLMs Using the ScoreEvaluator to produce structured, rationale-backed scores for each dimension.
🧮 Embedding-Based Hypothesis Indexing Efficiently embedding hypotheses and comparing them for contrastive learning using similarity.
🔄 Contrast Pair Generation Creating training pairs where one hypothesis outperforms another on a given dimension.