Given the recent news that CD Projekt Red is currently recruiting multiplayer-savvy designers for its much-anticipated sequel, I find myself thinking back to the maligned launch of Cyberpunk 2077. We ...
RULER (Relative Universal LLM-Elicited Rewards) eliminates the need for hand-crafted reward functions by using an LLM-as-judge to automatically score agent trajectories. Simply define your task in the ...