Rectifying Shortcut Behaviors in Preference-Based Reward Learning arxiv.org 1 points by PaulHoule 8 hours ago