Rectifying Shortcut Behaviors in Preference-Based Reward Learning (arxiv.org)
1 points by PaulHoule 6 hours ago | 0 comments
1211 points by PaulHoule 6 hours ago | 0 comments
1211 points by DaemonCoreApp 6 hours ago | 0 comments
122231 points by Daiz 6 hours ago | 75 comments
1235 points by nradov 6 hours ago | 1 comment
1241 points by venuur 6 hours ago | 2 comments
1253 points by delichon 6 hours ago | 0 comments
1265 points by coloneltcb 6 hours ago | 0 comments
12797 points by chaosprint 6 hours ago | 27 comments
1284 points by wordhydrogen 6 hours ago | 1 comment
1292 points by iansltx 6 hours ago | 1 comment
1304 points by Marshferm 6 hours ago | 0 comments
131102 points by softwaredoug 6 hours ago | 13 comments
1322 points by ltratt 6 hours ago | 0 comments
1332 points by donsupreme 6 hours ago | 0 comments
1342 points by alamortsubite 6 hours ago | 0 comments
13511 points by bpierre 6 hours ago | 0 comments
1362 points by caspel26 6 hours ago | 0 comments
1372 points by wadamczyk 6 hours ago | 0 comments
1381 points by geox 6 hours ago | 0 comments
1391 points by moondistance 6 hours ago | 0 comments
1403 points by logannyeMD 6 hours ago | 0 comments
1411 points by senorqa 6 hours ago | 0 comments
1422 points by janpio 6 hours ago | 0 comments
1432 points by mooreds 6 hours ago | 0 comments
1445 points by oldgradstudent 6 hours ago | 2 comments
1452 points by high_byte 6 hours ago | 1 comment
1468 points by donsupreme 6 hours ago | 1 comment
1472 points by simonmic 6 hours ago | 0 comments
1485 points by paulpauper 6 hours ago | 0 comments
1496 points by rbanffy 6 hours ago | 0 comments
150