Better leaderboards could be cool #2637
Replies: 1 comment
-
Thank you for your ideas @benshindel! First, I agree that Peer leaderboards currently have a problem. See recent discussion here. Regarding your suggestions:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Current behavior
Currently, baseline accuracy scales almost linearly with number of questions forecasted, to the point that you could easily win it by just forecasting it on every single question on the platform and constantly mimicking the community forecast. This seems not optimal. Also, peer accuracy rewards you for only participating in very LARGE tournaments where ppl don't update their forecast. You could imagine that if I forecasted on the ACX tournament, and then also forecasted on 50 coin flip questions, I would have a far lower peer accuracy than someone who only forecasted on the ACX tournament, despite possibly being a better forecaster (in relation to my peers).
Expected/desired behavior
The peer accuracy prior should be like 50 or 75 questions or something, to actually incentivize doing well across multiple tournaments. Alternatively, the peer accuracy scores could be normalized by volatility (within tournaments or between them), so that participating in tournaments with higher- or lower-than-average peer accuracy scores doesn't guarantee you or lock you out of the leaderboard. I'm currently 1st on peer accuracy so I think I'm unbiased in saying it's gameable, lol.
It might be nice to have a leaderboard that captures both of these in some way. A graph plotting peer accuracy against baseline accuracy for the top x users or whatever would be really cool?!
Y'all love h-indexes (as should everyone). What about an h-index for peer accuracy? Something like (# of questions with a certain peer accuracy score)? I feel like that could capture some unique new information. That might also work with baseline accuracy. All 4 scores could be h-indexes, that would be neat lol.
Beta Was this translation helpful? Give feedback.
All reactions