Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes. The math is much more subtle than people realize. Bayesian (meaning conditional) probability is required for any solution to be passable. The problem that needs to be solved is given the number of upvotes, downvotes, and views, what is the probability that it is a comment that the next random reader would want to upvote it, and then sort on that probability. There are well established formulas for this kind of thing.

The one you link considers the upvotes and downvotes as the sample population then constructs the probability that you will upvote it and ranks on that. This allows late posts to take over early posts if they have a better ratio, even if they have significantly less votes, but only if it is "enough" better to make up for its lack of confidence.

This fails to address a few things here:

* It completely ignores people that read the post but didn't vote. There really is no way to get a perfect count of this no matter how much scroll logging, but you could approximate it and then include it in the calculation.

* On HN many people can't downvote.

* As others have pointed HN doesn't necessarily start with a base assumption of all commenters being equal, nor should they.



Yes, but you are almost guaranteed lower votes and lack of confidence if you tie voting and comment viewing into the same function, such that you have to scroll past comments to get to low confidence ones that need more voting in order to establish reasonable intervals.

For example: You would be better off also randomly displaying a comment that requires more voting to establish good confidence bounds on the first page to each user somewhere.

This enables you to get better results quicker over the entire group, and gives you better results than taking into account how many people read but didn't vote.


I have always disliked that assumption, it assumes too close of a correlation between "content someone wants to read" and "content someone will upvote". I would add to your list that it biases content towards posts that have popular opinions that the majority thinks are minority opinions.

It fits well with how people act and it fits well with what people complain about in the comments. Hard to measure though. Although if something as rough as scroll logging can still make an improvement I'm sure you could find something rough along these lines that would.

(It's so fun figuring out how to make our filter bubbles more harmful isn't it?)


"* It completely ignores people that read the post but didn't vote. There really is no way to get a perfect count of this no matter how much scroll logging, but you could approximate it and then include it in the calculation."

Would it help if there was an incentive to vote? You could, for example, increase a user's score by a point for each vote given but weight the vote with more points. This way user who write good comments and active readers could benefit.

But then again, this could tempt some users to abuse the voting system. I can't predict which of these two reactions would prevail.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: