Hacker News

How to Read an Unlabeled Sales Chart (2013)(evanmiller.org)

74 pointsTomte posted 5 days ago7 Comments
7 Comments:
axlee said 5 days ago:

I remember "How Not To Sort By Average Rating" by the same author, which I am confident helped shape thousands of recommendation/quality sorting algorithms (including Reddit or Yelp), over the years that followed its writing.

https://www.evanmiller.org/how-not-to-sort-by-average-rating...

gowld said 4 days ago:

I doubt it. Evan's algorithm optimizes for user satisfaction. Merchants optimize for engagement and conversion.

Darkphibre said 5 days ago:

Oh, this is a great read! I fondly recall a meeting with extremely bright individuals, and several of us were able to deduce the scale on the fly using the Riemann Zeta Function trick. The presenter was not pleased (the deduced number was not exactly complimentary to their messaging).

gowld said 4 days ago:

How does that work, since in the common case the number of pixels would be rounded to nearest integer?

The article explains that this only works if you assume that there are much fewer sales than pixels, so that sales divide pixels.

Darkphibre said 4 days ago:

If I remember correctly, the metrics were in millions (and clearly rounded to the nearest million).

And it wasn't sales data, I believe it had to do with chip design or marketshare (during the design phase of the current generation of consoles).

fyp said 5 days ago:

> In the absence of calamity, fortuitous events, or brilliant new marketing strategies, sale counts are well-described by a Poisson process. That is, you can think of there being an underlying average number of sales per day, and each day will be a realization of a Poisson distribution with that average.

Can someone give a bit more justification for this? It seems like the average rate shouldn't be constant and is heavily dependent on time/date.

If not, is there another justification for why sales mean should equal sales variance?

gowld said 4 days ago:

> shouldn't be constant and is heavily dependent on time/date.

A priori it is assumed that sales are independent of time. That's part of what Poisson distribution means -- a constant rate of rolls of a fixed weighted die. The assumption could be wrong.

> justification for why sales mean should equal sales variance

https://proofwiki.org/wiki/Variance_of_Poisson_Distribution