

Look, I survived statistics class. I will stride to defend some of my post.
but it doesn’t explain what alternative hypothesis you’re leaning toward—high engagement versus low engagement isn’t inherently “good” or “bad” without further context.
Namely that much of the aim of it was to show that an metric like comment count doesn’t imply that it was a good or bad post - hence the bizarre engagement bait at the end. And also why all of the “good posts” were in quotes.
you might add a step that actually calculates the p-value for an observed comment count. This would give you a clearer measure of how “unusual” your observation is under your model.
I’m under the impression that whilst you can do a Hypothesis test by calculating the probability of the test statistic occurring, you can also do it by showing that the result is in the critical regions. Which can be useful if you want to know if a result is meaningful based on what the number is, rather than having to calculate probabilities. For a post of this nature, it makes no sense to find a p value for a specific post, since I want numbers of comments that anyone for any post can compare against. Calculating a p-value for an observed comment count makes no sense to me here, since it’s meaningless to basically everyone on this platform.
Using critical regions based on the Poisson distribution can be useful to flag unusual observations. However, you need to be careful that the interpretation of those regions aligns with the hypothesis test framework. For instance, simply saying that fewer than 4 comments falls in the “critical region” implies that you reject the null when observing such counts
Truthfully I wasn’t doing a hypothesis test - and I don’t say I am in the post - although your original reply confused me - so I thought I was, I was finding critical regions and interpreting them, however I’m also under the impression that you can do 2 tailed tests, although I did make a mistake by not splitting the significance level in half for each tail. :(. I should have been clearer that I wasn’t doing a hypothesis test, rather calculating critical regions.
It doesn’t seem like you are saying I’m wrong, rather that my model sucks - which is true. And that my workings are weird - it’s a Lemmy post not a science paper. That said, I didn’t quite expect this post to do so well, so I’ve edited the middle section to be clearer as to what I was trying to do.
Presumably where you posted it, given that local feeds show posts based, not on if someone is on the instance, but rather which instance the post is made on. The model I used is litterally the most basic thing in the world, so I just cobbled something together that was somewhat meaningful. I only took college stats, so complex models are out of my range.