p-curve

If we conduct an experiment where there is no real effect, we would expect, by random chance, that we would get, around 5% of the time, a p-value of 0.05 or lower and, about 1% of the time, a p-value less than 0.01 – that is precisely what these significance levels mean: on average, a flat curve for p-values vs number of experiments with that value. In contrast, where there is a real effect we would hope to see a considerably higher proportion of lower p-values. Because of the file drawer effect not all the results of experiments are published, so typically we are unlikely to see the negative results. However, it has been suggested by Simonsohn et al. that the shape of the p-curve, that is the graph of results with each p-value, can give a sense of the health of a discipline and the impact of the file drawer effect. If this curve is flat, it suggests that there are few real results and most are pure chance; if it is highly curved, with less than a 5:1 ratio of 5% or more to 1% or more results, then it is likely that the discipline is using studies with sufficient statistical power and that most or many of the published results do in fact correspond to real effects.

Used in page 87

p-curve

Terms from Statistics for HCI: Making Sense of Quantitative Data

Links: