In principle, the significance level should reflect the costs and benefits of correct and incorrect decisions (true and false positives, and true and false negatives). In practice, such informed threshold choices are rare. Instead, practitioners usually pick a conventional significance level, such as 0.05, 0.01, 0.001, or 0.0001. Pre-specifying the level may help ensure that the choice is not engineered to deliver a preferred decision. In the context of election forensics, it is perhaps harder than usual to pick a good level, because elections are typically carried out in intervals of years as one-off events, and much hinges on the outcome and its perceived legitimacy. This is especially true for new or troubled democracies, which are more likely to be inspected for signs of electoral fraud. It is impossible to know in advance what costs and benefits follow from raising flags for accurate elections and for not raising them for elections with irregularities.