Editor’s Note: The following explains the methodology used by Columbia Journalism Investigations to produce the article and interactive graphic 2020 Election Could Hinge on Whose Votes Don’t Count, companion stories to FRONTLINE’s forthcoming documentary Whose Vote Counts. Together, this collection of investigative reporting is a partnership between Columbia Journalism Investigations, Columbia Journalism School, USA Today Network and FRONTLINE.
This November we can expect historic levels of voting by mail. Experts believe at least half of the electorate will vote absentee, millions for the first time. If the rate of rejected mail ballots from 2016 is about the same this time around, the CJI analysis found that more than one million voters are likely to have their ballots tossed out. This analysis was completed in collaboration with Merlin Heidemanns, a Political Science PhD student at Columbia University.
We used the only source for national election administration data—the Election Administration and Voting Survey (EAVS)—from the 2016 general election. We rely on the 2016 data because it is most complete. While 2016 showed low levels of voter turnout among minority voters compared to 2008 and 2012, those years would have been less complete while also being less representative in terms of absentee voting.
The purpose of our analysis is to see—assuming that turnout, absentee ballot submission and rejection rates remain the same—how an increase in absentee requests would impact the number of ballots rejected. We focused our analysis only on civilian mail ballots (Section C of EAVS) using the EAVS Data Public Release Version 4.
Elections across the U.S. are run on the local-level. The county most commonly runs elections, but in a few states, they are administered on a hyper-local level: by parish, ward or municipality. Wisconsin for example runs elections by ward, and has over 1,800 election offices. For a few states, we chose to aggregate the data up to the county level which provided a more robust sample size and the ability to merge in demographic data. Our final dataset has 3,114 jurisdictions.
While EAVS 2016 is the most complete data set we have for a general election year, there is still missing data. 167 jurisdictions of the 3,114 total did not report rejected ballot data (C4b) across eight states. We have almost complete data in battleground states however, with only Arkansas reporting some missing data. Submitted ballots (C1b) and transmitted ballots (C1a) also had missing data, but it made up only 0.8% and 0.7% of all jurisdictions respectively.
The data was supplemented with additional data that was collected from the election-related state and local offices and merged into the EAVS data based on the associated FIPS code. If 85% of the jurisdictions did not report data on rejected ballots in a state, we contacted the jurisdictions and states to fill in the missingness. We based the 85% threshold on the EPI methodology. In the following states, fewer than 85% of jurisdictions did not report data in 2016: Texas, Arkansas, New Mexico, Alabama, and Vermont.
- Alabama reported no rejected ballot data.
- Vermont did not report any data at all. The state of Vermont provided a full dataset upon request.
- New Mexico reports that “data may not add up because it comes from different, unidentified sources”—we were unable to get clarification from the state despite multiple attempts to reach them.