1. Change Log Update : I've spent some time going through the data to get a better understanding of the relationships between the rows, and how to identify batches of votes that were loaded into the system. I focused on the largest batches in CD-1, CD-10, and CD-11.
2. I was particularly focused on batches that had both Dem and Rep candidates for Prez, Senate, and House. The purpose was to compare the votes across the ballots. Expecting to see Prez with the largest numbers, and the Senate / House being lower numbers.
3. Well, this is true for the Dem candidates, but not for Rep candidates. I haven't manually checked all the batches, but this tends to hold true for smaller batches (less than 50K). Here is an example of what I mean for this unusual trend in the votes of these batches.
4. This example is a batch of votes that was loaded on 11/3 @ 9:44PM. This batch is restated / updated on 11/6 @ 4:31PM. The bold outlines represent the batch. You can see how Dems go from largest number to smallest as you go down ballot. However, for Reps it's reversed. Odd.
5. I don't have an explanation for this, but, in a presentation by Dr. Shiva, he explains how election software can improperly count ballots votes. Since this data shows us the voting results of specific batches of ballots, this dataset might be useful in analyze these patters
6. The presentation was at the precinct level, but I think batch specific results would be helpful in determining that. Since we have these batches at the county / district level (these are absentee ballots), it could help hone in on areas where such calculation issues come from.
7. In the examplle I showed, the batch was in Loudoun County / 10 district. From the smaller batches I've checked from other counties / districts, this skewing doesn't seem to happen. Suggests this wasn't present in all places. Here's one more example.
8. This is a small batch example. Caroline County / District 1. Batch was loaded on 11/3 @ 8:35PM. Same batch was updated on 11/6 @ 4:03PM. Same trend as the large batch example. Dem goes large to small going down ballot. Rep small to large. Why?
9. While these examples relate to odd patterns in the batches, I've also spent some time looking in other areas. One is "tabulation errors" to see if there is anything unusual. I tweeted earlier about VA-7 Spanberger, and looked at the batches for this race closer.
10. In this example, we follow two updates. The first has no votes because this appears to be a preloaded entry in preparation for the election. First update : 11/4 @ 1:45AM. Second update : 11/4 @ 4:20PM. Both candidates show increases, but Spanberger +9000. More on that next
11. When I look at other "tabulation errors", the adjustments are normally very small as a percentage of the votes. This could be because not all ballots were run through, but it's not clear why this tabulation was so much larger than others in the dataset.
12. One note on the "ChangeReason" field, this appears to be a drop down that users select. "Change Comments" are free text fields that are optional. As you can see in the example, the bottom two rows has the user note "late arriving ballots added". Some users add longer notes
13. such as "post marked 11/3/2020 or before". I've not spent much time looking through these. But it is interesting to see some explanation related to the changes. I'll keep going through this, but now that's what I have to share.
Deep Dive Explanation of Approach / Analysis : I want to provide the context of the work I am doing, and the reasoning behind it. I started this with a question : Why did R's do so well down ballot, and not at the top? In order to answer this question, I needed a suitable dataset
2. The ideal dataset would allow us to see, for each batch of ballots, the landscape of the votes (e.g. which candidates received the votes, and by what proportions). And not just any batch would suffice to perform this analysis.
3. Batches of ballots needed to be large (e.g. many ballots), and must include votes for Prez, Senate, and House Rep for both Dems and Reps (3 races x 2 parties). The batch is a self contained example of voting behavior, that would be random within a district.
1. Change Log Update : Batch Analysis. As far as I know, there isn't a data source available of how batches of ballots voted. We see the batches in the Edison data feed, and can see their impact on the Dem / Rep %s. But that's for the President. What about down ballot? Let's see!
2. Seeing how Dem & Rep candidates performed down ballot compared to the Prez is a powerful insight that can tell us how more about what these batches contain. @va_shiva had a presentation about weighted race voting, and while I'm not familiar, the pattern I'm seeing might
3. be what his analysis showed in MI. Now these batches are in VA. I've focused on CD-1, CD-10, and CD-11. The other criteria was that the batch of votes had to be >1000 for both Biden and Trump. The big batches are more meaningful in terms of trends.
Found another file on VA website. This one is a "Change Control" log for votes in the system. I've only begun to explore this, but there's some interesting activity for VA-7 for Spanberger. On Oct 30th, a preload change was initiated that would expire at 11/4 @ 4:13AM
At 4:13AM, 66,498 votes are assigned to her with an expiration of 11/5 @ 11:26 AM. When this comes around, the total is adjusted to 63,687. Reason given is "Tabulation Error in Precinct". The last change has no expiration date. These changes affected Chesterfield County.
This lines up with the raw vote total for Spanberger in Chesterfield County. There's also a change records to assign Rashid (VA-1) in Stafford county. See both pics attached :
@WontMarch4Soros@bedivere_knight@ColdPotatoSpud After thinking about the data / analyses I've been doing on the raw data vs website and other reports, I believe I have an explanation for why Rashid has so many more votes than Wittman. This will be a long explanation :
1. This explanation may bounce around a bit, but touches on different aspects of the pics I've posted across different threads. Ask questions if things don't seem to follow, since they are related.
If we think about how elections work, different precincts will have diff ballots
2. The ballots are different because downstream the (house reps) are different in the various districts. There are only two races that will be on every ballot across every district precinct. Senate and President. This is important.
@ColdPotatoSpud@bedivere_knight@Peoples_Pundit 4) The other odd thing in the turnout file is that Stafford County has a huge amount of absentee ballots that were cast "In Person", which you don't see in many of the other counties. Here's the details :
@ColdPotatoSpud@bedivere_knight@Peoples_Pundit The absentee ballots reported in the turnout file is bizzare (high), but is nothing close to what the raw data say are the absentee ballots associated with Rashid in Stafford County. See the details :
@ColdPotatoSpud@bedivere_knight@Peoples_Pundit Those pics show a large number of votes associated with Rashid in the raw data, but in the turnout file, there is a much smaller number of ballots. Since each vote for Rashid must equal 1 ballot. So we should see a minimum of 200K ballots in the Turnout file, but again it doesn't