Giter VIP home page Giter VIP logo

pycastvoterecords's Introduction

CastVoteRecord in Python

This is the beginnings of a Python package to work with elections results in a the NIST Cast Vote Record Common Data Format. It uses data classes and type hints, so you're probably best off in Python 3.7 or later. The license is BSD.

For now, the library uses only standard library functions, but the validate.py program (which will be part of the test suite) uses lxml to validate an XML Schema.

The static_cvr.py example is very simple and is just meant to verify that I'm interpreting the format correctly, by hand-constructing a CVR and then serializing it. It takes no arguments and the results of a sample run are in handbuilt.xml. That file valdidates with the NIST XSD file, which is copied right out of the NIST Github repo.

Included in this repo is a CSV of the votes as recorded by the tabulating machine at Ward 9 (where I live) of the City of Madison in the Fall 2018 General Election. The Dane County Clerk puts out this data after elections in Election Audit Central, I downloaded the Excel spreadsheet, imported the full set into SQLite, and then dumped Ward 9 from SQLite into the CSV file. The data is an export from ESS DS200 tabulating machines, which scan a paper ballot, and then ESS ElectionWare aggregates all of the scans and exports the TIFF image files as well as a set of Excel Spreadsheets of the CVRs. Note that the Excel spreadsheets were the union of everything on the ballot across Dane County, whereas the included CSV filters out only results that came from the City of Madison Ward 9, so there are a lot of NULLs in the CSV and most columns aren't used.

convert_to_cvr.py reads from this file and spits out a NIST CVR. By default it only does the first 10 rows, which is enough to get an undervote, but you can run it with --all to have it spit out a 11MB result file, which includes undervotes, overvotes, and writeins (not the actual writein, though - the ESS report doesn't include what was written in, only that there was a writein)

Later comes reading a CVR report. The big questions are what data structures would anyone want after reading the file, and depending on how big the CVR reports get, can one get away with basic, single pass, in-memory XML parsing?

Questions

  • Are CVR Snapshot IDs globally unique, or just unique within a CVR element?
  • This is usually easy to figure out from common sense/the examples, but whever the spec says 'reference' or 'link' to an different object, that's always encoded as 'OBJECT_TYPE' with 'Id' concatenated onto it? It would be nice if the PDF spec was more specific about this and the names, without having to go read the XSD file.
  • The spec says "ContestSelection contains one attribute, Code, that can be used to identify the contest selection and thereby eliminate the need to identify it using the subclasses." - but the example files don't a 'Code' in the ContestSelection element
  • Why does CVRContest have a status for Overvote and Undervote, and a count for overvotes and undervotes and writeins (but no status for writeins?)
  • The example_1.xml sample file includes an example of a writein in a SelectionPosition, but the CVRContestSelection doesn't link to a ContestSelection
  • The basic example in section 5.2 of the spec includes a 'cdf:Position' element in the CVRContestSelection element, but the spec seems to call that OptionPosition?
  • The basic example in section 5.2 includes a 'TotalNumberVotes' in the CVRContest (as well as in the CVRContestSelection element) but that doesn't appear to be legal.
  • Could the XML Schema not use sequence, which seems to force ordering in the XML entities? It's not a huge deal and if it's a limitation of XML Schema I guess whatetver, but it was annoying to have to go back and reorder elements in the serialization.
  • ReportGeneratingDeviceIds in example_2.xml looks wrong - it's pointing at an election?
  • The description of Party in CastVoteRecordReport suggests that you're only supposed to use it for a primary, but is it OK to use in the general too?
  • What is the purpose of 'value' in a Code type?
  • Why is code in both candidate and contest?

pycastvoterecords's People

Contributors

epaulson avatar nealmcb avatar

Stargazers

 avatar Shiro Kuriwaki avatar

Watchers

 avatar James Cloos avatar John Dziurlaj avatar Shiro Kuriwaki avatar

Forkers

nealmcb kuriwaki

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.