deardurham / ciprs-reader Goto Github PK
View Code? Open in Web Editor NEWPython library for reading CIPRS PDFs
License: BSD 3-Clause "New" or "Revised" License
Python library for reading CIPRS PDFs
License: BSD 3-Clause "New" or "Revised" License
The CaseDetails
parser fails to parse county names with special characters like:
No match: Case Summary for Court Case: GUILFORD-GR 15CR000000
The petition tool then groups all unmatched county names under UNKNOWN, which is a bug.
AC:
(?P<county>.+)
Rows like the following are skipped currently because the regex fails to match:
CONVICTED CITY/TOWN VIOLATION (I) INFRACTION LOCAL ORDINANCE
AC
The DEAR team uses disposition codes to fill in the disposition column in the offense section. Update petition generator to use these codes.
ACIS codes.pdf
The CIPRS reader only recognizes CR (DISTRICT COURT) and CRS (SUPERIOR COURT) values in the CIPRS record file numbers, causing the Summary Document to put the infraction offenses in a NOT AVAILABLE jurisdiction. The infraction offenses should be put in the DISTRICT jurisdiction.
For testing/demo: Can use CIPRS record of person with initials JR or MJ
Export JSON should contain:
{
"Offense Record": {
"Plea": "blah",
"Verdict": "blah",
}
}
There are the two types of court records: Case Details and Case Summaries. The petition generator was built for the Case Details. Make this parser work for Case Summaries as well.
Ideally we should be pulling the Offense Date from the Case Information section, which is the first section at the top. Right now the parser gets it from the citation information section, which doesn't exist for some records.
The name DEANS,TEST'TEST,TYNAHJ gets parsed as Shaqui Deans rather than Shaqui'awan Deans because it stops at the '.
Disposition Method: WAIVER - MAGISTRATE
is currently parsed as WAVIER
.
AC
Summary records can contain the following language:
Additional offenses exist. To see a complete breakdown, view full case detail.
If it exists, this should be noted in the JSON so dear-petition can pick it up and display a message to the user:
{
"Additional offenses exist": "true"
}
Some CIPRS records look like this:
Offense Record 1 of 2 (Line Number 1)
...
Offense Record 2 of 2 (Line Number 2)
...
We are only capturing the charges from the first offense record and not the second.
Removed unnecessary steps. We should release now to resolve Jessica's issue with Open Container description not getting parsed
Right now date parsers return datetime strings, which get converted to MM-DD-YYYY format for display, but this results in a lot of duplicated code because it always needs to be converted to MM/DD/YYYY (%m/%d/%Y
) format in the end. We can DRY our code by making the parser return that format in the first place.
The offense row parser is designed to capture data from offense rows that look like this:
ACTION DESCRIPTION SEVERITY LAW CODE
However, on some records they look like this:
# ACTION DESCRIPTION SEVERITY LAW
With # being either a zero padded number or blank
Attorney's manually combine CIPRS record PDFs to ease organization of his documents, so a single PDF may contain many CIPRS records.
AC
The CaseDetails parser doesn't expect county names to contain spaces, so a line like
Case Summary for Court Case: NEW HANOVER 20CR000000
will result in a county name of NEW
.
AC
A machine-readable record type will be useful for dear-petition.
AC:
Works:
Matched: {'value': '08/04/2004 01:00 AM'} in Offense Date/Time: 08/04/2004 01:00 AM • Date: 04/05/2005
Doesn't work:
No match: Offense Date/Time: 03/04/2005 00:00 • Date: 02/16/2006
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.