Giter VIP home page Giter VIP logo

palewire / django-calaccess-raw-data Goto Github PK

View Code? Open in Web Editor NEW
64.0 64.0 146.0 34.95 MB

A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database

Home Page: http://django-calaccess.californiacivicdata.org/

License: MIT License

Python 99.91% Makefile 0.09%
california data-journalism django etl journalism news opendata python

django-calaccess-raw-data's Introduction

django-calaccess-raw-data's People

Contributors

aboutaaron avatar amandabee avatar anthonyjpesce avatar armendariz avatar benlk avatar bllchmbrs avatar burtherman avatar dankeemahill avatar duner avatar enactdev avatar ewr avatar gordonje avatar hcharley avatar hunterowens avatar joegermuska avatar karkinosw avatar malon avatar mattmontgomery avatar mclawges22 avatar mhkeller avatar mthatcherclark avatar myersjustinc avatar nbedi avatar palewire avatar ryanpitts avatar ryanvmenezes avatar sourcedouglas avatar stevenrich avatar tocateunvals avatar yujiap avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

django-calaccess-raw-data's Issues

campaign_finance management command or commands

Need to move the campaign_finance.load stuff into a management command. I just did it that way while I figured it all out. The management command stuff is still pretty new to my feeble mind.

Probably need a load from scratch command as well as an update with the new stuff only command.

Getting error on user input from downloadcalaccess mgmt command

Problem

python manage.py downloadcalaccess throws a UnicodeEncodeError on #L124.

Reproduce the problem

  1. Create a new virtualenv, install django and create a new django project
  2. git clone this repo, cd into it and checkout the rewrite branch
  3. build the django-calaccess-parser package with python setup.py sdist
  4. Install the parser into your new virtualenv with pip install /path/to/django-calaccess-parser/dist/django-calaccess-parser-0.1.tar.gz
  5. Add calaccess to your INSTALLED_APPS and python manage.py syncdb
  6. Create a MySQL database and user and give them privileges
  7. Set a CALACCESS_DOWNLOAD_DIR path in settings.py

Breathe.

  1. Run python manage.py downloadcalaccess to see the error

Possible solutions

I don't think I adjusted any code when I started porting things over but we should look into that. Also, I don't think how well this command works now that it's been moved to package install of an app explicitly inside the package.

Stacktrace

Traceback (most recent call last):
  File "manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/home/aaron/.envs/test/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 399, in execute_from_command_line
    utility.execute()
  File "/home/aaron/.envs/test/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 392, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/aaron/.envs/test/local/lib/python2.7/site-packages/django/core/management/base.py", line 242, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/home/aaron/.envs/test/local/lib/python2.7/site-packages/django/core/management/base.py", line 285, in execute
    output = self.handle(*args, **options)
  File "/home/aaron/.envs/test/local/lib/python2.7/site-packages/calaccess/management/commands/downloadcalaccess.py", line 124, in handle
    confirm = input(self.prompt)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 73: ordinal not in range(128)

Refactoring for speed, flexibility, usability and completeness

Dreams

Here is a running list of ideas I have for improving this repository at the upcoming code convening.

  • The load method should have a more explict connection between CSV files and tables rather than making namespace-based connections. And it should throw errors if things don't link up.
  • The clean method seems to be too slow. How can we make it go faster? Figuring out #30 would help us here.
  • The database models need docstrings, verbose names or something to spell out what they are more clearly to newbies. They also need __unicode__ methods and other prettifying thing like ordering defaults.
  • There are a number of source data files that do not yet have a corresponding Django model. We need to figure out which ones we want and get them into the system.
  • LOAD DATA INFILE is great, but it only works for MySQL. Is there a way we could write the code so that it could work for other database backends supported by Django, particularly PostgreSQL?
  • Row count checks now run at the tail end of the downloadcalaccess management command. It would be great if they were prettified, expanded and made available on demand -- maybe in a separate management command or on a dedicated dashboard webpage.
  • We need unittests -- but we can't ask our testing suite to download and install this gigantic data dump. Any ideas about how to work that out?

LOAD DATA INFILE failing with error saying it needs default values for empty cells. WTF?

Why didn't this happen before. Does anybody else get it? cc @aboutaaron @mikejcorey

python example/manage.py loadcalaccessfile AcronymsCd
- Loading AcronymsCd
Traceback (most recent call last):
  File "example/manage.py", line 9, in <module>
    execute_from_command_line(sys.argv)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 399, in execute_from_command_line
    utility.execute()
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 392, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/core/management/base.py", line 242, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/core/management/base.py", line 285, in execute
    output = self.handle(*args, **options)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/core/management/base.py", line 385, in handle
    label_output = self.handle_label(label, **options)
  File "calaccess_raw/management/commands/loadcalaccessfile.py", line 16, in handle_label
    self.load(label)
  File "calaccess_raw/management/commands/loadcalaccessfile.py", line 68, in load
    cnt = c.execute(bulk_sql_load)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/db/backends/util.py", line 69, in execute
    return super(CursorDebugWrapper, self).execute(sql, params)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/db/backends/util.py", line 51, in execute
    return self.cursor.execute(sql)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 124, in execute
    return self.cursor.execute(query, args)
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 207, in execute
    if not self._defer_warnings: self._warning_check()
  File "/home/ben/Code/ccdc/django-calaccess-parser/local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 117, in _warning_check
    warn(w[-1], self.Warning, 3)
_mysql_exceptions.Warning: Field 'ACRONYM' doesn't have a default value
make: *** [loadtable] Error 1

CSV folder not getting created after unzip, clean fails

Unfortunately the zip file got blown up so I don't have a copy. Will download and check if this is just a bad copy of the zip file.

  File "/home/michaelcorey/.Envs/calaccess/local/lib/python2.7/site-packages/django/core/management/base.py", line 285, in execute
    output = self.handle(*args, **options)
  File "/home/michaelcorey/.Envs/calaccess/local/lib/python2.7/site-packages/django/core/management/base.py", line 385, in handle
    label_output = self.handle_label(label, **options)
  File "/home/michaelcorey/.Envs/calaccess/local/lib/python2.7/site-packages/calaccess_raw/management/commands/cleancalaccessfile.py", line 20, in handle_label
    self.clean(label)
  File "/home/michaelcorey/.Envs/calaccess/local/lib/python2.7/site-packages/calaccess_raw/management/commands/cleancalaccessfile.py", line 40, in clean
    csv_file = open(csv_path, 'wb')
IOError: [Errno 2] No such file or directory: '/bulk/calaccess/csv/cvr_campaign_disclosure_cd.csv'

Support for PostgreSQL

This might be a good excuse to develop a Python wrapper that is able to access MySQL's LOAD DATA INFILE as well as PostgreSQL's COPY.

problem with calendar view on folks with bad or no data

routes:
No Data
/committee/calderon-legal-defense-fund-ron/1131/

Bad/missing data
committee/perez-for-assembly-2010-john-a/2848/

Errors:
For no data
In Cal Heatmap: Uncaught TypeError: Cannot call method 'getFullYear' of undefined

setup.py can't seem to find calaccess parser on Pypi

So, I'm working to get django-calaccess-campaign-finance ready, but it looks like setup_tools can't find the parser on pypi.

Reproduce

$ git clone https://github.com/california-civic-data-coalition/django-calaccess-campaign-finance.git
$ cd django-calaccess-campaign-finance
$ python setup.py install

Output:

Installed /home/aaron/.envs/calaccess/lib/python2.7/site-packages/django_calaccess_campaign_finance-0.2-py2.7.egg
Processing dependencies for django-calaccess-campaign-finance==0.2
Searching for django-calaccess-parser>=0.2
Reading http://pypi.python.org/simple/django-calaccess-parser/
No local packages or download links found for django-calaccess-parser>=0.2
error: Could not find suitable distribution for Requirement.parse('django-calaccess-parser>=0.2')

Possible reason

It looks like setup_tools looks for the repo at http://pypi.python.org/simple/django-calaccess-parser/ rather than the actual Pypi index.

I'll drop the dependency for now just to check out some other stuff, but thought I'd let you know before we get started Wednesday

Move campaign finance app out of repo

In an effort to keep things simple, we're gonna move the campaign finance app and ui into a separate repo. This way, we can focus on making this parser simple and let individual newsrooms decide on the ui and app features they want.

Django 1.7 failing hard on Travis

Traceback (most recent call last):
  File "setup.py", line 59, in <module>
    cmdclass={'test': TestCommand}
  File "/opt/python/2.7.8/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/opt/python/2.7.8/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/opt/python/2.7.8/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "setup.py", line 28, in run
    call_command('test', 'calaccess_raw')
  File "/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/Django-1.7-py2.7.egg/django/core/management/__init__.py", line 93, in call_command
    app_name = get_commands()[name]
  File "/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/Django-1.7-py2.7.egg/django/utils/lru_cache.py", line 101, in wrapper
    result = user_function(*args, **kwds)
  File "/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/Django-1.7-py2.7.egg/django/core/management/__init__.py", line 73, in get_commands
    for app_config in reversed(list(apps.get_app_configs())):
  File "/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/Django-1.7-py2.7.egg/django/apps/registry.py", line 137, in get_app_configs
    self.check_apps_ready()
  File "/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/Django-1.7-py2.7.egg/django/apps/registry.py", line 124, in check_apps_ready
    raise AppRegistryNotReady("Apps aren't loaded yet.")
AppRegistryNotReady: Apps aren't loaded yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.