Giter VIP home page Giter VIP logo

alderman_machine's Introduction

  • I’m @JohnCRuf , a Pre-doctoral researcher under Jonathan Dingel at UChicago Booth and an ex-engineer.
  • 👀 I’m interested in Economic Research. In particular I am interested in Political Economy, Urban Economics, and Innovation and Productivity from an empirical IO perspective (Although I do enjoy a good causal inference paper).
  • 📫 How to reach me: Email me at [email protected] or [email protected]

alderman_machine's People

Contributors

jdingel avatar johncruf avatar tmalthouse avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

alderman_machine's Issues

Address 11/15 RP Seminar comments

Intro:

  • Audience did not like 'misallocation' wording and verbiage because I do not present an optimal spending benchmark to compare against.
  • Alternatives include 'political favoritism' and 'politically-influenced welfare weighting'. The point is fair in my opinion but `politically-influenced welfare weighting' is a mouthful

Data Description:

  • Audience wants to see spending per capita figures in the data
  • People want to see maps and distributions of spending categories as well
  • Junbiao in particular had some nice comments on this as he framed his thinking in terms of 'rates of return' on projects. I don't really have an objection to this framing but it may be something to discuss if I dig into the spending categories more
  • Audience: Would be good to consider state-level investment (I don't have the time to get that data, so I'll set this aside)
  • Audience: May be good to go into somewhat gory details of data processing for applications, shows drive and problem-solving skills.
  • Lots of curiosity about potential border design. One audience member mentioned Bordeu's JMP!

Bernie Stone Case Study:

The audience went wild for this, with audible gasps when I revealed the

  • My first comment is to put this case study on the front page of the paper as its "catchy"
  • Basically, everyone thought this was an extremely impressive part of the paper and made the case that this was a worthwhile and interesting project to do, regardless of the results

Results

  • Need to be clearer about sample
  • One recommendation for an TWFE approach where I weigh the treatment by the amount of support - this would allow me to increase my sample size by 5 times because I would be able to use every precinct
  • One recommendation for using a synthetic control design
  • One recommendation for DiD placebo tests
  • Multiple questions on the best way to interpret the treatment effect.
  • A lot of questions and recommendations afterward, overall people were very supportive and fairly impressed. I wasn't expecting people to be impressed - I thought people would be bored by how local the setting is.

Why not a joint test of significance?

Had an idea while interpretting an issue for work.

Why can't I do a joint test between the top and bottom models? In how many specifications doe the joint tests pass?

Gather 311 Complaint Data

Gather complaint data and match it to political data using voting-precinct maps. Make sure to include duplicates.

Merge Thomas Malthouse's work into this repo

This should be done a pull request so I can review the work. This needs to follow the task-based format of this repo.

Key questions

  1. Why is he getting a significant sorting effect in elections? It could be a composition-based effect from merging general and runoff elections. If the composition selectively removes general elections when a runoff exists, you have a mechanism for inducing a significant discontinuity.
  2. Is it possible to merge contracting data with the menu data? If so, how?

Data Description Improvements

I want a few new figures:

  • Maps of spending categories. What areas get the most spending on streets, parks, sidewalks and alleys, policing, and misc?
  • Take 50 wards for each cycle from 2003-2022. This should 5 total: (04-07, 08-11, 12-15, 16-19, 19-22). Look at the distribution of a fraction of the total budget spent on top X incumbent supporting precincts in the general election of the previous election for each of the 50 wards in these five periods. This gives 250 numbers. What does this look like? If it is bi-modal, we have a strong case for bringing in Dixit-Londregan results.

Data processing for 2021 and 2022 PDFs do not work

Something needs to be fixed with the multi-line conjoin functionality of the 2016-2022 menu processing data that doesn't align with the formatting of the 2021 and 2022 files. A new separate script is needed for 2021 and 2022 that can successfully deal with their format's multi-line row formatting.

Custom DiD Estimator?

I need to consider using a custom-built estimator for this "close elections DiD" thing.

Under the close election assumption, I have treatment randomization over wards. However, because the "treatment" is having a "tied" alderman replaced, I dosage correlates to net votes. Furthermore, under the close election assumption, precinct-level net votes don't significantly change when treated or untreated.

Thus, I effectively see the "counter-factual dosage" each precinct would have gotten if their alderman had been reelected. I'm effectively matching on this "dosage" and then running a DiD, but this leads to a lower power than if I can use all of the 50 precincts available in each ward.

Traditional DiD estimators (e.g., https://bcallaway11.github.io/posts/five-minute-did-continuous-treatment) assume that you can't see this "counterfactual dosage," but in this case, we can.

Racial Redistricting Study

Wards tend to be majority White, Hispanic, or Black. Racial balancing of wards seems to be expected. In the 2015 redistricting, identify majority census tracts that get redistricted (mostly) to wards where they are a minority (preferably controlled by an alderman who does not share their race). Do their menu funds decrease? Do their city services decrease?

First, count the number of such occurrences in the 2015 and by racial categories. White -> hispanic, black -> white, etc.

Then, run a Dif-in-dif study to see what happens.

Eg. a Black census tract gets redistricted to a Hispanic-controlled ward, and what happens before/after relative to adjacent tracts that remained in the original ward?

Verify that spending totals align. If they don't align, why?

The PDFs contain accurate expenditure totals. Some of my expenditure totals obtained from summing the elements are much less. For example, from the collected dataset, the 49th ward's total expenditures are $365,839.8. In the 2005 PDF, it is $1,320,000.00. What gives?

DiD Spending Model: Close Elections

There are 9 close runoffs in 2019 and another 9 in 2015, defining "close" as <10% voteshare gap. I need to run a diff-in-diff on this and probably and RDD to boot. Trying to determine if the precincts who support an incumbent in a given election experience a drop in spending when they're booted out of office.

Writing improvements

  • Be clearer about how I construct samples
  • Be clearer about "fraction of observed spending" concept
  • Remove references to misallocation, talks bout political favoritism instead

Replace current theoretical model with something with a little more panache

The current model is very basic. It would be advisable to create a more interesting model that better reflects the situation at hand.

Eg. Citizens have a probability of getting a concern, and utility depends on how many concerns are addressed and how much the politician extracts rents. Politician can spend time addressing concerns (to get reelected) or extract rents. Political experience lowers DWL of rent extraction and increases productivity of addressing concerns. Over time, politician gets so good at extracting rents that they address a minimal number of concerns.

Because there is an initial strategy of extracting maximal rents at maximum inefficiency, people choose the devil they know instead of the devil they don't know.

R Merge Asserts

There's a dangerous amount of merging going on in the repo without appropriate asserts. Need to figure that out.

Gather spatial data using census and google API's

Develop an R or Python script that takes the location information from the menu data, processes it so that the census and/or Google location APIs can handle it, and obtains geographic coordinates. Then match this data with Chicago Political Map data, especially voting precinct data.

Fix missing locations in 2016-2022 data

Many off-menu expenditures from 2016-2022 seem to have no location attached to them (because the location is not present in the underlying data). I need to edit the menu_money_data_cleaner.R script to fix them.

DiD Spending Model: Retirement Due to Corruption

So, the close elections DiD estimator found robust null results.
This could be because close elections necessarily drive Aldermen towards median voter-style outcomes.
In a typical city, we'd be SOL with no way to boot out entrenched incumbents exogenously.
Luckily, this is all happening in Chicago so that we can use investigation-forced retirements as potential treatments.

The idea is to take the population of "entrenched" incumbents (ie, those who are unchallenged or win by extremely large margins) and compare them to the sample of aldermen forced into retirement by corruption allegations/investigations.

Quantifying geolocation error

I've observed an "obvious error rate" of approximately ~24 from my work on matching df_with_2_ands.csv to the 2003-2011 precinct map. An error rate of 25/1500 corresponds to a ~1.5% error rate. Not bad, but not great. Obvious errors are projects that span > 2 wards. There are only 3 project ids in the dataset that are from 2011 on.

We need a task that quantifies these errors across years and across wards. We need to show that this error rate is actually random.

Bernie Stone Geocoding Improvements

Screenshot_20230812_151654

As seen here, we do a very good job of geomatching each year except for 2005 due to a $700k entry called "parks."

Screenshot_20230812_153013

However, there is no reason why we can't bump up the 2009 and 2010 numbers by manually replacing whatever small typos are causing the geocoding script to miss ~500k.

Re-run BLP and discrete choice models using voting precinct data

Remove old, drastically underpowered BLP models and replace them with modestly underpowered BLP models exploiting voting precinct data.

There are 46 runoffs in the dataset. 40 precincts per ward/year mean a BLP model with ~1800 observations rather than ~200. No longer drastically underpowered, now merely modestly underpowered.

Bernie Stone Case Study

“Well, I grew up in the 50th Ward and you know, God bless [the late former Ald.] Bernie Stone, may he rest in peace, but I remember crossing California going west, every street was resurfaced almost every year,” Ramirez-Rosa says. “They always had brand new lighting and then east of California, where he would lose the precincts consistently, I mean the streets were in shambles. Many people felt he was spending the bulk of the menu money west of California, where he was getting the bulk of the vote.” - Alderman Carlos Ramirez-Rosa

Let's test this explicitly. With the new data, I have Bernie Stone's Menu allocations for his last six years in office. Divide the 50th ward into a grid, does funding serendipitously drop across California?

Grab Campaign Contribution Data

After addressing #38, if I find a positive effect, it could be due to an intertemporal tradeoff between current election odds and next year. The basic theory would be that giving to current supporters induces campaign contributions, of which individual contributions matter most. Thus aldermen face a tradeoff -- they can give to their supporters to increase their war chest and roll over the war chest to the next election, or they can target the median voter now to secure this year's win.

Thus, this would rationalize the (hypothetical) finding that swapping secure aldermen for new, less secure aldermen finds an effect but not swapping out aldermen in competitive elections. You only care about tomorrow, given that you're confident you'll win today.

Motivating National Data

This won't be interesting unless we can tie it to some broader national trends. For the DiD portion at least, we can tie it to the "community input" movement.

Is there any research showing that the use of community input has been rising or important recently?

Try out WIP - Chicago Geolocation API

Sean MacMullen's comment is here:

Should you need to do more geocoding at some point, I've had relative success using the Chicago street center lines data set to find the coordinates of intersecting streets.

Dataset: https://data.cityofchicago.org/Transportation/Street-Center-Lines/6imu-meau
WIP API implementation by @kollerbud: https://github.com/smacmullan/chicago-participatory-urbanism/blob/main/chicago_participatory_urbanism/geocoder_api.py

It would be a good idea to update the current location code to use this API.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.