uscensusbureau / census2020-das-2010ddp Goto Github PK
View Code? Open in Web Editor NEW2020 Census 2010 Demonstration Data Products Disclosure Avoidance System
2020 Census 2010 Demonstration Data Products Disclosure Avoidance System
Is there an Amazon Machine Image publicly available with the configuration used to generate the DDPs?
In the config file, there is a field named detailed_prop
, what does it do, and what is it used for?
Thank you for posting such a wonderful collection of code for exploring the effects of differential privacy algorithms on census data. I'm trying to understand what is captured by the L1 metric that is applied to compare the post-processed data with the incoming data. There is some pretty good documentation in programs/validation.py but I wanted to ask these questions to make sure I'm understanding it as well as I can.
TIA....
What kind of when the noise is introduced at the top, national level? Isn't it all one big bucket?
The documentation on census2020-das-2010ddp/das_decennial/README.md says that there's a script: das_decennail/etc/setup_external but I can't find anything in das_decennial/etc/ but a README.md file.
Is there any good documentation for the best way to configure the code so it can find a way to dasexperimental?
Traceback (most recent call last):
File "/usr/lib/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/usr/lib/python3.6/http/client.py", line 1254, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1300, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1249, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 974, in send
self.connect()
File "/usr/lib/python3.6/http/client.py", line 946, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/lib/python3.6/socket.py", line 724, in create_connection
raise err
File "/usr/lib/python3.6/socket.py", line 713, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "das2020_driver.py", line 133, in
dashboard.das_log(mission_name + ' starting', extra={'start':'now()'})
File "/home/pcw/Census/census2020-das-2010ddp/das_decennial/programs/dashboard.py", line 87, in das_log
'instanceId': aws.instanceId()}, **extra}
File "/home/pcw/Census/census2020-das-2010ddp/das_decennial/das_framework/ctools/aws.py", line 78, in instanceId
return instance_identity()['instanceId']
File "/home/pcw/Census/census2020-das-2010ddp/das_decennial/das_framework/ctools/aws.py", line 63, in instance_identity
return get_url_json('http://169.254.169.254/latest/dynamic/instance-identity/document')
File "/home/pcw/Census/census2020-das-2010ddp/das_decennial/das_framework/ctools/aws.py", line 57, in get_url_json
return json.loads(get_url(url, **kwargs))
File "/home/pcw/Census/census2020-das-2010ddp/das_decennial/das_framework/ctools/aws.py", line 53, in get_url
with urllib.request.urlopen(url, context=context) as response:
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/usr/lib/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib/python3.6/urllib/request.py", line 1346, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib/python3.6/urllib/request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 113] No route to host>
Running the program gives me error
FileNotFoundError: [Errno 2] No such file or directory: '4j/__init__.py'
when its trying to make a release file. The --print_bom option also lists 4j
and ark
directories that are not on this repo.
Are they missing or shoul I be getting them from somewhere else?
In the .ini file, these lines are in topdown order:
#budget in topdown order (e.g. County, Tract, Block Group, Block)
geolevel_budget_prop: 0.25,0.25,0.25,0.25
But these are in bottom up order. So the far right above corresponds to the far left below and vice-versa?
# Names of smallest to largest geocode (no spaces)geolevel_names: Enumdist,County,State,National
--
My understanding is that the Census Edited File is a format from within the census and it is not made public.
Is it possible to get a sample CEF file in order to inspect what it looks like?
My specific use case is this:
I have a bunch of reconstructions of a 2010 state that I would like to run through DDP. I was planning to build them as suggested by the format at the bottom of this file: https://github.com/uscensusbureau/census2020-das-e2e/blob/master/programs/reader/e2e_reader.py
Even a key to what these fields mean would be helpful, i think.
but I dont understand where I would enter geographic data like say the Block Group a person is in.
I am familiar with the 1940s IPUMs data format where the Household lines contain the geographic information.
When using the DAS E2E example AMI, I noticed the documentation embedded in the CONFIG.ini file in das_decennnial says that the privacy budget is split "in topdown order (e.g. County, Tract, Block Group, Block)". But the certificate produced at the end of the run says that it is split between: Enumdist, county, state and national. Is one correct?
Is it correct to say that blocks that have a population of 0 are not treated with the DP mechanism?
It seems like the reader works by reading in the person lines from the input file, which means that 0 population blocks never get represented anywhere in the system. This also means that 0 population blocks do not get noised at all, and stay 0.
Is this the correct way to interpret the code, and if not, where are the 0 population blocks being accounted for?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.