Giter VIP home page Giter VIP logo

list-n's Introduction

EPA List of COVID-19 Disinfectants (List N)

Installation

brew install datasette sqlite-utils
pip3 install datasette-publish-vercel 
  or 
datasette install datasette-publish-vercel

Import data

cd Projects/Advocacy/list-N/list-N
sqlite-utils insert list-N.db listN list-N.csv --csv
    or
curl "https://cfpub.epa.gov/wizards/disinfectants/includes/queries.cfc?method=getDisData&Keyword=&RegNum=&ActiveIng=All&ContactTime=&UseSite=&SurfType="

(was: 
curl 'https://cfpub.epa.gov/giwiz/disinfectants/includes/queries.cfc?method=getDisData&Keyword=&RegNum=&ActiveIng=All&ContactTime=&UseSite=&SurfType=' | python transform.py | jq . | sqlite-utils insert disinfectants.db listN - --pk ID
)

Enable Full-Text Search

sqlite-utils enable-fts disinfectants.db listN 'Surface_type' 'Active_ingredient' 'Risk_level' 'Date_on_List_N'  'Company' 'Contact_time' 'Use_site' 'Product_name' 'Formulation_type' 'Follow_directions_for_this_virus' 'Why_on_List_N' 'EPA_reg_num' --create-triggers --tokenize=porter

Update column order

sqlite-utils transform disinfectants.db listN \
--column-order EPA_reg_num \
--column-order Risk_level \
--column-order Active_ingredient \
--column-order Product_name \
--column-order Company \
--column-order Use_site \
--column-order Surface_type \
--column-order Contact_time \
--column-order Formulation_type \
--column-order Follow_directions_for_this_virus \
--column-order Date_on_List_N \
--column-order Why_on_List_N \
--column-order ID

Publish locally

datasette disinfectants.db -m metadata.json --setting default_page_size 2000 --setting max_returned_rows 3000 --setting default_facet_size 35 --static static:static/  --template-dir templates/ --plugins-dir plugins -p 8001 -o
datasette disinfectants.db -m metadata.json \
--setting default_page_size 3000 \
--setting max_returned_rows 3000 \
--setting default_facet_size 35 -o \
--static static:static/ \
--template-dir templates/ \
--plugins-dir plugins

Publish to Vercel

Visit: https://vercel.com/download to get CLI tool.

Run: vercel login to login to Vercel, then you can do this:

datasette publish vercel disinfectants.db --project "list-n" --title "Disinfectants Used for Addressing COVID" --source "List N Tool COVID-19 Disinfectants" --source_url "https://cfpub.epa.gov/giwiz/disinfectants/index.cfm" --install datasette-vega --static static:static/ --metadata metadata.json --setting default_page_size 3000 --setting max_returned_rows 3000 --setting default_facet_size 35 --template-dir templates --plugins-dir plugins
datasette publish vercel disinfectants.db \
--project "list-n" \
--title "Disinfectants Used for Addressing COVID" \
--source "List N Tool COVID-19 Disinfectants" \
--source_url "https://cfpub.epa.gov/giwiz/disinfectants/index.cfm" \
--install datasette-vega \ 
--setting default_page_size 3000 \
--setting max_returned_rows 3000 \
--setting default_facet_size 35 \
--static static:static/ \
--template-dir templates/
--plugins-dir plugins/
--metadata metadata.json \

Utilities and Miscellaneous

sqlite-utils tables disinfectants.db --counts --columns
sqlite-utils analyze-tables disinfectants.db listN
sqlite-utils disable-fts disinfectants.db listN
open /Applications/DB\ Browser\ for\ SQLite.app disinfectants.db

Datasette Session Notes - https://docs.google.com/document/d/1f61st8AXtpXvjeHB3UlmUhSCG1Ddiwih9nr-nO8LTEY/edit

Margie's original notes - https://docs.google.com/document/d/1RHv_Twe7gzUMcfAeHZ-RlVQU-ZwshpEalZ4NmF9-ISk/edit Simon's original notes - https://docs.google.com/document/d/1Ck4Gopt8ssumGUjH1TeqASvHpFAPBJpTpDE_bvf4bCI/edit

sqlite> .schema listN
CREATE TABLE [listN] (
   [EMER_PATH] TEXT,
   [REGI_NUM] TEXT,
   [INST_VIRUS] TEXT,
   [COMPANY] TEXT,
   [USE_SITE] TEXT,
   [CONT_TIME] FLOAT,
   [ACTI_ING] TEXT,
   [USE_SURF] TEXT,
   [COMPANY_URL] TEXT,
   [DATE_ON_LIST_N] TEXT,
   [FORM_TYPE] TEXT,
   [ID] INTEGER PRIMARY KEY,
   [PROD_NAME] TEXT
);

Now:

CREATE TABLE [listN] (
   [Surface_type] TEXT,
   [Active_ingredient] TEXT,
   [Risk_level] TEXT,
   [Date_on_List_N] TEXT,
   [Company] TEXT,
   [Contact_time] FLOAT,
   [Use_site] TEXT,
   [Product_name] TEXT,
   [Active_ingredient] TEXT,
   [Formulation_type] TEXT,
   [Follow_directions_for_this_virus] TEXT,
   [Why_on_List_N] TEXT,
   [ID] INTEGER PRIMARY KEY,
   [EPA_reg_num] TEXT
);

Special URLs

Key documentation

list-n's People

Contributors

mroswell avatar kiplip avatar heemayl avatar

Watchers

 avatar James Cloos avatar  avatar

Forkers

heemayl

list-n's Issues

Understand templating system

/usr/local/Cellar/datasette/0.55/libexec/lib/python3.9/site-packages/datasette/templates

permissions_debug.html
messages_debug.html
index.html
patterns.html
_codemirror_foot.html
base.html
_table.html
row.html
database.html
table.html
_close_open_menus.html
_description_source_license.html
allow_debug.html
query.html
show_json.html
logout.html
error.html
_codemirror.html
_footer.html

How to maintain health data

active_ingredients_health_impact will be a table. Each active ingredient could have zero to many references. how to maintain it?

Implement foreign key detection

I'll have three tables, one of which is routinely updated, "ListN"

Second table signals if Active Ingredients are toxic or not. This will be incorporated directly into ListN, each time EPA updates "List N" So I won't want that table to display.

Third table could have some health information related to active ingredients, a one-to-many table, since an active ingredient may show up in the scientific literature in multiple places (Possible PubMed link...?)

Related issue is here: simonw/datasette#85
But my case is more complicated because
A) the field is an array
B) I want to use the whole name of the active ingredient and not the primary key to connect.. (?)

Remove even the notice of the hidden tables

simonw/datasette#129

 datasette/templates/database.html 
@@ -65,13 +65,19 @@ <h3>Query parameters</h3>
{% endif %}

{% for table in tables %}
{% if show_hidden or not table.hidden %}
<div class="db-table">
    <h2><a href="/{{ database }}-{{ database_hash }}/{{ table.name|quote_plus }}">{{ table.name }}</a></h2>
    <h2><a href="/{{ database }}-{{ database_hash }}/{{ table.name|quote_plus }}">{{ table.name }}</a>{% if table.hidden %}<em> (hidden)</em>{% endif %}</h2>
    <p><em>{% for column in table.columns[:9] %}{{ column }}{% if not loop.last %}, {% endif %}{% endfor %}{% if table.columns|length > 9 %}...{% endif %}</em></p>
    <p>{{ "{:,}".format(table.table_rows) }} row{% if table.table_rows == 1 %}{% else %}s{% endif %}</p>
    <p>{{ "{:,}".format(table.count) }} row{% if table.count == 1 %}{% else %}s{% endif %}</p>
</div>

Figure out how to properly rename column

sqlite> .schema listN
CREATE TABLE [listN] (
   [Why on List N] TEXT,
   [Registration number] TEXT,
   [Company URL] TEXT,
   [Formulation type] TEXT,
   [Contact time] FLOAT,
   [Active ingredients] TEXT,
   [Use site] TEXT,
   [Compnay] TEXT,
   [Follow directions for this virus] TEXT,
   [Date on List N] TEXT,
   [Product name] TEXT,
   [ID] INTEGER PRIMARY KEY,
   [Active ingredient] TEXT,
   [Surface type] TEXT
);
sqlite> alter table listN rename column Compnay to Company;
sqlite> .schema listN
CREATE TABLE [listN] (
   [Why on List N] TEXT,
   [Registration number] TEXT,
   [Company URL] TEXT,
   [Formulation type] TEXT,
   [Contact time] FLOAT,
   [Active ingredients] TEXT,
   [Use site] TEXT,
   "Company" TEXT,
   [Follow directions for this virus] TEXT,
   [Date on List N] TEXT,
   [Product name] TEXT,
   [ID] INTEGER PRIMARY KEY,
   [Active ingredient] TEXT,
   [Surface type] TEXT
);

Disable Full-Text Search

Disable Full-Text Search. Not necessary now that we've got everything on one page.

sqlite-utils disable-fts disinfectants.db listN

Rename all columns

CREATE TABLE [listN] (
[Why on List N] TEXT,
[Registration number] TEXT,
[Company URL] TEXT,
[Formulation type] TEXT,
[Contact time] FLOAT,
[Active ingredients] TEXT,
[Use site] TEXT,
"Company" TEXT,
[Follow directions for this virus] TEXT,
[Date on List N] TEXT,
[Product name] TEXT,
[ID] INTEGER PRIMARY KEY,
[Active ingredient] TEXT,
[Surface type] TEXT
);

CREATE TABLE [listN] (
[EMER_PATH] TEXT,
[REGI_NUM] TEXT,
[INST_VIRUS] TEXT,
[COMPANY] TEXT,
[USE_SITE] TEXT,
[CONT_TIME] FLOAT,
[ACTI_ING] TEXT,
[USE_SURF] TEXT,
[COMPANY_URL] TEXT,
[DATE_ON_LIST_N] TEXT,
[FORM_TYPE] TEXT,
[ID] INTEGER PRIMARY KEY,
[PROD_NAME] TEXT
);

Append Is_Safer field to JSON (or maybe is_Toxic field?)

might be helpful: https://www.kite.com/python/answers/how-to-check-for-multiple-substrings-in-a-string-in-python

UPDATE: created csv, which I used to generate two python lists in transform.py
https://github.com/mroswell/list-N/blob/main/data/ActiveIngredientsSafeOrNot.csv

The list below came from the facets, and included counts, but I didn't include those in the csv, because those counts are always changing, and can be accessed via SQL (except for the faceting issue...)
T = Toxic. S = Safer

T | Quaternary ammonium 251
S | Hydrogen peroxide 83
T | Sodium hypochlorite 75
S | Ethanol (Ethyl alcohol) 36
Peroxyacetic acid (Peracetic acid) 30
Phenolic 29
S | Isopropanol (Isopropyl alcohol) 22
Hypochlorous acid 21
S | Citric acid 17
Sodium chlorite 11
Chlorine dioxide 9
L-Lactic acid 8
Sodium dichloroisocyanurate 7
Thymol 7
Hydrochloric acid 5
Sodium chloride 5
Glycolic acid 4
Octanoic acid 4
Potassium peroxymonosulfate 4
Ammonium bicarbonate 3
Ammonium carbonate 3
Glutaraldehyde 3
Silver 3
Sodium carbonate peroxyhydrate 3
Dodecylbenzenesulfonic Acid 2
Dodecylbenzenesulfonic acid 2
PHMB 2
Peroxyoctanoic acid 2
Silver ion 2
Tetraacetyl ethylenediamine 2

Clean up dates

Remove " 00:00:00" from times. and if EPA won't do it, fix formatting of dates.

Disable FTS

Disable Full-Text Search. Not necessary now that we've got everything on one page.

sqlite-utils disable-fts disinfectants.db listN

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.