Giter VIP home page Giter VIP logo

data-nasa-gov-frontpage's Introduction

data-nasa-gov-frontpage

Data.nasa.gov holds metadata harvested from more than 88 different NASA sites that share public data.

This front-end only page serves as an easy to change front-page for the system that hosts the catalog of datasets in data.nasa.gov (which is not on github).

Contributing

If you find any factual errors or places where additions could be made, please add it as an issue or submit a pull request. We accept pull requests from the public.

Deployment

This page is live under https://data.nasa.gov.

It used to be live under https://nasa.github.io/data-nasa-gov-frontpage/ as a github pages page, but now that redirects to https://data.nasa.gov as well.

The catalog itself is proxied in from a 3rd party SAAS.

After changes are made here, the code changes can be updated on the appropriate server using the update bash script. Look internally for a more detailed description of the worflow. Only after the update on the server, will updates appear on data.nasa.gov.

Potentially Reusable Code Assets Leveraged by this Repository

  1. This page uses nasawds-2.0.7, which you can find here: https://github.com/nasa/nasawds which is based on GSA's US web design service. Both are open-source projects on Github.

  2. The footer is reused across several open-innovation pages.

  3. The page data_visualization.html contains a treemap that displays an aggregate data visualization of contents of data.nasa.gov data catalog. The data is represented by rectangles scaled by the number of datasets. Each rectangle reflects a unique combination of source, category, and keyword. This data is extracted from the metadata in data.json.

The Treemap Data Visualization Page

Data Source:

The aggregate data visualization code depends on a JSON file that follows a valid data.json schema as defined by the open data project here: https://project-open-data.cio.gov/ and here https://project-open-data.cio.gov/v1.1/schema/.

Theoretically, you should be able to drop in any data.json from any federal agency. Your milage may vary, however. The visualization may not look as nice if the structure is different. For example, if source is shown as the first breakdown and 95% of your datasets are from a single source, than that rectange will take up the majority of the space. This won't look very nice.

Data Processing:

data_processing.py is responsible for processing the data and can be found inside the data_processing_scripts folder. To run it, make sure your data source is in the same folder as the script and run python3 data_processing.py. This will produce processed_data.json, which is formatted as follows:

{
   "name": "dataset",
   "children": [
       {
           "name": <source>,
           "children": [
               {
                   "name": <category>,
                   "children": [
                       {
                           "name": <keyword>,
                           "value": <count>
                       }, ...
                   ]
               }, ...
           ]
       }, ...
   ]
}

NOTE: Category refers to "theme" in the original schema.

Duplicates:

If your source data has a duplicate source, you may add it to duplicates.json found inside the data_processing_scripts folder.

Example:

{
  "National Aeronautics Space Administration": "NASA",
  ...
}

This example will group both "National Aeronautics Space Administration" and "NASA" under the source name "NASA" in data_processing.py.

Ignoring Data:

If your source data has any source, category or keyword you want ignored, you may add it to ignoreData.json found inside the data_processing_scripts folder.

Example:

{
  "source": ["NASA"],
  "category": ["Earth Science", "Geospatial"],
  "keyword": []
}

This example will ignore all entries in which "NASA" is the source, and ignore all entires with "Earth Science" and "Geospatial" are categories. NOTE: Category refers to "theme" in the original schema

Keyword Count Minimum:

keyword_count_threshold sets the minimum number a keyword count must be to be added to processed_data.json and can be changed inside data_processing.py here.

Data Visualization:

The code used to visualize the data is visualizations.js and can be found inside nasawds-2.07/js.

Expanding Acronyms to be Displayed in Treemap Legend:

In acronyms.json, found inside nasawds-2.0.7/json, you may select acronyms you wish to be expanded for the purpose of displaying them in the treemap legend. Each acronym must have a type (either source or category), and a name (the acronym's expansion).

Example:

{
  "NASA": { "type": "source", "name": "National Aeronautics Space Administration"},
  "GPM": {"type": "category", "name": "Global Precipitation Measurement"}
}

Treemap Rectangle and Legend Key Links:

When a user clicks on either a treemap rectangle or a legend key, they are redirected to data.nasa.gov's data catalog page. Changing where a user is redirected can be done inside of nasawds-2.07/js/visualizations.js. Setting the treemap rectangle link is done here, and setting the legend key link is done here.

Functions:

create_treemap(<data>, <format>) function takes the processed_data.json as the first argument, and the treemap format (d3.treemapSquarify, d3.treemapBinary, d3.treemapSlice, d3.treemapDice, d3.treemapSliceDice) as the second. This function is responsible for rendering and appending the treemap to the site.

changeNesting(<data>) function takes the processed_data.json as an argument, and returns back the same data except it changes the grouping/nesting order from source -> category -> keyword to category -> source -> keyword.

clean_data_treemap(<data>) function takes the processed_data.json as an argument, and is responsible for removing sources and categories that have no children, as well as swap out source and category acronyms listed in acronyms.json (found inside nasawds-2.0.7/json) using swap_acronyms(). This new cleaned dataset it returned.

swap_acronyms(<acronym>) function takes an acronym string and returns its expantion if listed in acronyms.json. If no expantion is found, the original acronym string is returned.

Other D3.js visualizations that will work with same processed data format:

Collapsible Tree

Circle Packing

Sunburst

Icicle

Cluster Dendrogram

data-nasa-gov-frontpage's People

Contributors

brianlobo avatar danielcrendon avatar darith27 avatar evantayloryates avatar jmhorn2015 avatar justingosses avatar open-nasa-robot avatar rubiojeffery avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

data-nasa-gov-frontpage's Issues

change how sign in is handled

The sign in button is far enough down the page that it could be missed. We could move it back to the top right corner, but then it wouldn't have the related information text near it. Without that text, people get confused how to sign up such that they can also add datasets.

Maybe we could have clicking on a new top right sign-in button take them to the lower sign-in button? Alternatively, it might bring up a pop-up with similar information to what is farther down the page?

I'm not sure what is the best way to do this. We definitely don't want people to sign up and think they can upload data with a normal account.

Code restructuring needed

Hi,
I went through the code and noticed that most of the javascript for the index.html page was inside the html page and this is not a good practice. I also noticed a console.log for a modal in the browser console and some button function being inline as well. I forked the repo and will be making a pull request. I hope you guys will accept the changes. Thank you

Rework the language and add links in this paragraph

Open-Innovation Program
Data.nasa.gov is the dataset-focused site of NASA's OCIO (Office of the Chief Information Officer) open-innovation program. There are also API.nasa.gov and Code.nasa.gov for APIs and Code respectively. Open.nasa.gov is the central page for open-innovation sites and acts as as a home for the datanauts program, which is a public outreach program where members of the public work with NASA datasets.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.