harvardopendata / harvardopendata.github.io Goto Github PK
View Code? Open in Web Editor NEWHarvard's first open data catalog
Home Page: http://hodp.org
License: MIT License
Harvard's first open data catalog
Home Page: http://hodp.org
License: MIT License
Put on the About page. Also include their profile picture, name, year, what they're interested in, and what they've done on HODP.
Add stuff from here to our data csv
We have some duplication
note the tools listed here https://project-open-data.cio.gov
This thread is for discussion of which data sets should be included in launch
One example:
I've started it in the landingpage
branch, but we need to add a bootstrap thumbnail for each one.
Our bootstrap frontend doesn't look very sexy. Consider using a new one from StartBootstrap, WrapBootstrap, HTML5Up, etc.
If you want to take this on, comment here so we can figure out a new theme before you start implementing it!
This thread is to discuss license options for Harvard's Open Data website (not the underlying datasets).
For comparison, here is Data.gov's license: https://github.com/GSA/data.gov/blob/master/LICENSE.md
Here is a straw man for discussion:
Public Domain
We waive copyright and related rights in the work worldwide through the CC0 1.0 Universal public domain dedication.
All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest. See CONTRIBUTING for more information.
GNU General Public License
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Visit http://www.gnu.org/licenses/ to learn more about the GNU General Public License.
Other Information
In no way are the patent or trademark rights of any person affected by CC0, nor are the rights that other persons may have in the work or in how the work is used, such as publicity or privacy rights.
Unless expressly stated otherwise, the person who associated a work with this deed makes no warranties about the work, and disclaims liability for all uses of the work, to the fullest extent permitted by applicable law. When using or citing the work, you should not imply endorsement by the author or the affirmer.
Consider the dataset "Universal Harvard Events Calendar". It has multiple URLs so it doesn't work! Yet some datasets will have multiple URLs. We should enable support for that! Either have url1, url2, url3, etc. fields, or allow for arbitrarily many URLs separated by, say, a pipe, in the CSV file.
This thread is for a discussion about Dataverse, CKAN, Socrata, and other possible solutions.
@capooti is a collaborator/core commiter on the geonode project now working on Harvard's Worldmap with utilizes pyCSW the same Web Catalog Service CKAN uses to catalog/manage/harvest/serve metadata...
See slide deck:
https://github.com/DistributedOpenUnifiedGovernmentNetwork
Currently, once you set a category/term, there's no way to undo it besides visiting the page again. Add a way to clear it.
It gets really huge on small screens b/c fixed height
Hey @jdhe1120 — in trying to store CSS locally, I think we've excluded the icon font, because these icons aren't loading:
https://choosealicense.com/licenses/cc-by-4.0/
note that the code should still be under the MIT license, but data should be CC-BY.
I want to be sure that we're always #1 on the Google rankings for queries like harvard open data
. We're already doing a pretty good job, but we need to keep improving:
Ideas:
I would love to see syllabi as a target data set. Because Harvard does not have an open syllabus project, these would have to be gathered via opt ins ... or possibly by downloading the syllabi the Open Syllabus Project at Columbia U has been gathering off the Web. The OSP is also likely to be producing a useful schema; the people running it are pragmatic, not Schema Infinite Perfectionists.
Why would I love this so much? 1. Syllabi are an insanely useful resource for faculty creating new courses. 2. Encouraging open syllabi would result in a cross-university dataset that would be a gold mine for researchers seeing to understand the patterns of ed in this country and beyond. 3. It seems to me to be totally in line with Harvard's commitment to openness. 4. Harvard syllabi could be imported into the H2O project that treats them like playlists to be learned from and mashed up.
We need to find metadata for our datasets before we publish them, including a description of the dataset, where to download it, who published it, etc.
Everyone should find one dataset on this shared Google Doc, find the relevant metadata by poking around the internet, and fill in the rest of the dataset's row.
The metadata schema can be found on this thread.
If you're interested, you can find more potential datasets here. Or, if you have any more ideas about potential datasets, feel free to add information about them!
Post here if you have any questions or comments!
YAML is more flexible and powerful than CSV, and also easier for humans to read and write. It's a little harder to parse, but there's a library for that.
However, YAML is more space-intensive and not as well suited for huge collections of data as CSV. Harder to learn, too.
From the meeting yesterday we had the idea to create a Dataverse for the Open Data Project, store some data on it, and write a wrapper webapp (using whatever stack we want) that simply calls the Dataverse APIs behind the scenes, allowing users to call APIs (which in turn call the Dataverse APIs) or download files directly from the Dataverse.
I think the benefit of this is that Dataverse contains lots of useful functionality, and with a wrapper we can add some useful features on top of that.
What's everyone think? If this sounds interesting, I can throw together a quick proof-of-concept.
Instead of downloading Font Awesome, D3, jQuery, etc. individually, we can use a tool like Yarn to automatically download them and keep them updated for us.
pforzheimer-house.jpg
may be useful
This thread can be for a discussion of the license options for the underlying datasets listed on Harvard's Open Data website.
Would suggest 3 popular options, with the default and recommended choice being CC0.
Brainstorming
Like in data.gov, let people search by text, filter by category and datatype, etc.
Here is our current mission statement, any recommendations for it?
The goal of the Harvard Open Data Project (HODP) is to leverage open data to foster community, efficiency, and student innovation. Making data public and easily accessible allows us all to unlock its potential. Data-driven progress unites people, organizations, and departments as we all try to make daily life better. Aggregating, maintaining, and publicizing open data has and will continue to be a global trend and we want Harvard to be at its forefront. Our goal is to give that progress a home with centralization of available data, integration with existing systems, and showcases of data-inspired products.
To solicit general Ux feedback or feedback on what datasets should be included in Droid at launch or after (#8, #10), it'd be great to partner with Boston's new library project on open data: http://www.cityofboston.gov/doit/knight.asp
For reference: Pittsburgh has similarly partnered with the University of Pittsburgh on a regional open data catalog: http://ucsur.pitt.edu/programs/urban-regional-analysis/regional-data-center/
See also these engagement strategies:
This thread is for a discussion about metadata standards. Would suggest starting with global standard like DCAT and reducing to a lightweight few that are required (e.g. title, description, keywords, point of contact name, point of contact email, URL, license).
Here is the current Data.gov schema: https://project-open-data.cio.gov/v1.1/schema/
alphadata.fas.harvard.edu
data.harvard.edu
Emma found a cool one that used our faculty diversity data
Right now there's lots of whitespace and very little text. Let's see if we can make these look nicer with pictures, better labeling, or a more efficient use of all that space.
Also the "download" buttons don't always make that much sense so we should change them to "view" or something.
We have several more datasets we need to import to our catalog!
┆Attachments: https://dataverse.harvard.edu/dataverse/harvardopendata | https://docs.google.com/a/college.harvard.edu/spreadsheets/d/1_bn6tBtj0MqndkhQWxB3vPpZYQdmTvrVaccF-YHzEK8/edit?usp=sharing
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.