Giter VIP home page Giter VIP logo

ood's Introduction

โ— This repository has moved to https://github.com/ocaml/v3.ocaml.org-server. Please open issues and PRs there.

Typed and Versioned Data for OCaml.org v3

Status: Not yet open for contributions. Contact @avsm.

This repository contains data for the OCaml.org website along with a suite of tools for managing that data. In particular:

  • data: stores all of the data. There are two kinds of data, those stored as yaml files and those stored in a jekyll-format (a meta-data section of yaml and a body of markdown). In addition to this the tutorials are also written using mdx to ensure they are up to date.
  • src: contains the code for three separate tools.
    • ood: is almost exclusively OCaml modules generated by parsing and slightly modifying the data stored in this repository.
    • ood-gen: contains a suite of CLI tools for generating, parsing and fetching data. For example, ood-gen/bin/lint.ml reads all of the different items in data and ensures they are correct. This tool is run whenever you run make test.
    • ood-preview: contains a simple dream server to provide a playground for experimenting with HTML rendering of the data.

For more information about the ocaml.org site, please see the main repository at https://github.com/ocaml/v3.ocaml.org and the server at https://github.com/ocaml/v3.ocaml.org-server.

If you wish to contribute, we have some contributing documents and plenty of issues.

Current OCaml Version: 4.10.2 -- in order for the mdx tests to be consistent (for example some list the functions available from the List module) you should only run them with the current version of OCaml this repository is using.

ood's People

Contributors

avsm avatar f-ludlam avatar guptadiksha307 avatar kayceesrk avatar maiste avatar patricoferris avatar rdavison avatar shreyaswikriti avatar tmattio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ood's Issues

Default collection type?

ReScript suggests using array as the default collection type. Some domain types in this project use list as the default collection type. I assume that we will use the ocaml convention of using list throughout?

Generate OCaml modules containing ood data

Currently, ood only provides types for the different data stored in the repository. This has a few consequences:

  • The data has to be read and processed by the user. This is particularly problematic as we are using two different processors: omd in ood-preview to test that everything is ok, and another JS library (?) in v3.ocaml.org. This deprecancy opens the door for inconsistencies between the data that has been QAed during the import, and the one available in v3.
  • Errors in the data are only caught at runtime.
  • The added complexity in user-site to work with the different processors and filesystem is huge

I propose that we write a preprocessor that will output OCaml modules containing the data available in data/.
The output module will not have any dependency, so it will be usable, regardless of the setup, and the data will be consistent with all of the projects that use ood.

Import Releases

Import the ocaml.org freeform releases into a structured collection.

Remove NetlifyCMS

I don't think we're going to go for this approach for now and the extra library and values in modules is just getting in the way of things. We can easily reconstruct any of this information if we need it later on by doing an archaeological dig of the repository.

Add book links as a metadata

Currently the books have any external links (say to Amazon or O'Reilly) linked at the bottom of the markdown, as suggested by @tmattio in #21, a better strategy would be to add this in the metadata. The simplest approach would be to simply add:

type t = {
  ...
  links: string list;
}

We could also use variants to distinguish things like `Amazon | `OReilly but that might be overkill seeing as we could probably extract it from the URL itself.

Import Books

Import all of the OCaml books from ocaml.org: https://github.com/ocaml/ocaml.org/blob/master/site/learn/books.md

This will need a new type, something along the lines of:

type t = {
  title : string;
  description: string;
  authors: string list;
  language: string;
  published: string option; (* Estimate of publish date *)
  isbn: string option;
}

We might want to make these "collections" ร  la tutorials to allow more free form description with links, markdown things like emphasis or bullet point lists etc.

Missing tutorials

While working on the preview of the tutorials, I noticed there might be some missing tutorials. In particular, Up and Running seems to be missing.

Create NPM package

As the primary goal of this library is to be a repository containing both the data for the ocaml.org project, as well as a suite of tools for managing that data, it should be easy to integrate that data. We should begin by adding a package.json file and ensuring that this project can be added as a dependecy of ocaml.org.

Import rest of the workshops

If someone is keen it would be good to get the important information from the other workshops into ood now that #48 is merged which lays out the general type information and structure of the files which can be copied. The workshops can be found here: https://github.com/ocaml/ocaml.org/tree/master/site/meetings/ocaml

Concretely, this requires adding more markdown files following the structure of data/workshops for the different years that the OCaml workshop has been running doing your best to extract the information from ocaml.org (link above) into this structured format. Finding the links to papers, videos etc. is also very helpful and can be added directly into the metadata section of the markdown. Any extra content can simply be added as markdown.

Add non-empty list type?

Some of the domain type fields could make use of a non-empty list type, which might also be enforceable with netlify. Let's consider whether to introduce such a type and whether to implement the type locally within this project.

Handling large media files

This topic has been mentioned in passing in previous PRs. Consider implementing a new server for image serving called image.ocaml.org using a system like thumbor. This will reduce the amount of data that need to be transferred between ood and v3, when v3 updates it's ood copy, and it will also allow for v3 to request scaled images on the fly.

Images for tutorials

The images for the tutorials need to be imported. From what I've seen, only functionnal_programming uses an image, and the missing Up and running reported in #28

Uniform date representation

Currently we have quite a few dates stored in the metadata, should we provide a better representation for these other than string or to keep things simple we could keep them as strings but ensure the strings are rfc3339 formatted?

Some examples include:

  • The year in the watch.ocaml.org data could just be the actual retrieved value as these are already in the correct format.
  • The tutorials already use the rfc3339 format indicating when they were last checked.
  • The papers have a year some of them we know months too we could convert to rfc3339 and always round to the earliest in the year (i.e. 1st day of the month, January, 00:00:00)

What do you think @tmattio ?

Import academic data

This issue is for adding data about OCaml's use for teaching in academic purposes primarily at universities.

A suggested type would be:

(* Academic_institution.ml (or some name like that...) *)

type location = { long: float; lat: float } (* In case we want to put it on a map ? *)

type t = {
  name: string;               (* name of the institution *) 
  description: string;        (* short description of the institution *)
  link: string;               (* Some link to the course or inst. *) 
  location : location option  (* optional location of inst. *)
}

The data we need to import can be found here https://github.com/ocaml/ocaml.org/blob/master/site/learn/teaching-ocaml.md

Add industrial user logo meta data?

It would be nice if each industrial user entry indicated whether the logo included the name of the company or not. This would allow the rendering to conditionally add the name when displaying the logo.

Add link checker?

Consider checking that links point to valid urls that are actively responding with a proper https status code, either as part of Netlify CMS edits or as part of the data linting process.

NetlifyCMS string list doesn't support whitespaces

Currently the default NetlifyCMS widget for lists and strings only allows comma-separated values but crucially the strings cannot contain spaces. This is problematic for author/people lists which need spaces. There are two workarounds:

  1. Extend lib_netlify to support custom widgets and use decaporg/decap-cms#4646
  2. Turn an author or person into an object with a name field. This seems the best and easiest solution and would leave the door open for adding extra information, for example an optional affiliation field for authors in papers.yml.

Resolving this issue with (2) would also be nice experiment of the workflow of modifying existing data that is currently being used by v3, updating type definitions and porting back to the web front-end.

List Kyoto University on the academic page

The great news is that OCaml is already in use at Kyoto University Undergraduate Course Program of Computer Science, and listing in v3 could be a good road sign for some students. That said, there are a few things to do before we add them.

Import the rest of the meetings

As part of #23 we need to bring in all of the data associated with meetings and events:

In fact I think "meetings" and "events" are the same and here we have events. The problem is that the individual meeting pages (where they exist) tend to be very different across the board. For example the OCaml Workshop pages have a fairly straightforward structure in that they contain papers, presentations, organising committee etc. they can be found
here https://github.com/ocaml/ocaml.org/tree/master/site/meetings/ocaml

Other meetings are simply a link to a meetup page. They don't necessarily need all of the data that the OUD ones do. In addition to this a lot of that data can be derived simply from the tags and years of the papers.yml entries and the videos.yml entries by filtering for the ocaml-workshop and the year 2020 for instance.

The meetings probably have enough additional content to be represented as a collection in a folder rather than as a single yaml file as they are currently.

Perhaps the OUD entries just need a relation to papers and videos to make that link explicit? @tmattio any thoughts on the best way to approach this? Maybe we keep and events.yml for tracking meetups and the like that don't need their own individual page on the front-end side and then generate a meetings/en/{meeting}.md collection for primarily the OCaml workshop pages?

Consider implementing ood-fixtures repo

As the amount of data grows in this repository, developing on v3 and deploying PR previews will continue to get slower (once we stop vendoring ood). Consider implementing a companion repository to ood, ood-fixtures, which has the same structure as ood, but only contains a small, fixed set of fixture data, sufficient to exercise the website. Ood-fixtures could be used in development and PR builds, while ood could be used in production deploys. Any changes to types would have to be duplicated in ood and ood-fixtures, but I expect the vast majority of changes to be data additions.

The fixture generation functions and the shell script to update ood-fixtures could reside in ood.

Consider migrating preview to front end

In the long run, it might be easier to use the front end directly to preview the data while editing. If all contributors are using the front end in their editing workflow, then the integrated ood and front end developer experience will get more attention. The added attention will result in smoothing out and speeding up the overall workflow - more integrated watchers, nicer git workflow, etc. It may even result in pushing for enhancements to the ocurrent pipelines for PR reviews.

I assume the main hesitation about using the front end for data previews is related to new data sets. When creating a new data set, the page design may not be ready. In such a case, it seems reasonable to just present the data in a page with no real design, using one of the simpler components in the component library, until the design is ready.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.