ocaml / ood Goto Github PK

View Code? Open in Web Editor NEW

14.0 9.0 8.0 16.71 MB

OCaml.org v3 data repository

License: Other

OCaml 92.72% Makefile 0.01% CSS 0.30% JavaScript 0.01% Standard ML 6.96% Shell 0.01%

ood's Introduction

❗ This repository has moved to https://github.com/ocaml/v3.ocaml.org-server. Please open issues and PRs there.

Typed and Versioned Data for OCaml.org v3

Status: Not yet open for contributions. Contact @avsm.

This repository contains data for the OCaml.org website along with a suite of tools for managing that data. In particular:

data: stores all of the data. There are two kinds of data, those stored as yaml files and those stored in a jekyll-format (a meta-data section of yaml and a body of markdown). In addition to this the tutorials are also written using mdx to ensure they are up to date.
src: contains the code for three separate tools.
- ood: is almost exclusively OCaml modules generated by parsing and slightly modifying the data stored in this repository.
- ood-gen: contains a suite of CLI tools for generating, parsing and fetching data. For example, ood-gen/bin/lint.ml reads all of the different items in data and ensures they are correct. This tool is run whenever you run make test.
- ood-preview: contains a simple dream server to provide a playground for experimenting with HTML rendering of the data.

For more information about the ocaml.org site, please see the main repository at https://github.com/ocaml/v3.ocaml.org and the server at https://github.com/ocaml/v3.ocaml.org-server.

If you wish to contribute, we have some contributing documents and plenty of issues.

Current OCaml Version: 4.10.2 -- in order for the mdx tests to be consistent (for example some list the functions available from the List module) you should only run them with the current version of OCaml this repository is using.

ood's People

Contributors

Stargazers

Watchers

Forkers

patricoferris avsm tmattio shreyaswikriti guptadiksha307 f-ludlam kayceesrk

ood's Issues

Default collection type?

ReScript suggests using array as the default collection type. Some domain types in this project use list as the default collection type. I assume that we will use the ocaml convention of using list throughout?

Curated package lists and explanations

Incorporating something like Ecosystem from https://ocamlverse.github.io/

Import remaining data from teaching-ocaml.md

The page contents after the RESOURCES section have not been imported as mentioned in issue #24 .
TODO:import

suggested books.
teaching tools
ocaml installation
Tutorials and exercises
Mailing list
Import content from https://github.com/ocaml/ocaml.org/blob/master/site/learn/teaching-ocaml.md

Import Successes

Import the OCaml successes from https://github.com/ocaml/ocaml.org/blob/master/site/learn/success.md -- this should most likely be collections like the tutorials where the markdown body is an explanation, the meta data could look like:

type t = {
  title : string;
  description : string;
  company : string option; 
  tags : string list; (* Could be academic, industry etc. ? *)
}

Generate OCaml modules containing ood data

Currently, ood only provides types for the different data stored in the repository. This has a few consequences:

The data has to be read and processed by the user. This is particularly problematic as we are using two different processors: omd in ood-preview to test that everything is ok, and another JS library (?) in v3.ocaml.org. This deprecancy opens the door for inconsistencies between the data that has been QAed during the import, and the one available in v3.
Errors in the data are only caught at runtime.
The added complexity in user-site to work with the different processors and filesystem is huge

I propose that we write a preprocessor that will output OCaml modules containing the data available in data/.
The output module will not have any dependency, so it will be usable, regardless of the setup, and the data will be consistent with all of the projects that use ood.

Import Releases

Import the ocaml.org freeform releases into a structured collection.

Remove NetlifyCMS

I don't think we're going to go for this approach for now and the extra library and values in modules is just getting in the way of things. We can easily reconstruct any of this information if we need it later on by doing an archaeological dig of the repository.

Add book links as a metadata

Currently the books have any external links (say to Amazon or O'Reilly) linked at the bottom of the markdown, as suggested by @tmattio in #21, a better strategy would be to add this in the metadata. The simplest approach would be to simply add:

type t = {
  ...
  links: string list;
}

We could also use variants to distinguish things like `Amazon | `OReilly but that might be overkill seeing as we could probably extract it from the URL itself.

Periodic update of dependencies

v3.ocaml.org recently upgraded to bs-platform 9.0.2. We performed the same upgrade here in PR #9.

Import Books

Import all of the OCaml books from ocaml.org: https://github.com/ocaml/ocaml.org/blob/master/site/learn/books.md

This will need a new type, something along the lines of:

type t = {
  title : string;
  description: string;
  authors: string list;
  language: string;
  published: string option; (* Estimate of publish date *)
  isbn: string option;
}

We might want to make these "collections" à la tutorials to allow more free form description with links, markdown things like emphasis or bullet point lists etc.

Missing tutorials

While working on the preview of the tutorials, I noticed there might be some missing tutorials. In particular, Up and Running seems to be missing.

List University of Tsukuba on the academic page

1. Get approval from them.
2. Check the accurate usage status.
- @aigarashi has already confirmed with this with them, but it would be a good thing to re-confirm here.

Create NPM package

As the primary goal of this library is to be a repository containing both the data for the ocaml.org project, as well as a suite of tools for managing that data, it should be easy to integrate that data. We should begin by adding a package.json file and ensuring that this project can be added as a dependecy of ocaml.org.

Import rest of the workshops

If someone is keen it would be good to get the important information from the other workshops into ood now that #48 is merged which lays out the general type information and structure of the files which can be copied. The workshops can be found here: https://github.com/ocaml/ocaml.org/tree/master/site/meetings/ocaml

Concretely, this requires adding more markdown files following the structure of data/workshops for the different years that the OCaml workshop has been running doing your best to extract the information from ocaml.org (link above) into this structured format. Finding the links to papers, videos etc. is also very helpful and can be added directly into the metadata section of the markdown. Any extra content can simply be added as markdown.

Add non-empty list type?

Some of the domain type fields could make use of a non-empty list type, which might also be enforceable with netlify. Let's consider whether to introduce such a type and whether to implement the type locally within this project.

List Ochanomizu University on the academic page

1. Get approval from them.
2. Check the accurate usage status.
- @aigarashi has already confirmed with @kenichi-asai about this, but it would be a good thing to re-confirm here.

Handling large media files

This topic has been mentioned in passing in previous PRs. Consider implementing a new server for image serving called image.ocaml.org using a system like thumbor. This will reduce the amount of data that need to be transferred between ood and v3, when v3 updates it's ood copy, and it will also allow for v3 to request scaled images on the fly.

Update the README

Since #27 was merged, the readme no longer reflects the project structure.

Refactor and import 99 problems

The current format of 99 problems solutions doesn't appear to work with mdx, it will need imported and slightly modified to make the tests run.

Images for tutorials

The images for the tutorials need to be imported. From what I've seen, only functionnal_programming uses an image, and the missing Up and running reported in #28

Fix the forallsecure svg

I think it is actually currently XML so doesn't render properly

Uniform date representation

Currently we have quite a few dates stored in the metadata, should we provide a better representation for these other than string or to keep things simple we could keep them as strings but ensure the strings are rfc3339 formatted?

Some examples include:

The year in the watch.ocaml.org data could just be the actual retrieved value as these are already in the correct format.
The tutorials already use the rfc3339 format indicating when they were last checked.
The papers have a year some of them we know months too we could convert to rfc3339 and always round to the earliest in the year (i.e. 1st day of the month, January, 00:00:00)

What do you think @tmattio ?

Import academic data

This issue is for adding data about OCaml's use for teaching in academic purposes primarily at universities.

A suggested type would be:

(* Academic_institution.ml (or some name like that...) *)

type location = { long: float; lat: float } (* In case we want to put it on a map ? *)

type t = {
  name: string;               (* name of the institution *) 
  description: string;        (* short description of the institution *)
  link: string;               (* Some link to the course or inst. *) 
  location : location option  (* optional location of inst. *)
}

The data we need to import can be found here https://github.com/ocaml/ocaml.org/blob/master/site/learn/teaching-ocaml.md

List The University of Tokyo on the academic page

1. Get approval from them.
2. Check the accurate usage status.
- @aigarashi has already confirmed with this with them, but it would be a good thing to re-confirm here.

Add industrial user logo meta data?

It would be nice if each industrial user entry indicated whether the logo included the name of the company or not. This would allow the rendering to conditionally add the name when displaying the logo.

Add link checker?

Consider checking that links point to valid urls that are actively responding with a proper https status code, either as part of Netlify CMS edits or as part of the data linting process.

Expose different languages of the content

Success stories
Tutorials

NetlifyCMS string list doesn't support whitespaces

Currently the default NetlifyCMS widget for lists and strings only allows comma-separated values but crucially the strings cannot contain spaces. This is problematic for author/people lists which need spaces. There are two workarounds:

Extend lib_netlify to support custom widgets and use decaporg/decap-cms#4646
Turn an author or person into an object with a name field. This seems the best and easiest solution and would leave the door open for adding extra information, for example an optional affiliation field for authors in papers.yml.

Resolving this issue with (2) would also be nice experiment of the workflow of modifying existing data that is currently being used by v3, updating type definitions and porting back to the web front-end.

Introduce semantic versioning for ood npm package?

This library is now an npm package. Should we introduce semantic versioning to the workflow of this repository?

Original News Content

#42 added some scrapped blog posts. We also want to import the blog posts from https://github.com/ocaml/platform-blog and discuss. The blog posts from these two sources should be clearly demarcated as original content.

List Kyoto University on the academic page

The great news is that OCaml is already in use at Kyoto University Undergraduate Course Program of Computer Science, and listing in v3 could be a good road sign for some students. That said, there are a few things to do before we add them.

1. Get approval from them based on the identity guidebook.
- https://www.kyoto-u.ac.jp/en/about/profile/emblems/our-emblem-and-school-color#kyoto-university-visual-identity-guidebook
2. Check the accurate usage status.
- I have already confirmed with @aigarashi about this, but it would be a good thing to re-confirm here.

Import the rest of the meetings

As part of #23 we need to bring in all of the data associated with meetings and events:

Meetings: https://github.com/ocaml/ocaml.org/blob/master/site/meetings/index.md

In fact I think "meetings" and "events" are the same and here we have events. The problem is that the individual meeting pages (where they exist) tend to be very different across the board. For example the OCaml Workshop pages have a fairly straightforward structure in that they contain papers, presentations, organising committee etc. they can be found
here https://github.com/ocaml/ocaml.org/tree/master/site/meetings/ocaml

Other meetings are simply a link to a meetup page. They don't necessarily need all of the data that the OUD ones do. In addition to this a lot of that data can be derived simply from the tags and years of the papers.yml entries and the videos.yml entries by filtering for the ocaml-workshop and the year 2020 for instance.

The meetings probably have enough additional content to be represented as a collection in a folder rather than as a single yaml file as they are currently.

Perhaps the OUD entries just need a relation to papers and videos to make that link explicit? @tmattio any thoughts on the best way to approach this? Maybe we keep and events.yml for tracking meetups and the like that don't need their own individual page on the front-end side and then generate a meetings/en/{meeting}.md collection for primarily the OCaml workshop pages?

history of OCaml

Not sure if this should be in ood or in the frontend, but there is a good history of OCaml in the v1 site:

https://caml.inria.fr/about/history.en.html

Consider implementing ood-fixtures repo

As the amount of data grows in this repository, developing on v3 and deploying PR previews will continue to get slower (once we stop vendoring ood). Consider implementing a companion repository to ood, ood-fixtures, which has the same structure as ood, but only contains a small, fixed set of fixture data, sufficient to exercise the website. Ood-fixtures could be used in development and PR builds, while ood could be used in production deploys. Any changes to types would have to be duplicated in ood and ood-fixtures, but I expect the vast majority of changes to be data additions.

The fixture generation functions and the shell script to update ood-fixtures could reside in ood.

Generate a variant with all tutorial names

Please generate a variant with all tutorial names as part of ood gen.

Consider migrating preview to front end

In the long run, it might be easier to use the front end directly to preview the data while editing. If all contributors are using the front end in their editing workflow, then the integrated ood and front end developer experience will get more attention. The added attention will result in smoothing out and speeding up the overall workflow - more integrated watchers, nicer git workflow, etc. It may even result in pushing for enhancements to the ocurrent pipelines for PR reviews.

I assume the main hesitation about using the front end for data previews is related to new data sets. When creating a new data set, the page design may not be ready. In such a case, it seems reasonable to just present the data in a page with no real design, using one of the simpler components in the component library, until the design is ready.