Giter VIP home page Giter VIP logo

arrow-site's Introduction

Apache Arrow Website

Overview

Jekyll is used to generate HTML files from the Markdown + templates in this repository. The built version of the site is kept on the asf-site branch, which gets deployed to https://arrow.apache.org.

Adding Content

To add a blog post, create a new markdown file in the _posts directory, following the model of existing posts. In the front matter, you should specify an "author". This should be your Apache ID if you have one, or it can just be your name. To add additional metadata about yourself (GitHub ID, website), add yourself to _data/contributors.yml. This object is keyed by apacheId, so use that as the author in your post. (It doesn't matter if the ID actually exists in the ASF; all metadata is local to this project.)

Prerequisites

With a recent version of Ruby (i.e. one that does not have an End-Of-Life (EOL) status) installed, run the following commands to install Jekyll.

gem install bundler
bundle install

We also need Node.JS to use webpack for maintaining dependent JavaScript and CSS libraries.

We can install webpack and dependent JavaScript and CSS libraries automatically by following command lines to preview or build the site. So we just need to install Node.JS here.

Previewing the site

Run the following and open http://localhost:4000/ to preview generated site locally:

bundle exec rake

Deployment

apache/arrow-site

On a commit to the main branch of apache/arrow-site, the rendered static site will be published to the asf-site branch using GitHub Actions.

Forks

When implementing changes to the website on a fork, the GitHub Actions workflow behaves differently.

On a commit to the main branch, the rendered static site will be published to a branch named gh-pages (rather than asf-site). If it doesn't already exist, a gh-pages branch will be automatically created by the GitHub Actions workflow when it succeeds.

The gh-pages branch is intended to be used with GitHub Pages. Deploying changes on the gh-pages branch to GitHub Pages is a useful way to preview changes to the website. It can also be a helpful way to share changes that are still in progress with others, since they can easily view them by navigating to the GitHub Pages URL in their web browser.

For the changes on the gh-pages branch to be deployed to GitHub Pages, the Source branch for GitHub Pages deployment must be set to gh-pages in the repository Settings of your fork (by default, the Source branch should be set to asf-site). Instructions on how to configure the Source branch can be found in the GitHub Pages documentation.

FYI: We can also generate the site for https://arrow.apache.org/ to _site/ locally by the following command line:

JEKYLL_ENV=production bundle exec rake generate

Using Docker

If you don't wish to change or install ruby and nodejs locally, you can use docker to build and preview the site with a command like:

docker run -v `pwd`:/arrow-site -p 4000:4000 -it ruby bash
cd arrow-site
apt-get update
apt-get install -y npm
gem install bundler
bundle install
# Serve using local container address
bundle exec rake HOST=0.0.0.0

Then open http://localhost:4000 locally

arrow-site's People

Contributors

alamb avatar alenkaf avatar amoeba avatar amol- avatar andygrove avatar bkmgit avatar dependabot[bot] avatar ianmcook avatar jorgecarleitao avatar jorisvandenbossche avatar kevingurney avatar kou avatar kszucs avatar lidavidm avatar mrkn avatar nealrichardson avatar paddyhoran avatar paleolimbot avatar pitrou avatar raulcd avatar returnstring avatar rvernica avatar siddharthteotia avatar thisisnic avatar tustvold avatar wesm avatar westonpace avatar willayd avatar xhochy avatar zeroshade avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arrow-site's Issues

[Website] Add a link to the Ruby End-Of-Life (EOL) schedule in the Prerequisites section of README.md

Describe the bug, including details regarding any error messages, version, and platform.

The following text appears in the Prerequisites section of the README.md for apache/arrow-site:

With non-EOL-ed Ruby installed, run the following commands to install Jekyll.

For those who aren't familiar with the Ruby ecosystem, it may not be immediately obvious what "non-EOL-ed Ruby" means.

To make things clearer, it would be helpful to link directly to the Ruby EOL schedule, as well as to spell out EOL as End-Of-Life for those who might not be familiar with the acronym EOL.

Component(s)

Website

[R] version selector is broken

The development version is displayed on the release version site and I cannot go from the release version to the development version site. ( It is possible to go to the development version site from the past version).

image

This is due to the fact that the following lines were not updated by #273?

{
"name": "10.0.0.9000 (dev)",
"version": "dev/"
},
{
"name": "10.0.0 (release)",
"version": ""
},

Add GitHub Id to list of committers.yml

The committers.yml file on the arrow-site repository contains information about the PMC and committers for the Arrow project. The current information is:

- name: committer name
  role: VP/PMC/committer
  alias: ASF alias
  affiliation: Work affiliation

We have implemented a PR automation workflow on the Arrow repository to better keep track of the state of PRs.
We assumed the alias on the committers file was the GitHub id instead of the ASF id. In general this is working because lots of members use the same ASF id than their GitHub id.
In some cases we are still not correctly identifying the committers.
I would like to add the the list of committers also their GitHub id similar to the contributors.yml:

- name: committer name
  role: VP/PMC/committer
  alias: ASF alias
  githubId: The github ID
  affiliation: Work affiliation

[Website] Can't build with latest ruby version "logger.rb:384:in `level': undefined method `[]' for nil (NoMethodError)"

When using a docker image to build following https://github.com/apache/arrow-site?tab=readme-ov-file#using-docker

docker run -v `pwd`:/arrow-site -p 4000:4000 -it ruby bash
cd arrow-site
apt-get update
apt-get install -y npm
gem install bundler
bundle install
# Serve using local container address
bundle exec rake HOST=0.0.0.0

jekyll fails to run:

root@2a57a0f41e29:/arrow-site# bundle exec rake HOST=0.0.0.0
jekyll serve --incremental --livereload --host 0.0.0.0
/usr/local/bundle/gems/jekyll-4.2.0/lib/jekyll.rb:28: warning: csv was loaded from the standard library, but will no longer be part of the default gems since Ruby 3.4.0. Add csv to your Gemfile or gemspec. Also contact author of jekyll-4.2.0 to add csv into its gemspec.
/usr/local/bundle/gems/safe_yaml-1.0.5/lib/safe_yaml/transform.rb:1: warning: base64 was loaded from the standard library, but will no longer be part of the default gems since Ruby 3.4.0. Add base64 to your Gemfile or gemspec. Also contact author of safe_yaml-1.0.5 to add base64 into its gemspec.
/usr/local/bundle/gems/liquid-4.0.4/lib/liquid/standardfilters.rb:2: warning: bigdecimal was loaded from the standard library, but will no longer be part of the default gems since Ruby 3.4.0. Add bigdecimal to your Gemfile or gemspec. Also contact author of liquid-4.0.4 to add bigdecimal into its gemspec.
jekyll 4.2.0 | Error:  undefined method `[]' for nil
/usr/local/lib/ruby/3.3.0/logger.rb:384:in `level': undefined method `[]' for nil (NoMethodError)

    @level_override[Fiber.current] || @level
                   ^^^^^^^^^^^^^^^
	from /usr/local/bundle/gems/jekyll-4.2.0/lib/jekyll/log_adapter.rb:45:in `adjust_verbosity'
	from /usr/local/bundle/gems/jekyll-4.2.0/lib/jekyll/configuration.rb:143:in `config_files'
	from /usr/local/bundle/gems/jekyll-4.2.0/lib/jekyll.rb:118:in `configuration'
	from /usr/local/bundle/gems/jekyll-4.2.0/lib/jekyll/command.rb:44:in `configuration_from_options'
	from /usr/local/bundle/gems/jekyll-4.2.0/lib/jekyll/commands/serve.rb:83:in `block (2 levels) in init_with_program'
	from /usr/local/bundle/gems/mercenary-0.4.0/lib/mercenary/command.rb:221:in `block in execute'
	from /usr/local/bundle/gems/mercenary-0.4.0/lib/mercenary/command.rb:221:in `each'
	from /usr/local/bundle/gems/mercenary-0.4.0/lib/mercenary/command.rb:221:in `execute'
	from /usr/local/bundle/gems/mercenary-0.4.0/lib/mercenary/program.rb:44:in `go'
	from /usr/local/bundle/gems/mercenary-0.4.0/lib/mercenary.rb:21:in `program'
	from /usr/local/bundle/gems/jekyll-4.2.0/exe/jekyll:15:in `<top (required)>'
	from /usr/local/bundle/bin/jekyll:25:in `load'
	from /usr/local/bundle/bin/jekyll:25:in `<main>'
rake aborted!
Command failed with status (1): [jekyll serve --incremental --livereload --host 0.0.0.0]
/arrow-site/rakefile:42:in `block in <top (required)>'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/cli/exec.rb:58:in `load'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/cli/exec.rb:58:in `kernel_load'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/cli/exec.rb:23:in `run'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/cli.rb:451:in `exec'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/vendor/thor/lib/thor/command.rb:28:in `run'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/vendor/thor/lib/thor.rb:527:in `dispatch'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/cli.rb:34:in `dispatch'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/vendor/thor/lib/thor/base.rb:584:in `start'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/cli.rb:28:in `start'
/usr/local/bundle/gems/bundler-2.5.4/exe/bundle:28:in `block in <top (required)>'
/usr/local/bundle/gems/bundler-2.5.4/lib/bundler/friendly_errors.rb:117:in `with_friendly_errors'
/usr/local/bundle/gems/bundler-2.5.4/exe/bundle:20:in `<top (required)>'
/usr/local/bundle/bin/bundle:25:in `load'
/usr/local/bundle/bin/bundle:25:in `<main>'
Tasks: TOP => default => serve
(See full trace by running task with --trace)

workround

If I use an older ruby image from https://hub.docker.com/_/ruby then the site builds fine:

docker run -v `pwd`:/arrow-site -p 4000:4000 -it ruby:3.2.2-slim-bullseye bash

Update arrow-site governance page

The check_committers list indicates that our governance page is out of date:

$ ../.venv/bin/python check_committers.py 
Missing PMCs in local list: ['jakevin']
Missing committers in local list: ['akurmustafa', 'avantgardner', 'gangwu', 'kevingurney', 'wayne']
Unexpected members in local list: ['avantgardnerio', 'jackwener', 'mustafasrepo', 'waynexia']

[Website] Deploy step on deploy GitHub action has been failing for the last commits

The deploy step for the Arrow site is failing. After merging the changes are not being deployed at the moment:

See: https://github.com/apache/arrow-site/actions/workflows/deploy.yml
and a specific job: https://github.com/apache/arrow-site/actions/runs/8330658484/job/22795808856#step:9:52

 From https://github.com/apache/arrow-site
 * [new branch]              add-bryce  -> deploy/add-bryce
 * [new branch]              asf-site   -> deploy/asf-site
 * [new branch]              main       -> deploy/main
Switched to a new branch 'asf-site'
branch 'asf-site' set up to track 'deploy/asf-site'.
rsync: [sender] change_dir "/home/runner/work/arrow-site/arrow-site/../build/docs" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1338) [sender=3.2.7]
Error: Process completed with exit code 23.

[Website] Create a custom Docker container for running the website build and deployment steps

This is a follow up to #326.

As discussed in #326 (comment), it would be helpful if the website deployment script (.github/workflows/deploy.yml) ran the website build steps inside of a custom Docker container.

This would enable easier reproducibility for local debugging workflows (i.e. https://github.com/apache/arrow-site#using-docker) and allow for more fine grained control over the deployment environment.

Ideally, the Docker container would have the following properties:

  1. Based on ubuntu:22.04 (i.e. FROM ubuntu:22.04).
  2. Uses nvm to install the latest available LTS version of Node.js based on an .nvmrc file.
  3. Uses rbenv to install the latest available LTS version of Ruby based on a .ruby-version file.
  4. All environment setup / dependency installation steps are extracted into a standalone Bash script (e.g. install_website_dependencies.sh) so that the script can also be run directly on an Ubuntu 22.04 host machine with no container requirement.
  5. Stored in the GitHub Container Registry (ghcr.io).

[Website] Installation page refers to arrow-cpp from conda which has been deprecated in favor of libarrow

On the install page: https://arrow.apache.org/install/
We refer to installing arrow-cpp with conda:

Install them with:

conda install arrow-cpp=16.1.* -c conda-forge
conda install pyarrow=16.1.* -c conda-forge
conda install r-arrow=16.1.* -c conda-forge

This has been deprecated. We should refer to libarrow and maybe to the different components like libarrow-all, libparquet, etc or at least pyarrow-core, pyarrow and pyarrow-all.

[Website] Website deployment workflow (`deploy.yml`) is failing due to Node.js 18 version bump in `ubuntu-latest` GitHub Actions runner image and Webpack usage of `md4` hashing algorithm

Describe the bug, including details regarding any error messages, version, and platform.

See the following comment on apache/arrow#322.

A few weeks ago, the apache/arrow-site deployment workflow (.github/workflows/deploy.yml) started failing with the following output:

.
.
.
npm ci
npm WARN deprecated [email protected]: You can find the new Popper v2 at @popperjs/core, this package is dedicated to the legacy v1

added [13](https://github.com/apache/arrow-site/actions/runs/4257427943/jobs/7407542742#step:9:14)2 packages, and audited 133 packages in 2s

17 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities
npx webpack --mode=production
rm -f javascript/main.js
node:internal/crypto/hash:71
  this[kHandle] = new _Hash(algorithm, xofLen);
                  ^

Error: error:0308010C:digital envelope routines::unsupported
    at new Hash (node:internal/crypto/hash:71:19)
    at Object.createHash (node:crypto:133:10)
    at BulkUpdateDecorator.hashFactory (/home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/util/createHash.js:[14](https://github.com/apache/arrow-site/actions/runs/4257427943/jobs/7407542742#step:9:15)4:18)
    at BulkUpdateDecorator.update (/home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/util/createHash.js:46:50)
    at RawSource.updateHash (/home/runner/work/arrow-site/arrow-site/node_modules/webpack-sources/lib/RawSource.js:64:8)
    at NormalModule._initBuildHash (/home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/NormalModule.js:753:[17](https://github.com/apache/arrow-site/actions/runs/4257427943/jobs/7407542742#step:9:18))
    at handleParseResult (/home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/NormalModule.js:817:10)
    at /home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/NormalModule.js:908:4
    at processResult (/home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/NormalModule.js:640:11)
    at /home/runner/work/arrow-site/arrow-site/node_modules/webpack/lib/NormalModule.js:692:5 {
  opensslErrorStack: [ 'error:03000086:digital envelope routines::initialization error' ],
  library: 'digital envelope routines',
  reason: 'unsupported',
  code: 'ERR_OSSL_EVP_UNSUPPORTED'
}

Node.js v18.14.1
rake aborted!
Command failed with status (1): [npx webpack --mode=production...]
/home/runner/work/arrow-site/arrow-site/Rakefile:37:in `block in <top (required)>'
/home/runner/work/arrow-site/arrow-site/vendor/bundle/ruby/3.0.0/gems/rake-13.0.6/exe/rake:27:in `<top (required)>'
/opt/hostedtoolcache/Ruby/3.0.2/x64/bin/bundle:[23](https://github.com/apache/arrow-site/actions/runs/4257427943/jobs/7407542742#step:9:24):in `load'
/opt/hostedtoolcache/Ruby/3.0.2/x64/bin/bundle:23:in `<main>'
Tasks: TOP => generate => javascript/main.js
(See full trace by running task with --trace)
Error: Process completed with exit code 1.

This appears to be related to the use of Webpack by apache/arrow-site and the following issues:

  1. webpack/webpack#14532 (comment)
  2. webpack/webpack#13572
  3. webpack/webpack#14306

My high level understanding is that in Node 18 (the build output above shows Node.js v18.14.1 is being used), the md4 hashing algorithm is deprecated (more specifically, it seems that Node 18 uses OpenSSL 3.0, which has deprecated md4) and the version of Webpack used by apache/arrow-site (v5.21.2) seems to default to using md4.

Webpack v5.61.0 added a WASM md4 implementation as a fallback. However, the advice in webpack/webpack#14532 (comment) recommends setting output.hasFunction in the Webpack config to use an alternative hashing algorithm instead. Specifically, it recommends using xxhash64 (which is planned to be the default hashing algorithm when Webpack 6 is released).

It seems that the version of Node.js in the ubuntu-latest GitHub Actions runner image (used by deploy.yml) was bumped to v18 on Februrary 13, 2023. This would explain why this issue started appearing a few weeks ago.

Workarounds

There are a few different approaches we could pursue to address this issue:

  1. We could choose to pin the version of Node.js used by the GitHub Actions runner to v16 for the actions/setup-node action to work around this issue. Of course, this would mean we would be continuing to rely on an outdated version of Node.js, which doesn't seem ideal in the long term.

  2. We could follow the advice in webpack/webpack#14532 (comment) and set output.hashFunction in the Webpack config to use an alternative hashing algorithm, like xxhash64.

  3. We could follow the advice of @avantgardnerio in #322 (comment) and move away from relying on the proprietary ubuntu-latest image, which is subject to sudden updates like the Node.js one that caused this issue. Instead, we can use the official ubuntu:latest container image (this is the approach followed by arrow-ballista). ubuntu:latest wouldn't have unexpected library updates, and it would also be possible to run the container image locally for debugging purposes.

Component(s)

Website

[QUESTION] Submitting blogposts

Hello! The README mentions how to go about adding a blogpost, but I was wondering what the editorial process was like. Can anyone submit a blogpost provided it be relevant to the Arrow project?

[Website] Add note about the need to set GitHub Pages deployment source branch to `gh-pages` when previewing website changes on a fork of `apache/arrow-site`

Describe the bug, including details regarding any error messages, version, and platform.

This is a follow up to this comment.

It would be helpful to add an explicit note to the Deployment section of the README.md about the need to set the GitHub Pages deployment source branch to gh-pages when previewing website changes on a fork of apache/arrow-site.

It isn't immediately obvious that this is a required step. As far as I can tell, the default deployment source branch will normally be set to asf-site for a fork, which means that any changes to the website won't deploy on a successful run. This could potentially be quite confusing.

Component(s)

Website

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.