Brainstorm/questions for "esy fast build",about esy/esy

Comments (8)

andreypopp commented on June 14, 2024

I wonder how the esy my command environment will play with this "fast build
command" feature we've been discussing. esy build creates a full build of
all dependencies, and totally clean build of the current project into a
local ./node_modules/.cache directory. esy my command then augments the
current environment with all the dependencies as well as the current
project's full
clean build in the ./node_modules/.cache directory. If we have a "fast
build command", that wouldn't create a full clean build, but rather do one
in the current source tree. You could imagine doing both esy build and then
also the "fast build". What would esy my command augment the environment
with? The full clean build in node_modules/.cache? The locally build
project? Or perhaps whichever one was last executed?

I don't think we could augment the environment with paths from the locally executed build. We just don't know where the project's build system places artifacts — jbuilder places it into _build/default/ and ocamlbuild straight into _build.

I think we need to rethink what we use esy <anycommand> for (and also esy shell, in fact we should consider esy shell and esy <anycommand> equivalent from the point of view of the environment).

There could be different use cases for esy <anycommand>/esy shell:

You execute commands during the development. Those commands use the dependencies you have installed, such as ocamlmerlin, refmt.
You execute commands to demo/test/use your project and you want to have your top level's project bin/ in $PATH and your lib/ in $OCAMLPATH.
You execute commands in a build environment. We have a separate esy build-shell command for that (and no esy <anycommand> equivalent btw) but I'd just list it for the sake of completeness.

So (1) is (almost) well supported right now with the current esy shell/esy <anycommand> implementation. I say almost b/c it also includes top level package's paths in the environment which can mess things up sometimes (example: you develop Merlin and have a dependency on another Merlin version which you use for development).

The usecase (2) is supported by the current esy shell/esy <anycommand> only in case we use esy build. It won't work if we are going to introduce some kind of esy fastbuild command which won't place artifacts into stores.

Regardig (3) I think we already have good support for it, but maybe we need to add esy <anycommand> equivalent. Maybe in the form of:

esy build-run <anycommand>

which will execute <anycommand> within the build environment.

Ok, let's focus on (1) and (2) then.

I think (1) is much more often used than (2) and therefore I want to keep esy shell/esy <anycommand> UI for that use case.

The only addition I'd make is esy run <anycommand> which works exactly the same as esy <anycommand> command but allows to execute commands shadowed by esy subcommands (example: esy run install).

We need to change the environment for (1) though by removing the top level package's path from it. So that the case of developing Merlin with another Merlin in dependencies could be solved. It will look more like the environment of build-shell but not exactly as build-shell enviroment is still more
strict (overriden $SHELL and sandboxed using sandbox-exec on macOS).

As for (2) I think we can make it more general. What if we could have a command which creates a new sandboxed environment with some listed packages w/o the need to define package.json and w/o the need to have those listed packages installed and/or built? That could look like:

esy sandbox-run -p reason refmt

That would create a new sandboxed environment with a package reason installed and built and then execute refmt binary (which happens to be in $PATH after we have reason). Of course if we have reason in store then such sandbox would be created w/o installing or building reason — we'd need just to set the correct paths from store into env.

How does that solve (2)? Simple:

esy sandbox-run -p ./ hello.native

That could build and install the local project and execute hello.native (example from esy-ocaml-project) which happens to be in $PATH.

To an end user that esy sandbox-run command would work similar to npx which is by the way now shipped with npm and so with the Node distribution.

We need to have an esy shell equivalent too — esy sandbox-shell.

esy sandbox-shell -p reason

Names of commands are not final, of course, let's bikeshed that.

Thoughts?

Since esy my command augments the environment with a few more env variables
(so it works with editors etc), I think you were correct that for "fast
rebuilds", we will need something other than esy make build. We want the
"fast build" feature to have the exact same environment as esy build but
without all the file copying.

Agree. Let's brainstorm on the esy fastbuild.

Requirements for esy fastbuild:

It should be fast.

a. It shouldn't spawn Node.js runtime on each invokation. It could spawn it on first invokation though.

b. It should perform a build without source relocation.
It should perform a build in the similar environment as of esy build.

This is why esy <your-build-command> doesn't work as it includes own bin directory in the $PATH.
It should only execute commands which build the project, not commands which install the built artifacts into $cur__install.

Therefore I suggest the following implementation plan:

We modify Esy specification to split esy.build in package.json into esy.build and esy.install parts. esy.build just builds into $cur__target_dir and esy.install installs artifacts from $cur__target_dir into $cur__install.
We add esy fastbuild command (let's assume it's called that way and bikeshed on its choice later).
- The environment will be as strict as with esy build but instead building into store it will build with $cur__target_dir=..
- On first invocation
  - It ejects a shell script with commands from package.json's esy.build.
  - Builds dependencies.
  - Executes ejected build.
- On later invokations we check if ejected build is not stale (same check as we do with command-env by stat-ing package.json-files) and either regenerate it or execute it.
- (optional) We could have a mode which executes the build commands and wait for input from user — Enter to reexecute build, q to quit.

That way the dev workflow would look like:

esy install
esy fastbuild
./_build/default/bin/hello.exe

Or if we rename esy fastbuild to esy build:

esy install
esy build
./_build/default/bin/hello.exe

Thoughts?

from esy.

jordwalke commented on June 14, 2024

Regarding `esy fastbuild`:

It should be fast.
a. It shouldn't spawn Node.js runtime on each invokation. It could spawn it on first invokation though.

Sounds great!

b. It should perform a build without source relocation.

Agreed.

It should perform a build in the similar environment as of esy build.
This is why esy doesn't work as it includes own bin directory in the $PATH.

Agreed. Shouldn't the environment for fast build be exactly the same environment that we have for esy build right now, right?

You mention that we have mode 2 right now, but that it wouldn't work based off of the results of a "fast build" since the fast build wouldn't copy install artifacts. I actually think both modes 1 and 2 could be made to work based on a fast build, if we also implement your other proposal of having a separate esy.install field:

esy fastbuild could do exactly as you say and nothing more. As you suggested, it wouldn't even run the esy.install field to copy artifacts over. However, when it comes time to run Mode 2 commands we can lazily run the esy.install field, if it has not been run yet since the last time we ran esy build/fastbuild. If Mode 2 was made into an esy sandbox-run (like npx) then esy sandbox-run cmd would download dependencies if needed, build dependencies if needed, run the current project's fast build if needed, run the esy.install if needed, and then run cmd. I have a bike-shedding table below which shows how this could simplify the entire CLI interface.

Another question: Just to simplify, do you think we could get away with only having the fast build mode? A justification is that once you build a fast mode, you pollute artifacts in your tree anyways, and so there's no real way to get a true esy build that is perfectly clean anyways from that point forward. Any future esy build after an esy fastbuild would basically be as pure/reliable as a second esy fastbuild. Another benefit is that it becomes easier to teach people the workflow.

About the various modes:

I would say that mode 2 has actually been the most useful for me (and while documenting esy starter projects) because it lets you run commands from the perspective of someone who installed your project globally - without having to statefully pin it globally like with opam. After successfully building a project, usually the first thing you want to do is test it like an end user. I don't feel strongly about which mode gets which cli invocation, but I don't think mode 2 should be too heavy. Perhaps we can name the esy sandbox-run as esx, to reflect the similarity to npx. Nice observation about the ability to extend sandbox-run to arbitrary packages with -p.

Bike-shedding:

I think I have found a way to cram everything we've discussed into a very small
CLI surface area. It makes assumptions that I explained above:

That esy build is "fast" by default.
That the install (copying final artifacts) can be done lazily.
That because of 1 and 2, esy build can be used as a prefix for arbitrary build commands that run in the same build environment as esy build itself, and they too will run fast. That means you can do esy build make clean - this helps prevent esy from turning into a task runner.
esx is like your proposed esy sandbox-run and can be loosely inspired by
npx.

Command	Shell equivalent	Description	Environment Mode	Inherits Current Shell	Includes Top Package Bin/Lib
`esy build`	`esy build shell`	Fast builds dependencies and project. Does not do install copying.	Mode 3	NO	NO
`esy build cmd`	N/A	Fast arbitrary build commands such as `esy build make clean`	Mode 3	NO	NO
`esx cmd`	`esx shell`	Esy execute `cmd` in current project. Equivalent to `esx . cmd`	Mode 2	YES	YES
`esx -p reactjs cmd`	`esx -p reactjs shell`	Esy execute `cmd` in temp workspace with package `reactjs`	Mode 2	YES	YES
`esy cmd`	`esy shell`	Esy Command	Mode 1	YES	NO

Does this more or less match your understanding of the various mode's behaviors? We can bike shed the calling API but I want to make sure I understand the environment Modes correctly.

Another Question: What is the primary use case for Mode 1, where you we inherit the current shell, but don't include the top level packages? Mode 3 seems better for running commands to debug the build. Is Mode 1 primarily for IDEs?

Random Suggestion:

Suggestion: I would use a different terminology besides install since install (unfortunately means to download sources). Perhaps we could have esy.build and esy.buildinstall.

from esy.

andreypopp commented on June 14, 2024

Shouldn't the environment for fast build be exactly the same environment that we have for esy build right now, right?

Not the same as $cur__target_dir won't be inside Esy store but equal to source dir.

Another Question: What is the primary use case for Mode 1, where you we inherit the current shell, but don't include the top level packages? Mode 3 seems better for running commands to debug the build. Is Mode 1 primarily for IDEs?

Yes, I thought about IDEs and also about trying fastbuild builds: esy ./_build/default/hello.native — this is how I use Esy now. Maybe that could be replaced with esx ... I don't know.

What I dislike about name esx is that it seems it will be often used within the same Esy workflow but it is a different command. In npx it is understandable because you wouldn't use npx while you develop a package, its use case is completely separated. In our case I'd prefer to have it under esy top level command. Maybe esy x if we want it to be short.

from esy.

jordwalke commented on June 14, 2024

Not the same as $cur__target_dir won't be inside Esy store but equal to source dir.

Ah, right.

What I dislike about name ...

yeah, those are good points.

from esy.

jordwalke commented on June 14, 2024

Updates from offline discussion:

We believe perhaps that the install command (copying files) isn't that expensive and that the real price is copying source files before compiling because that is what ruins incremental compilation, and the time spend copying and recompiling from scratch blocks actionable error messages. The install copy only occurs when everything succeeds so it's not such a big deal.

This means that yes, we can have one kind of build - and that it can be as fast as possible all the time - however, not all packages can safely be compiled without that copy. Only packages marked buildsInSource:false can. The reason is that these packages could be symlinked from another package, and if we perform a fast non-source-copying build it will create build artifacts in the repo that other projects linking to us will end up seeing. Then, "Inconsistent Interface" compilation errors will abound. So what we have today with the buildsInSource:false is pretty close to the optimal performance we can achieve while safely supporting symlinks.

We can still reevaluate the environment modes and how we access them via esy/esx/blah.

It would be nice to have some command like esy build cmd that runs a specific command in the same build environment that esy build runs in. This would be an escape hatch for doing in source, non-copying builds for projects that we shouldn't be doing them for.
(Andrey says he doesn't like the idea of overloading esy build - that's fine - I just need an escape hatch of some kind to run builds in source even when they technically shouldn't be and esy build make build would satisfy that).

from esy.

andreypopp commented on June 14, 2024

Updates from disscussions in Discord:

Motivation

We want to optimize the development workflow with Esy.

Currently esy build command does to much for every package in a sandbox:

SLOW: Spawn node process
SLOW: Copy project sources to a clean location (this invalidates incremental
builds)
Run build & install commands
SLOW: Relocate from stage (s) directory to install (i) tree within the store.

We are going to optimize workflow for projects which use jbuilder as their
build system. Specifically we want to optimize the build time of the root
project.

Proposals

Enabling incremental builds (addresses (2))

We've decided to make buildsInSource package config to have three states:

false (default value if omitted) — build process produces artifacts only in
$cur__target_dir and/or $cur__install.
'_build' — build process can produce artifacts in $cur__root/_build (in
addition to $cur__target_dir and $cur__install).
true — build process can produce artifacts in any location within the
$cur__root (in addition to $cur__target_dir and $cur__install).

The _build variant is specifically to accomodate jbuilder behaviour.

The esy build command is going to have the following semantics regarding buildsInSource in terms of the following operations:

build — execute builds commands
copy — copy sources to a build location

The table of "buildsInSource" x package type:

`buildsInSource`	`false`	`'_build'`	`true`
root package	build	build	copy, build
linked dependency	build	copy¹, build	copy, build
regular dependency	build	copy, build	copy, build

¹: we cannot pollute $cur__root/_build for linked dependencies as
they might be linked to more than a single sandbox (we also make sure we ignore
_build dir when copying sources to a build location).

That effectively means that running esy build will reuse the _build
directory for a root project and thus enable incremental builds with jbuilder
for root projects.

Can this be made simpler?

Yes, three states for buildsInSource is suboptimal. We'll see if jbuilder is
going to get out of source builds — in that case we'd deprecate "_build"
variant.

Introducing fastpath for root builds (addresses (1))

I propose to introduce a fastpath for running root builds in a similar fashion
we do fastpaths for esy shell and esy <anycmd>.

Let the esy build invocation execute
$cur__root/node_modules/.cache/_esy/build.sh if it exists, if it doesn't exist
then the slowpath is executed which runs the entire build and then generates
build.sh.

What build.sh does:

Check if own mtime is later than mtimes of all packages'
package.json/esy.json and root's esy.lock (similar check is done for
build-eject and shell fastpaths).
- If true then proceeed
- If false then bail to slowpath
Executes build commands one by one in a build env.

Separating build & install commands (addresses (4))

I propose separating build & install commands into esy.build and esy.install
keys in package.json.

If we do so then esy build for a root project will only execute the build
part of the process and thus won't need to relocate artifacts from stage to
install location.

The installation process usually consists of calls to ocamlfind install,
opam-installer, jbuilder install — these are not usually needed during dev.

Workflow description

Given the proposals above are implemented we are going to end up with the
following workflow.

% esy install

# first invocation is slowish
% esy build

% esy vim ./bin/program.re

# later invocations are fast as they use fastpath
% esy build

% esy vim package.json
# again this won't hit the cache as package.json was changed
% esy build

# now to test compiled executables
% esy ./_build/default/bin/program.exe

Note that esy <anycmd> env won't contain _build/default/bin in $PATH and
therefore all aspects of arbitrary executable can't be evaluated that way (as
program.exe might depend on a collocated another.exe). We are going to
introduce a tool called esx (described in previous comments above) which will
perform the install commands too and construct the env which includes the root
package.

% esx program

This will be slower than esy ./_build/default/bin/program.exe though (it does
more things). The esx is out of scope of these proposals and will be explained
in future proposals.

from esy.

jordwalke commented on June 14, 2024

The general workflow you described at the end is amazing. I'm glad we decided
that esx would be its own separate command. It's a better way to divide up
the CLI API (more consistent design).

I really like that this allows us to have a single command esy build, that
automatically balanced default speed without compromising stability.

Yes, three states for buildsInSource is suboptimal. We'll see if jbuilder is
going to get out of source builds — in that case we'd deprecate "_build"
variant.

I really hope this doesn't get in our way, but I believe that having this third
mode can help us achieve the world's best workflow for developing several local
packages that are linked together. We'd be able to just remain in our topmost
application and run esy build every time we make any change to any one of our
projects! I will lobby very hard for pure out of source jbuilder builds.

I propose to introduce a fastpath for running root builds in a similar fashion
we do fastpaths for esy shell and esy .
...
What build.sh does:
..

Executes build commands one by one in a build env.

Yeah, I can see why this is important. Right now I do esy jbuilder build
which doesn't have to pay the node tax, but it can create the wrong environment
if one of my dependencies has changed (which requires running esy build).

Would esy build recursively traverse the dependency graph in pure bash and
run each project's $cur__root/node_modules/.cache/_esy/build.sh? (or perhaps
it should store a cached simplified .txt file of the dependency traversal
order/locations - so it will make it easier to implement that esy build
process in a cross platform manner later).

I propose separating build & install commands into esy.build and esy.install
keys in package.json.

That sounds great! It's probably not the bottleneck, but it's awesome when we
can cut out all wasted time. esy is going to be so fast!

The installation process usually consists of calls to ocamlfind install,
opam-installer, jbuilder install — these are not usually needed during dev.

Yeah, they would only be needed for a hypothetical esx . cmd in the future -
and in that case, it could be "installed" lazily.

from esy.

andreypopp commented on June 14, 2024

Implemented in [email protected].

from esy.

Brainstorm/questions for "esy fast build" about esy HOT 8 CLOSED

Comments (8)

Regarding `esy fastbuild`:

About the various modes:

Bike-shedding:

Random Suggestion:

Motivation

Proposals

Enabling incremental builds (addresses (2))

Can this be made simpler?

Introducing fastpath for root builds (addresses (1))

Separating build & install commands (addresses (4))

Workflow description

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (8)

Regarding esy fastbuild:

About the various modes:

Bike-shedding:

Random Suggestion:

Motivation

Proposals

Enabling incremental builds (addresses (2))

Can this be made simpler?

Introducing fastpath for root builds (addresses (1))

Separating build & install commands (addresses (4))

Workflow description

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

Regarding `esy fastbuild`: