Required Functionality Currently, users must initialize a "worksp

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

UX: Make synth init obsolete about synth HOT 9 CLOSED

shuttle-hq commented on August 19, 2024 1

UX: Make synth init obsolete

from synth.

Comments (9)

llogiq commented on August 19, 2024

I should note that it's just an idea at this moment, and we need some discussion to validate the concept.

from synth.

brokad commented on August 19, 2024

I have hacked together a quick example of what that would look like in the hope it will help fuel the discussion. Feel free to give it a quick spin.

synth init is mostly redundant. It was introduced as a quick and dirty way of "locking" a directory to belong to synth (in order to prevent inadvertent overwrites on potentially destructive actions like synth import). Maybe we also thought of it as an "anchor" to specify the root of a repository of synth schema files? I however cannot find references to that in the code: I believe the current implementation just checks it exists at $PWD and moves on.

In the branch, I have simply added more consistency to the treatment of paths in CLI arguments to compensate for the absence of synth init. For example, you can run

$ synth generate examples/bank/bank_db/users.json

or the equivalent

$ cd examples/bank/bank_db; synth generate users.json

to generate only the users data. Or you can run

$ synth generate examples/bank/bank_db/

to generate the whole bank_db example.

On import, synth will prompt for confirmation before overwriting existing files but otherwise behaves as before (minus synth init step).

$ mkdir -p /tmp/test_import; synth generate examples/bank/bank_db | synth import /tmp/test_import

from synth.

christos-h commented on August 19, 2024

Thanks for bringing this up - it is a slightly contentious issue.

So first of all - what do we have now. When you run synth init a sub-directory .synth is created which marks the parent directory as a workspace for synth. This also creates an as of now empty config.toml.

A comparison with other tools

This concept of workspaces is clearly not original - to give an example terraform has a similar concept when you run terraform init initialising a directory and creating a marker sub-directory or file. The same is true for some IDEs like IntelliJ, which creates .idea sub-directory when you create a project. This too serves as a marker and holds metadata about the project.

Other tools take slightly different approaches - take kubectl for example and gcloud. They have a global state in ~/.config (or something like that) which is 'global' for the user.

They key difference between the two paradigms is that the former are associated deeply with the underlying files, while the latter do not care. For example, terraform cares deeply about the flles in a given subdirectory which is literally a textual representation of your infastructure where as gcloud couldn't care less about the file in $PWD when a gcloud command is used. Furthermore, there is configuration that is deeply coupled with $PWD in the case of tools like terraform or your IDE. It makes sense that you don't want to manually be switching up config constantly. Finally both terraform and IDEs have free reign to modify files (in a sensible way) in their respective workspaces, which is also the same for synth; by not prompting on file modification and deletion (as @brokad proprosed) and having well defined behaviour they offer a better user experience.

The case for state

Okay so all these tools we've talked about have some context of state. From the local tfstate to your IDE's settings. synth currently doesn't have state.

Well this is almost true - synth does have state in the status of telemetry being enabled or not. This is held in an OS specific config directory similar to .config/synth. I'm not sure why that decision was made (i think it should be on a per-workspace case now) since some users may want to share telemetry data on some workspaces and not in others (which they may consider sensitive).

Furthermore as the project is built out we will have more state to deal with which will be workspace specific. I am not going to elaborate too much here, as much of this is TBD, but for example:

credentials for generating on a synth cluster
saving credentials for connecting with a database (so you don't always specify them on the command line)
ML models (weights) that are generated on an ad-hoc basis as per conversations with @fretz12 about understanding binary payload distributions
holding metadata across generation sessions (i.e. if generation is aborted and then resumed)

and many more.

The software engineering experience

Finally w.r.t quality of life for our end users, I think a little structure and opinionated design can help users implicitly pick up what we believe are best practices. This is mostly opinion, but a few points are:

By arranging synth files into workspaces, developers will always know where those files are code repositories won't get littered with random files all over the place
Diffs and PRs will be more localised where they pertain change of a data model
Developers won't accidentally put random text files, scratch files, temporary files or whatever in a directory which is dedicated to their data models

tldr

synth is tightly coupled to the contents of a directory, thus the concept of a workspace can provide a superior user experience as the binary has free reign in that directory.
synth will eventually have workspace level state, and that state should be stored in the workspace directory. synth init is currently initialisation of that state
organising synth schema files in a workspace hierarchy will provide a better SE experience

from synth.

brokad commented on August 19, 2024

Thanks @christoshadjiaslanis for outlining your thinking on this! It clears up some of the confusion I had about why this exists in the first place 😄

It seems like the decision to have synth init is tied to the desire to have a notion of project or workspace and - later on - to a local state. Both of which seem important to the long-term roadmap. In that way, synth init is there for future-proofing and force users to think in terms of workspace/project.

I wonder if it would be possible still to allow users to run simple Synth schema files outside of workspaces. For example, running documentation snippets like this one without having to setup a new workspace. Maybe mounting the workspace type workflows under a subcommand like synth workspace and otherwise leaving the top-level CLI able to deal with individual JSON files outside of workspaces?

from synth.

brokad commented on August 19, 2024

Another thing that could work is allowing to generate from individual files (viewed as regular Content) when the .synth directory is not found at $PWD and otherwise behaving as currently (traversing a given path and assembling into a Namespace).

Any thoughts @christoshadjiaslanis?

from synth.

christos-h commented on August 19, 2024

@brokad these seem like good ideas. Basically it boils down to:

if we run synth generate my_file.json, treat my_file.json as a regular Content (we can use the code from the playground here)
If it is pointed to a directory, treat that directory as a Namespace

So I guess the next step if this line of inquiry would be to ask:

what happens on import outside a workspace?
what happens if my_file.json above refers to some content that requires state (hence requiring a workspace).

from synth.

brokad commented on August 19, 2024

if we run synth generate my_file.json, treat my_file.json as a regular Content (we can use the code from the playground here)

If it is pointed to a directory, treat that directory as a Namespace

Indeed.

what happens on import outside a workspace?

As @llogiq suggested above, nothing special. We could prompt for user confirmation before overwriting existing files, or we could proceed as we do now - i.e. outright refuse the operation if the target path is non-empty/exists.

what happens if my_file.json above refers to some content that requires state (hence requiring a workspace).

I'm not sure what you mean. Can you give me an example of something like this?

from synth.

christos-h commented on August 19, 2024

I'm not sure what you mean. Can you give me an example of something like this?

Not currently - it is more thinking ahead..

I'll play around with your fork tomorrow and get back to you :)

from synth.

christos-h commented on August 19, 2024

@brokad thanks for your work on the fork.

After playing around with it - it is clear that not having workspaces offers a better UX as of today. In the future, if we have some concept of namespace/workspace specific state to hold things like model weights or database credentials, we can add workspaces back in.

Next steps:

Merge relevant parts of fork/no-init into getsynth/synth:master
Update the documentation to not mention workspaces any more
Update asciinema on website to no longer have synth init
Update tutorial content to no longer mention workspaces
Create new release

from synth.

UX: Make synth init obsolete about synth HOT 9 CLOSED

Comments (9)

A comparison with other tools

The case for state

The software engineering experience

tldr

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent