Giter VIP home page Giter VIP logo

Comments (9)

llogiq avatar llogiq commented on August 19, 2024

I should note that it's just an idea at this moment, and we need some discussion to validate the concept.

from synth.

brokad avatar brokad commented on August 19, 2024

I have hacked together a quick example of what that would look like in the hope it will help fuel the discussion. Feel free to give it a quick spin.

synth init is mostly redundant. It was introduced as a quick and dirty way of "locking" a directory to belong to synth (in order to prevent inadvertent overwrites on potentially destructive actions like synth import). Maybe we also thought of it as an "anchor" to specify the root of a repository of synth schema files? I however cannot find references to that in the code: I believe the current implementation just checks it exists at $PWD and moves on.

In the branch, I have simply added more consistency to the treatment of paths in CLI arguments to compensate for the absence of synth init. For example, you can run

$ synth generate examples/bank/bank_db/users.json

or the equivalent

$ cd examples/bank/bank_db; synth generate users.json

to generate only the users data. Or you can run

$ synth generate examples/bank/bank_db/

to generate the whole bank_db example.

On import, synth will prompt for confirmation before overwriting existing files but otherwise behaves as before (minus synth init step).

$ mkdir -p /tmp/test_import; synth generate examples/bank/bank_db | synth import /tmp/test_import

from synth.

christos-h avatar christos-h commented on August 19, 2024

Thanks for bringing this up - it is a slightly contentious issue.

So first of all - what do we have now. When you run synth init a sub-directory .synth is created which marks the parent directory as a workspace for synth. This also creates an as of now empty config.toml.

A comparison with other tools

This concept of workspaces is clearly not original - to give an example terraform has a similar concept when you run terraform init initialising a directory and creating a marker sub-directory or file. The same is true for some IDEs like IntelliJ, which creates .idea sub-directory when you create a project. This too serves as a marker and holds metadata about the project.

Other tools take slightly different approaches - take kubectl for example and gcloud. They have a global state in ~/.config (or something like that) which is 'global' for the user.

They key difference between the two paradigms is that the former are associated deeply with the underlying files, while the latter do not care. For example, terraform cares deeply about the flles in a given subdirectory which is literally a textual representation of your infastructure where as gcloud couldn't care less about the file in $PWD when a gcloud command is used. Furthermore, there is configuration that is deeply coupled with $PWD in the case of tools like terraform or your IDE. It makes sense that you don't want to manually be switching up config constantly. Finally both terraform and IDEs have free reign to modify files (in a sensible way) in their respective workspaces, which is also the same for synth; by not prompting on file modification and deletion (as @brokad proprosed) and having well defined behaviour they offer a better user experience.

The case for state

Okay so all these tools we've talked about have some context of state. From the local tfstate to your IDE's settings. synth currently doesn't have state.

Well this is almost true - synth does have state in the status of telemetry being enabled or not. This is held in an OS specific config directory similar to .config/synth. I'm not sure why that decision was made (i think it should be on a per-workspace case now) since some users may want to share telemetry data on some workspaces and not in others (which they may consider sensitive).

Furthermore as the project is built out we will have more state to deal with which will be workspace specific. I am not going to elaborate too much here, as much of this is TBD, but for example:

  • credentials for generating on a synth cluster
  • saving credentials for connecting with a database (so you don't always specify them on the command line)
  • ML models (weights) that are generated on an ad-hoc basis as per conversations with @fretz12 about understanding binary payload distributions
  • holding metadata across generation sessions (i.e. if generation is aborted and then resumed)

and many more.

The software engineering experience

Finally w.r.t quality of life for our end users, I think a little structure and opinionated design can help users implicitly pick up what we believe are best practices. This is mostly opinion, but a few points are:

  • By arranging synth files into workspaces, developers will always know where those files are code repositories won't get littered with random files all over the place
  • Diffs and PRs will be more localised where they pertain change of a data model
  • Developers won't accidentally put random text files, scratch files, temporary files or whatever in a directory which is dedicated to their data models

tldr

  • synth is tightly coupled to the contents of a directory, thus the concept of a workspace can provide a superior user experience as the binary has free reign in that directory.
  • synth will eventually have workspace level state, and that state should be stored in the workspace directory. synth init is currently initialisation of that state
  • organising synth schema files in a workspace hierarchy will provide a better SE experience

from synth.

brokad avatar brokad commented on August 19, 2024

Thanks @christoshadjiaslanis for outlining your thinking on this! It clears up some of the confusion I had about why this exists in the first place 😄

It seems like the decision to have synth init is tied to the desire to have a notion of project or workspace and - later on - to a local state. Both of which seem important to the long-term roadmap. In that way, synth init is there for future-proofing and force users to think in terms of workspace/project.

I wonder if it would be possible still to allow users to run simple Synth schema files outside of workspaces. For example, running documentation snippets like this one without having to setup a new workspace. Maybe mounting the workspace type workflows under a subcommand like synth workspace and otherwise leaving the top-level CLI able to deal with individual JSON files outside of workspaces?

from synth.

brokad avatar brokad commented on August 19, 2024

Another thing that could work is allowing to generate from individual files (viewed as regular Content) when the .synth directory is not found at $PWD and otherwise behaving as currently (traversing a given path and assembling into a Namespace).

Any thoughts @christoshadjiaslanis?

from synth.

christos-h avatar christos-h commented on August 19, 2024

@brokad these seem like good ideas. Basically it boils down to:

  • if we run synth generate my_file.json, treat my_file.json as a regular Content (we can use the code from the playground here)
  • If it is pointed to a directory, treat that directory as a Namespace

So I guess the next step if this line of inquiry would be to ask:

  • what happens on import outside a workspace?
  • what happens if my_file.json above refers to some content that requires state (hence requiring a workspace).

from synth.

brokad avatar brokad commented on August 19, 2024
  • if we run synth generate my_file.json, treat my_file.json as a regular Content (we can use the code from the playground here)
  • If it is pointed to a directory, treat that directory as a Namespace

Indeed.

what happens on import outside a workspace?

As @llogiq suggested above, nothing special. We could prompt for user confirmation before overwriting existing files, or we could proceed as we do now - i.e. outright refuse the operation if the target path is non-empty/exists.

what happens if my_file.json above refers to some content that requires state (hence requiring a workspace).

I'm not sure what you mean. Can you give me an example of something like this?

from synth.

christos-h avatar christos-h commented on August 19, 2024

I'm not sure what you mean. Can you give me an example of something like this?

Not currently - it is more thinking ahead..

I'll play around with your fork tomorrow and get back to you :)

from synth.

christos-h avatar christos-h commented on August 19, 2024

@brokad thanks for your work on the fork.

After playing around with it - it is clear that not having workspaces offers a better UX as of today. In the future, if we have some concept of namespace/workspace specific state to hold things like model weights or database credentials, we can add workspaces back in.

Next steps:

  • Merge relevant parts of fork/no-init into getsynth/synth:master
  • Update the documentation to not mention workspaces any more
  • Update asciinema on website to no longer have synth init
  • Update tutorial content to no longer mention workspaces
  • Create new release

from synth.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.