Giter VIP home page Giter VIP logo

Comments (2)

ivan-aksamentov avatar ivan-aksamentov commented on June 8, 2024

The fact that you can use Nextclade JSON to seed your particular database at all is a little miracle and I would not recommend to rely on it going forward. As mentioned in the docs, JSON output is unstable. Also there will be massive breaking changes in the coming weeks in Nextclade v3.

JSON format is used for internal communication between different parts of Nextclade, and as you've discovered, this is just a serialized internal struct. It naturally changes during routine development.

As a small research lab we are focusing on science and we don't have time to commit to maintain a stable external JSON format at this point, and will not have resources to adjust to the requirements of downstream projects. We experiment and break things a lot and reserve a right to change the JSON format at any time without warning.

So while you can submit a PR to change the format now (assuming there is no loss of functionality and correctness, we will likely accept it), I don't see it helping much in long term.

One thing that we considered to facilitate usage of JSON output is to provide a JSON schema for the format, but this would not help much in your use case.

Perhaps writing a middleware tool to ingest TSV output is a better solution for downstream projects? TSV output is much more stable - it follows semantic versioning. You can then maintain a stable output format of your liking, and to open-source the tool for the community who happen to use your particular toolset.

Also, Spark seems like a massive overkill to me. Internally our scientists use TSV with pandas/polars and it works decently well. Maybe this could also fit to your project?

If you have other ideas let us know.

from nextclade.

mitochon avatar mitochon commented on June 8, 2024

Thanks for your comments and suggestions.
I discussed with a few of our team members and we will look into using the TSV output in lieu of JSON.
We do want to thank you for your work and making this tool available.
This has enabled us to do research and help us made some contributions in the public health space.

from nextclade.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.