Giter VIP home page Giter VIP logo

handbook's Introduction

DANDI Handbook

Handbook for interacting with the DANDI Archive.

DANDI Style Guidelines

Follow the guidelines below when creating and revising text in the DANDI Handbook:

  • dandi- repositories — hyphenate the names of DANDI GitHub repositories (e.g. dandi-archive); "Dandisets" is an exception because it is a complete word
  • Dandiset — use single, unformatted, capitalized word (not dandiset or Dandiset)
  • file names — use lower case (e.g. development.md)
  • headings — use Title Capitalization (for 1st and 2nd levels) and follow with an intro sentence
  • GitHub — use camel case (not github or Github)
  • instructional language — should be direct, imperative, active, straightforward (e.g. "Install the files in your Python environment", not "Files could be installed in your Python environment")
  • JupyterHub — use camel case (not Jupyterhub)
  • license (not licence); in general, prefer American spelling
  • limited use of "please"
  • steps should start with 1, not 0
  • DANDI Archive - capitalize "archive" if it follows DANDI (not DANDI archive)

HOWTO

This handbook uses mkdocs to render the handbook written as a collection of markdown files into a website. If you would like to render it locally, you would need to create and configure a python environment according to configuration provided in requirements.txt file, e.g. via

python3 -m venv venv && source venv/bin/activate && python3 -m pip install -r requirements.txt

And your current session would already be using that virtual Python environment, which you could deactivate by executing deactivate command. If in the future you would need to activate it, just source venv/bin/activate again.

After that you can either

  • do one time manual build using mkdocs build and find built website under site/ folder.
  • run mkdocs serve which would not only build website and start a local webserver for you to visit rendered version at e.g., http://0.0.0.0:8000/, but also it would automatically re-build if you change any source markdown or configuration file.

handbook's People

Contributors

asmacdo avatar bendichter avatar codycbakerphd avatar djarecka avatar jwodder avatar kabilar avatar melster1010 avatar mgrauer avatar mhhennig avatar satra avatar thechymera avatar waxlamp avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

handbook's Issues

#117 needs fix up

#117 need but now fails to build

raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: .icons/fontawesome/brands/x-twitter.svg

We need to establish CI to build/test it in PRs...

Edit new subsections of Using DANDI

  • Apply style guidelines to the 5 new subsections added recently to Using Dandi (View, Download, Upload, Publish) and add them to nav TOC under a new umbrella heading "User Guide" (as a parallel to the "Developer Guide" heading).
  • Make the subheadings parallel with imperative - Download vs Downloading.
  • In Debugging section, flatten hierarchy by removing Acquiring Debugging Info subsection, only need one heading

hard to find upload instructions

The new organization of the handbook makes it much harder for a new user to find instructions for uploading data. Can we please add a link to this on the left or right panel?

Add instructions about dandi-cli credentials management

dandi/dandi-cli#267 is introducing to dandi-cli support for a fall back (if system wide credentials store is not available) to specify an alternative from the keyring's default - encrypted keyring. It would require password .

We should have a subsection in https://github.com/dandi/handbook/blob/master/docs/10_using_dandi.md which describes that and also provides a way for a "fully automated" setups, e.g. via env variable or establishing a plain text credential store (which would not require password).

Updating the website to include how to stream Dandi datasets

Hello! I really appreciate what you guys have done!

I noticed in the download Data and Dandsets part of the website, it only includes how to hard download data, not how to stream it. Thanks to the hackathon I know there is a tutorial made for streaming data instead. Perhaps, incorporate that into the main website as well?

Thanks

Refactor the About section

Either make Terms and Policies separate sections and eliminate About, or make an Appendix or rename to About DANDI so that we don't have About and About this Doc, rather confusing.

Refactor the About This Doc section

The Welcome section links people to About This Doc for info on how to contribute to the doc, but it only tells you how to serve the docs locally. Need additional sections prior to that one.

Establish best practices for PRs

Relates heavily to best practices on commit messages and PR/feature branches.

Thought to formalize one here, but I think we just need to find one already existing on which we agree with or build upon

Challenges/solutions table improvements

In the "introduction" there is a challenges and solutions table, which has some formatting errors, and the language within it has grammar (and other) inconsistencies.

Add some kind of an "Assisted do-ocracy pledge" toward contributions (from within and outside the team)

I am thinking around these lines

  • This is a collaborative project and contributions to any component from any developer and outside contributor(s) are welcome
  • DANDI project encourages submitting PRs to resolve outstanding issues, in particular with severity-important and severity-critical labels
  • The primary developer team of the corresponding component pledges to provide timely feedback to bring submitted PRs to the acceptable state (and resort to provide an alternative implementation only if significant changes are required)

WDYT @dandi/dandiarchive @dandi/archive-maintainers @dandi/dandi-cli (we need to come up with @dandi/people super-team... will do some time)

Provide a more exhaustive list of methods to download data

Triggered by @TheChymera

By default we just tell people to use dandi CLI. But there are also

  • datalad datasets
  • webdav interface
    • allows for a very unified access to the entire archive -- people could in principle wget entire archive. For an individual release eg. wget -r -nH --cut-dirs=3 --no-parent --reject "index.html*" https://webdav.dandiarchive.org/dandisets/000027/releases/0.210831.2033/

Provide explicit guidance/expectations for zarr uploads

Zarr uploads still remain in Draft mode, as they are yet to be versioned.

Our handbook does not provide documentation for "what to expect" for Zarr uploads in the DANDI ecosystem. (e.g. https://github.com/dandi/handbook/blob/master/docs/13_upload.md does not contain any reference)

The goal of this Issue is to capture follow-up work to communicate the Zarr workflow correctly in this handbook, dandi-cli and dandi-archive

Relates to dandi/dandi-archive#1811

usability notes

In Uploading a Dandiset, Setup:

  • Provide screenshot of where the API key is
  • Add link to instructions for Python virtualenv
  • Change to simply “pip install -U dandi”

In Uploading, Data upload/management workflow:

  • “New Dataset” -> “NEW DANDISET” even better, show a picture of this button inline in the instructions
  • “Reach out to us for help” should link to github helpdesk chooser in new window
  • Add instructions for non-NWB files, break it into NWB and non-NWB
    • In NWB-specific subsection, make sure you have an updated version of pynwb
    • If you have having trouble with validation, make sure conversions were run with the most recent version of PyNWB and MatNWB

In download section,

  1. Downloading a Dandiset.
  2. Downloading a subject.
  3. Downloading a file.

needs to be expanded out

Enhance the Using DANDI section for new users

  • Lacks a straightforward set of “how to use” instructions, do you really download a dandiset before creating an account? An overview of typical major steps would be helpful.
  • Perhaps add Quick Start, you don’t need an account to use DANDI, but then if you want your own dandiset and contribute one, you DO need an account-> 6/24/22
  • Relocate the schematic figure to Developer Guide section, too technical for new users -> 7/11/22

Developer guide: add more about validation

ATM we have quite a lot of "heterogeneity" in how we do validation. We have multiple layers

  • pynwb

  • bidsschematools

  • nwbinspector

  • dandischema

  • dandi-cli glueing: uses above + adding more (zarr checking) - relies on content to be available, not just validation of extracted metadata

  • dandi-archive: invokes validations of dandischema, validates extracted metadata, not relying on content being present

We also have two "dataset layouts": DANDI and BIDS, with DANDI being our "ad-hoc" layout which is instrumented in code in dandi-cli.

NB We might move DANDI layout into dandischema

We have https://www.dandiarchive.org/handbook/135_validation/ which outlines to user only some, very limited, set of validations. We should get some more detailed description of "validation framework" and "DANDI layout" here.

edit: and then worth getting some meeting/presentation for DANDI team to sync our knowledge etc on all these aspects.

add clearer descriptions of the archive

based on the questions below, it would be useful for us to create a page in the handbook that directly addresses this.

questions from priyanka subhash at USC

  • Data storage - Cloud-based platform; user must upload data through Github
    Type of data - The archive accepts cellular neurophysiology data including electrophysiology, optophysiology, and behavioral time-series, and images from immunostaining experiments
  • Data collection - User must create a DANDI account, create a Python environment, install the DANDI CLI into Python environment, register a dandiset to generate an identifier, convert data to NWB
  • Data maintenance - Data is maintained on Github platform
  • Data Upload Process - User must create a DANDI account, create a Python environment, install the DANDI CLI into Python environment, register a dandiset to generate an identifier, convert data to NWB
  • Data access - A Github account is required to access/download public datasets and to be given access to private datasets through DANDI
  • Data download - Using the Web application - Each Dandiset has a View Data option. This provides a folder-like view to navigate a Dandiset. Any file in the Dandiset has a download icon next to it. You can click this icon to download a file to your device where you are browsing or right click to get the download URL of the file. You can then use this URL programmatically or in other applications such as the NWB Explorer or in a Jupyter notebook on Dandihub. Using the Python CLI - Install Python client using DANDI code
  • Accepted Data File Formats - NWB, BIDS, NDM
  • In-house analytical tools - Dandihub provides a Jupyter environment to interact with the DANDI archive. To use the hub, you will need to register an account using the DANDI Web application. Please note that Dandihub is not intended for significant computation, but provides a place to introspect Dandisets and files.

Refactor the Welcome section

  • Fix broken/incorrect links
  • The 3rd bullet in “How to Use this Documentation” is just an fyi
  • The link to the project structure is for devs, right? Should be a general bullet that includes this for the Developer Guide
  • Reorder subsections to be more logical (How to use this Doc first, then how to communicate, Contributing/Feedback, License)

Separate "Developer User Guide" or just "Developer Guide" and rename into "DANDI Developer Guide"?

We have User Guide which is mostly user oriented -- users who aim to access or upload data, not write software to integrate with DANDI.

I wonder if we should make more explicit separation here, and have a section for people who are developing based on DANDI, e.g. using API, Python lower level APIs etc.

E.g. ATM in the light of

I was looking a place where to add an advise/requests to use it aiming for efficiency/lower impact on our services, e.g.

  • use max page size if aiming for a full list of assets in dandiset
  • use glob for /assets/ listing whenever aiming for specific file types (not to fetch all and filter on the client).

and that should possibly be accompanied with examples on how it to be done both in API calls and Python interfaces.

WDYT?

Improve Developer Guide

  • Add Table of Contents in upper-right corner like all the other sections
  • Add Intro (needs a file or a paragraph at least)
  • Rename the Notes subsection to something more meaningful
  • Reformat long bulleted list as a table
  • Order repositories in some logical way or even broken into categories if applicable.
  • Needs a reference to the dandi schema README (as done for the api, archive, and cli)

add NWB best practices section

@bendichter - given some of the issues being discussed with the biccn datasets, i realized that there is no information in the handbook on the dandi extension or how to add such info. also slice/tissue info would be relevant to ophys as well and perhaps the term ndx-dandi-icephys may semantically be confusing.

it would be nice to have a clear section in the handbook on pointers to dandi related nwb.

refactor docs for creating an account

We currently have docs for creating an account here within the section for creating and uploading Dandisets. This is not ideal, because a user may want to create an account in order to access DANDI Hub to analyze existing data, and such a user might find it difficult to find this section. I think it would be better to have a separate section on signing up for DANDI that is linked to by the section for creating and uploading Dandisets as well as the section on the DANDI Hub. This section should also explain when a DANDI login is necessary and when it is not (i.e. if all you want to do is download data)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.