dandi / handbook Goto Github PK

View Code? Open in Web Editor NEW

6.0 11.0 12.0 5.76 MB

Handbook for interacting with the DANDI archive.

Home Page: https://www.dandiarchive.org/handbook/

howto dandi-archive

handbook's Introduction

DANDI Handbook

Handbook for interacting with the DANDI Archive.

DANDI Style Guidelines

Follow the guidelines below when creating and revising text in the DANDI Handbook:

dandi- repositories — hyphenate the names of DANDI GitHub repositories (e.g. dandi-archive); "Dandisets" is an exception because it is a complete word
Dandiset — use single, unformatted, capitalized word (not dandiset or Dandiset)
file names — use lower case (e.g. development.md)
headings — use Title Capitalization (for 1st and 2nd levels) and follow with an intro sentence
GitHub — use camel case (not github or Github)
instructional language — should be direct, imperative, active, straightforward (e.g. "Install the files in your Python environment", not "Files could be installed in your Python environment")
JupyterHub — use camel case (not Jupyterhub)
license (not licence); in general, prefer American spelling
limited use of "please"
steps should start with 1, not 0
DANDI Archive - capitalize "archive" if it follows DANDI (not DANDI archive)

HOWTO

This handbook uses mkdocs to render the handbook written as a collection of markdown files into a website. If you would like to render it locally, you would need to create and configure a python environment according to configuration provided in requirements.txt file, e.g. via

python3 -m venv venv && source venv/bin/activate && python3 -m pip install -r requirements.txt

And your current session would already be using that virtual Python environment, which you could deactivate by executing deactivate command. If in the future you would need to activate it, just source venv/bin/activate again.

After that you can either

do one time manual build using mkdocs build and find built website under site/ folder.
run mkdocs serve which would not only build website and start a local webserver for you to visit rendered version at e.g., http://0.0.0.0:8000/, but also it would automatically re-build if you change any source markdown or configuration file.

handbook's People

Contributors

Stargazers

Watchers

Forkers

satra mhhennig arclamp djarecka jeffbaumes melster1010 catalystneuro thechymera yarikoptic kabilar lincbrain asmacdo

handbook's Issues

#117 needs fix up

#117 need but now fails to build

raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: .icons/fontawesome/brands/x-twitter.svg

We need to establish CI to build/test it in PRs...

challenges vs solutions

I think the sections The Challenges and Our Solution should be matching more closely in the intro section:
https://www.dandiarchive.org/handbook/01_introduction/

Edit new subsections of Using DANDI

Apply style guidelines to the 5 new subsections added recently to Using Dandi (View, Download, Upload, Publish) and add them to nav TOC under a new umbrella heading "User Guide" (as a parallel to the "Developer Guide" heading).
Make the subheadings parallel with imperative - Download vs Downloading.
In Debugging section, flatten hierarchy by removing Acquiring Debugging Info subsection, only need one heading

hard to find upload instructions

The new organization of the handbook makes it much harder for a new user to find instructions for uploading data. Can we please add a link to this on the left or right panel?

Add Contributing section

to go along with Developers information (https://github.com/dandi/handbook/pull/11/files) and outline common workflows/agreements (commit messages structure, linting etc).

File view: update screenshot and add paragraph listing/showing external services integrations

ATM we do not show availability of those extra icons and ability to jump to external services, e.g. smth like

Provide short example for developers on how to send authenticated request to API

inspired by dandi/dandi-archive#1790 (comment)

Handbook 1st pass

perform thorough copyedit based on the style sheet guidelines

Add instructions about dandi-cli credentials management

dandi/dandi-cli#267 is introducing to dandi-cli support for a fall back (if system wide credentials store is not available) to specify an alternative from the keyring's default - encrypted keyring. It would require password .

We should have a subsection in https://github.com/dandi/handbook/blob/master/docs/10_using_dandi.md which describes that and also provides a way for a "fully automated" setups, e.g. via env variable or establishing a plain text credential store (which would not require password).

Updating the website to include how to stream Dandi datasets

Hello! I really appreciate what you guys have done!

I noticed in the download Data and Dandsets part of the website, it only includes how to hard download data, not how to stream it. Thanks to the hackathon I know there is a tutorial made for streaming data instead. Perhaps, incorporate that into the main website as well?

Thanks

Refactor the About section

Either make Terms and Policies separate sections and eliminate About, or make an Appendix or rename to About DANDI so that we don't have About and About this Doc, rather confusing.

Refactor the About This Doc section

The Welcome section links people to About This Doc for info on how to contribute to the doc, but it only tells you how to serve the docs locally. Need additional sections prior to that one.

Restructure the Storing Access Credentials subsection

In “Storing Access Credentials” in the Uploading Data section there are paragraphs bulleted and some not, needs to be organized/structured to better indicate what are options, what are notes

Add CI to PRs to build/test

We have a workflow which is triggered only on outages to master since it uses action which pushes to gh pages.

Ideally we need a workflow to just build and ideally to preview in PRs. It seems we use mkdocs. https://github.com/bids-standard/bids-specification too, but with RTD, which provides preview functionality.

So may be we should integrate with RTD?

Update developer section

The developer section needs to be updated in relation to the new api server.

Data Standards section needs relocation and xref

Data Standards should logically come before the User Guide, not between the UG and Dev Guide, and should be cross-referenced from the appropriate bullet in the Intro.

Add Troublshooting section?

which would likely just lead to our https://github.com/dandi/helpdesk/discussions .

one of the common cases could be outdated "system wide" Python which we no longer support, and then them following pip install dandi we instruct them to do in our UI

Establish best practices for code/PRs review

E.g.

OOS (out of scope) for the PR ideas should be filed in separate issues (in the same repo or generic https://github.com/dandi/dandi-infrastructure or https://github.com/dandi/helpdesk as pertinent) and cross-referenced instead of stated verbatim in the comments

related: #15 discussion

Establish best practices for PRs

Relates heavily to best practices on commit messages and PR/feature branches.

Thought to formalize one here, but I think we just need to find one already existing on which we agree with or build upon

Challenges/solutions table improvements

In the "introduction" there is a challenges and solutions table, which has some formatting errors, and the language within it has grammar (and other) inconsistencies.

Review and edit README files

Edit the READMEs for style consistency/grammar/etc
- dandi-archive (overall readme and web app readme) dandi/dandi-archive#1386
- dandi-cli dandi/dandi-cli#1170
- dandi-hub dandi/dandi-hub#50
- helpdesk dandi/helpdesk#87
- dandi-schema dandi/dandi-schema#152
Determine and implement a common structure where applicable

Add some kind of an "Assisted do-ocracy pledge" toward contributions (from within and outside the team)

I am thinking around these lines

This is a collaborative project and contributions to any component from any developer and outside contributor(s) are welcome
DANDI project encourages submitting PRs to resolve outstanding issues, in particular with severity-important and severity-critical labels
The primary developer team of the corresponding component pledges to provide timely feedback to bring submitted PRs to the acceptable state (and resort to provide an alternative implementation only if significant changes are required)

WDYT @dandi/dandiarchive @dandi/archive-maintainers @dandi/dandi-cli (we need to come up with @dandi/people super-team... will do some time)

Add style sheet to Handbook

Add high-level and technical documentation about DataLad datasets

It would be helpful to add high-level documentation that describes the motivation and use cases for DataLad datasets that mirror the Dandisets, as well as technical documentation that describes how to use these datasets.

Relevant repositories:

Provide a more exhaustive list of methods to download data

Triggered by @TheChymera

By default we just tell people to use dandi CLI. But there are also

datalad datasets
webdav interface
- allows for a very unified access to the entire archive -- people could in principle wget entire archive. For an individual release eg. wget -r -nH --cut-dirs=3 --no-parent --reject "index.html*" https://webdav.dandiarchive.org/dandisets/000027/releases/0.210831.2033/

update handbook about organize and naming conventions

while the CLI was adjusted no updates were made to the handbook. @yarikoptic - it would be great if someone updated the section on organize and what researchers should be doing.

Include notebooks into handbook?

Motivated by seeing how notebooks are included in https://nasa-impact.github.io/veda-docs/notebooks/quickstarts/open-and-plot.html manual . That should improve their content indexing/findability by google.

WDYT?

Provide explicit guidance/expectations for zarr uploads

Zarr uploads still remain in Draft mode, as they are yet to be versioned.

Our handbook does not provide documentation for "what to expect" for Zarr uploads in the DANDI ecosystem. (e.g. https://github.com/dandi/handbook/blob/master/docs/13_upload.md does not contain any reference)

The goal of this Issue is to capture follow-up work to communicate the Zarr workflow correctly in this handbook, dandi-cli and dandi-archive

Relates to dandi/dandi-archive#1811

usability notes

In Uploading a Dandiset, Setup:

Provide screenshot of where the API key is
Add link to instructions for Python virtualenv
Change to simply “pip install -U dandi”

In Uploading, Data upload/management workflow:

“New Dataset” -> “NEW DANDISET” even better, show a picture of this button inline in the instructions
“Reach out to us for help” should link to github helpdesk chooser in new window
Add instructions for non-NWB files, break it into NWB and non-NWB
- In NWB-specific subsection, make sure you have an updated version of pynwb
- If you have having trouble with validation, make sure conversions were run with the most recent version of PyNWB and MatNWB

In download section,

Downloading a Dandiset.

Downloading a subject.

Downloading a file.

needs to be expanded out

Enhance the Using DANDI section for new users

Lacks a straightforward set of “how to use” instructions, do you really download a dandiset before creating an account? An overview of typical major steps would be helpful.
Perhaps add Quick Start, you don’t need an account to use DANDI, but then if you want your own dandiset and contribute one, you DO need an account-> 6/24/22
Relocate the schematic figure to Developer Guide section, too technical for new users -> 7/11/22

Developer guide: add more about validation

ATM we have quite a lot of "heterogeneity" in how we do validation. We have multiple layers

pynwb
bidsschematools
nwbinspector
dandischema
dandi-cli glueing: uses above + adding more (zarr checking) - relies on content to be available, not just validation of extracted metadata
dandi-archive: invokes validations of dandischema, validates extracted metadata, not relying on content being present

We also have two "dataset layouts": DANDI and BIDS, with DANDI being our "ad-hoc" layout which is instrumented in code in dandi-cli.

NB We might move DANDI layout into dandischema

We have https://www.dandiarchive.org/handbook/135_validation/ which outlines to user only some, very limited, set of validations. We should get some more detailed description of "validation framework" and "DANDI layout" here.

edit: and then worth getting some meeting/presentation for DANDI team to sync our knowledge etc on all these aspects.

add clearer descriptions of the archive

based on the questions below, it would be useful for us to create a page in the handbook that directly addresses this.

questions from priyanka subhash at USC

Data storage - Cloud-based platform; user must upload data through Github
Type of data - The archive accepts cellular neurophysiology data including electrophysiology, optophysiology, and behavioral time-series, and images from immunostaining experiments
Data collection - User must create a DANDI account, create a Python environment, install the DANDI CLI into Python environment, register a dandiset to generate an identifier, convert data to NWB
Data maintenance - Data is maintained on Github platform
Data Upload Process - User must create a DANDI account, create a Python environment, install the DANDI CLI into Python environment, register a dandiset to generate an identifier, convert data to NWB
Data access - A Github account is required to access/download public datasets and to be given access to private datasets through DANDI
Data download - Using the Web application - Each Dandiset has a View Data option. This provides a folder-like view to navigate a Dandiset. Any file in the Dandiset has a download icon next to it. You can click this icon to download a file to your device where you are browsing or right click to get the download URL of the file. You can then use this URL programmatically or in other applications such as the NWB Explorer or in a Jupyter notebook on Dandihub. Using the Python CLI - Install Python client using DANDI code
Accepted Data File Formats - NWB, BIDS, NDM
In-house analytical tools - Dandihub provides a Jupyter environment to interact with the DANDI archive. To use the hub, you will need to register an account using the DANDI Web application. Please note that Dandihub is not intended for significant computation, but provides a place to introspect Dandisets and files.

How to cite DANDI?

We have been mentioned in a new publication (https://www.sciencedirect.com/science/article/abs/pii/S0165027020303952), but there was no citation. I think it would be nice to tell the user how to cite DANDI in the FAQ. Should we use Zenodo to get a DOI for the archive itself?

Fix broken links in Data Standards section

The links are broken for the BIDS Getting Started page and the NWB Conversion Tools user guide.

Remove code element format on "Dandiset"

"Dandiset" is inappropriately styled with code element formatting:

remove from handbook
update stylesheet

Needs search capability

ATM I do not see any way to search besides geeky "go to github, search in the source"

Brain link in handbook should link back to top level

There's a brain image in the handbook page which is just a link to the top level of the handbook; I expected this link to take me back to the dandiarchive home page.

Document specific requirements for upload of NWB files

DANDI requires certain metadata to be present in an NWB file for upload. This is checked using the nwbinspector with the dandi config. However, these requirements are not documented besides looking at:
https://github.com/NeurodataWithoutBorders/nwbinspector/blob/dev/src/nwbinspector/internal_configs/dandi.inspector_config.yaml

Can these requirements be documented in the DANDI handbook?

Code formatting for data upload docs messed up

On https://www.dandiarchive.org/handbook/10_using_dandi/#uploading-a-dandiset section 2.b.iii, the fenced code block does not appear correctly:

According to https://www.mkdocs.org/user-guide/writing-your-docs/, "fenced code blocks can not be indented. Therefore, they cannot be nested inside list items, blockquotes, etc."

README.md should contain instructions for building docs locally

data standards page

Refactor the Welcome section

Fix broken/incorrect links
The 3rd bullet in “How to Use this Documentation” is just an fyi
The link to the project structure is for devs, right? Should be a general bullet that includes this for the Developer Guide
Reorder subsections to be more logical (How to use this Doc first, then how to communicate, Contributing/Feedback, License)

favicon not showing up

The favicon is not working for me

Separate "Developer User Guide" or just "Developer Guide" and rename into "DANDI Developer Guide"?

We have User Guide which is mostly user oriented -- users who aim to access or upload data, not write software to integrate with DANDI.

I wonder if we should make more explicit separation here, and have a section for people who are developing based on DANDI, e.g. using API, Python lower level APIs etc.

E.g. ATM in the light of

dandi/dandi-archive#1891

I was looking a place where to add an advise/requests to use it aiming for efficiency/lower impact on our services, e.g.

use max page size if aiming for a full list of assets in dandiset
use glob for /assets/ listing whenever aiming for specific file types (not to fetch all and filter on the client).

and that should possibly be accompanied with examples on how it to be done both in API calls and Python interfaces.

WDYT?

Improve Developer Guide

Add Table of Contents in upper-right corner like all the other sections
Add Intro (needs a file or a paragraph at least)
Rename the Notes subsection to something more meaningful
Reformat long bulleted list as a table
Order repositories in some logical way or even broken into categories if applicable.
Needs a reference to the dandi schema README (as done for the api, archive, and cli)

Add a section on editing Dandiset metadata

There is a small section that describes the Dandiset metadata on the Viewing Dandisets page, but we should create a separate page on how to edit Dandiset metadata with more details.
Include instructions on how to add Funding information.

add page on identifiers in dandi

dandi uses several types of identifiers and users have to figure out what to do. many are not used to ontologies or identifiers. a page should describe the different types of identifiers, uris, and urls recommended for dandi.

from: dandi/helpdesk#49 (comment)

add NWB best practices section

@bendichter - given some of the issues being discussed with the biccn datasets, i realized that there is no information in the handbook on the dandi extension or how to add such info. also slice/tissue info would be relevant to ophys as well and perhaps the term ndx-dandi-icephys may semantically be confusing.

it would be nice to have a clear section in the handbook on pointers to dandi related nwb.

refactor docs for creating an account

We currently have docs for creating an account here within the section for creating and uploading Dandisets. This is not ideal, because a user may want to create an account in order to access DANDI Hub to analyze existing data, and such a user might find it difficult to find this section. I think it would be better to have a separate section on signing up for DANDI that is linked to by the section for creating and uploading Dandisets as well as the section on the DANDI Hub. This section should also explain when a DANDI login is necessary and when it is not (i.e. if all you want to do is download data)