Giter VIP home page Giter VIP logo

asyml / stave Goto Github PK

View Code? Open in Web Editor NEW
49.0 11.0 18.0 4.73 MB

An extensible framework for building visualization and annotation tools to enable better interaction with NLP and Artificial Intelligence. This is part of the CASL project: http://casl-project.ai/

Home Page: https://asyml.io

License: Apache License 2.0

HTML 0.26% JavaScript 0.82% TypeScript 80.70% CSS 2.33% Python 15.67% Shell 0.23%
annotation nlp petuum visualization casl-project

stave's Issues

Show Pointer objects as links

Is your feature request related to a problem? Please describe.
Forte's new pointer objects standardize the connection between entries. Pointers can be the attributes of entires, the current Stave's main view does not support Pointers.

Describe the solution you'd like
A simple solution is to display Pointer objects as a link when displaying the attribute set, so a user can follow this link to jump to the relevant one. This solution may depend on #59

Describe alternatives you've considered
Showing simply some identifier of the pointing object is possible, but not very informative.

Additional context
N/A

Render complex attributes of entries.

Is your feature request related to a problem? Please describe.
Currently, complex attributes of entries (e.g. List, Dict) are not supported. "N/A" or "Complex Object" are being shown at the attribute slot.

Describe the solution you'd like
Since there is a limited and non-recursive type of complex object, it should be safe to implement a one level collapsable view for List and Dict. A user can expand the view to see the details of the List.

Describe alternatives you've considered
No

Additional context
No

No selection indicator during "Add Annotation"

Describe the bug
The original annotation adding interface is introduced in #23, however, this feature is missing after a change in the layout.

To Reproduce
Steps to reproduce the behavior:
In the current version, try to add an annotation by selecting a span, the arrows and highlight will not be shown as expected.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

try docker for the dev environment

now we have to install start frontend (js + react) and backend (python + django) to be able to develop. Might be worth trying docker to simplify the web workflow

Strategy to send annotations to back end

The current strategy of sending annotations to back end is done by sending over the whole data pack. This is quite a large network overhead. It is sufficient to send back the diff only. There are two possible approaches:

  1. Sending the actual diff between the two packs.
  2. Sending the annotation action, and apply the actions in the back end.

Legend selection improvements

Two possible improvements in legend selection:

  1. Select all/unselect all/reverse-select in legends
  2. Radio selection of the attribute under a particular legend should follow its legends behavior

Account menu in the top right corner

Is your feature request related to a problem? Please describe.
We should add a menu for account management in the top right corner of the header.

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
N/A

URL for annotated objects

It would be very helpful for communication if we use the RESTful API, that most objects have their URL. It is much easier to share our annotaitons.

Real upload interface and configurable parameters

Is your feature request related to a problem? Please describe.
The current data upload panel is simply transferring the text to the front end, which has many limitations. The interface requires a real upload tool with basic functionalities.

On the other hand, it would be nice if some of the configurations of the upload tool can be customized. One example includes changing the file upload limit of the tool.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Header or Dashboard

Is your feature request related to a problem? Please describe.
Related to Issue #101 Header should not be visible when logged out.
Now Header is in the App component, which is shown on every page.

Describe the solution you'd like
Two solutions are:

  1. Keep it in App.tsx and hide it on some pages such as login, signup
  2. Create a new page called Dashboard which works as a console. Projects and the individual project can be imported to this page so that the user can manage them without jumping to another page.

Short demo

demo_dashboard

Multiple person working on the same project.

Describe the bug
When two users are working on one project simultaneously, it will create problems to synchronize their efforts.

To Reproduce
Steps to reproduce the behavior:

  1. Open two accounts and edit the same table at the same time.

Expected behavior
The system should either allow one user to edit at one time or handle synchronized operations.

  1. One user will lock the document when editing the project. This user will be logged out automatically after some timeout.
  2. Support simultaneous edit to the same file.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Node Versions or other

Additional context
Add any other context about the problem here.

Unexpected text auto-wrapping causing highlight brackets to be off by 1 line.

There are several UI glitches that could be caused by similar calculation errors, see the following screenshots and description

image

This screenshot shows that some sentences are longer than the screen is automatically wrapped (see the "before" and "face" word). These are probably unexpected, and it seems that the annotation highlight positions after such wrappings are off by a little, see the following image:

image

Note that after the first auto-wrap "was", the highlighting is off by 1 line, and after the second auto-wrap "calmness", the highlight is off by 2 lines.

documents are not listed on the project page

Describe the bug
documents are not listed on the project page

To Reproduce
Steps to reproduce the behavior:

  1. log in
  2. click one project which you have access

Expected behavior
when you click one project, you should be directed to the page where all documents of the project are listed. However, none of the documents are listed.

Screenshots
If applicable, add screenshots to help explain your problem.
image

Environment (please complete the following information):

  • OS: WSL, Ubuntu 16.04 LTS (backend server and client server)
  • Browser: Chrome (on Win 10)

Additional context
I have examined the back-end and found it fetches expected documents. The problem seems lie in client side.
image
In Project.tsx, I think "indoc" is in consistent with "single_pack" in the database. However, after making them consistent, the problem still exists.
Moreover, [hmr] waiting for update signal from wds... in the console makes me confused because I haven't encountered it in my previous work.

Scope selector bug

Describe the bug
Sometimes we can get the following error when using the scope selector.

TypeError: Cannot read property 'span' of undefined
TextArea
src/nlpviewer/components/TextArea.tsx:79
76 | );
77 | const currScopeAnnotation = scopeAnnotations[selectedScopeIndex];
78 |

79 | text = text.substring(
| ^ 80 | currScopeAnnotation.span.begin,
81 | currScopeAnnotation.span.end
82 | );

This is likely caused by trying to use a scope from an non-existing span. For example, if there is no Token, then using scope selector would cause this error.

To Reproduce
Steps to reproduce the behavior:

  1. Prepare data: Add a datapack without entries.
  2. Click which button: select one of the scope, for example, "Token" or "Sentence".
  3. See error: You should see the TypeError above.

Expected behavior
Either report that this cannot be used as Scope, or do not show this in the Scope selector from the start.

Screenshots
image

Here is the scope selector

Environment (please complete the following information):

  • OS: [e.g. Mac OS and Chrome]
  • Version: Chrome 84.0
  • Node Versions: v12.16.1

Additional context
Add any other context about the problem here.

move connect point to global

connect point currently is at each annotation, it overlaps with next annotation when they are too close. Move the connect point to a shared global one should avoid that, also it's less element to render.

Extra corner annotation container

Describe the bug
There are extra corner annotation box at the end of some lines.

To Reproduce
Steps to reproduce the behavior:

  1. This can be reproduced using the example DB data: abc0009
  2. Select sentence in legend
  3. Resize the window to appropriate sizes (I can reproduce the error at many different dimensions).

Expected behavior
One continuous annotation in the same line should be rendered as one annotation. However, at the right corner of the text area, sometimes we can see one additional annotation above white spaces.

The following two screenshots show the problem, at the same line, there are two annotations for data id 940, the one at the corner seems to be unnecessary.

Screenshots
Screen Shot 2020-06-17 at 5 13 24 PM


Screen Shot 2020-06-17 at 5 13 34 PM

Environment

  • OS: macOS Catalina 10.15.5
  • Browser: Chrome

Issues on user permissions

Is your feature request related to a problem? Please describe.
Currently there is only one owner/user of the project.
For normal users, it is impossible to access projects/documents from other users. Need adjustments to better work for collaboration.

Describe the solution you'd like
Currently there is only one user group (normal user, besides staff members), users from normal user could only operate their own projects.
A better design is to assign different user role (e.g. editor, viewer) to the user for each project, to make the permission system more flexible and better for collaboration. Besides, the project may have the property 'public' or 'private'.

Wrong ID generation for annotations

      {
        "py/object": "ft.onto.base_ontology.EntityMention",
        "py/state": {
          "_embedding": [],
          "_span": {
            "begin": 24,
            "end": 31,
            "py/object": "forte.data.span.Span"
          },
          "_tid": 2,
          "ner_type": null
        }
      }

When new annotation arrives, the _tid generated by the backend should be an int number but not a HASH string. The int number should be "the biggest id in the doc + 1"

UUID int for new entry id.

Is your feature request related to a problem? Please describe.
Currently, there are two types of inconsistent id assignments, one through next_id, one through uuid (str).

These two implementations should be unified, and we should keep all the ids to be int.

Describe the solution you'd like
The easy solution is to use uuid.int for all id assignment cases. This will also make it easy to do multi-person collaboration.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Should have a reasonable login page.

Is your feature request related to a problem? Please describe.
The current login page is a very simple page without any style. We need to have a login page with som reasonable style.

Describe the solution you'd like
Adapt a style for the input page. Notably, center the form and make it clear for the users.

Describe alternatives you've considered
N/A

Additional context
N/A

JSON Validation window

Is your feature request related to a problem? Please describe.

This project uses JSON in several key places, including:

  1. The data files (data packs)
  2. The ontology file
  3. The configurations

Currently, these files are directly uploaded or edited via simple text boxes. No validations are performed. This will introduce bugs in a later process.

Describe the solution you'd like
Introduce a general JSON input, visualization, and validation box, which can take a schema as an argument. Some tools like this may be good: https://www.npmjs.com/package/jsoneditor-react

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Separate backend to another view

Some backend controls such as All document and All users currently appear on top of the page, they should be moved to a different view that is accessible by the admin. Or, only admin privilege can see these pages.

Max Int assigned as "Start" which crash the annotation interface.

Describe the bug
Some users have encountered offset out of range problem when adding annotations.

It is clear that 4294967295 = 2^32 -1, which means this is likely that some part of the system uses Max Int as the start offset, waiting for the user input and probably exit unexpectedly.

Upon close investigation, this can be triggered by having an Annotation with -1 as the start (or end). The following JSON can be used to reproduce this error:

{ "py/object":"forte.data.data_pack.DataPack", "py/state":{ "links":[], "groups":[], "meta":{ "py/object":"forte.data.data_pack.Meta", "py/state":{ "pack_name":"71284", "_pack_id":1035, "language":"eng", "span_unit":"character" } }, "_text": "some text", "annotations":[ { "py/object":"ft.onto.base_ontology.Token", "py/state":{ "_span":{ "begin":-1, "end":-1, "py/object":"forte.data.span.Span" }, "_tid":268 } } ], "generics":[], "orig_text_len":9, "serialization":{ "next_id":269 } } }

To Reproduce
Steps to reproduce the behavior:

  1. Copy the content above in to a JSON file.
  2. Upload this file into the example project
  3. Open this file and you will see the error.

It is unclear which annotation steps result in a broken annotation like this.

Expected behavior

  1. The program should safeguard the annotation adding process and exit gracefully when it is interrupted. For example, before adding, perform checking to make sure 0 <= start < end <= text.length.
  2. In case of errors in the annotation offsets (this can happen if users upload files created by 3rd party programs), the interface should probably report the error (with a bubble), but do not raise an exception that crash this page.

Screenshots
If applicable, add screenshots to help explain your problem.
Screen Shot 2020-10-18 at 9 26 38 PM

Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Node Versions or other

Additional context
Add any other context about the problem here.

Add hamburger menu

Is your feature request related to a problem? Please describe.
We need improvement on the current UI.

Describe the solution you'd like
Combine the the items inside a hamburger menu.

Describe alternatives you've considered
N/A

Additional context
N/A

Occasional annotation highlight missmatching.

With a recent bug fix, the annotation highlighting problem disappear in most scenarios. But there are still some funny annotation mismatch problems.

The problem seems harder to reproduce, I have attached one specific datapack/ontology pair that creates this. To reproduce, select this document, and select the WikiSection box, resize the browser window and try out, you may be able to observe the phenomenon like the following screenshot:

Screen Shot 2020-06-12 at 11 46 29 PM

There are two interesting things here:

  1. Some annotations are off by a couple of lines.
  2. There are funny small shapes as pointed by the arrows, those should not happen because the whole green area is the same annotation (WikiSection)

Model interpretation modules

Is your feature request related to a problem? Please describe.
Now Stave can visualize NLP data results, such as annotation, link. Another type of valuable information to show to machine learning practitioners are the model insights (interpretability).

We can leverage toolkits for model interpretability, here is an example:
The paper: https://www.aclweb.org/anthology/2020.emnlp-demos.15.pdf
The code: https://github.com/pair-code/lit

According to this paper, the interpretable results are light-weight and stateless, so we can also pass them inside the DataPack.

Describe the solution you'd like
Here is a concrete plan of making this happen:

  1. Create a sub-ontology that setup the data types needed for interpretability.
  2. Run the interpreter models on the data packs (this workflow may start from a Stave click)
  3. Store the interpretation results in the data pack using the sub-ontology.
  4. Create a plugin that wraps around LIT for the visualization.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Using machine learning backends for interactive front end.

Is your feature request related to a problem? Please describe.
Stave should support interactive data processing (e.g. annotation, processing) via communicating with machine learning backend.

Describe the solution you'd like
One example plugin that can demonstrate this feature.

Describe alternatives you've considered
Building this feature into the main feature, but it is better to try on a plugin first.

Additional context
No

Editable attributes

Currently, attributes are appeared as read only on the right side bar. Ability to edit and save these would be quite nice too.

Standard way to add data into system

Currently, we can paste data from the backend or run transform manually. A standard way to add input data to the system is required. Such as providing a script for doing so.

Ability to view full name of legends in some form

The full legend name follows a class name convention, thus two legends can be different but share the last segment of their names (e.g. company_a.Prodcut vs. company_b.Product). While showing the partial name of a legend is a good approach, maybe we can allow users to hover and see the full name of the legend.

Editable GUI layout

There are several panes in this GUI, such as legend pane, group plugin, meta data pane, etc. It would be nice if some of them can be closed. Two suggestions:

  1. The admin can specify which panes to show
  2. The user can adjust the size of the panes (and minimize if needed).

Header should not be visible when logged out.

Describe the bug
Currently, the header is shown when the user is logged out, which contains the logout button itself. This is quite confusing to the users.

To Reproduce
Simply start the project and you can see the header without login.

Expected behavior
The header should be hidden when logged out.

Screenshots
N/A

Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Node Versions or other

Additional context
Add any other context about the problem here.

Some project configs are not working yet

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new project
  2. Add the ontology
  3. Now you can change the project configs (see screenshot)
  4. Only is_shown and layout_configs are working. The other configs are not implemented yet.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
Screen Shot 2020-11-30 at 6 14 58 PM

New user can have the same name with an existing user

Describe the bug
A new user of this project will have the same name with the existing user.

To Reproduce
Steps to reproduce the behavior:

  1. Go to "All Users" page
  2. Try add another user with the same name
  3. You will see the new user appear there with the same name.

Expected behavior
The server should reject this new user.

Screenshots
If applicable, add screenshots to help explain your problem.
image

Environment (please complete the following information):

  • Work on all environment

Additional context
Add any other context about the problem here.

Support of corpus or project

An annotation project mostly involves multiple documents (e.g. a corpus). To support a corpus, the tool should:

  1. Front-end: Navigate the documents in the corpus (next document button and a document list to selection from)
  2. Back-end: Store the documents together, and the documents should share the same ontology.

Project ontology should support upload files.

Is your feature request related to a problem? Please describe.
Currently, the project ontology is created by typing in a text box. This is not the convenient way to do this. The better alternative should allow the user to directly upload the file.

Describe the solution you'd like
An upload box, preferably drag'n'drop box should be there for uploading the ontology. It should also check if the ontology is valid (at least check for MIME type). User can preview the uploaded ontology.

Describe alternatives you've considered
Upload should be the best option.

Additional context
Currently, ontology is uploaded using the following text box.
image

Log out an idle user after short timeout

Is your feature request related to a problem? Please describe.
If the user keeps working on a project and do not log out, it may block the other users.

Describe the solution you'd like
Logout the user automatically after a certain time period (15 minutes)

Describe alternatives you've considered
N/A

Additional context
This may be related to #102

Per project config

A config is used to control what/how legends/attributes should be displayed.

  1. When creating a new project, the config should be stored.
  2. Config should be editable from the UI.
  3. Config should be applied to the doc viewer.

Support data import by uploading files

Is your feature request related to a problem? Please describe.
Currently, data is added to the system via pasting the data pack and ontology into text boxes. For actual users, a data import interface should be implemented.

Describe the solution you'd like

  1. Have a page for data upload, accessible by privileged users.
  2. The page lists the projects, the user should first select a project.
  3. There should be a button to browse local data files. The user can select one and upload it.
  4. Should support zip files or individual json files.

Additional context
This should be done after we have a project view.

Add permission check before creating a job

Is your feature request related to a problem? Please describe.
Currently, we did not conduct permission check when creating a job for the job creator. Instead, we only check the permissions related to the document and project

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Make backend NLP packages optional

Describe the bug
Current back-end NLP libraries (e.g. Forte, Pytorch) are assumed to exist, which is not needed by all systems. They should be optional and will not raise errors from backends.

To Reproduce
Steps to reproduce the behavior:

  1. Simply start the backend without installing Forte.

Expected behavior
The system can log/warn that the package is not installed, but do not throw exceptions.

Screenshots
NONE

Environment (please complete the following information):
All environment.

Additional context
Add any other context about the problem here.

Build a model-free chatbot to demonstrate dialogue_box plugin

Is your feature request related to a problem? Please describe.
The current dialogue_box plugin can only be run with Forte models and require preparing a lot of data (such as indexing). It is not friendly for user to simply try out this plugin.

Describe the solution you'd like
Build-in a model-free chatbot such as Eliza into Forte. So we can demonstrate by calling this model. This chatbot should also be packaged on PyPI in Forte so user don't have to build from source.

Describe alternatives you've considered
None

Additional context
None

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.