Giter VIP home page Giter VIP logo

trellis's Introduction

Trellis: Topic Model Aggregation and Visualization

Trellis aims to facilitate text corpus exploration and provide an interactive approach to refining a topic model. The Trellis application allows users to take topic models with large numbers of topics and form hierarchies from these topics. This hierarchy can be used to organize and sort the underlying text corpus, allowing users to find or read documents related to a topic.

Trellis is intended to be run locally. It is impractical to upload thousands or tens of thousands of text files, so we use the shinyFiles package to access files locally. The tool will therefore not be functional if deployed to a remote server.

Installation

Until Trellis is released on CRAN, the easiest way to install is with devtools::install_github("ajbc/trellis", build_vignettes=TRUE).

Use

Use trellis::launchTrellis() to start the application. See vignette("trellis") for details.

Feedback

Trellis is young and undergoing active development. If you have any feedback, please submit an issue to this repository or email Thomas Schaffner at t.f.schaffner [AT] gmail.com

trellis's People

Contributors

afriendlyrobot avatar ajbc avatar gwgundersen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

trellis's Issues

resume clustering

upload old file or otherwise resume clustering after shutting down setting

update cluster titles

As a follow-up from #1, when clusters are updated, the titles should update accordingly. Commit a9e13ae sends the new titles to the HTMLwidget, but the text itself doesn't update yet. @gwgundersen, could I get some help finishing this off?

Test/improve performance

We should find the limits on size of data and value of K for which the tool runs smoothly, and work on supporting larger data sets and higher values of K

move clusters

Shift+click for clusters should move the cluster as a whole, instead of creating a new cluster containing the cluster (without the shift, it merges clusters). Alternatively, we could add a different key modifier and keep the current behavior. Regardless, we need a way to move clusters.

@AFriendlyRobot want to tackle this one?
@gwgundersen advice welcome

update which bubble labels display (bubble view)

Currently, zooming does not change which topic titles are displayed in the bubble widget, no matter how deep the hierarchy is. Zooming should allow more titles to be displayed (hierarchically).

Separate zooming and window-resizing functionalities

The current system uses a single value (k) to denote a "zooming" ratio. However, no matter how large the window is at initialization, k is initialized to 1. This particularly causes text resizing issues when start in a small window and increasing the size vs. starting in a small window and shrinking the size.

Changing number of clusters after moving nodes causes crash

This seems to happen fairly frequently, but I have not yet identified exact conditions necessary or sufficient to produce the behavior. Reported error:

screen shot 2017-09-16 at 2 09 11 am

Note that the numbers at the end seem to change, and I have been able to produce this error in several trials by moving nodes around and then increasing or decreasing the number of clusters parameter.

Dynamically load documents, sorted per topic (left pane)

Currently, we statically load the top-weighted 100 documents per topic. Ideally we will be able to dynamically load more documents as the user scrolls. This will increase performance while also allowing the user to explore potentially all documents.

enforce visual consistency between widgets

When changing between the bubble widget and tree widget, some visual inconsistencies can occur. For example, if the user selects node A in the bubble widget, then switches to the tree widget and selects node B, then switches back to the bubble widget, node A will still APPEAR selected. However, the backend will accurately consider node B to be selected (displaying documents related to node B on the sidebar).

One simple option is to clear selected nodes when changing between widgets, but this is not necessarily the best behavior.

Create smaller data set

Create a smaller data set (than the wikipedia data set) which can be uploaded to the repository.
This should follow the current format of the .RData file, and will allow for faster iteration during development and debugging.

display document metadata

At some point, we may want to allow document metadata to be displayed for each document (if provided).

One solution: In the directory of text files, also include RData files containing whatever data should be displayed.

handle node merging/moving with collapsed nodes (tree view)

If a node is collapsed before a drag-and-drop merge of clusters (in the tree view), the underlying data structure will correctly merge/move. However, visually, the previously collapsed/hidden nodes will remain collapsed/hidden but the newly moved nodes can appear as children of a collapsed node.

Suggested fix: Uncollapse nodes when merging (seems like a reasonable behavior)

Include styling for image exporting

Currently exporting an SVG image does not include the associated CSS styling.
Solution: Create a dummy copy of the SVG elements, using Window.getComputedStyle() to copy style explicitly.

Improve file parsing

Desired:

  1. Expand volumes available
  2. Improve file/directory path parsing

Potential solutions

  1. Use shinyFiles getVolumes() augmented with custom code (primarily for Windows)
  2. Use shinyFiles parseFilePath()/parseDirPath() methods

Consistently display edited titles

After moving node C (with edited title) from parent P1 to parent P2, on first mouseover of C, C's old title will display. Then new (edited) title will display, while all visible titles redisplay. Then C's title will go blank

can't unselect a leaf node

When a leaf node is selected, it must either be moved to a new parent or a different leaf node must be selected. We should be able to unselect it.

@gwgundersen not sure if this is a side-effect of the hierarchical changes or not (thought we used to be able to do this).

adjust colors

Per discussion for #1, we want to make sure the users' attention is drawn to the right things.

Also ensure that colors adjust depending on the depth of the hierarchy.

Scrolling with Topic Summary pane open changes mouseover position and resets pane

When hovering over a topic, a Topic Summary panel appears in the left column. However, if the user scrolls while this panel is open in certain ways (see below), the panel will hide itself and then reappear, causing the page to expand and contract and causing the visualization to appear to bounce up and down.

Conditions for this to occur:
It appears that when the user (starting fully scrolled to the top of the page) hovers over a topic bubble and opens the panel, then scrolls down so that the mouse is hovering over the root node only, the topic summary panel will hide (no specific topic with a summary is hovered). However, this shrinks the total height of the page, which effectively pushes the mouse back up. If the positioning of the mouse is "correct", this new position will be over a topic bubble.
The topic summary panel will then expand again. Because the page was scrolled to the bottom previously, Chrome at least (not yet tested on other browsers) will reset the scrolled position of the page to the bottom. Therefore, the mouse is no longer hovering the relevant topic bubble, and the cycle continues.

Likely requires a little more examination to fully identify all necessary and sufficient conditions.

@ajbc Again, I'm not sure if this is necessary to address before Text as Data. However, I don't think any overhaul we've discussed would be likely to address this issue specifically, and this issue is therefore probably independent from much of the potential work for the Overhaul milestone.

color coding of nodes

Allow users to color-code or flag individual nodes/topics/clusters. E.g. a user could change a cluster to green if they think the cluster is well-formed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.