Giter VIP home page Giter VIP logo

polyglot's Introduction

Polyglot: Bio-inspired Visual Analysis of Language Embedding Data

Polyglot is a web application for visualizing language embeddings in a 3D space. Language embeddings are typically high-dimensional vector representations of the syntactic and semantic content of words. This application allows examination of a particular word embedding data, reduced to 3D using UMAP. In addition to 3D navigation of the scatter plot space, the application also allows the user to view the exploration result of Monte-Carlo Physarum Machine (MCPM). The algorithm is a computational model simulating the self-organizing nature of slime mold. It has been shown to discover structures of underlying data following the characteristics of optimal transport networks. Lastly, the application also allows viewing the dataset by coloring based on each word's part-of-speech tag.

For this application, we use Gensim Continuous Skipgram result of Wikipedia Dump of February 2017 (296630 words). The same dataset is reduced twice using UMAP under the same parameter (can be switched using Select Dataset) to example the persistence of underlying structures.

Use mouse hover to examine the content of each word point. The toggle Show More displays all the word tokens under the mouse point, not just the one closest to the screen.

One can switch between examining slime exploration result and part-of-speech distributions using Color Mode. The four sliders (Color Gradient, Lowest Weight, Lowest Connect, Opacity Fading) can be used to customize the visualization of the slime results. Specifically, Lowest Connect is particularly helpful to declutter the scatter plot view.

The slime result is generated by placing MCPM probe agents around a single word point, which we call anchor point, and allow them to spread out and follow the trace. The anchor points are marked yellow. Hold Left-Shift to enter anchor point navigation mode. Double click on an anchor point to switch to the slime mode result for that particular point.

Web Application

You can use the web application by going to: https://creativecodinglab.github.io/Polyglot/index.html

Quick Reference

Mouse: navigate in 3D
Left-Shift: anchor-focus mode

Double click on anchor points (yellow) to view slime data from that anchor point.

Screenshots

Authors

This web visualization tool was created by a team of researchers at University of California, Santa Cruz, Dept. of Computational Media:

This work was published as Hongwei Zhou's M.S. thesis.

A version of this work was published in 2020 IEEE 5th Workshop on Visualization for the Digital Humanities (VIS4DH)

polyglot's People

Contributors

normand-1024 avatar angusforbes avatar oskarelek avatar

Forkers

polyphyhub

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.