mukund109 / word-mesh Goto Github PK

A context-preserving word cloud generator

License: MIT License

Python 100.00%

word-mesh's Introduction

word-mesh

A wordcloud/wordmesh generator that allows users to extract keywords from text, and create a simple and interpretable wordcloud.

Why word-mesh?

Most popular open-source wordcloud generators (word_cloud, d3-cloud, echarts-wordcloud) focus more on the aesthetics of the visualization than on effectively conveying textual features. word-mesh strikes a balance between the two and uses the various statistical, semantic and grammatical features of the text to inform visualization parameters.

Features:

keyword extraction: In addition to 'word frequency' based extraction techniques, word-mesh supports graph based methods like textrank, sgrank and bestcoverage.
word clustering: Words can be grouped together on the canvas based on their semantic similarity, co-occurence frequency, and other properties.
keyword filtering: Extracted keywords can be filtered based on their pos tags or whether they are named entities.
font colors and font sizes: These can be set based on the following criteria - word frequency, pos-tags, ranking algorithm score.

How it works?

word-mesh uses spacy's pretrained language models to gather textual features, graph based algorithms to extract keywords, Multidimensional Scaling to place these keywords on the canvas and a force-directed algorithm to optimize inter-word spacing.

Examples

Here's a visualization of the force-directed algorithm. The words are extracted using textrank from a textbook on international law, and are grouped together on the canvas based on their co-occurrence frequency. The colours indicate the pos tags of the words.

This wordmesh was created from Steve Job's famous commencement speech at Stanford. The keywords are extracted using textrank and clustered based on their scores. The font colors and font sizes are also a function of the scores. Code

This is from the same text, but the clustering has been done based on co-occurrence frequency of keywords. The colors have been assigned using the same criteria used to cluster them.

This is quite apparent from the positions of the words. You can see the words like 'hungry' and 'foolish' have been grouped together, since they occur close to each other in the text as part of the famous quote "Stay hungry. Stay foolish". Code

This is a wordmesh of all the adjectives used in a 2016 US Presidential Debate between Donald Trump and Hillary Clinton. The words are clustered based on their meaning, with the font size indicating the usage frequency, and the color corresponding to which candidate used them. Code

This example is taken from a news article on the Brazil vs Belgium 2018 Russia WC QF. The colors correspond to the POS tags of the words. The second figure is the same wordmesh clustered based on the words' co-occurrence frequency. Code

Installation

Install the package using pip:

pip install wordmesh

You would also need to download the following language model (size ~ 115MB):

python -m spacy download en_core_web_md

This is required for POS tagging and for accessing word vectors. For more information on the download, or for help with the installation, visit here.

Tutorial

All functionality is contained within the 'Wordmesh' class.

from wordmesh import Wordmesh

#Create a Wordmesh object by passing the constructor the text that you wish to summarize
with open('sample.txt', 'r') as f:
    text = f.read()
wm = Wordmesh(text) 

#Save the plot
wm.save_as_html(filename='my-wordmesh.html')
#You can now open it in the browser, and subsequently save it in jpeg format if required

#If you are using a jupyter notebook, you can plot it inline
wm.plot()

The Wordmesh object offers 3 'set' methods which can be used to set the fontsize, fontcolor and the clustering criteria. Check the inline documentation for details.

wm.set_fontsize(by='scores')
wm.set_fontcolor(by='random')
wm.set_clustering_criteria(by='meaning')

You can access keywords, pos_tags, keyword scores and other important features of the text. These may be used to set custom visualization parameters.

print(wm.keywords, wm.pos_tags, wm.scores)

#set NOUNs to red and all else to green
f = lambda x: (200,0,0) if (x=='NOUN') else (0,200,0)
colors = list(map(f, wm.pos_tags))

wm.set_fontcolor(custom_colors=colors)

For more examples check out this notebook.

If you are working with text which is composed of various labelled sections (e.g. a conversation transcript), the LabelledWordmesh class (which inherits from Wordmesh) can be useful if you wish to treat those sections separately. Check out this notebook for an example.

Notes

The code isn't optimized to work on large chunks of text. So be wary of the memory usage while processing text with >100,000 characters.
Currently, Plotly is being used as the visualization backend. However, if you wish to use another tool, you can use the positions of the keywords, and the size of their bounding boxes, which are available as Wordmesh object attributes. These can be used to render the words using a tool of your choice.
As of now, POS based filtering, and multi-gram extraction cannot be done when using graph based extraction algorithms. This is due to some problems with underlying libraries which will hopefully be fixed in the future.
Even though you have the option of choosing 'TSNE' as the clustering algorithm, I would advise against it since it still needs to be tested thoroughly.

word-mesh's People

Contributors

Stargazers

Watchers

word-mesh's Issues

ValueError: Invalid property specified for object of type plotly.graph_objs.layout.XAxis: 'autotick'

I am getting this plotly error when running your code on a random text.
Any help is appreciated.

`ValueError Traceback (most recent call last)
in ()
1 wm = Wordmesh(" ".join(t_list))
----> 2 wm.plot()

~/.local/lib/python3.4/site-packages/wordmesh-0.1.0b2-py3.4.egg/wordmesh/static_wordmesh.py in plot(self, force_directed_animation)
545 """
546 self.save_as_html(force_directed_animation=force_directed_animation,
--> 547 notebook_mode=True)
548
549

~/.local/lib/python3.4/site-packages/wordmesh-0.1.0b2-py3.4.egg/wordmesh/static_wordmesh.py in save_as_html(self, filename, force_directed_animation, notebook_mode)
532 else:
533 self._visualizer.save_wordmesh_as_html(self.embeddings, filename,
--> 534 notebook_mode=notebook_mode)
535
536 def plot(self, force_directed_animation=False):

~/.local/lib/python3.4/site-packages/wordmesh-0.1.0b2-py3.4.egg/wordmesh/utils.py in save_wordmesh_as_html(self, coordinates, filename, animate, autozoom, notebook_mode)
188 if notebook_mode:
189 py.init_notebook_mode(connected=True)
--> 190 py.iplot(fig, filename=filename, show_link=False)
191 else:
192 py.plot(fig, filename=filename, auto_open=False, show_link=False)

~/.local/lib/python3.4/site-packages/plotly/offline/offline.py in iplot(figure_or_data, show_link, link_text, validate, image, filename, image_width, image_height, config)
337 config.setdefault('linkText', link_text)
338
--> 339 figure = tools.return_figure_from_figure_or_data(figure_or_data, validate)
340
341 # Though it can add quite a bit to the display-bundle size, we include

~/.local/lib/python3.4/site-packages/plotly/tools.py in return_figure_from_figure_or_data(figure_or_data, validate_figure)
1472
1473 try:
-> 1474 figure = Figure(**figure).to_dict()
1475 except exceptions.PlotlyError as err:
1476 raise exceptions.PlotlyError("Invalid 'figure_or_data' argument. "

~/.local/lib/python3.4/site-packages/plotly/graph_objs/_figure.py in init(self, data, layout, frames)
334 respective traces in the data attribute
335 """
--> 336 super(Figure, self).init(data, layout, frames)
337
338 def add_area(

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in init(self, data, layout_plotly, frames)
155
156 # ### Import Layout ###
--> 157 self._layout_obj = self._layout_validator.validate_coerce(layout)
158
159 # ### Import clone of layout properties ###

~/.local/lib/python3.4/site-packages/_plotly_utils/basevalidators.py in validate_coerce(self, v)
1875
1876 elif isinstance(v, dict):
-> 1877 v = self.data_class(**v)
1878
1879 elif isinstance(v, self.data_class):

~/.local/lib/python3.4/site-packages/plotly/graph_objs/_layout.py in init(self, arg, angularaxis, annotations, autosize, bargap, bargroupgap, barmode, barnorm, boxgap, boxgroupgap, boxmode, calendar, colorway, datarevision, direction, dragmode, extendpiecolors, font, geo, grid, height, hiddenlabels, hiddenlabelssrc, hidesources, hoverdistance, hoverlabel, hovermode, images, legend, mapbox, margin, orientation, paper_bgcolor, piecolorway, plot_bgcolor, polar, radialaxis, scene, selectdirection, separators, shapes, showlegend, sliders, spikedistance, template, ternary, title, titlefont, updatemenus, violingap, violingroupgap, violinmode, width, xaxis, yaxis, **kwargs)
3882 self.width = width if width is not None else _v
3883 _v = arg.pop('xaxis', None)
-> 3884 self.xaxis = xaxis if xaxis is not None else _v
3885 _v = arg.pop('yaxis', None)
3886 self.yaxis = yaxis if yaxis is not None else _v

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in setattr(self, prop, value)
3601 if match is None:
3602 # Set as ordinary property
-> 3603 super(BaseLayoutHierarchyType, self).setattr(prop, value)
3604 else:
3605 # Set as subplotid property

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in setattr(self, prop, value)
2702 prop in self._validators):
2703 # Let known properties and private properties through
-> 2704 super(BasePlotlyType, self).setattr(prop, value)
2705 else:
2706 # Raise error on unknown public properties

~/.local/lib/python3.4/site-packages/plotly/graph_objs/_layout.py in xaxis(self, val)
2765 @xaxis.setter
2766 def xaxis(self, val):
-> 2767 self['xaxis'] = val
2768
2769 # yaxis

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in setitem(self, prop, value)
3587 if match is None:
3588 # Set as ordinary property
-> 3589 super(BaseLayoutHierarchyType, self).setitem(prop, value)
3590 else:
3591 # Set as subplotid property

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in setitem(self, prop, value)
2665 # ### Handle compound property ###
2666 if isinstance(validator, CompoundValidator):
-> 2667 self._set_compound_prop(prop, value)
2668
2669 # ### Handle compound array property ###

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in _set_compound_prop(self, prop, val)
2963 validator = self._validators.get(prop)
2964 # type: BasePlotlyType
-> 2965 val = validator.validate_coerce(val)
2966
2967 # Save deep copies of current and new states

~/.local/lib/python3.4/site-packages/plotly/graph_objs/layout/_xaxis.py in init(self, arg, anchor, automargin, autorange, calendar, categoryarray, categoryarraysrc, categoryorder, color, constrain, constraintoward, domain, dtick, exponentformat, fixedrange, gridcolor, gridwidth, hoverformat, layer, linecolor, linewidth, mirror, nticks, overlaying, position, range, rangemode, rangeselector, rangeslider, scaleanchor, scaleratio, separatethousands, showexponent, showgrid, showline, showspikes, showticklabels, showtickprefix, showticksuffix, side, spikecolor, spikedash, spikemode, spikesnap, spikethickness, tick0, tickangle, tickcolor, tickfont, tickformat, tickformatstops, ticklen, tickmode, tickprefix, ticks, ticksuffix, ticktext, ticktextsrc, tickvals, tickvalssrc, tickwidth, title, titlefont, type, visible, zeroline, zerolinecolor, zerolinewidth, **kwargs)
2936 # Process unknown kwargs
2937 # ----------------------
-> 2938 self._process_kwargs(**dict(arg, **kwargs))

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in _process_kwargs(self, **kwargs)
2253 Process any extra kwargs that are not predefined as constructor params
2254 """
-> 2255 self._raise_on_invalid_property_error(*kwargs.keys())
2256
2257 @Property

~/.local/lib/python3.4/site-packages/plotly/basedatatypes.py in _raise_on_invalid_property_error(self, *args)
2838 full_obj_name=full_obj_name,
2839 invalid_str=invalid_str,
-> 2840 prop_descriptions=self._prop_descriptions))
2841
2842 def update(self, dict1=None, **kwargs):

ValueError: Invalid property specified for object of type plotly.graph_objs.layout.XAxis: 'autotick'

Valid properties:
    anchor
        If set to an opposite-letter axis id (e.g. `x2`, `y`),
        this axis is bound to the corresponding opposite-letter
        axis. If set to "free", this axis' position is
        determined by `position`.
    automargin
        Determines whether long tick labels automatically grow
        the figure margins.
    autorange
        Determines whether or not the range of this axis is
        computed in relation to the input data. See `rangemode`
        for more info. If `range` is provided, then `autorange`
        is set to False.
    calendar
        Sets the calendar system to use for `range` and `tick0`
        if this is a date axis. This does not set the calendar
        for interpreting data on this axis, that's specified in
        the trace or via the global `layout.calendar`
    categoryarray
        Sets the order in which categories on this axis appear.
        Only has an effect if `categoryorder` is set to
        "array". Used with `categoryorder`.
    categoryarraysrc
        Sets the source reference on plot.ly for  categoryarray
        .
    categoryorder
        Specifies the ordering logic for the case of
        categorical variables. By default, plotly uses "trace",
        which specifies the order that is present in the data
        supplied. Set `categoryorder` to *category ascending*
        or *category descending* if order should be determined
        by the alphanumerical order of the category names. Set
        `categoryorder` to "array" to derive the ordering from
        the attribute `categoryarray`. If a category is not
        found in the `categoryarray` array, the sorting
        behavior for that attribute will be identical to the
        "trace" mode. The unspecified categories will follow
        the categories in `categoryarray`.
    color
        Sets default for all colors associated with this axis
        all at once: line, font, tick, and grid colors. Grid
        color is lightened by blending this with the plot
        background Individual pieces can override this.
    constrain
        If this axis needs to be compressed (either due to its
        own `scaleanchor` and `scaleratio` or those of the
        other axis), determines how that happens: by increasing
        the "range" (default), or by decreasing the "domain".
    constraintoward
        If this axis needs to be compressed (either due to its
        own `scaleanchor` and `scaleratio` or those of the
        other axis), determines which direction we push the
        originally specified plot area. Options are "left",
        "center" (default), and "right" for x axes, and "top",
        "middle" (default), and "bottom" for y axes.
    domain
        Sets the domain of this axis (in plot fraction).
    dtick
        Sets the step in-between ticks on this axis. Use with
        `tick0`. Must be a positive number, or special strings
        available to "log" and "date" axes. If the axis `type`
        is "log", then ticks are set every 10^(n*dtick) where n
        is the tick number. For example, to set a tick mark at
        1, 10, 100, 1000, ... set dtick to 1. To set tick marks
        at 1, 100, 10000, ... set dtick to 2. To set tick marks
        at 1, 5, 25, 125, 625, 3125, ... set dtick to
        log_10(5), or 0.69897000433. "log" has several special
        values; "L<f>", where `f` is a positive number, gives
        ticks linearly spaced in value (but not position). For
        example `tick0` = 0.1, `dtick` = "L0.5" will put ticks
        at 0.1, 0.6, 1.1, 1.6 etc. To show powers of 10 plus
        small digits between, use "D1" (all digits) or "D2"
        (only 2 and 5). `tick0` is ignored for "D1" and "D2".
        If the axis `type` is "date", then you must convert the
        time to milliseconds. For example, to set the interval
        between ticks to one day, set `dtick` to 86400000.0.
        "date" also has special values "M<n>" gives ticks
        spaced by a number of months. `n` must be a positive
        integer. To set ticks on the 15th of every third month,
        set `tick0` to "2000-01-15" and `dtick` to "M3". To set
        ticks every 4 years, set `dtick` to "M48"
    exponentformat
        Determines a formatting rule for the tick exponents.
        For example, consider the number 1,000,000,000. If
        "none", it appears as 1,000,000,000. If "e", 1e+9. If
        "E", 1E+9. If "power", 1x10^9 (with 9 in a super
        script). If "SI", 1G. If "B", 1B.
    fixedrange
        Determines whether or not this axis is zoom-able. If
        true, then zoom is disabled.
    gridcolor
        Sets the color of the grid lines.
    gridwidth
        Sets the width (in px) of the grid lines.
    hoverformat
        Sets the hover text formatting rule using d3 formatting
        mini-languages which are very similar to those in
        Python. For numbers, see: https://github.com/d3/d3-form
        at/blob/master/README.md#locale_format And for dates
        see: https://github.com/d3/d3-time-
        format/blob/master/README.md#locale_format We add one
        item to d3's date formatter: "%{n}f" for fractional
        seconds with n digits. For example, *2016-10-13
        09:15:23.456* with tickformat "%H~%M~%S.%2f" would
        display "09~15~23.46"
    layer
        Sets the layer on which this axis is displayed. If
        *above traces*, this axis is displayed above all the
        subplot's traces If *below traces*, this axis is
        displayed below all the subplot's traces, but above the
        grid lines. Useful when used together with scatter-like
        traces with `cliponaxis` set to False to show markers
        and/or text nodes above this axis.
    linecolor
        Sets the axis line color.
    linewidth
        Sets the width (in px) of the axis line.
    mirror
        Determines if the axis lines or/and ticks are mirrored
        to the opposite side of the plotting area. If True, the
        axis lines are mirrored. If "ticks", the axis lines and
        ticks are mirrored. If False, mirroring is disable. If
        "all", axis lines are mirrored on all shared-axes
        subplots. If "allticks", axis lines and ticks are
        mirrored on all shared-axes subplots.
    nticks
        Specifies the maximum number of ticks for the
        particular axis. The actual number of ticks will be
        chosen automatically to be less than or equal to
        `nticks`. Has an effect only if `tickmode` is set to
        "auto".
    overlaying
        If set a same-letter axis id, this axis is overlaid on
        top of the corresponding same-letter axis. If False,
        this axis does not overlay any same-letter axes.
    position
        Sets the position of this axis in the plotting space
        (in normalized coordinates). Only has an effect if
        `anchor` is set to "free".
    range
        Sets the range of this axis. If the axis `type` is
        "log", then you must take the log of your desired range
        (e.g. to set the range from 1 to 100, set the range
        from 0 to 2). If the axis `type` is "date", it should
        be date strings, like date data, though Date objects
        and unix milliseconds will be accepted and converted to
        strings. If the axis `type` is "category", it should be
        numbers, using the scale where each category is
        assigned a serial number from zero in the order it
        appears.
    rangemode
        If "normal", the range is computed in relation to the
        extrema of the input data. If *tozero*`, the range
        extends to 0, regardless of the input data If
        "nonnegative", the range is non-negative, regardless of
        the input data.
    rangeselector
        plotly.graph_objs.layout.xaxis.Rangeselector instance
        or dict with compatible properties
    rangeslider
        plotly.graph_objs.layout.xaxis.Rangeslider instance or
        dict with compatible properties
    scaleanchor
        If set to another axis id (e.g. `x2`, `y`), the range
        of this axis changes together with the range of the
        corresponding axis such that the scale of pixels per
        unit is in a constant ratio. Both axes are still
        zoomable, but when you zoom one, the other will zoom
        the same amount, keeping a fixed midpoint. `constrain`
        and `constraintoward` determine how we enforce the
        constraint. You can chain these, ie `yaxis:
        {scaleanchor: *x*}, xaxis2: {scaleanchor: *y*}` but you
        can only link axes of the same `type`. The linked axis
        can have the opposite letter (to constrain the aspect
        ratio) or the same letter (to match scales across
        subplots). Loops (`yaxis: {scaleanchor: *x*}, xaxis:
        {scaleanchor: *y*}` or longer) are redundant and the
        last constraint encountered will be ignored to avoid
        possible inconsistent constraints via `scaleratio`.
    scaleratio
        If this axis is linked to another by `scaleanchor`,
        this determines the pixel to unit scale ratio. For
        example, if this value is 10, then every unit on this
        axis spans 10 times the number of pixels as a unit on
        the linked axis. Use this for example to create an
        elevation profile where the vertical scale is
        exaggerated a fixed amount with respect to the
        horizontal.
    separatethousands
        If "true", even 4-digit integers are separated
    showexponent
        If "all", all exponents are shown besides their
        significands. If "first", only the exponent of the
        first tick is shown. If "last", only the exponent of
        the last tick is shown. If "none", no exponents appear.
    showgrid
        Determines whether or not grid lines are drawn. If
        True, the grid lines are drawn at every tick mark.
    showline
        Determines whether or not a line bounding this axis is
        drawn.
    showspikes
        Determines whether or not spikes (aka droplines) are
        drawn for this axis. Note: This only takes affect when
        hovermode = closest
    showticklabels
        Determines whether or not the tick labels are drawn.
    showtickprefix
        If "all", all tick labels are displayed with a prefix.
        If "first", only the first tick is displayed with a
        prefix. If "last", only the last tick is displayed with
        a suffix. If "none", tick prefixes are hidden.
    showticksuffix
        Same as `showtickprefix` but for tick suffixes.
    side
        Determines whether a x (y) axis is positioned at the
        "bottom" ("left") or "top" ("right") of the plotting
        area.
    spikecolor
        Sets the spike color. If undefined, will use the series
        color
    spikedash
        Sets the dash style of lines. Set to a dash type string
        ("solid", "dot", "dash", "longdash", "dashdot", or
        "longdashdot") or a dash length list in px (eg
        "5px,10px,2px,2px").
    spikemode
        Determines the drawing mode for the spike line If
        "toaxis", the line is drawn from the data point to the
        axis the  series is plotted on. If "across", the line
        is drawn across the entire plot area, and supercedes
        "toaxis". If "marker", then a marker dot is drawn on
        the axis the series is plotted on
    spikesnap
        Determines whether spikelines are stuck to the cursor
        or to the closest datapoints.
    spikethickness
        Sets the width (in px) of the zero line.
    tick0
        Sets the placement of the first tick on this axis. Use
        with `dtick`. If the axis `type` is "log", then you
        must take the log of your starting tick (e.g. to set
        the starting tick to 100, set the `tick0` to 2) except
        when `dtick`=*L<f>* (see `dtick` for more info). If the
        axis `type` is "date", it should be a date string, like
        date data. If the axis `type` is "category", it should
        be a number, using the scale where each category is
        assigned a serial number from zero in the order it
        appears.
    tickangle
        Sets the angle of the tick labels with respect to the
        horizontal. For example, a `tickangle` of -90 draws the
        tick labels vertically.
    tickcolor
        Sets the tick color.
    tickfont
        Sets the tick font.
    tickformat
        Sets the tick label formatting rule using d3 formatting
        mini-languages which are very similar to those in
        Python. For numbers, see: https://github.com/d3/d3-form
        at/blob/master/README.md#locale_format And for dates
        see: https://github.com/d3/d3-time-
        format/blob/master/README.md#locale_format We add one
        item to d3's date formatter: "%{n}f" for fractional
        seconds with n digits. For example, *2016-10-13
        09:15:23.456* with tickformat "%H~%M~%S.%2f" would
        display "09~15~23.46"
    tickformatstops
        plotly.graph_objs.layout.xaxis.Tickformatstop instance
        or dict with compatible properties
    ticklen
        Sets the tick length (in px).
    tickmode
        Sets the tick mode for this axis. If "auto", the number
        of ticks is set via `nticks`. If "linear", the
        placement of the ticks is determined by a starting
        position `tick0` and a tick step `dtick` ("linear" is
        the default value if `tick0` and `dtick` are provided).
        If "array", the placement of the ticks is set via
        `tickvals` and the tick text is `ticktext`. ("array" is
        the default value if `tickvals` is provided).
    tickprefix
        Sets a tick label prefix.
    ticks
        Determines whether ticks are drawn or not. If **, this
        axis' ticks are not drawn. If "outside" ("inside"),
        this axis' are drawn outside (inside) the axis lines.
    ticksuffix
        Sets a tick label suffix.
    ticktext
        Sets the text displayed at the ticks position via
        `tickvals`. Only has an effect if `tickmode` is set to
        "array". Used with `tickvals`.
    ticktextsrc
        Sets the source reference on plot.ly for  ticktext .
    tickvals
        Sets the values at which ticks on this axis appear.
        Only has an effect if `tickmode` is set to "array".
        Used with `ticktext`.
    tickvalssrc
        Sets the source reference on plot.ly for  tickvals .
    tickwidth
        Sets the tick width (in px).
    title
        Sets the title of this axis.
    titlefont
        Sets this axis' title font.
    type
        Sets the axis type. By default, plotly attempts to
        determined the axis type by looking into the data of
        the traces that referenced the axis in question.
    visible
        A single toggle to hide the axis while preserving
        interaction like dragging. Default is true when a
        cheater plot is present on the axis, otherwise false
    zeroline
        Determines whether or not a line is drawn at along the
        0 value of this axis. If True, the zero line is drawn
        on top of the grid lines.
    zerolinecolor
        Sets the line color of the zero line.
    zerolinewidth
        Sets the width (in px) of the zero line.`

Wm.plot() produces empty plot in jupyter notebook and when saving it as html produces a black screen in the browser.

In the examples.ipynb (the jobs speech part ) code executes successfully but when I plot it in notebook a complete white plot and when saving it as html produces a black screen in the browser.
The code executes successfully because I am able to print the similarity matrix and the key words and their pos_tags.

Any help will be really helpful.
The key words are there in the html source I have checked manually. I dont know what can be the problem.
I am sharing the text format of the html file.
my_wordmesh.txt

ValueError: Invalid value of type 'builtins.str' received for the 'textposition' property of scatter - Received value: 'centre'

Apologies up front if this is something on my end.

wm = Wordmesh(text)
wm.plot() or wm.save_as_html(filename='my-wordmesh.html') produces the following error on Windows 10 and Ubuntu 16.04:

ValueError:

Invalid value of type builtins.str received for the textposition property of scatter
Received value: centre

The textposition property is an enumeration that may be specified as:

One of the following enumeration values:
['top left', 'top center', 'top right', 'middle left',
'middle center', 'middle right', 'bottom left', 'bottom
center', 'bottom right']
A tuple, list, or one-dimensional numpy array of the above

Not sure if this is an issue with Plotly==3.0.0. Any assistance is appreciated. And nice work picking spacy/textacy.

Cannot import 'named_entities' from 'textacy.extract'

ImportError: cannot import name 'named_entities' from 'textacy.extract' (c:\users\shen\anaconda3\envs\fyp-test\lib\site-packages\textacy\extract.py)

Got Error in plot the results.

ValueError:
Invalid value of type 'builtins.str' received for the 'textposition' property of scatter
Received value: 'centre'

The 'textposition' property is an enumeration that may be specified as:
  - One of the following enumeration values:
        ['top left', 'top center', 'top right', 'middle left',
        'middle center', 'middle right', 'bottom left', 'bottom
        center', 'bottom right']
  - A tuple, list, or one-dimensional numpy array of the above

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.