OCTIS version: 1.9.0 Python version :3.9.7 Operating System:<b

WECoherencePairwise and WECoherenceCentroid are negatively correlated. about octis HOT 3 CLOSED

mind-lab commented on July 17, 2024

WECoherencePairwise and WECoherenceCentroid are negatively correlated.

from octis.

Comments (3)

pietrotrope commented on July 17, 2024

Hi rsimd, thank you for reaching out!
Your advice was very helpful! Reviewing the code I found some errors with both metrics.
To make the code easier to read and fix the problems I rewrote both of the metrics following in detail the formulas of the reference paper:
https://doi.org/10.18653/v1/d18-1096

To test the code you can clone the repo and execute: python setup.py install
Let me know if it works now!

from octis.

rsimd commented on July 17, 2024

Thank you for fixing it.
Since I don't have time to do the experiments again that I did when I created the figure shown above, I will show the values of WECoherencePairwise and WECoherenceCentroid during training on a single model.
In the output shown below, wetc_c means WECoherencePairwise and wetc_pw means WECoherenceCentroid. I think these are roughly the expected scores.

Run RecurrentStickBreakingModel num_topics=10, embed_dim=300
  epoch      td    train_loss       train_ppl    valid_loss      valid_ppl    wetc_c    wetc_pw     dur
-------  ------  ------------  --------------  ------------  -------------  --------  ---------  ------
      1  0.9600       73.9698  320736165.8069       68.5748  75777356.2360    0.6372     0.4447  0.5634
      2  0.9600       63.6994  43644642.7318       52.8878  12319750.3907    0.6233     0.4431  0.5034
      3  0.9700       52.4110  12532657.1333       42.7649  4528677.4107    0.6238     0.4450  0.5192
      4  0.9700       42.5961  4518471.2740       35.0613  1901227.4254    0.6257     0.4432  0.5077
      5  0.9700       35.2624  1956250.7926       29.6098  993814.0761    0.6372     0.4486  0.5189
      6  0.9700       29.7505  971985.6298       25.0100  529332.6143    0.6415     0.4428  0.5199
      7  0.9700       25.6807  527041.6114       21.7965  332640.1859    0.6406     0.4484  0.5142
      8  0.9800       21.7055  300979.1405       17.8883  191409.8175    0.6378     0.4453  0.5131
      9  0.9800       19.0901  189746.0816       16.8559  128860.5507    0.6419     0.4464  0.5159
     10  0.9800       16.9686  117649.4644       13.5188   79297.2655    0.6484     0.4468  0.5327
     11  0.9800       13.6051   76754.6987       10.2547   53539.9720    0.6469     0.4449  0.5236
     12  0.9700       10.4957   51914.8373        7.6733   37806.9141    0.6395     0.4488  0.5249
     13  0.9700        7.8220   36336.7206        5.2881   27193.4501    0.6402     0.4453  0.5001
     14  0.9600        5.7267   26571.4181        3.5589   20158.4579    0.6469     0.4491  0.5000
     15  0.9600        3.7754   19824.1490        1.6373   15315.5823    0.6482     0.4502  0.4965
     16  0.9700        1.8117   15046.2791       -0.4119   11771.6308    0.6714     0.4546  0.6522
     17  0.9800       -0.0840   11651.8547        0.9896   11116.5698    0.6808     0.4586  0.5274
     18  0.9800       -1.9134    9199.4690       -2.5322    7608.3783    0.6812     0.4589  0.5376
     19  0.9800       -3.5271    7390.6316       -5.3993    5911.0744    0.6841     0.4573  0.5124
     20  0.9900       -5.0035    6062.4598       -6.8578    4966.2950    0.6846     0.4595  0.5195
     21  1.0000       -6.1741    5071.2266       -7.8498    4222.3710    0.6839     0.4607  0.5133
     22  1.0000       -7.3060    4278.6677       -9.2016    3589.5463    0.6956     0.4570  0.5238
     23  1.0000       -8.6377    3651.7721      -10.4976    3053.1089    0.6804     0.4590  0.5193
     24  1.0000       -9.9328    3156.5071      -11.4241    2732.4987    0.6938     0.4623  0.5381
     25  1.0000      -10.9590    2768.7093      -12.5331    2381.2803    0.6950     0.4638  0.5272
     26  1.0000      -11.8022    2445.4564      -13.3054    2106.8487    0.6919     0.4634  0.5341
     27  1.0000      -12.6172    2195.8951      -13.7952    1943.4262    0.6913     0.4596  0.5179
     28  1.0000      -13.2920    1984.6121      -14.4775    1773.9326    0.7012     0.4627  0.5317
     29  1.0000      -13.8840    1820.3619      -15.2518    1593.0502    0.6997     0.4611  0.5196
     30  1.0000      -14.5299    1673.6254      -15.6372    1497.4241    0.6988     0.4591  0.5276
     31  1.0000      -15.1335    1556.9596      -16.4473    1381.1824    0.6988     0.4591  0.5255
     32  1.0000      -15.6565    1451.4193      -16.8722    1308.9742    0.6968     0.4615  0.5157
     33  1.0000      -16.0427    1378.1303      -17.2128    1228.9282    0.6992     0.4619  0.5229
     34  1.0000      -16.3261    1305.7176      -17.2658    1193.9426    0.6960     0.4582  0.5010
     35  1.0000      -16.6525    1243.8213      -17.7837    1104.6458    0.6982     0.4548  0.5249
     36  1.0000      -16.8720    1195.2415      -17.7987    1078.8660    0.6994     0.4562  0.5151
     37  1.0000      -17.1660    1147.1929      -17.8148    1083.4223    0.7003     0.4583  0.5334
     38  1.0000      -17.4919    1113.5443      -18.3335    1044.0485    0.7005     0.4584  0.5047
     39  1.0000      -17.7039    1079.5254      -17.9807    1010.1541    0.7005     0.4560  0.5252
     40  1.0000      -17.8705    1050.6049      -18.1552     987.7849    0.7015     0.4571  0.5175
     41  1.0000      -18.0920    1023.6774      -18.9372     947.5119    0.7015     0.4571  0.5163
     42  1.0000      -18.2971    1005.2088      -19.2902     935.5594    0.7142     0.4637  0.5064
     43  1.0000      -18.4749     989.1840      -19.4979     899.5156    0.7150     0.4652  0.5139
     44  1.0000      -18.6154     970.7104      -19.5591     888.7733    0.7144     0.4650  0.5148
     45  1.0000      -18.7052     957.0750      -19.4846     890.6475    0.7112     0.4629  0.5117
     46  1.0000      -18.8325     944.8661      -19.8619     875.3139    0.7107     0.4636  0.5284
     47  1.0000      -18.9852     933.2427      -20.0767     859.9098    0.7124     0.4664  0.5268
     48  1.0000      -19.0465     926.8892      -16.0793    1348.9520    0.6984     0.4589  0.5065
     49  1.0000      -19.0715     925.1578      -19.7939     868.3454    0.6971     0.4616  0.5204
     50  1.0000      -19.1469     914.9127      -19.9731     849.2092    0.6967     0.4614  0.5154

However, here we use glove.42B.300d (collected from https://nlp.stanford.edu/projects/glove/) to calculate the score.

from octis.

pietrotrope commented on July 17, 2024

Hi, thank you for testing. The computation of the score should be independent of the used word representation technique.
In case you find any problems, please do not hesitate to reach out again!

Pietro

from octis.

WECoherencePairwise and WECoherenceCentroid are negatively correlated. about octis HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent