Giter VIP home page Giter VIP logo

power's People

Contributors

tobiasuhmann avatar

Watchers

 avatar

power's Issues

Attention Metrics

Run the current attention-based classifier (that concatenates the sentence mixes) and compare its metrics to the baseline.

Baseline

Recently, OWER was compared to zero rule baselines. In the past it was also compared to a reasonable baseline, but that baseline does not exist in the current code, anymore.

Re-create a reasonable baseline and test OWER against it on the current dataset.

Shuffle sentences

Shuffle sentences during training so that each linear layer sees all sentences during training.

Einsum

Use torch’s einsum() to implement the multi-linear layer.

More classes

Create datasets with more classes, train on them and inspect the results on rare classes. Does it affect the performance on frequent classes?

Refactor original classifier

Refactor the original classifier to take whole texts as input (instead of pre-processed texts) so that it can be used for inference.

Notebook classifier

Enhance the notebook to define a classifier and use it in a train/valid loop like in the code base. Also, plot the loss curve.

No attention all data

The no-attention model performs well on the minimal test data. Run it on the ower-fb-3 dataset and check the loss curve.

Mine Rules

Create FB/CoDEx datasets for AnyBURL and browse resulting rules. Assert that those can be parsed and converted to Cypher.

Extract experiments

The experiments in the thesis-tools repo blow up the repo and are hard to run because the source root has to be changed every time. Also, they simply do not belong into the repo. Extract the experiments into individual repos in the same GitHub group.

Zero Rule Baseline

The multi-linear classifier performs better than the base classifier and the concat classifier, but its not clear that it performs better than a zero rule classifier that simply predicts the most common classes in the training data (i.e. pick 0 for every class in the OWER dataset).

Probably, the base classifier and the concat classifier perform slightly better than the zero rule baseline, but this needs to be verified. In the worst case, the null baseline would perform better than the multi-linear classifier.

Visualize sentence attention

Visualize how well the class embeddings attend on words and sentences. The expected result would be that the “married” class embedding, for example, attends heavily on words and sentences related to marriage like “married”, “husband”, “wife”, etc.

Create positive case

Create a notebook that demonstrates a positive case (assume learned weights) of a simple example (entity with short sentences, minimal vocabulary). Verify that the embeddings are not unlearned.

Implement aggregator

Implement the aggregator that combines the predictions from texter and ruler. Check that the predictions for CW valid entities are reasonable.

Streamlit App

Create a Streamlit App that allows

  • browsing a Power dataset
  • make predictions (that explain the model's decision by listing rules and sentence prioritization)

New OWER dataset

Build OWER datasets with more sentences. In fact, include all Ryn sentences in the OWER dataset and limit the number of sentences just before training. The same could be done for classes.

Also, include the classes, debug information like entity names and other information from the Ryn dataset that is later required.

Show examples during training

The loss curve indicates that training works but it is not illustrative. Print some concrete examples in the training and validation loops to see how well training performs.

Visualize attentions

Visualize class-word attentions to see whether the class embeddings are learned as intended. Ideally, show attentions at the end of each epoch in Tensorboard, somehow like this:

image

Build Ruler

Build the ruler that

  • reads the rules created by AnyBURL
  • sorts them by confidence and filters out low-confidence rules
  • predicts facts and stores them for the respective tail entity together with the rule and confidence
  • sthe result

Also, delete unused dev projects along the way.

Create new OWER dataset

It has turned out that the validation of the current ower-fb-3 dataset are not useful as all of the ground truth classes are false for the first 300 entities.

The script for building the OWER dataset needs to be examined and corrected. Estimated, the most frequent classes should be true at least for every 100th entity.

Set up graph DBs

Set up neo4j graphs for Freebase and CoDEx and perform some Cypher queries for AnyBURL rules.

Tensorboard

Log train/valid loss and visualize it, e.g. with Tensorboard.

No concat

The current model concatenates the sentence mixes before feeding them into the linear layer. This is wrong as it does not assign a class embedding to a single output class.

Instead, each sentence mix should be passed through the linear layer individually.

Furthermore, it would be interesting to compare the one-linear-layer-for-all approach to a one-linear-layer-for-each class approach.

Overfit

Overfit a single sample on a randomly initialized classifier

Min/Max Pooling

Contrary to expectations, the class embeddings do not fit the embeddings of semantically similar words very well. The reason might be that the words' discriminative features get lost among the many other words in the sentence.

Try min/max pooling for getting the sentence embedding and compare the results with the usual mean pooling.

No attention

For some reason the valid loss does not decrease significantly on the attention model. Compare with a baseline that does not include the attention mechanism.

Metrics

Measure and plot precision, recall and F1

Refactor multi-linear classifier

Remove the two redundant outputs of the linear layers, making the one-hot-encoding obsolete.

Also, update the code base to the multi-linear version.

Pre-trained word embeddings

Should use pre-trained word embeddings. Otherwise sentence embeddings might not capture meaning of sentences

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.