galeos93 / disaster_tweets Goto Github PK
View Code? Open in Web Editor NEWRepository to work with "Natural Language Processing with Disaster Tweets" Kaggle competition
License: MIT License
Repository to work with "Natural Language Processing with Disaster Tweets" Kaggle competition
License: MIT License
Now that our dataset is ready, we are able to train a model. For a baseline, nothing too elaborate is needed. Possible ideas:
Try to obtain same performance as https://www.kaggle.com/pinkaxe/pytorch-with-embeddingbag-layer, which is very similar to your approach. I don't obtain such high accuracy on validation set.
I think TweetDataset will not work directly if I feed it to DataLoader. This is so because of the different lenghts of the sentences. This fact makes their aggrupation on a batch impossible. I have to find out how to fix it.
For example, here, they use collate_fn=collate_batch
to "flatten" the sentences into a giant unidimensional vector. I will see if there are alternatives to this.
We need to add tox so we can share a common environment easily.
We have to make an Exploratory Data Analysis on the data. As Andrej Karpathy wisely indicated in this post, we have to become one with the data. Understanding the data is the first thing that have to be done before starting coding.
Naturally, we need to feed our data to our model. Well, how we do that? Luckily PyTorch has a comprehensive guide on dataset and data_loaders we can follow.
I will need:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.