Comments (3)
Previous Work on the Umbrella Issue
- regex/string matching
- capturing common word mis-spelling.
Derive Root Words of the words in the Slur List
From what I broadly understand we want to do - Lemmatization and Stemming
The root word is called - Lemma
Automated/ Semi-Automated Approaches
Lemmatization for English
- https://www.geeksforgeeks.org/python-lemmatization-approaches-with-examples/
- Popular python libraries -> WordNet, WordNet (with POS tag), TextBlob, TextBlob (with POS tag), spaCy, TreeTagger, Pattern
Gensim, Stanford CoreNLP etc - Comparing NLTK, TextBlob, spaCy, Pattern and Stanford CoreNLP - Link
- https://www.geeksforgeeks.org/python-lemmatization-with-nltk/
- Stemming and types of stemming's - https://www.engati.com/glossary/stemming#:~:text=popular%20stemming%20algorithms%3F-,What%20is%20stemming%20in%20NLP%3F,reduced%20to%20a%20standard%20form.
- Porter's algorithm (suffix stripping) - https://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html
- https://stackoverflow.com/questions/32520521/detect-words-with-the-same-root
Lemmatization for Hindi
- a simple stemmer in java - https://stackoverflow.com/questions/4007558/is-there-is-any-stemmer-available-for-indian-language
- NLTK library
- A Hybrid Approach for Lemmatizer for Hindi Language
- A Rule-based Part-of-Speech and Morphological Tagging Toolkit - Link1, Link2
- Multilingual NLTK for POS Tagging and Lemmatizer
- Taxonomic survey of resources for Hindi Language NLP systems ->See section
4.2.9
- https://paperswithcode.com/task/lemmatization
Approaches with Human Intervention
- Rule-based Lemmatization
- Manual Annotations of
Root
andNot Root
from uli.
This issue is stale because it has been open for 30 days with no activity.
from uli.
This issue was closed because it has been inactive for 14 days since being marked as stale.
from uli.
Related Issues (20)
- [Bug]: Slur Replacement Feature
- Timeline for Uli Roadmap HOT 1
- Upgrade the Plugin HOT 1
- Enable Metadata collection on mobile HOT 2
- Expanding to multi-modal HOT 1
- Translated metadata to appear in the language of the slur HOT 1
- Clean up the Slur Dataset HOT 1
- Help manage the Slack channel for the DPG Program: send reminders, messages, homework HOT 1
- Help ideate on Hacktoberfest assistance to DPGs HOT 1
- Archive dashboard HOT 1
- Show Slur Metadata in the webpage HOT 2
- Review DOM diffing/patching techniques HOT 27
- Inject SVG element into DOM HOT 5
- Backlog Initiatives HOT 1
- Data Governance HOT 1
- Continuing Sync Sessions HOT 1
- Partnership sessions/events relevant to the product track HOT 1
- Research/Experiments: Cross-Platform/Federated Responses HOT 1
- Research: Women Candidates' Online Presence HOT 1
- issue tracker HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uli.