python-implementation-of-lsa's Introduction

Python-Implementation-of-LSA

Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSA’s reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word–word and passage–word lexical priming data; and, as reported in 3 following articles in this issue, it accurately estimates passage coherence, learnability of passages by individual students, and the quality and quantity of knowledge contained in an essay.

Prerequisites

Things reuired

Jupyter Notebook
Python
Gensim

Getting Started

To use this Code just download the repository & open it up in Jupyter Notebook. The code is ready for your next use, So what are you wating for? Start creating something awesome! Good Luck!

Built With

Gensim - The main Library used
Python - Programming Language used
Jupyter Notebook - A web based coding enviorment

Contributing

Feel free to submit pull requests to me.

Authors

Muhammad Haseeb - Initial work - Muhammad Haseeb

License

This project is licensed under the MIT License - see the LICENSE file for details

python-implementation-of-lsa's People

Stargazers

Watchers

python-implementation-of-lsa's Issues

Not getting proper results

I execute that code on files by reading files and storing in list which is document[] and then where you define new_doc , i take input from the user using input() and assign to new_doc but when i execute that program it shows me wrong result
your code shows me a list with file index and their weight but not the write one on top neither in top 10

response time

when we use our own data or document rather than your documents the output comes not valid....
for example if i take 5 documents that are
documents = ["Computer is a machine", "Dog is animal", "Hamza is human", "I am in Multan" , "How are you"]
and match "dog is animal"
then the output comes
[(0,1.0), (1,1.0),(2,1.0),(3,0.0),(4,0.0)]
which is invalid
and if i implement that code on text files than it take a large amount of time to execute
for example
if i read files from my pc using glob and stores the file in documents[] then your program take very large time to execute more than 1 hour

Recommend Projects

iam-mhaseeb / python-implementation-of-lsa Goto Github PK

python-implementation-of-lsa's Introduction

Python-Implementation-of-LSA

Prerequisites

Getting Started

Built With

Contributing

Authors

License

python-implementation-of-lsa's People

Stargazers

Watchers

Forkers

python-implementation-of-lsa's Issues

Not getting proper results

response time

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent