Giter VIP home page Giter VIP logo

python-implementation-of-lsa's Introduction

Python-Implementation-of-LSA

Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSA’s reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word–word and passage–word lexical priming data; and, as reported in 3 following articles in this issue, it accurately estimates passage coherence, learnability of passages by individual students, and the quality and quantity of knowledge contained in an essay.

Prerequisites

Things reuired

  1. Jupyter Notebook
  2. Python
  3. Gensim

Getting Started

To use this Code just download the repository & open it up in Jupyter Notebook. The code is ready for your next use, So what are you wating for? Start creating something awesome! Good Luck!

Built With

Contributing

Feel free to submit pull requests to me.

Authors

License

This project is licensed under the MIT License - see the LICENSE file for details

python-implementation-of-lsa's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

python-implementation-of-lsa's Issues

Not getting proper results

I execute that code on files by reading files and storing in list which is document[] and then where you define new_doc , i take input from the user using input() and assign to new_doc but when i execute that program it shows me wrong result
your code shows me a list with file index and their weight but not the write one on top neither in top 10

response time

when we use our own data or document rather than your documents the output comes not valid....
for example if i take 5 documents that are
documents = ["Computer is a machine", "Dog is animal", "Hamza is human", "I am in Multan" , "How are you"]
and match "dog is animal"
then the output comes
[(0,1.0), (1,1.0),(2,1.0),(3,0.0),(4,0.0)]
which is invalid
and if i implement that code on text files than it take a large amount of time to execute
for example
if i read files from my pc using glob and stores the file in documents[] then your program take very large time to execute more than 1 hour

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.