Giter VIP home page Giter VIP logo

kidsearch's Introduction

kidsearch

A search engine completed by Qingsong Lv, Shulin Cao and Yifan Wang. Our tutors are Qian Yin and Xin Zheng.

This repository is based on a national undergraduate scientific research and innovation project: simplified Chinese search engine for kids.

Sorry this project is not available now, but will be available soon.(maybe in one year)

introduction

When we did kidsearch project, we were sophomores. As time going by, we realize that there are more we can do to make it more valuable. So we decide to create this github repository. This project aims to tidy up codes of kidsearch which were written by us from 2016 to 2017 and make part of them opened. We will try our best to make this project a unified system and provide as many APIs as we can.

We think the best explanation of APIs should be comments of codes, but there will also be some tutorials available soon. If you want to get some literal thoughts now, related work may help.

The initial version of our project is based on Java(Lucene), Python(Crawler), PHP(frontend) and Socket(Communication). The most useful part we think is socket because we added multi-threading in it. Since Python is so popular at present, we also use PyLucene to replace Lucene and Django to replace PHP, which can simplify part of socket communications to build another Python version of our project. Both of the two versions will be open-sourced.

Actually, our project is mainly for simplified Chinese search engine. The reason for using English in documents and comments is that we think this project may also helpful to some other languages.

environment

This project is aimed to help do some lightweight search engine tasks. So the running environment is mainly on Windows.

requirement

Python3.x (x>=5), Django(maybe django-rest is also needed?), PyLucene, Apache, MySQL.

Some other python packages are also needed: requests, ...

goal

Make a wonderful convenient Python package to do tasks about search engine. Here is an ideal example:

import kidsearch as ks
webpages = ks.crawler(['http://www.61tom.com', 'http://www.61baobao.com/'], max_page=1000, max_depth=10)
indexes = ks.make_index(webpages)
results = indexes.search(key_words)
print(ks.show(results))

related work

kidsearch's People

Contributors

1049451037 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.