Giter VIP home page Giter VIP logo

meerkat's Introduction

Meerkat

Build Status

The Meerat library enables general combinator-style context-free path querying.

meerkat's People

Contributors

afroozeh avatar anastassija avatar darthorimar avatar gsvgit avatar ilya-nozhkin avatar loskutov avatar mixnik999 avatar sofysmol avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meerkat's Issues

Циклы. SPPFToTrees. Стартовая ввершина в графе.

При использовании Meerkat были обнаружены несколько ошибок:

  1. toDot.visit не умеет работать с циклами, возникает StackOverflowError
  2. SPPFToTrees не работает
  3. Нельзя выбрать стартовую вершину в графе (всегда с 0 должен начинаться)
trait Input[-L] {
  type M >: L
  type Edge = (M, Int)

  def length: Int

  def start: Int = 0

...

Все проверялось при таких данных https://gist.github.com/simonvar/5bbc9fba89af239552ba7c1555714468

Cases for static code analysis via CFL-reachcability

Useful wrappers for SPPF

Provide user-friendly representation of SPPF:

  • Set of triples
  • Lazy collection of trees
  • Lazy collection of paths (the path is a yield of the tree, so implement previous first)

Test. Графовые базы данных: алгоритмы поиска путей

Для тех, кто хочет заниматься задачей #54.

На языке программрования Scala создайте приложение, которое генерирует случайны граф с не менее чем 10000 вершин и 1000000 рёбер, сохраняет его в Neo4j, а затем выполняет несколько (не меньше 3) различных запросов на поиск пути к нему. Вы можете использовать любые доступные средства интеграции Neo4j и Scala.

Результат: ссылка на репозиторий с реализацией. Репозитроий должен быть снабжён всем необходимым для сборки и запуска приложения.

Semantic calculation for graph parsing

Meerkat supports semantic code specification, but it works only if the result of parsing is an SPPF with a single tree. SPPF is a compressed representation of parse forest. In case of graph parsing, we get an arbitrary SPPF which can contains an infinite set of trees. One of the possible solutions is lazy tree extraction. For each tree semantic code can be calculated by using existing methods. The result is a lazy collection of semantic values for each tree from SPPF.

  • Implement BFS-based lazy trees extraction.
  • Implement the interface for a user, which provides an ability to calculate semantic code for arbitrary parsing result.
  • Implement tests.

Automatic build results publishing

Travis provides integration with a wide range of deployment services: https://docs.travis-ci.com/user/deployment/

It would be great to automate the publishing of stable builds results.

  • List of builds/jars on the separated GitHub page with automated update
  • Automated versioning of CI build artifacts
  • Automated release process: versioning, tag creation, binaries publishing, release notes creation

Distributed version of Meerkat

Implement a distributed version of Meerkat. It is necessary for huge graphs parsing. (Synchronize with this task)

  • Naive graph partitioning (for the first step).
  • Implement distributed version of GLL. Without SPPF for the first step. It is necessary to share descriptors for border vertices. The main problem is a GSS sharing because descriptor contains a reference to GSS node. Seems, that GSS can be filtered such that it contains only vertices which reachable from descriptors which should be shared.
  • Testing. Create good test base.
  • Optimal graph partitioning. Synchronize with this issue.
  • Evaluation. Example of big RDF
  • Version with SPPF.

Evaluation

  • Create an image of Neo4j, which contains data from this data set.
  • Implement ability to switch between two modes of query evaluation: with SPPF and without SPPF.
  • Implement queries from the dataset mentioned above.
  • Do performance evaluation.
  • Create examples, which demonstrates features of combinators
    • Composability
    • Type safety
    • Semantic actions specification (look at #5)

Integration with JGraphT

Create an adapter for JGraphT library.

  • Use JGraphT graph as a data source. labels of vertices and edges may be an arbitrary type.
  • Provide user-friendly representation of query result. Look at #22. Note that paths should be constructed in terms of the input graph.

[Meta] Combinators for graph DB

  • Unified interface for graph-structured data querying: #21
  • Documentation (github.io): #27
  • Automated release porcess: #30
  • Parallel query processing (single machine, shared memory): #25
  • Destributed query processing: #24
  • Conjunctive languages support: #26

Interface for input

Create an interface for input data.

  • Generalization of linear and graph-structured data
  • Should be as general as possible
    Seems, that next functions are required.
  • Get outgoing tokens by position
  • Convert edg label to token
  • Convert vrt label to token

Parallel GLL

Try to implement a parallel version of GLL. Is it possible without tons of locks on all share data structures? Can it be really faster than sequential GLL?

Тестовая задача

Реализовать на Scala библиотеку парсер-комбинаторов. Снабдить её тестами и примерами. Опубликовать на github. Ссылку на репозиторий оставлять в комментариях к этой таске. Вопросы задавать там же.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.