Giter VIP home page Giter VIP logo

Comments (8)

cigolpl avatar cigolpl commented on September 21, 2024 1

hey @noraj,

ItemsAPI 2.0 is a totally new version we are working since march 2020. We are using and improving it in our commercial products hence no documentation yet. We are planning to document all features and API soon by using tool like that: https://github.com/Redocly/redoc

Please let me answer your question here:

  • Until there is no documentation you can see possible API endpoints and query parameters in JS client https://github.com/itemsapi/itemsapi-js.

  • I'll update soon an example to show more possible values in the config file. The config file defines mainly which facets and sorting fields you want to index and policy for displaying facets (conjunctive or disjunctive). Indexed fields means super fast searching and sorting on that fields

  • Architecture: The goal is to make it fast and start easily on one single machine (as the RAM and SSD is getting cheaper). We have a use case with 30M records with 30GB data and it search data in less than 250ms (by faceted search) on cheap server. It's the fastest faceted search solution on the market. We are using C++ for speed in core features, https://github.com/RoaringBitmap/CRoaring for bitmap data structure, https://github.com/simdjson/simdjson for fast JSON parsing and optimized sets algorithms for faceted search.

  • The process of creating index: you provide config with defined faceted and sorting fields so the system know what to index when you provide data. Then you need to provide data in JSON and system parse it and tokenize it and store all terms in bitmap data structure. Once it's done you are ready to make searches. Besides API there is also UI which allows you to explore data

from itemsapi.

cigolpl avatar cigolpl commented on September 21, 2024 1

There is a new documentation for ItemsAPI - https://docs.itemsapi.com/. It's an early version and it'll be continuously improved over the time. I am closing this thread for now

from itemsapi.

noraj avatar noraj commented on September 21, 2024

@cigolpl Thanks for you quick answer.

ItemsAPI 2.0 is a totally new version we are working since march 2020

Can you tell me the main differences from version 1.x ? Does 2.0 totally deprecate the 1.x ?

Maybe the easiest is that I explain you my end goal.

I'm the dev of this website Rawsec's CyberSecurity Inventory. It's 100% static.
The inventory is referencing a lot of tools and resources, all the data are structured into several JSON files. There are many fields: name, website, source, description, language, price, etc.
I wanted to create a main search page/site that would let users search efficiently through all the data (searching though at least title/name and description, but also through keywords/tags in the future) but also leting them use some filters/dorks like language:ruby or price:free.
Ultimately I wanted to use a 100% static search engine (no server), so either search done in memory (JS), or a big index/archive done at build + a smaller JS doing the search using the pre-built big index.
I listed a rather long list of potential solutions, but some well known solutions are complex to use and setup, others less known are undocumented and/or not maintained.
In the end I resigned myself to search for a dynamic search engine and the only promising solution I found that was not overwhelming and complex to setup was ItemsApi. But since my first comment I figured you also have made https://github.com/itemsapi/itemsjs wich seems to be the 100% static in browser solution, so perfectly what I'm looking for.
However itemsjs seems more like a PoC since it's limited to 1000 elements, wasn't updated for nearly 1 year, and some points in issues seems still pending.
So do you plan to update ItemsJS for a 2.0 like it's planned for ItemsApi?

from itemsapi.

cigolpl avatar cigolpl commented on September 21, 2024

@noraj, version 1.x is a Node.js API layer over Elasticsearch. 1.x and 2.x has very similar search and indexing API. The engine in ItemsAPI 2.x is completely new based on our experience and problems with Elasticsearch

Most of your listed potential solutions seems good for a full text search or doing search like language:ruby or price:free. They are missing faceted search though.. ItemsJS has both - full text and faceted search.

There is plan to make ItemsJS 2.x which covers feedback from issues, upgrade its search engine to Lunr 2.0 and improve faceted search algorithms

If you don't have very sophisticated case ItemsJS 1.x still should work for you. It has been used by amazon https://aws-startup-response.github.io/ as a static website and other companies / people. It can work for even 5000 elements but it depends on how many filters and elements in filters you have. The more the slower. Unfortunately there is no precomputing index option at that stage. That could be also in ItemsJS 2.x.

If you prepare all your data like that:

[{
  name: "Dart",
  // used for full text search
  description: "DART is a test documentation tool created by the.."
  // can be array of string value depends if many licenses are possible
  license: ["Apache 2.0"],
  language: ["Python"],
  price: "Free",
  online: "Yes",
  category: ["Collaboration and Report"],
  tags: ["documentation", "reporting", "test"]
}, {}, {}]

then you could test it out with modification of that code https://jsfiddle.net/cigol/0ef9qeos/5/

from itemsapi.

noraj avatar noraj commented on September 21, 2024

Most of your listed potential solutions seems good for a full text search or doing search like language:ruby or price:free. They are missing faceted search though.. ItemsJS has both - full text and faceted search.

Yeah, ItemsJS was definitely my favorite one, I'll try it out for sure.

There is plan to make ItemsJS 2.x which covers feedback from issues, upgrade its search engine to Lunr 2.0 and improve faceted search algorithms

I subscribed for release notification, so when 2.0 will be out, I'll upgrade to 2.0 but I'll try the 1.0 in between.

then you could test it out with modification of that code https://jsfiddle.net/cigol/0ef9qeos/5/

I'll try that. Thanks a lot again.

Update: Is it possible to use nested JSON like that: https://pastebin.com/Z1eAxvTb or is flat JSON mandatory (so I'll need to prepare data even more)? The element from ctf_platforms, operating_systems, resources and tools have differents structure (even inside of of these categories), is that a problem if a field is missing from an element? Else while if I need to prepare data to flatten it, I would have to add a category and sub-category field.

from itemsapi.

cigolpl avatar cigolpl commented on September 21, 2024

Update: Is it possible to use nested JSON like that: https://pastebin.com/Z1eAxvTb or is flat JSON mandatory (so I'll need to prepare data even more)? The element from ctf_platforms, operating_systems, resources and tools have differents structure (even inside of of these categories), is that a problem if a field is missing from an element? Else while if I need to prepare data to flatten it, I would have to add a category and sub-category field.

Only flatten JSON is supported right now.. Also at some point we consider adding multi level facets / filters like here https://pypi.org/search/

from itemsapi.

noraj avatar noraj commented on September 21, 2024

@cigolpl I'm curious to known with which tool you generated the doc and where the sources are?

from itemsapi.

cigolpl avatar cigolpl commented on September 21, 2024

@noraj the docs is generated by https://vuepress.vuejs.org/. The source is locally. If you'd like to help with docs I'll publish code on github.

from itemsapi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.