Giter VIP home page Giter VIP logo

typesense / firestore-typesense-search Goto Github PK

View Code? Open in Web Editor NEW
149.0 149.0 25.0 683 KB

Firebase Extension to automatically push Firestore documents to Typesense for full-text search with typo tolerance, faceting, and more

Home Page: https://extensions.dev/extensions/typesense/firestore-typesense-search

License: Apache License 2.0

JavaScript 100.00%
firebase firebase-extensions firebase-firestore firebase-search firestore typesense

firestore-typesense-search's Introduction

Typesense

Typesense is a fast, typo-tolerant search engine for building delightful search experiences.

An Open Source Algolia Alternative &
An Easier-to-Use ElasticSearch Alternative


Website | Documentation | Roadmap | Slack Community | Community Threads | Twitter


Typesense Demo

✨ Here are a couple of live demos that show Typesense in action on large datasets:

🗣️ 🎥 If you prefer watching videos:

Quick Links

Features

  • Typo Tolerance: Handles typographical errors elegantly, out-of-the-box.
  • Simple and Delightful: Simple to set-up, integrate with, operate and scale.
  • ⚡ Blazing Fast: Built in C++. Meticulously architected from the ground-up for low-latency (<50ms) instant searches.
  • Tunable Ranking: Easy to tailor your search results to perfection.
  • Sorting: Dynamically sort results based on a particular field at query time (helpful for features like "Sort by Price (asc)").
  • Faceting & Filtering: Drill down and refine results.
  • Grouping & Distinct: Group similar results together to show more variety.
  • Federated Search: Search across multiple collections (indices) in a single HTTP request.
  • Geo Search: Search and sort by results around a latitude/longitude or within a bounding box.
  • Vector Search: Index embeddings from your machine learning models in Typesense and do a nearest-neighbor search. Can be used to build similarity search, semantic search, visual search, recommendations, etc.
  • Semantic / Hybrid Search: Automatically generate embeddings from within Typesense using built-in models like S-BERT, E-5, etc or use OpenAI, PaLM API, etc, for both queries and indexed data. This allows you to send JSON data into Typesense and build an out-of-the-box semantic search + keyword search experience.
  • Conversational Search (Built-in RAG): Send questions to Typesense and have the response be a fully-formed sentence, based on the data you've indexed in Typesense. Think ChatGPT, but over your own data.
  • Image Search: Search through images using text descriptions of their contents, or perform similarity searches, using the CLIP model.
  • Voice Search: Capture and send query via voice recordings - Typesense will transcribe (via Whisper model) and provide search results.
  • Scoped API Keys: Generate API keys that only allow access to certain records, for multi-tenant applications.
  • JOINs: Connect one or more collections via common reference fields and join them during query time. This allows you to model SQL-like relationships elegantly.
  • Synonyms: Define words as equivalents of each other, so searching for a word will also return results for the synonyms defined.
  • Curation & Merchandizing: Boost particular records to a fixed position in the search results, to feature them.
  • Raft-based Clustering: Setup a distributed cluster that is highly available.
  • Seamless Version Upgrades: As new versions of Typesense come out, upgrading is as simple as swapping out the binary and restarting Typesense.
  • No Runtime Dependencies: Typesense is a single binary that you can run locally or in production with a single command.

Don't see a feature on this list? Search our issue tracker if someone has already requested it and add a comment to it explaining your use-case, or open a new issue if not. We prioritize our roadmap based on user feedback, so we'd love to hear from you.

Roadmap

Here's Typesense's public roadmap: https://github.com/orgs/typesense/projects/1.

The first column also explains how we prioritize features, how you can influence prioritization and our release cadence.

Benchmarks

  • A dataset containing 2.2 Million recipes (recipe names and ingredients):
    • Took up about 900MB of RAM when indexed in Typesense
    • Took 3.6mins to index all 2.2M records
    • On a server with 4vCPUs, Typesense was able to handle a concurrency of 104 concurrent search queries per second, with an average search processing time of 11ms.
  • A dataset containing 28 Million books (book titles, authors and categories):
    • Took up about 14GB of RAM when indexed in Typesense
    • Took 78mins to index all 28M records
    • On a server with 4vCPUs, Typesense was able to handle a concurrency of 46 concurrent search queries per second, with an average search processing time of 28ms.
  • With a dataset containing 3 Million products (Amazon product data), Typesense was able to handle a throughput of 250 concurrent search queries per second on an 8-vCPU 3-node Highly Available Typesense cluster.

We'd love to benchmark with larger datasets, if we can find large ones in the public domain. If you have any suggestions for structured datasets that are open, please let us know by opening an issue. We'd also be delighted if you're able to share benchmarks from your own large datasets. Please send us a PR!

Who's using this?

Typesense is used by a range of users across different domains and verticals.

On Typesense Cloud we serve more than 3 BILLION searches per month. Typesense's Docker images have been downloaded over 12M times.

We've recently started documenting who's using it in our Showcase. If you'd like to be included in the list, please feel free to edit SHOWCASE.md and send us a PR.

You'll also see a list of user logos on the Typesense Cloud home page.

Install

Option 1: You can download the binary packages that we publish for Linux (x86_64 & arm64) and Mac (x86_64).

Option 2: You can also run Typesense from our official Docker image.

Option 3: Spin up a managed cluster with Typesense Cloud:

Deploy with Typesense Cloud

Quick Start

Here's a quick example showcasing how you can create a collection, index a document and search it on Typesense.

Let's begin by starting the Typesense server via Docker:

docker run -p 8108:8108 -v/tmp/data:/data typesense/typesense:26.0 --data-dir /data --api-key=Hu52dwsas2AdxdE

We have API Clients in a couple of languages, but let's use the Python client for this example.

Install the Python client for Typesense:

pip install typesense

We can now initialize the client and create a companies collection:

import typesense

client = typesense.Client({
  'api_key': 'Hu52dwsas2AdxdE',
  'nodes': [{
    'host': 'localhost',
    'port': '8108',
    'protocol': 'http'
  }],
  'connection_timeout_seconds': 2
})

create_response = client.collections.create({
  "name": "companies",
  "fields": [
    {"name": "company_name", "type": "string" },
    {"name": "num_employees", "type": "int32" },
    {"name": "country", "type": "string", "facet": True }
  ],
  "default_sorting_field": "num_employees"
})

Now, let's add a document to the collection we just created:

document = {
 "id": "124",
 "company_name": "Stark Industries",
 "num_employees": 5215,
 "country": "USA"
}

client.collections['companies'].documents.create(document)

Finally, let's search for the document we just indexed:

search_parameters = {
  'q'         : 'stork',
  'query_by'  : 'company_name',
  'filter_by' : 'num_employees:>100',
  'sort_by'   : 'num_employees:desc'
}

client.collections['companies'].documents.search(search_parameters)

Did you notice the typo in the query text? No big deal. Typesense handles typographic errors out-of-the-box!

Step-by-step Walk-through

A step-by-step walk-through is available on our website here.

This will guide you through the process of starting up a Typesense server, indexing data in it and querying the data set.

API Documentation

Here's our official API documentation, available on our website: https://typesense.org/api.

If you notice any issues with the documentation or walk-through, please let us know or send us a PR here: https://github.com/typesense/typesense-website.

API Clients

While you can definitely use CURL to interact with Typesense Server directly, we offer official API clients to simplify using Typesense from your language of choice. The API Clients come built-in with a smart retry strategy to ensure that API calls made via them are resilient, especially in an HA setup.

If we don't offer an API client in your language, you can still use any popular HTTP client library to access Typesense's APIs directly.

Here are some community-contributed clients and integrations:

We welcome community contributions to add more official client libraries and integrations. Please reach out to us at [email protected] or open an issue on GitHub to collaborate with us on the architecture. 🙏

Framework Integrations

We also have the following framework integrations:

Postman Collection

We have a community-maintained Postman Collection here: https://github.com/typesense/postman.

Postman is an app that let's you perform HTTP requests by pointing and clicking, instead of having to type them out in the terminal. The Postman Collection above gives you template requests that you can import into Postman, to quickly make API calls to Typesense.

Search UI Components

You can use our InstantSearch.js adapter to quickly build powerful search experiences, complete with filtering, sorting, pagination and more.

Here's how: https://typesense.org/docs/guide/search-ui-components.html

FAQ

How does this differ from Elasticsearch?

Elasticsearch is a large piece of software, that takes non-trivial amount of effort to setup, administer, scale and fine-tune. It offers you a few thousand configuration parameters to get to your ideal configuration. So it's better suited for large teams who have the bandwidth to get it production-ready, regularly monitor it and scale it, especially when they have a need to store billions of documents and petabytes of data (eg: logs).

Typesense is built specifically for decreasing the "time to market" for a delightful search experience. It's a light-weight yet powerful & scaleable alternative that focuses on Developer Happiness and Experience with a clean well-documented API, clear semantics and smart defaults so it just works well out-of-the-box, without you having to turn many knobs.

Elasticsearch also runs on the JVM, which by itself can be quite an effort to tune to run optimally. Typesense, on the other hand, is a single light-weight self-contained native binary, so it's simple to setup and operate.

See a side-by-side feature comparison here.

How does this differ from Algolia?

Algolia is a proprietary, hosted, search-as-a-service product that works well, when cost is not an issue. From our experience, fast growing sites and apps quickly run into search & indexing limits, accompanied by expensive plan upgrades as they scale.

Typesense on the other hand is an open-source product that you can run on your own infrastructure or use our managed SaaS offering - Typesense Cloud. The open source version is free to use (besides of course your own infra costs). With Typesense Cloud we don't charge by records or search operations. Instead, you get a dedicated cluster and you can throw as much data and traffic at it as it can handle. You only pay a fixed hourly cost & bandwidth charges for it, depending on the configuration your choose, similar to most modern cloud platforms.

From a product perspective, Typesense is closer in spirit to Algolia than Elasticsearch. However, we've addressed some important limitations with Algolia:

Algolia requires separate indices for each sort order, which counts towards your plan limits. Most of the index settings like fields to search, fields to facet, fields to group by, ranking settings, etc are defined upfront when the index is created vs being able to set them on the fly at query time.

With Typesense, these settings can be configured at search time via query parameters which makes it very flexible and unlocks new use cases. Typesense is also able to give you sorted results with a single index, vs having to create multiple. This helps reduce memory consumption.

Algolia offers the following features that Typesense does not have currently: personalization & server-based search analytics. For analytics, you can still instrument your search on the client-side and send search metrics to your web analytics tool of choice.

We intend to bridge this gap in Typesense, but in the meantime, please let us know if any of these are a show stopper for your use case by creating a feature request in our issue tracker.

See a side-by-side feature comparison here.

Speed is great, but what about the memory footprint?

A fresh Typesense server will consume about 30 MB of memory. As you start indexing documents, the memory use will increase correspondingly. How much it increases depends on the number and type of fields you index.

We've strived to keep the in-memory data structures lean. To give you a rough idea: when 1 million Hacker News titles are indexed along with their points, Typesense consumes 165 MB of memory. The same size of that data on disk in JSON format is 88 MB. If you have any numbers from your own datasets that we can add to this section, please send us a PR!

Why the GPL license?

From our experience companies are generally concerned when libraries they use are GPL licensed, since library code is directly integrated into their code and will lead to derivative work and trigger GPL compliance. However, Typesense Server is server software and we expect users to typically run it as a separate daemon, and not integrate it with their own code. GPL covers and allows for this use case generously (eg: Linux is GPL licensed). Now, AGPL is what makes server software accessed over a network result in derivative work and not GPL. And for that reason we’ve opted to not use AGPL for Typesense.

Now, if someone makes modifications to Typesense server, GPL actually allows you to still keep the modifications to yourself as long as you don't distribute the modified code. So a company can for example modify Typesense server and run the modified code internally and still not have to open source their modifications, as long as they make the modified code available to everyone who has access to the modified software.

Now, if someone makes modifications to Typesense server and distributes the modifications, that's where GPL kicks in. Given that we’ve published our work to the community, we'd like for others' modifications to also be made open to the community in the spirit of open source. We use GPL for this purpose. Other licenses would allow our open source work to be modified, made closed source and distributed, which we want to avoid with Typesense for the project’s long term sustainability.

Here's more background on why GPL, as described by Discourse: https://meta.discourse.org/t/why-gnu-license/2531. Many of the points mentioned there resonate with us.

Now, all of the above only apply to Typesense Server. Our client libraries are indeed meant to be integrated into our users’ code and so they use Apache license.

So in summary, AGPL is what is usually problematic for server software and we’ve opted not to use it. We believe GPL for Typesense Server captures the essence of what we want for this open source project. GPL has a long history of successfully being used by popular open source projects. Our libraries are still Apache licensed.

If you have specifics that prevent you from using Typesense due to a licensing issue, we're happy to explore this topic further with you. Please reach out to us.

Support

👋 🌐 If you have general questions about Typesense, want to say hello or just follow along, we'd like to invite you to join our public Slack Community.

If you run into any problems or issues, please create a GitHub issue and we'll try our best to help.

We strive to provide good support through our issue trackers on GitHub. However, if you'd like to receive private & prioritized support with:

  • Guaranteed SLAs
  • Phone / video calls to discuss your specific use case and get recommendations on best practices
  • Private discussions over Slack
  • Guidance around scaling best practices
  • Prioritized feature requests

We offer Paid Support options described here.

Contributing

We are a lean team on a mission to democratize search and we'll take all the help we can get! If you'd like to get involved, here's information on where we could use your help: Contributing.md

Getting Latest Updates

If you'd like to get updates when we release new versions, click on the "Watch" button on the top and select "Releases only". GitHub will then send you notifications along with a changelog with each new release.

We also post updates to our Twitter account about releases and additional topics related to Typesense. Follow us here: @typesense.

👋 🌐 We'll also post updates on our Slack Community.

Build from source

We use Bazel to build Typesense.

Typesense requires the following dependencies:

  • C++11 compatible compiler (GCC >= 4.9.0, Apple Clang >= 8.0, Clang >= 3.9.0)
  • Snappy
  • zlib
  • OpenSSL (>=1.0.2)
  • curl
  • ICU

Please refer to the CI build steps for the latest set of dependencies.

Once you've installed them, run the following from the root of the repo:

bazel build //:typesense-server

The first build will take some time since other third-party libraries are pulled and built as part of the build process.


© 2016-present Typesense Inc.

firestore-typesense-search's People

Contributors

dackers86 avatar jasonbosco avatar jperasmus avatar keisukeyamashita avatar kylewill avatar orionreed avatar root-dir avatar yamankatby avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

firestore-typesense-search's Issues

Collections with ids that contain spaces cause errors in the emulated environment.

Description

One of my top level collections that has this extension configured contains a space in its name: "zip codes". Although this appears to work fine in a production environment, it throws errors in the emulated dev environment. From firebase-debug.log and the thrown exception, it appears that the space leads to parsing errors. Here's the relevant excerpt of the logs during emulator initialization.

[debug] [2023-03-27T21:44:49.727Z] addFirestoreTrigger "{\"eventTrigger\":{\"eventType\":\"providers/cloud.firestore/eventTypes/document.write\",\"resource\":\"projects/millies-catering/databases/(default)/documents/zip codes/{documentID}\",\"service\":\"firestore.googleapis.com\"}}"

Any write to this collection throws an error:

[debug] [2023-03-27T22:00:37.772Z] Mar 27, 2023 6:00:37 PM com.google.cloud.datastore.emulator.impl.util.WrappedStreamObserver onError
INFO: operation failed: Invalid pattern. Reason: [59: codes/{documentID}] Unrecognized character(s) at end of path.
java.lang.IllegalArgumentException: Invalid pattern. Reason: [59: codes/{documentID}] Unrecognized character(s) at end of path.
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:151)
	at com.google.firebase.rules.eventflow.client.path.PathPattern.compileInternal(PathPattern.java:79)
	at com.google.firebase.rules.eventflow.client.path.PathPattern.compile(PathPattern.java:51)
	at com.google.firebase.rules.eventflow.client.EventRuleEvaluatorImpl.lambda$new$0(EventRuleEvaluatorImpl.java:56)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:446)
	at com.google.firebase.rules.eventflow.client.EventRuleEvaluatorImpl.<init>(EventRuleEvaluatorImpl.java:50)
	at com.google.firebase.rules.eventflow.client.EventRuleEvaluatorImpl.<init>(EventRuleEvaluatorImpl.java:37)
	at com.google.cloud.datastore.emulator.impl.events.EventManager.reportEvents(EventManager.java:84)
	at com.google.cloud.datastore.emulator.impl.CloudFirestoreV1.commitHelper(CloudFirestoreV1.java:1014)
	at com.google.cloud.datastore.emulator.impl.CloudFirestoreV1.internalCommit(CloudFirestoreV1.java:902)
	at com.google.cloud.datastore.emulator.impl.CloudFirestoreV1.commit(CloudFirestoreV1.java:450)
	at com.google.cloud.datastore.emulator.impl.CloudFirestoreV1Router.commit(CloudFirestoreV1Router.java:152)
	at com.google.cloud.datastore.emulator.firestore.v1.FirestoreV1GrpcAdapter$1.lambda$commit$8(FirestoreV1GrpcAdapter.java:160)
	at com.google.cloud.datastore.emulator.firestore.v1.FirestoreV1GrpcAdapter.unary(FirestoreV1GrpcAdapter.java:70)
	at com.google.cloud.datastore.emulator.firestore.v1.FirestoreV1GrpcAdapter$1.commit(FirestoreV1GrpcAdapter.java:160)
	at com.google.firestore.v1.FirestoreGrpc$MethodHandlers.invoke(FirestoreGrpc.java:1286)
	at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
	at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
	at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
	at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
	at io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
	at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
	at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
	at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
	at io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
	at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:355)
	at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:867)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1589)

Steps to reproduce

Create a collection whose name contains a space. (e.g. zip codes)

Expected Behavior

All valid collection names should be allowed, as per:
https://firebase.google.com/docs/firestore/quotas#limits

Actual Behavior

Spaces result in unexpected segmentation within some parser, preventing their uses.

Metadata

Typesense version: 0.24.0
Extension version: typesense/[email protected]
OS: Windows 11

Firebase cloud function error

Description

When creating or upserting a document, or a backfill operation the cloud functions fails with the code error "EAI_AGAIN getaddrinfo EAI_AGAIN https". The same issue occurs with both cloud functions (ext-firestore-typesense-search-backfillToTypesenseFromFirestore, ext-firestore-typesense-search-indexToTypesenseOnFirestoreWrite).
The following are the logs from the cloud function:

  • Function execution started
  • Creating document {"id":"UMA-NICO-3"}
  • Request #1629896880473: Request to Node 0 failed due to "EAI_AGAIN getaddrinfo EAI_AGAIN https"
  • Request #1629896880473: Sleeping for 120s and then retrying request...
  • Function execution took 120003 ms, finished with status: 'timeout'
    I tried running typesense on a virtual machine and typesense cloud, both cases with the same errors.
    I also tried creating a collection and a document from a JS client to the typesense cloud and worked fine.
    Firebase plan is BLAZE.

Steps to reproduce

Project created in Firebase, typesense cloud created, cloud function triggered by a new document (or update).

Expected Behavior

Documents created and updated on typesense cloud.

Actual Behavior

No documents or collections are synced from Fireabase to typesense cloud.

Metadata

Typsense Version: v0.21.0

OS: Linux Ubuntu (used on the VM)

Resource_Error

Hi Typesense Team

i get the following message when i try to install this extension with the console and cli

RESOURCE_ERROR at /deployments/firebase-ext-firestore-typesense-search/resources/backfillToTypesenseFromFirestore: {"ResourceType":"gcp-types/cloudfunctions-v1:projects.locations.functions","ResourceErrorCode":"400","ResourceErrorMessage":"Build failed: {"metrics":{},"error":{"buildpackId":"","buildpackVersion":"","errorType":"OK","canonicalCode":"OK","errorId":"","errorMessage":""},"stats":[{"buildpackId":"google.nodejs.runtime","buildpackVersion":"1.0.0","totalDurationMs":1496,"userDurationMs":0},{"buildpackId":"google.utils.archive-source","buildpackVersion":"0.0.1","totalDurationMs":45,"userDurationMs":44},{"buildpackId":"google.nodejs.npm","buildpackVersion":"1.0.0","totalDurationMs":6428,"userDurationMs":6185},{"buildpackId":"google.nodejs.functions-framework","buildpackVersion":"0.9.4","totalDurationMs":3756,"userDurationMs":3541},{"buildpackId":"google.config.entrypoint","buildpackVersion":"0.9.0","totalDurationMs":0,"userDurationMs":0},{"buildpackId":"google.utils.label","buildpackVersion":"0.0.2","totalDurationMs":0,"userDurationMs":0}],"warnings":null,"customImage":false}"}; RESOURCE_ERROR at /deployments/firebase-ext-firestore-typesense-search/resources/indexToTypesenseOnFirestoreWrite: {"ResourceType":"gcp-types/cloudfunctions-v1:projects.locations.functions","ResourceErrorCode":"400","ResourceErrorMessage":"Build failed: {"metrics":{},"error":{"buildpackId":"","buildpackVersion":"","errorType":"OK","canonicalCode":"OK","errorId":"","errorMessage":""},"stats":[{"buildpackId":"google.nodejs.runtime","buildpackVersion":"1.0.0","totalDurationMs":1485,"userDurationMs":0},{"buildpackId":"google.utils.archive-source","buildpackVersion":"0.0.1","totalDurationMs":59,"userDurationMs":59},{"buildpackId":"google.nodejs.npm","buildpackVersion":"1.0.0","totalDurationMs":6677,"userDurationMs":6425},{"buildpackId":"google.nodejs.functions-framework","buildpackVersion":"0.9.4","totalDurationMs":3689,"userDurationMs":3477},{"buildpackId":"google.config.entrypoint","buildpackVersion":"0.9.0","totalDurationMs":0,"userDurationMs":0},{"buildpackId":"google.utils.label","buildpackVersion":"0.0.2","totalDurationMs":0,"userDurationMs":0}],"warnings":null,"customImage":false}"}

when i look at the cloud functions there is a warning that the functions could not be built - log message from the function:

Build failed: {"metrics":{},"error":{"buildpackId":"","buildpackVersion":"","errorType":"OK","canonicalCode":"OK","errorId":"","errorMessage":""},"stats":[{"buildpackId":"google.nodejs.runtime","buildpackVersion":"1.0.0","totalDurationMs":1437,"userDurationMs":0},{"buildpackId":"google.utils.archive-source","buildpackVersion":"0.0.1","totalDurationMs":42,"userDurationMs":42},{"buildpackId":"google.nodejs.npm","buildpackVersion":"1.0.0","totalDurationMs":6883,"userDurationMs":6640},{"buildpackId":"google.nodejs.functions-framework","buildpackVersion":"0.9.4","totalDurationMs":3529,"userDurationMs":3323},{"buildpackId":"google.config.entrypoint","buildpackVersion":"0.9.0","totalDurationMs":0,"userDurationMs":0},{"buildpackId":"google.utils.label","buildpackVersion":"0.0.2","totalDurationMs":0,"userDurationMs":0}],"warnings":null,"customImage":false}

Thanks
Michael

Firebase extension host

Description

I have setup the typesense extension in my Firebase project, but when trying to add a document, i get the following error:

Error detected in ext-firestore-typesense-search-indexToTypesenseOnFirestoreWrite {"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.Insight","errorGroup":"CNTlnprCi-zYJw","exceptionInfo":{"type":"Error","message":"timeout of 10000ms exceeded"},"errorEvent":{"eventTime":"2021-07-08T20:47:32.503Z","message":"Error: timeout of 10000ms exceeded\n at createError (/workspace/node_modules/axios/lib/core/createError.js:16:15)\n at RedirectableRequest.handleRequestTimeout (/workspace/node_modules/axios/lib/adapters/http.js:280:16)\n at RedirectableRequest.emit (events.js:376:20)\n at RedirectableRequest.emit (domain.js:532:15)\n at Timeout._onTimeout (/workspace/node_modules/follow-redirects/index.js:166:12)\n at listOnTimeout (internal/timers.js:555:17)\n at processTimers (internal/timers.js:498:7)","serviceContext":{"resourceType":"cloud_function","service":"ext-firestore-typesense-search-indexToTypesenseOnFirestoreWrite"}}}

I am hosting the server on my local machine for the time being, so perhaps the problem is stemming from the host name that I used when setting up the extension. I set up port forwarding in order to get a global IP address, and used that address XX.XX.XX:8108 as the host name. Sending curl requests via the terminal to that host name works, however.

Can't install extension via console

Please check screenshot. I think this issue is linked with service account permissions but i can find which service account is associated with extension.

Selection_008

Typesense Firebase Extension issue

With the typesense extensions in Firebase, I have just been prompted by Firebase extension manager to update - however, the update seems like it might be a bit glitched on the versioning? The version which installed from the marketplace is 1.0.1 (which is also what shows as latest here https://extensions.dev/extensions/typesense/firestore-typesense-search) however, it shows to "update" that version (pictured). Choosing to update, suggests to update to 0.4.2 and doing so then shows the extension is "up to date" when "updated" to 0.4.2 so we are not sure if it's a Firebase extension issue or a Typesense one.

Screenshot 2023-05-08 at 12 33 05 pm
Screenshot 2023-05-08 at 12 32 17 pm
Screenshot 2023-05-08 at 12 31 53 pm
Screenshot 2023-05-08 at 12 30 50 pm

Support backfill documents in subcollections

Description

This is a request to support backfill all subcollections return from a collection group query. This could be helpful in use case like, backfill landmarks data from all cities under path: cities/{cityId}/landmarks.

Return unflattened objects from search

Description

I'm trying to sync unflattened objects into my Typesense collection, however I get the following error:

importResults: [
    {
      code: 404,
      document: '{"title":"Title","address.postalCode":"10785","address.country":"Germany","address.addressLine1":"Neue Nationalgalerie","address.addressLine2":"Potsdamer Straße 50, 10785 Berlin, Germany","address.state":"","address.city":"Berlin"}',
      error: 'Field `address` not found.',
      success: false
    }]

Typesense config

{
  "name": "events",
  "fields": [
   {
      "name": "title",
      "type": "string"
    },
    {
      "name": "address",
      "type": "object"
    }
  ],
  "enable_nested_fields": true
}

If I change the type of the "address" to "auto", the sync works, but the "address" field is flattened:

address.addressLine1 Neue Nationalgalerie
address.addressLine2 Potsdamer Straße 50, 10785 Berlin, Germany
address.city Berlin
address.country Germany
address.postalCode 10785
address.state 

I would like to be able to search by all "address" properties, but have the search return it as an unflattened object. I don't mind it returning both types of results, but I don't want to have to parse flattened objects into JS objects. According to the docs, I can send both flattened and unflattened version of the object, but I haven't been able to achieve this.

To simplify traversing the data in the results, you might want to send both the flattened and unflattened version of the nested fields into Typesense, and only set the flattened keys as indexed in the collection's schema and use them for search/filtering/faceting. At display time when parsing the results, you can then use the nested version. (https://typesense.org/docs/0.24.0/api/collections.html#notes-on-indexing-common-types-of-data)

As a side note, I have been able to sync and search arrays of objects using this simple configuration:

    {
      "index": true,
      "name": "content",
      "optional": false,
      "type": "object[]"
    },

Expected Behavior

Return objects from search.

Actual Behavior

Objects in search results are flattened.

Metadata

Typesense Version: v0.24.0 (typesense-js v.1.5.2)

Out-of-order function triggers result in data inconsistency between Firestore -> Typesense

Description

As noted in the Firebase functions documentation, background function triggers are not guaranteed to be received in the order the changes occured.

When rapid changes occur in the database, out-of-order events will result in out-of-order processing, meaning Typesense can end up with stale data in it which is not consistent with the final Firestore resting state after the influx of changes.

From all of the reading I've done, it is a fools-errand to try and fight the system to process them in order -- nevertheless without a solution, extensions like this are difficult to use in production environments.

Steps to reproduce

  1. Write a demo script / perform an app action rapidly that writes to the database
  2. Observe that the events are received out of order
  3. Quite often, this will result in Firestore containing the wrong data from an old event

Expected Behavior

Regardless of the order the events are received, only events newer than the last-processed event should be processed.

Actual Behavior

Inconsistent data in Typesense

Other Findings / Info

In trying to solve for this, I've attempted to use a Firestore collection to "lock" the item being updated, send to Typesense, and then unlock for future events after writing event context back to the lock endpoint.

Simplified example (RTDB trigger used here, but Firestore is affected by this as well):

.database.ref(`tasks/{itemId}`)
.onWrite(async (change, context) => {
  /**
   * We use a Firestore transaction to create a "lock" at a DB location
   * for a given `itemId`
   */
    const { timestamp, eventId } = context;
    const { itemId } = context.params;
    const timestampRef = firestore
      .collection(`typesenseLocks_tasks`)
      .doc(itemId);
    await admin.firestore().runTransaction(async transaction => {
      const dataChangedTimestamp = new Date(timestamp).getTime();
      const lastUpdatedTimestampDoc = await transaction.get(timestampRef);
      const lastUpdatedData = lastUpdatedTimestampDoc.data();

      /**
       * If this is the first time this document was changed (no previous locks),
       * or the last-stored lock timestamp is older than the current event's timestamp,
       * prepare a payload and send to Typesense.
       */
      if (
        (!lastUpdatedData?.timestamp ||
          dataChangedTimestamp > lastUpdatedData.timestamp) &&
        lastUpdatedData?.eventId !== eventId
      ) {
        // Send to Typesense
        await updateTypesense(change, indexer, itemId);

        // Finalize Transaction by writing the event timestamp and eventId back to Firestore for comparison by the next events
        transaction.set(timestampRef, {
          timestamp: dataChangedTimestamp,
          eventId
        });
      } else {
        /**
         * Do nothing, current event is older than last-indexed event already recorded, can be safely discarded
         */
      }
    });

In an attempt to fix this, I've tried moving the await updateTypesense(change, indexer, itemId); line inside and outside of the transaction function block, but in both scenarios stale data ends up inside Typesense.

From logs I've added, the above transaction appears to work, old events are dropped if a newer one has already been processed. But if two events that are extremely close together are processed in the correct order, Typesense occasionally ends up with the first event's payload.

This led me to believe that perhaps Typesense was parallelizing indexing operations, but since they are confirmed as serialized, I'm back at square one.

Metadata

Typsense Version: v0.23
OS: Typesense Cloud
Relevant Slack Thread: https://typesense-community.slack.com/archives/C01P749MET0/p1655047418496499

port and protocol config

Would be nice to be able to configure the server port and protocol.
I have a typesense server on http and port 8108 but can't connect this extension to it, it defaults to https on port 443.

Backfill function attempts to rerun admin.initializeApp on rerun

I ran backfill, had to fix a schema issue (preventing backfill from making updates) and re-ran backfill soon after.

Firebase Functions logs shows this error:

Error: The default Firebase app already exists. This means you called initializeApp() more than once without providing an app name as the second argument. In most cases you only need to call initializeApp() once. But if you do want to initialize multiple apps, pass a second argument to initializeApp() to give each app a unique name.
    at FirebaseAppError.FirebaseError [as constructor] (/workspace/node_modules/firebase-admin/lib/utils/error.js:44:28)
    at FirebaseAppError.PrefixedFirebaseError [as constructor] (/workspace/node_modules/firebase-admin/lib/utils/error.js:90:28)
    at new FirebaseAppError (/workspace/node_modules/firebase-admin/lib/utils/error.js:125:28)
    at FirebaseNamespaceInternals.initializeApp (/workspace/node_modules/firebase-admin/lib/firebase-namespace.js:65:23)
    at FirebaseNamespace.initializeApp (/workspace/node_modules/firebase-admin/lib/firebase-namespace.js:371:30)
    at /workspace/src/backfillToTypesenseFromFirestore.js:29:13
    at cloudFunction (/workspace/node_modules/firebase-functions/lib/cloud-functions.js:134:23)
    at /layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/invoker.js:199:28
    at processTicksAndRejections (internal/process/task_queues.js:95:5) 

Firebase functions can be re-used so admin.initializeApp ends up being called multiple times.

Submitting a PR here: #5

Port selection

Description

I've installed Typesense on my own server on port 8108. Specifying "IPADDRESS:8108" as the hostname does not cause the extension to look at port 8108 - it defaults to 443.

Looking at the source code, I see it reading from process.env.TYPESENSE_PORT but not really sure how to set that in an extension environment, short of recreating the functions myself. Thoughts?

Index a flattened field within an object

Description

Hi, I am trying index some fields which are within an object that I am supposed to flatten in order to index in Typesense. But looking at this,
https://github.com/typesense/firestore-typesense-search/blob/master/functions/src/utils.js#L16-L20 , I think even a flattened field would not match. Is there a workaround for fields within an object using firestore & typesense? Thanks in advance.

Steps to reproduce

Place a flattened field into Firestore Collection Fields.

Expected Behavior

Index a flattened field within an object into typesense.

Actual Behavior

Unable to index a flattened field within an object into typesense.

Metadata

Typsense Version:
typesense/[email protected]

Crash during backfilling collection data

Description

Hi! I want to implement full text search into my project which is based on Firebase. Right now I'm trying to get some things rolling by backfilling my data to the TypeSense Collection. I'm using a Firebase TypeSense Extension from this repo. Everything went smoothly except the backfilling process. The ext-firestore-typesense-search-backfillToTypesenseFromFirestore cloud function gives me an error "Memory limit exceeded". Usually I fix that problem by deploying a function once again with higher available resources but I'm not sure if this is the proper solution. The test collection has a size of around 60MB and contains 53K documents. I was able to successfully export other collection which is a bit smaller (around 30k documents).

Steps to reproduce

Screenshot 2021-08-03 at 15 46 52

Expected Behavior

The ext-firestore-typesense-search-backfillToTypesenseFromFirestore trigger function shouldn't crash.

Actual Behavior

If the collection is big enough the ext-firestore-typesense-search-backfillToTypesenseFromFirestore trigger function returns an error Function invocation was interrupted. Error: memory limit exceeded.

Metadata

Typsense Version: 0.21.0

Fresh plugin installation fails with 401 - a valid `x-typesense-api-key` header must be sent.

Description

I have just set up a new cluster in Typesense cloud and configured the plugin for a firestore collection.
The process has gone without any issues, it's actually been surprisingly simple.

However, documents are failing to sync due to an authorization issue.
It seems that the GCF is not setting the header, as the TypeSense host rejects the upsert with the following message

RequestUnauthorized: Request failed with HTTP code 401 | Server said: Forbidden - a valid x-typesense-api-key header must be sent

image

The secret with the API key has been created during the setup, as it can be seen on the GCF variables tab.

image

image

I have reviewed the function source and watched for typos for anything related to the typesenseAPIKey, that is configured on the client.
It all seems fine to me.
I was wondering if there might be a little extra to be done when reading secrets from process.env.
But if that were the case this would have failed for many others as well.

Do you have any ideas on why this might be failing?

Steps to reproduce

  • Create a new TS cluster
  • Configure the firestore TS extension via the web wizard to sync a single collection
  • update any document in the collection
  • data will not sync, GCF logs will show the error above

Expected Behavior

I would expect the document to sync to Typesense.

Actual Behavior

The upsert is rejected with a 401 indicated that the x-typesense-api-key has not been provided.

Metadata

Typesense Version: 0.25.2

OS: cloud

Function Names Generated by Typesense Extensions Are Too Long, Causing Setup Issues in Firebase Locally

Description

When setting up Firebase Extensions locally, I've encountered a problem related to the length of function names generated by Typesense Extensions. Specifically, the generated functions are named "indexToTypesenseOnFirestoreWrite" and "backfillToTypesenseFromFirestore," and these names exceed the allowable length, causing issues during the local setup process.

Steps to reproduce

  1. Attempt to set up Firebase Extensions locally using the Typesense Extensions.
  2. Observe that the function names "indexToTypesenseOnFirestoreWrite" and "backfillToTypesenseFromFirestore" exceed the character limit.
Screenshot 2023-11-07 at 3 52 39 PM

Expected Behavior

I suggest that Firebase Extensions generated function names adhere to Firebase's function naming constraints, ensuring they are within the allowable character limit.

I would greatly appreciate it if this issue could be addressed as it's currently impeding the local development and deployment of Firebase Extensions using Typesense.

Thank you for your attention to this matter.

Typesense Version: typesense/[email protected]

how can i trigger backfill manually

Description

Reading the docs i saw i have to create a new collection called typesense_sync with inside a document named backfill => trigger : true.
But actually none of my documents are filled in the collection,
in the logs i see few detail but i don't undestand

Steps to reproduce

Expected Behavior

Ther is a way to trigger backfill manually?

Actual Behavior

Metadata

Typsense Version:

OS:
Schermata 2022-02-16 alle 17 36 23

Can we make most of the traffic go though a certan node?

We have 3 servers running typesense over https on port 443, we want to setup a cluster. n1.example.com, n2.example.com and n3.example.com. 3 different ubuntu servers in different countries. installed via the DEB packages.

We're using typesense to help search our firebase firestore database, via the firestore typesense plugin. We have a Flutter application and we use the typesense SDK.

n1.example.com is our most powerful server and it's the closest server to our client in terms of geolocation.

  1. Can we make all the traffic go through n1.example.com and only use the others in case of a problem?
  2. In this what happens if n1.example.com? The developer told us that the Firebase extension and the client SDK don't support client side load balancing so our only option is to use server side load balancing. Is that correct? If so how to setup the cluster?

ideally for writes and updates I want a server side cluster because I don't want to send data from firestore to typesense 3 times, that would increase the firestore cost

I want the reads to happen from the client side, either that or to prioritize n1.example.com

Filter records which are indexed

Description

I have Firestore records with different publishing status (public, private, draft etc) and only want to include the public records in the search results. How can I achieve that today? If not available, it's a very clear feature request

Expected Behavior

Some kind of filter in the Firestore extension I assume?

OS:
Typesense Cloud :-)

Warning on deleted documents

Description

Warning of "Snapshot has no readTime. Using now()" appears for every document deleted on firebase.

Steps to reproduce

const targetValue = keysToDelete[i];
await productsCollectionRef.doc(targetValue).delete();

Expected Behavior

Complete without warning

Actual Behavior

Warning appears:

{
  "textPayload": "Snapshot has no readTime. Using now()",
  "insertId": "insertID",
  "resource": {
    "type": "cloud_function",
    "labels": {
      "region": "us-central1",
      "project_id": "abc",
      "function_name": "ext-firestore-typesense-search-indexToTypesenseOnFirestoreWrite"
    }
  },
  "timestamp": "2024-01-23T11:46:16.279665Z",
  "severity": "WARNING",
  "labels": {
    "runtime_version": "nodejs16_20240114_16_20_2_RC00",
    "execution_id": "executionID",
    "instance_id": "instanceID"
  },
  "logName": "projects/abc/logs/[cloudfunctions.googleapis.com](http://cloudfunctions.googleapis.com/)%2Fcloud-functions",
  "trace": "projects/abc/traces/traceID",
  "receiveTimestamp": "2024-01-23T11:46:16.538363409Z"
}

Metadata

Typesense Version: <1.2.0

OS: MacOS 14.2.1 (23C71)

Support multiple collections

Reading through the extension docs I conclude that you can only ever have 1 collection synced to Typesense from Firestore. For example the backfill trigger is clearly global and not aware of the collection. That's a bummer because I need multiple.

Is the limitation coming from Firebase Extensions, that you can only have one instance of any type of extension installed?

Update documentation on how to contribute to this package

Description

For me it is not clear from the section in the readme how to contribute to this project. I think the documentation is outdated and unclear. The firebase ext:dev:* commands no longer exist. It would be nice to know how to test locally, how to use your own folk, and what are the conditions to open a pull request on this project. Thank you very much :)

Missing Tier 2 Cloud Function Locations?

Description

Looking at the current list of possible Cloud Function Locations, I noticed it only offers Tier 1 Cloud Function Locations:

options:
- label: Iowa (us-central1)
value: us-central1
- label: South Carolina (us-east1)
value: us-east1
- label: Northern Virginia (us-east4)
value: us-east4
- label: Belgium (europe-west1)
value: europe-west1
- label: London (europe-west2)
value: europe-west2
- label: Frankfurt (europe-west3)
value: europe-west3
- label: Hong Kong (asia-east2)
value: asia-east2
- label: Tokyo (asia-northeast1)
value: asia-northeast1

But Cloud Functions has many more locations (Tier 2), as shown in their docs.

I wonder if this was intenttional? Was it due to the pricing difference between the Tiers?

Document flattening throws error with deeper nested structures

Description

This extension currently flattens the document objects before writing them to Typesense to support older versions of Typesense that didn't support nested data structures. Since v0.24 this is no longer the case and this flattening causes issues.

Here is a stripped-down example of a schema that throws an error when used with this extension (it works fine if you write directly via the API without flattening the data structure):

{
  "name": "collection-name",
  "fields": [
    {
      "name": ".*",
      "type": "auto",
      "facet": false,
      "index": true,
      "sort": false
    },
    {
      "name": "someObject.field1",
      "type": "string",
      "facet": true,
      "index": true,
      "sort": true,
      "optional": false
    },
    {
      "name": "someObject.someObjectArray",
      "type": "object[]",
      "facet": true,
      "index": true,
      "sort": false,
      "nested": true,
      "nested_array": 1
    },
    {
      "name": "someObject.someObjectArray.field2",
      "type": "string[]",
      "facet": true,
      "index": true,
      "sort": false
    }
  ],
  "enable_nested_fields": true
}

I managed to get around the problem by setting "optional: true" on "someObject.someObjectArray.field2", but it is not ideal because I want to make sure that property always exists.

Steps to reproduce

Create a Typesense collection and Firestore collection that resembles the schema given above and trigger either the backfill or onWrite cloud function. It will fail with an error that the schema expects a field named "someObject.someObjectArray.field2" but it doesn't exist on the document.

Expected Behavior

No errors and it indexes the document in Typesense

Actual Behavior

Throws an error

Metadata

Typesense Version:

v0.24

OS:
n/a

Add support for multiple api keys

Currently, the firestore extension supports multiple hosts with 1 API key, this isn't ideal

Add support for multiple API keys, either store multiple API keys in the secret manager or have them coma separated just like the host, you just have to ensure that there's no comma in the API key

I was trying to fix it but turns out the typesense js client only accepts one key!

Getting HTTP 400 error when trying to enable the Firebase extension through CLI

Description

I am using the Firebase CLI to try to enable the extension for a Firebase project. Major note is that the last few times I did this it worked fine, so this seems to be a new issue. I think it may be related to this change: 44c5b37

The TYPESENSE_API_KEY parameter appears to be the issue. Previously I had this set as the admin API key, which used to work. Now, when I try to enable the extension, I get an error related to secrets manager, which seems to be saying I have formatted it wrong. Specifically it says it should be formatted like projects/*/secrets/*/versions/* instead of just the API key, but I am not sure exactly how to format my API key like that (I am not sure which each of the asterisks in this case should be)

Steps to reproduce

Set up parameters file
Run Firebase command to enable extension

firebase ext:install typesense/firestore-typesense-search --project=projectId --params=params.env --force

Expected Behavior

The extension will enabled for my Firebase project

Actual Behavior

I receive a HTTP 400 error:

Error: HTTP Error: 400, Secret ID 123.... is invalid, must be of form projects/*/secrets/*/versions/*

Metadata

Typesense Version: v0.23.0.rc27

OS: MacOS

Support Node.js 18

Description

While checking my GCP cloud functions I noticed this
Screenshot 2023-11-01 at 03 05 00

Is supporting v18 coming soon?

Thank you

Firestore functions doesn't push data to typesense schema

Hi all!
i've create a typesense schema like this:
const myCollection = { 'name': 'products', 'fields': [ {"name": ".*", "type": "auto" } ] }
then, i've installed firestore extension and link "products" schema to a firestore DB collection. But the functions doesn't work.

When I tried backfill, function throw this error:

image

When I tried to add a document the function doesn't throw an error but in the typesense schema doesn't appear.
On delete the same document, the function throw this error:

image

It seems that firestore extension cannot comunicate with typesense schema.
Can you help me?

How to index Firestore TimeStamp?

First of all, I would like to thank you for this nice project.
I'm trying to use a Firebase TimeStamp as the default_sorting_field of my Typesense collection. But it only supports int, float and string. Is it possible to index a Firebase Timestamp? How could this be done? Thanks again

`indexToTypesenseOnFirestoreWrite.js` does not delete documents when the docId has a hash `#`

Description

We use the typesense/[email protected] firebase extension to sync (create, update, delete) documents to our typesense cluster. We have a collection that has document IDs that contain hashes #, e.g. "AAAtest#000"

While document.creates are sync'd with the extension - document.deletes are failing

Steps to reproduce

  1. Create a firebase collection
  2. Add the firestore-typesense-search firebase extension using version 0.4.2
  3. Create a document with a docID with a hash symbol, e.g. "AAAtest#000"
  4. Wait for the extension to sync the document to typesense
  5. Delete the document in firebase (which should trigger the extension to delete the document in typesense)
  6. Observe that that the delete sync to typesense fails and document still exists and is searchable in typesense

Expected Behavior

Expected behavior is that upon deleting the document in firebase the document is also deleted in typesense

Actual Behavior

I go to firebase console, find a document with a hash sign (e.g. AAAtest#000), and issue a delete command.

The following message in google cloud logs:

Deleting document AAAtest#000

Error:

Error: Request failed with HTTP code 404 | Server said: Could not find a document with id: AAAtest

at .ApiCall.customErrorForResponse ( /workspace/node_modules/typesense/lib/Typesense/ApiCall.js:226 )
at .ApiCall.performRequest ( /workspace/node_modules/typesense/lib/Typesense/ApiCall.js:118 )
at .processTicksAndRejections ( [internal/process/task_queues.js:95](https://console.cloud.google.com/debug?referrer=fromlog&file=internal%2Fprocess%2Ftask_queues.js&line=95&appModule=ext-firestore-typesense-search-warehouses-indexToTypesenseOnFir&appVersion=&project=costco-companion-app-6d932&supportedpurview=project) )

Please notice that in the error message the characters after the hash symbol is dropped:

  • DocID: AAAtest#000
  • DocID referenced in error logs: AAAtest

As expected (since the delete fails) the document still exists in our typesense database

Metadata

Typesense Version:
Typesense v0.23.1
typesense/[email protected]

OS:
functions are executed as a Firebase Cloud function
node v14
typescript

Host issues with DNS ip https - EAI_AGAIN and ECONNREFUSED

Hi, I've been trying to setup this for a while now and I never get any errors but nothing happens when I update or add docs to the collection it was supposed to be indexing. How to troubleshoot the change in the document? What commands should I use?

The only log I can see is when running tail -f /var/log/typesense/typesense.log

I20220618 02:35:29.949776 28495 raft_server.h:60] Peer refresh succeeded!
I20220618 02:35:39.951146 28490 raft_server.cpp:531] Term: 5, last_index index: 18, committed_index: 18, known_applied_index: 18, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 44

Cloud logs return 0 logs.

My setup:

  • I have a fresh Ubuntu AWS Lightsail running typesense
  • port 8108 is open
  • https is on
  • health check returns ok.
  • Collection was created on the server
  • typesense_sync collection created

Questions:

  • Can we please expand the readme to include how to troubleshoot?
  • Can we please add to the readme examples of the config? I.e. do I need a forward slash for collections? Do I need the port on the server?

image

I am using this to test if a change in a doc creates any new document but It always return 0 documents.

curl -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
     -X GET \
    "https://EXAMPLE.com:8108/collections/service"
{"created_at":1655499258,"default_sorting_field":"","fields":[{"facet":false,"index":true,"infix":false,"locale":"","name":".*","optional":true,"sort":false,"type":"auto"}],"name":"service","num_documents":0,"symbols_to_index":[],"token_separators":[]}%       

http unsupported (self hosting)

Testing with self hosted instance and entered [IP ADDRESS]:8108 in the Extension configuration.

Firebase Functions logs show the port is updated to 443 and the connection is refused:

image

Matching Schemas and Collections.

Description

Using this schema:

let schema = {
  'name': 'companies',
  'fields': [
    {
      'name': 'company_name',
      'type': 'string',
      'facet': false
    },
    {
      'name': 'num_employees',
      'type': 'int32',
      'facet': false
    },
    {
      'name': 'country',
      'type': 'string',
      'facet': true
    }
  ],
  'default_sorting_field': 'num_employees'
}

firestore collection:

{
"company_name": "abc inc",
"num_employees": "50"
}

Could we have a collection with missing parameters (ie. country) and upload the missing parameters after? For example, uploading the missing parameter (or parameters) after a document is added in firestore (using a trigger after each creation, update, deletion).

Edit:

To be clear, the collection won't have the parameter but the schema will have it. We add the parameter to that record after each modification.

Support for 'ref' type fields

Description

Currently ref type fields return null:

else if (value instanceof admin.firestore.DocumentReference) { return null; }
See here

Expected Behavior

ref is a firebase object that has a property path that returns a string, for me personally that's all I need to be indexed to Typesense so I think working with that would be the easiest solution for now.

Metadata

Typsense Version:

"typesense": "^1.2.2",
"typesense-instantsearch-adapter": "^2.4.0"

OS:

macOS

Add option to map field names

Description

I would like to have the option to map the field names from firestore to other field names in the typsense collection. As an example, the field id is called id in firestore and typsense, but the firestore field ownerId should be called userId in typsense. To achieve this, you could modify the field Firestore Collection Fields in the typsense extension configuration to accept the following syntax and of course update the sync cloud functions to handle it.

# Previously 
id, ownerId, date

# Future 
id, ownerId=userId, date 

Add support for multiple collections to be indexed in typesense

Description

This is not a bug with the extension, but a request to add a feature that would allow support for multiple collections to be indexed into typesense. The configuration steps could allow an array or list of collections to be inputted, and a subarray for fields.

help setting up

hi all. I'm running self hosted typesense on AWS via docker. I have nginx redirecting 80 and 443 traffic to locahost:8108, with SSL certificate etc.

if I visit my-domain.com/health, I get {"ok":true} back. I believe this means nginx is correctly redirecting traffic. however, when I try to trigger calls on firestore, I get this error:

{
  "textPayload": "Error: Request failed with HTTP code 404 | Server said: Not Found\n    at ApiCall._customErrorForResponse (/workspace/node_modules/typesense/lib/Typesense/ApiCall.js:378:26)\n    at ApiCall._callee$ (/workspace/node_modules/typesense/lib/Typesense/ApiCall.js:186:70)\n    at tryCatch (/workspace/node_modules/regenerator-runtime/runtime.js:63:40)\n    at Generator.invoke [as _invoke] (/workspace/node_modules/regenerator-runtime/runtime.js:293:22)\n    at Generator.next (/workspace/node_modules/regenerator-runtime/runtime.js:118:21)\n    at asyncGeneratorStep (/workspace/node_modules/@babel/runtime/helpers/asyncToGenerator.js:3:24)\n    at _next (/workspace/node_modules/@babel/runtime/helpers/asyncToGenerator.js:25:9)\n    at processTicksAndRejections (internal/process/task_queues.js:95:5)",
  "insertId": "000000-1be6577f-97bb-4c82-aa28-74f47012e713",
  "resource": {
    "type": "cloud_function",
    "labels": {
      "project_id": "modify-dev",
      "region": "us-central1",
      "function_name": "ext-firestore-typesense-search-indexToTypesenseOnFirestoreWrite"
    }
  },
  "timestamp": "2021-10-21T21:37:53.973Z",
  "severity": "ERROR",
  "labels": {
    "execution_id": "k67afrdv37p2"
  },
  "logName": "projects/modify-dev/logs/cloudfunctions.googleapis.com%2Fcloud-functions",
  "trace": "projects/modify-dev/traces/0f171f3a97592d67f6c842091303f0b5",
  "receiveTimestamp": "2021-10-21T21:38:03.542328289Z"
}

when looking at my nginx access logs, I can see that the extension does hit the server.
any ideas?

How to index Firestore subfields?

Description

I have a Firestore collection with subfields e.g. { "user": {"username": string, ...} } and installed firestore-typesense-search extenstion.
I want to create a Typesense collection with indexing by user.username without duplicating this in a flat field like user_username in Firestore.

Is it possible?
I've tried specifying user.username,... in extension configuration, but that doesn't seem to work.

Is the backfill batching logic fully correct ?

Considering the order of execution of async functions in the Node.js Event Loop,
is the backfill batching logic fully correct?

let currentDocumentsBatch = [];
querySnapshot.forEach(async ....); // this inner async functions might get executed after next line after the forEach()
if (currentDocumentsBatch.length > 0) { // this line might get executed before above inner async functions

Typesense v0.24.+ Nested fields support

Description

Nested fields support was added to the Typesense v0.24.
Wondering when you're planning to support it in the extension along with updated docs on how to do it?

Thank you, please let me know if you already have something for this and it's a duplicate.

Firebase integration with self-hosted server

Description

I installed typesense in my

ubuntu 20.04 apache server

and it is successfully running but I can't succeed in trying to sync my firestore data using the firebase extension. When I input the public IP of my server in the typesense host portion, it fails to upsert the data.

Steps to reproduce

firebase extension
image

image

and this is my config file for typesense server

image

I'm guessing that it needs SSL certification but where do I get my SSL certificates to put in the configs?
Can self-signed work? Can certbot work?

I'm new to this so have mercy on me

""

Expected Behavior

Actual Behavior

Metadata

Typsense Version:

OS:

Strings for Hits and Fields

I created a Typesense Cloud instance and installed this extension. I see [Object] and [Array] for a bunch of the fields that are text in the Typesense documentation. Is there a way to get the string fields back via configuration?

[
  {
    created_at: 1673020497,
    default_sorting_field: '',
    fields: [ [Object], [Object], [Object], [Object], [Object], [Object] ],
    name: 'companies',
    num_documents: 4,
    symbols_to_index: [],
    token_separators: []
  }
]
{
  facet_counts: [],
  found: 1,
  hits: [
    {
      document: [Object],
      highlights: [Array],
      text_match: 72341265420713980
    }
  ],
  out_of: 4,
  page: 1,
  request_params: { collection_name: 'companies', per_page: 10, q: 'stackoverflow' },
  search_cutoff: false,
  search_time_ms: 0
}

Getting started with Cloud offering

Hi,

I was hoping to get some help to get started with the cloud offering. I can't join Slack (it rejects my Google login)

image

and the "schedule a demo" doesn't work in Asia (they are all very early in the morning):

image

I've created a paid cluster esvcdu8rzkpbfgy4p and installed the Firestore extension, then added a "collection" in the TypeSense UI, matching the collection "items" in Firestore. However, when I added a document to "items" it doesn't appear in the search database.

How do I debug this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.