Hi and thank you for this connector, I have 3 elastic indexes that h

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

No errors but data not pulled out from Elastic about kafka-connect-elasticsearch-source HOT 9 CLOSED

dariobalinzo commented on July 4, 2024

No errors but data not pulled out from Elastic

from kafka-connect-elasticsearch-source.

Comments (9)

DarioBalinzo commented on July 4, 2024 2

Hi @tee2015,
sorry for my late response.

Were present any error log in your experiments?

There are no limitations in the number of topics assigned to the connector, but at the same time you can have multiple instance of the connector working on different data.

One important config to check is the incrementing field, I've seen that you are using a timestamp field. Consider that the incrementing field is designed to work with strictly incrementing values, if you have many duplicates in the @timestamp field the connector may loos data when performing a paginated query. If this is the issue you can check to the secondary incrementing field feature.

Regarding the confluent hub instead, the release process is manual and not automatic, but I will contact them soon to publish a latest stable version.

Dario

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

Hi, I simplified the problem to just pull one index (test1) from es to kafka ,
however still not able to do that , I used another index (test5) to test if the connector is working from the same elastic with a different schema and I am successfully able to sink the data from es (test5) to kafka via the connector.
I also cat the _doc from (test1) index into kafka via "kafkacat -P -l test1.json " and was able successfully to store the message. No clue how to proceed , I added the ignore.key to the above connector conf but still no luck , any clue ? or other options suggested for additional troubleshooting @DarioBalinzo that actually would be great thank you in advance.

I am able to see the number of messages on the the connector:
total messages: 39 (com.github.dariobalinzo.task.ElasticSourceTask:171)

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

I am using this connector version 1.3 installed via
confluent-hub install dariobalinzo/kafka-connect-elasticsearch-source:1.3
which is not the latest release I am wondering if I use a higher version this will resolve my issue, also why confluent dosent support a higher version than 1.3.
because setup the connector is a very tedious process and via the confluent hub it make it so simple.

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

I test it with 1.4.2 at the creation of the connector it is pulling the data finally , however it is stopping after that and the topic not receiving the latest records,
I downgrade to 1.4.1 and same issue.

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

Hi @DarioBalinzo,

Appreciated your response, I will focus then on the incrementing values set and add a secondary one, I managed to add the 1.4.2 into a container setup I will post it here:

Docker file:

FROM confluentinc/cp-kafka-connect-base:6.0.1
COPY target/dariobalinzo-kafka-connect.zip /tmp/dariobalinzo-kafka-connect1.4.2.zip
RUN confluent-hub install --no-prompt /tmp/dariobalinzo-kafka-connect1.4.2.zip

Build ur image:
docker build . -t my-custom-image:1.0.0

Docker-compose: (check for all other docker compose details on kafka connect demo zero to hero)

kafka-connect:
    image: my-custom-image:1.0.0
    container_name: kafka-connect
    etc .........

I hope this will help others until it is available via confluent hub.

Tarek

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

Hi @DarioBalinzo,
There are no errors on the docker stdout , however I can see now that the connector is just recognising that I have one message in the index while I have more than 1, is there another place to look for an error log?

This is what I mean that it is just recognising one message:

kafka-connect | [2021-09-13 14:29:21,344] INFO [logs-es-source|task-0] index logs total messages: 1 (com.github.dariobalinzo.task.ElasticSourceTask:193)
kafka-connect | [2021-09-13 14:29:21,345] INFO [logs-es-source|task-0] no data found, sleeping for 5000 ms (com.github.dariobalinzo.task.ElasticSourceTask:197)

I added secondary.incrementing.field.name:"id"

however this didn't solve my issue :(

from kafka-connect-elasticsearch-source.

DarioBalinzo commented on July 4, 2024

Hi @tee2015 ,
does your dataset contains sensitive information? If not you can share to me a subsets of documents of that index and I will try to investigate.

If that is not possible, can I ask you to run some aggregations on the data that are you using?

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

Hi @DarioBalinzo, what I found out that I am having a script running on a host as a backend process and sending data to es with the same timestamp :0 . After I stopped this script and started sending live data instead it is working as expected 💯.
I am searching for another field to add it as a secondary incrementing field to my config, can you please just confirm for me that what would be the field name ? secondary.incrementing.field.name:"id"
Appreciated ur assistance :)

from kafka-connect-elasticsearch-source.

phoenixml commented on July 4, 2024

incrementing.secondary.field.name found it closing this issue.

from kafka-connect-elasticsearch-source.

No errors but data not pulled out from Elastic about kafka-connect-elasticsearch-source HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent