Issue Deion I am trying to build and run the sparkler from t

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Error from server at http://localhost:8983/solr/crawldb: ERROR: [doc=<>] unknown field 'contenthash' about sparkler HOT 3 OPEN

ravindrabajpai commented on June 12, 2024

Error from server at http://localhost:8983/solr/crawldb: ERROR: [doc=<>] unknown field 'contenthash'

from sparkler.

Comments (3)

ravindrabajpai commented on June 12, 2024

I tried a work-around by removing this line from the StatusUpdateSolrTransformer -
//Constants.storage.CONTENTHASH -> ContentHash.fetchHash(data.fetchedData.getContent)

And it works for me for now.

But my hunch is that there is a better solution and maybe I am missing something in the configurations.

from sparkler.

lewismc commented on June 12, 2024

Hi @ravindrabajpai thanks for reporting the bug!

I see the Content Hash object in the sparkler-core code, but do not see it getting injected in the solr,

the content signature cannot be calculated at inject phase as it is based on Webpage content rather than the URL.

then why it is expected while fetching.

I suspect it is expected 'after' fetching but before indexing.

But my hunch is that there is a better solution and maybe I am missing something in the configurations.

Can you check that the webpage content was actually fetched?

from sparkler.

ravindrabajpai commented on June 12, 2024

Hi @lewismc

Thanks for replying. Yes I could see the webpage content was fetched correctly. I injected total 2 urls (additionally : edition.cnn.com) and both were fetched and stored correctly in the solr. there were about 300+ doc for both the sources (websites).

For all the Steps I did - https://github.com/ravindrabajpai/ana/blob/main/ground_zero

thanks.

from sparkler.

Recommend Projects

Error from server at http://localhost:8983/solr/crawldb: ERROR: [doc=<>] unknown field 'contenthash' about sparkler HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent