norconex / committer-sql Goto Github PK
View Code? Open in Web Editor NEWImplementation of Norconex Committer for SQL (JDBC) databases.
Home Page: https://opensource.norconex.com/committers/sql/
License: Apache License 2.0
Implementation of Norconex Committer for SQL (JDBC) databases.
Home Page: https://opensource.norconex.com/committers/sql/
License: Apache License 2.0
In createTableSQL
Tag, I can set the fields that are available for the tagger. I.E. the fields that would go here:
<tagger class="com.norconex.importer.handler.tagger.impl.KeepOnlyTagger">
<fields>title,description,document.reference,google-site-verification</fields>
/tagger>
The fields obtained vary for each document/site. I use DebugTagger to list them.
I create a table like this :
CREATE TABLE ${tableName} (
${targetReferenceField} VARCHAR(32672) NOT NULL,
${targetContentField} CLOB,
title VARCHAR(256),
description VARCHAR(256),
googleSiteVerification VARCHAR(256),
PRIMARY KEY ( ${targetReferenceField} )
)
How to match google-site-verification
tagger field to googleSiteVerification
column`?
I am getting the following error while committing to mysql.
Test Example: 2018-09-06 20:01:01 INFO - Max queue size reached (10). Committing
Test Example: 2018-09-06 20:01:01 INFO - Committing 10 files
Test Example: 2018-09-06 20:01:01 INFO - Sending 5 commit operations to SQL database.
Test Example: 2018-09-06 20:01:01 INFO - Checking if table "FH_1200" exists...
Test Example: 2018-09-06 20:01:05 INFO - Table "FH_1200" does not exist. Attempting to create it...
Test Example: 2018-09-06 20:01:05 INFO - Table created.
Test Example: 2018-09-06 20:01:05 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:480)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:425)
at com.norconex.committer.core.AbstractCommitter.commitIfReady(AbstractCommitter.java:146)
at com.norconex.committer.core.AbstractCommitter.add(AbstractCommitter.java:97)
at com.norconex.collector.core.pipeline.committer.CommitModuleStage.execute(CommitModuleStage.java:34)
at com.norconex.collector.core.pipeline.committer.CommitModuleStage.execute(CommitModuleStage.java:27)
at com.norconex.commons.lang.pipeline.Pipeline.execute(Pipeline.java:91)
at com.norconex.collector.http.crawler.HttpCrawler.executeCommitterPipeline(HttpCrawler.java:379)
at com.norconex.collector.core.crawler.AbstractCrawler.processImportResponse(AbstractCrawler.java:595)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextQueuedCrawlData(AbstractCrawler.java:541)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextReference(AbstractCrawler.java:419)
at com.norconex.collector.core.crawler.AbstractCrawler$ProcessReferencesRunnable.run(AbstractCrawler.java:812)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x90\x92\xE2\x9C...' for column 'content' at row 1 Query: INSERT INTO FH_1200(document_reference,Server,keywords,imagepath,description,id,title,Date,content) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
I am sending the configuration by email with the same subject
Thank You
Dear Mr. Paul,
I am geting the following error while committing to MYSQL 5.7 version
Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x91\x89 I...' for column 'content' at row 1 Query: INSERT INTO bangalore_email_test(starturl,document_reference,Server,id,title,email,Date,content)
I had raised this issue earlier
#3 and U had suggested to use ReplaceTagger.
I tried the following config
<transformer class="com.norconex.importer.handler.transformer.impl.ReplaceTransformer" caseSensitive="false">
<replace>
<fromValue>[\xF0\x9F\x91\x89]{1,}</fromValue>
<toValue>newyork</toValue>
</replace>
</transformer>
And this configuration
<tagger class="com.norconex.importer.handler.tagger.impl.ReplaceTagger">
<replace replaceAll="true">
<fromValue>[\xF0\x9F\x91\x89]{1,}</fromValue>
<toValue>orange</toValue>
</replace>
</tagger>
I don't need this Hexadecimal data. Please advice how to use ReplaceTransfomer or ReplaceTagger
this is the special character I am getting from the webpage
๐
Regards
Balu
Related to a closed issue:
No matter what I try I keep getting the following exception:
Caused by: java.sql.SQLException: Incorrect table name 'document' Query: ALTER TABLE tablename ADD document.reference VARCHAR(5000)
latest config to get rid of document.reference:
<tagger class="com.norconex.importer.handler.tagger.impl.RenameTagger">
<rename fromField="document.reference" toField="url" overwrite="true" /> </tagger>
<tagger class="com.norconex.importer.handler.tagger.impl.KeepOnlyTagger">
<fields>title,keywords,description,url</fields>
</tagger>
Hi,
Is it possible to use SQL Committer with SQLite?
Do you have any examples of using SQL Committer?
Regards,
Christian
I am trying to store data in mysql table using sqlcommitter but unable to do so cause I don't understand where to do the mapping for fields which will then get converted to mysql table columns. I am getting the required data in a json file when using a jsonfilecommitter. If you could please share an example of the config file and tags that I should be using it would be a great help.
I am getting the following errors while using SQL Committer V2. Can you please help?
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: 100% completed (323 processed/323 total)
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: Deleting orphan references (if any)...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: Deleted 0 orphan references...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: Crawler finishing: committing documents.
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Committing 89 files
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Checking if table "crawler7" exists...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Table "crawler7" does not exist. Attempting to create it...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Table created.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Done sending commit operations to database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Done sending commit operations to database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Done sending commit operations to database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:34 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:39 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:39 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:44 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:44 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:49 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:49 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:54 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:54 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Hello Hotel Example: Crawler executed in 8 minutes 32 seconds.
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Hello Hotel Example: Closing sitemap store...
Hello Hotel Example: 2018-08-09 21:59:59 ERROR - Execution failed for job: Hello Hotel Example
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Running Hello Hotel Example: END (Thu Aug 09 21:51:26 IST 2018)
I'm trying to use the CopyTagger to copy fields from a source system into a standardized field name.
<tagger class="com.norconex.importer.handler.tagger.impl.CopyTagger">
<copy fromField="og_type" toField="category" overwrite="true"/>
</tagger>
I've experimented with placing these in the preParseHandler and the postParseHandler as well as using the original names of the fields as well as the value that is assigned when "fixFieldNames" is used. Regardless of what I do, the fields do not end up in the SQL endpoint. I've also noticed that these fields are not automatically created by the "createFieldSQL" statement.
Does the SQL Committer not support these features of Norconex? I am using the December release of the HTTP Collector and the 2.0.0 version of the SQL Committer.
Hello,
I have configured the SQLFileCommitter.xml and the content is
<committer class = "com.norconex.committer.sql.SQLCommitter"> <fieldMappings> <mapping fromField = "document.reference" toField = "document_reference"/> <mapping fromField="content" toField="content"/> </fieldMappings> </committer>
And I have set the SQLCommitterConfig by using the methods:
`SQLCommitterConfig sqlCommitterConfig = new SQLCommitterConfig();
sqlCommitterConfig.setDriverClass("com.mysql.cj.jdbc.Driver");
sqlCommitterConfig.setConnectionUrl("jdbc:mysql://localhost:3306/sys");
sqlCommitterConfig.setCredentials(new Credentials().setUsername("root").setPassword("123456"));
sqlCommitterConfig.setTableName("test");
sqlCommitterConfig.setPrimaryKey("id");
sqlCommitterConfig.setFixFieldNames(true);
sqlCommitterConfig.setFixFieldValues(true);
sqlCommitterConfig.setTargetContentField("content");
XML xml = new XML(Path.of("fraud\\src\\main\\resources\\SQLFileCommitter.xml"));
SQLCommitter sqlCommitter = new SQLCommitter(sqlCommitterConfig);
sqlCommitter.loadFromXML(xml);
sqlCommitter.loadCommitterFromXML(xml);
crawlerConfig.setCommitters(sqlCommitter);`
I want to store only part of the extracted fields into MYSQL. But I got the exception
Caused by: java.sql.SQLException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-Backend-Response,Server,x-webkit-csp,Content-Location,x-frame-options,x-cdn-pro' at line 1 Query: INSERT INTO test(X-Backend-Response,Server,x-webkit-csp,Content-Location,x-frame-options,x-cdn-provider,Referrer-Policy,X-SecNG-Response,dc:title,Content-Encoding,Set-Cookie,collector.depth,id,surrogate-control,document_reference,google-site-verification,document.contentEncoding,strict-transport-security,pragma,x-xss-protection,x-idc-id,Cache-Control,document.contentType,Content-Language,expires,document.contentFamily,renderer,collector.redirect-trail,force-rendering,description,title,content,x-edge-timing,x-content-security-policy,X-Cache-Lookup,collector.is-crawl-new,collector.http-fetcher,Content-Length,Content-Type,Transfer-Encoding,X-Parsed-By,Connection,Date,X-UA-Compatible,content-security-policy,x-content-type-options,viewport,x-lb-timing,X-NWS-LOG-UUID,Vary) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Parameters: [0.028, CLOUD ELB 1.0.0, default-src * blob:; img-src * data: blob: resource: t.captcha.qq.com *.dun.163yun.com *.dun.163.com *.126.net *.nosdn.127.net nos.netease.com;
It seemed like the insert query was too long and the fileds mapping didn't work, so I didn't have enough columns to store. How can I deal with that?
Hi,
I ran the collector and have the working folders here. The SQL commit failed as the database was down.
Is it possible to just re-run the committer part without rerunning the collector? The collector took quite a bit of time.
Many thanks.
Hi,
I am using the filesystem crawler with SQL committer. It works great!
Today I tried running two crawlers at the same time.
consider crawl-one.variables
which has an associated crawl-one.xml
path = \\storage\path_one
workdir = ./crawl-one
And also the same for crawl-two
path = \\storage\path_two
workdir = ./crawl-two
These are configured to commit to different tables in the same database.
Oddly the database table for crawl-one contains results from crawl-two, and so forth.
The 32_Crawler.log
within the working directories is pure
I tried naming the fscollector id
and crawler id
uniquely in each configuration file but still have a problem.
I am running the 2.9.1 Snapshot of the collector filesystem.
Thinking aloud, the committer-queue
folder is perhaps common to both of my processes? I will try running two completely separate 2.9.1 snapshot folders. Regardless I thought worth mentioning this.
Norconex/committer-core#9 looks related.
Any advice is invited.
In MySQL, CLOB data type is not supported. And the most field types are declared as VARCHAR(32672) but this itself exceeds the limit if a field if encoded by utf-8, etc. And the combined limit is 65532 in text length; can't create more than 3 fields with other encondings. In order to remove this limit, the fields should be defined as a TEXT/BLOB type.
Hope there is a DB/Field Type option/param or more universal support.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.