Giter VIP home page Giter VIP logo

committer-sql's People

Contributors

dependabot[bot] avatar essiembre avatar

Stargazers

 avatar

Watchers

 avatar  avatar

committer-sql's Issues

How to store tagger field to column SQL table when the name is different?

In createTableSQL Tag, I can set the fields that are available for the tagger. I.E. the fields that would go here:

<tagger class="com.norconex.importer.handler.tagger.impl.KeepOnlyTagger">
    <fields>title,description,document.reference,google-site-verification</fields>
/tagger>

The fields obtained vary for each document/site. I use DebugTagger to list them.

I create a table like this :

CREATE TABLE ${tableName} (
      ${targetReferenceField} VARCHAR(32672) NOT NULL, 
      ${targetContentField}  CLOB, 
      title VARCHAR(256),
      description VARCHAR(256),
      googleSiteVerification VARCHAR(256),
      PRIMARY KEY ( ${targetReferenceField} )
  )

How to match google-site-verification tagger field to googleSiteVerification column`?

Error while Commiting to MYSQL-2

I am getting the following error while committing to mysql.

Test Example: 2018-09-06 20:01:01 INFO - Max queue size reached (10). Committing
Test Example: 2018-09-06 20:01:01 INFO - Committing 10 files
Test Example: 2018-09-06 20:01:01 INFO - Sending 5 commit operations to SQL database.
Test Example: 2018-09-06 20:01:01 INFO - Checking if table "FH_1200" exists...
Test Example: 2018-09-06 20:01:05 INFO - Table "FH_1200" does not exist. Attempting to create it...
Test Example: 2018-09-06 20:01:05 INFO - Table created.
Test Example: 2018-09-06 20:01:05 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:480)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:425)
at com.norconex.committer.core.AbstractCommitter.commitIfReady(AbstractCommitter.java:146)
at com.norconex.committer.core.AbstractCommitter.add(AbstractCommitter.java:97)
at com.norconex.collector.core.pipeline.committer.CommitModuleStage.execute(CommitModuleStage.java:34)
at com.norconex.collector.core.pipeline.committer.CommitModuleStage.execute(CommitModuleStage.java:27)
at com.norconex.commons.lang.pipeline.Pipeline.execute(Pipeline.java:91)
at com.norconex.collector.http.crawler.HttpCrawler.executeCommitterPipeline(HttpCrawler.java:379)
at com.norconex.collector.core.crawler.AbstractCrawler.processImportResponse(AbstractCrawler.java:595)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextQueuedCrawlData(AbstractCrawler.java:541)
at com.norconex.collector.core.crawler.AbstractCrawler.processNextReference(AbstractCrawler.java:419)
at com.norconex.collector.core.crawler.AbstractCrawler$ProcessReferencesRunnable.run(AbstractCrawler.java:812)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x90\x92\xE2\x9C...' for column 'content' at row 1 Query: INSERT INTO FH_1200(document_reference,Server,keywords,imagepath,description,id,title,Date,content) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)

I am sending the configuration by email with the same subject

Thank You

Incorrect String Value while committing to MYSQL

Dear Mr. Paul,
I am geting the following error while committing to MYSQL 5.7 version

Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x91\x89 I...' for column 'content' at row 1 Query: INSERT INTO bangalore_email_test(starturl,document_reference,Server,id,title,email,Date,content)

I had raised this issue earlier
#3 and U had suggested to use ReplaceTagger.

I tried the following config


<transformer class="com.norconex.importer.handler.transformer.impl.ReplaceTransformer" caseSensitive="false">
<replace>
<fromValue>[\xF0\x9F\x91\x89]{1,}</fromValue>
<toValue>newyork</toValue>
</replace>
</transformer>



And this configuration

	<tagger class="com.norconex.importer.handler.tagger.impl.ReplaceTagger">
						<replace replaceAll="true">
							<fromValue>[\xF0\x9F\x91\x89]{1,}</fromValue>
							<toValue>orange</toValue>
						</replace>
	</tagger>

I don't need this Hexadecimal data. Please advice how to use ReplaceTransfomer or ReplaceTagger

this is the special character I am getting from the webpage

๐Ÿ‘‰

Regards
Balu

document.reference fieldname invalid

Related to a closed issue:

No matter what I try I keep getting the following exception:

Caused by: java.sql.SQLException: Incorrect table name 'document' Query: ALTER TABLE tablename ADD document.reference VARCHAR(5000)

latest config to get rid of document.reference:

<tagger class="com.norconex.importer.handler.tagger.impl.RenameTagger">
     			<rename fromField="document.reference" toField="url" overwrite="true" />  		   </tagger>

         <tagger class="com.norconex.importer.handler.tagger.impl.KeepOnlyTagger">
           <fields>title,keywords,description,url</fields>
         </tagger>

SQLCommitter not storing data

I am trying to store data in mysql table using sqlcommitter but unable to do so cause I don't understand where to do the mapping for fields which will then get converted to mysql table columns. I am getting the required data in a json file when using a jsonfilecommitter. If you could please share an example of the config file and tags that I should be using it would be a great help.

Error while committing to MYSQL

I am getting the following errors while using SQL Committer V2. Can you please help?

  1. java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
  2. com.norconex.committer.core.CommitterException: Could not commit batch to database.
    Caused by: java.sql.SQLException: Data source is closed

Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: 100% completed (323 processed/323 total)
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: Deleting orphan references (if any)...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: Deleted 0 orphan references...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Hello Hotel Example: Crawler finishing: committing documents.
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Committing 89 files
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Checking if table "crawler7" exists...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Table "crawler7" does not exist. Attempting to create it...
Hello Hotel Example: 2018-08-09 21:59:33 INFO - Table created.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Done sending commit operations to database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Done sending commit operations to database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Done sending commit operations to database.
Hello Hotel Example: 2018-08-09 21:59:34 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:34 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:39 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:39 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:44 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:44 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:49 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:49 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:54 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:54 ERROR - Could not commit batched operations.
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Sending 10 commit operations to SQL database.
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Hello Hotel Example: Crawler executed in 8 minutes 32 seconds.
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Hello Hotel Example: Closing sitemap store...
Hello Hotel Example: 2018-08-09 21:59:59 ERROR - Execution failed for job: Hello Hotel Example
com.norconex.committer.core.CommitterException: Could not commit batch to database.
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:479)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.cacheOperationAndCommitIfReady(AbstractBatchCommitter.java:208)
at com.norconex.committer.core.AbstractBatchCommitter.commitAddition(AbstractBatchCommitter.java:143)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:222)
at com.norconex.committer.sql.SQLCommitter.commit(SQLCommitter.java:424)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:296)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:168)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:132)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: java.sql.SQLException: Data source is closed
at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2016)
at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319)
at org.apache.commons.dbutils.QueryRunner.query(QueryRunner.java:327)
at com.norconex.committer.sql.SQLCommitter.runExists(SQLCommitter.java:584)
at com.norconex.committer.sql.SQLCommitter.recordExists(SQLCommitter.java:575)
at com.norconex.committer.sql.SQLCommitter.sqlInsertDoc(SQLCommitter.java:527)
at com.norconex.committer.sql.SQLCommitter.addOperation(SQLCommitter.java:508)
at com.norconex.committer.sql.SQLCommitter.commitBatch(SQLCommitter.java:465)
... 15 more
Hello Hotel Example: 2018-08-09 21:59:59 INFO - Running Hello Hotel Example: END (Thu Aug 09 21:51:26 IST 2018)

CopyTagger fields do not end up in SQL

I'm trying to use the CopyTagger to copy fields from a source system into a standardized field name.

<tagger class="com.norconex.importer.handler.tagger.impl.CopyTagger">
                    <copy fromField="og_type" toField="category" overwrite="true"/>
                    </tagger>

I've experimented with placing these in the preParseHandler and the postParseHandler as well as using the original names of the fields as well as the value that is assigned when "fixFieldNames" is used. Regardless of what I do, the fields do not end up in the SQL endpoint. I've also noticed that these fields are not automatically created by the "createFieldSQL" statement.

Does the SQL Committer not support these features of Norconex? I am using the December release of the HTTP Collector and the 2.0.0 version of the SQL Committer.

I got an issue in mapping the fields.

Hello,

I have configured the SQLFileCommitter.xml and the content is
<committer class = "com.norconex.committer.sql.SQLCommitter"> <fieldMappings> <mapping fromField = "document.reference" toField = "document_reference"/> <mapping fromField="content" toField="content"/> </fieldMappings> </committer>

And I have set the SQLCommitterConfig by using the methods:

`SQLCommitterConfig sqlCommitterConfig = new SQLCommitterConfig();

    sqlCommitterConfig.setDriverClass("com.mysql.cj.jdbc.Driver");
    sqlCommitterConfig.setConnectionUrl("jdbc:mysql://localhost:3306/sys");
    sqlCommitterConfig.setCredentials(new Credentials().setUsername("root").setPassword("123456"));
    sqlCommitterConfig.setTableName("test");
    sqlCommitterConfig.setPrimaryKey("id");
    sqlCommitterConfig.setFixFieldNames(true);
    sqlCommitterConfig.setFixFieldValues(true);
    sqlCommitterConfig.setTargetContentField("content");

    XML xml = new XML(Path.of("fraud\\src\\main\\resources\\SQLFileCommitter.xml"));
    SQLCommitter sqlCommitter = new SQLCommitter(sqlCommitterConfig);
    sqlCommitter.loadFromXML(xml);
    sqlCommitter.loadCommitterFromXML(xml);
    crawlerConfig.setCommitters(sqlCommitter);`

I want to store only part of the extracted fields into MYSQL. But I got the exception
Caused by: java.sql.SQLException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-Backend-Response,Server,x-webkit-csp,Content-Location,x-frame-options,x-cdn-pro' at line 1 Query: INSERT INTO test(X-Backend-Response,Server,x-webkit-csp,Content-Location,x-frame-options,x-cdn-provider,Referrer-Policy,X-SecNG-Response,dc:title,Content-Encoding,Set-Cookie,collector.depth,id,surrogate-control,document_reference,google-site-verification,document.contentEncoding,strict-transport-security,pragma,x-xss-protection,x-idc-id,Cache-Control,document.contentType,Content-Language,expires,document.contentFamily,renderer,collector.redirect-trail,force-rendering,description,title,content,x-edge-timing,x-content-security-policy,X-Cache-Lookup,collector.is-crawl-new,collector.http-fetcher,Content-Length,Content-Type,Transfer-Encoding,X-Parsed-By,Connection,Date,X-UA-Compatible,content-security-policy,x-content-type-options,viewport,x-lb-timing,X-NWS-LOG-UUID,Vary) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) Parameters: [0.028, CLOUD ELB 1.0.0, default-src * blob:; img-src * data: blob: resource: t.captcha.qq.com *.dun.163yun.com *.dun.163.com *.126.net *.nosdn.127.net nos.netease.com;

It seemed like the insert query was too long and the fileds mapping didn't work, so I didn't have enough columns to store. How can I deal with that?

Running committer without running collector again

Hi,

I ran the collector and have the working folders here. The SQL commit failed as the database was down.

Is it possible to just re-run the committer part without rerunning the collector? The collector took quite a bit of time.

Many thanks.

SQL commits jumbled when running two crawlers at the same time

Hi,

I am using the filesystem crawler with SQL committer. It works great!

Today I tried running two crawlers at the same time.

consider crawl-one.variables which has an associated crawl-one.xml

path = \\storage\path_one 
workdir = ./crawl-one

And also the same for crawl-two

path = \\storage\path_two
workdir = ./crawl-two

These are configured to commit to different tables in the same database.

Oddly the database table for crawl-one contains results from crawl-two, and so forth.

The 32_Crawler.log within the working directories is pure

I tried naming the fscollector id and crawler id uniquely in each configuration file but still have a problem.

I am running the 2.9.1 Snapshot of the collector filesystem.

Thinking aloud, the committer-queue folder is perhaps common to both of my processes? I will try running two completely separate 2.9.1 snapshot folders. Regardless I thought worth mentioning this.

Norconex/committer-core#9 looks related.

Any advice is invited.

Table structure is not suitable for MySQL

In MySQL, CLOB data type is not supported. And the most field types are declared as VARCHAR(32672) but this itself exceeds the limit if a field if encoded by utf-8, etc. And the combined limit is 65532 in text length; can't create more than 3 fields with other encondings. In order to remove this limit, the fields should be defined as a TEXT/BLOB type.

Hope there is a DB/Field Type option/param or more universal support.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.