Giter VIP home page Giter VIP logo

webcuratortool / webcurator-v2-legacy Goto Github PK

View Code? Open in Web Editor NEW
27.0 17.0 12.0 172.92 MB

The Web Curator Tool is a tool for managing the selective web harvesting process. (moved from SourceForge). https://webcurator.slack.com https://webcuratortool.readthedocs.io

Home Page: http://dia-nz.github.io/webcurator/

License: Apache License 2.0

Batchfile 0.07% HTML 6.56% Java 81.87% JavaScript 10.63% CSS 0.45% Arc 0.18% Shell 0.08% PLpgSQL 0.17%

webcurator-v2-legacy's Introduction

webcurator-v2-legacy's People

Contributors

ahamblyn avatar bitzl avatar gmlewis avatar hannakoppelaar avatar hhazewinkel avatar kurtlenfesty avatar obrienben avatar stefan-it avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

webcurator-v2-legacy's Issues

D1.6 Harvest Scheduler harvester-type

Harvest scheduling needs to check the harvester-type of a Target Instance. Harvests need to be assigned to a Harvest Agent of the same harvester-type:

  • Check Target Instance harvester-type and assign to correct Harvest Agent
  • Show Harvest Agents of correct harvester-type in 'Harvest Now'

D1.1 Move H3 implementation into its own Harvest Agent

Separate old Heritrix and H3 Harvest Agent code into different modules. The Harvest Agents can be preconfigured to a specific crawler. Abstract out a common library of the Harvest Agent base classes/interfaces for quicker building of new Harvest Agent types.

Subtasks:

  • Duplicate the Harvest Agent Module and revert it back to original.
  • Build base Harvest Agent library for new Harvest Agents to extend.
  • Modify H3 Harvest Agent to use Harvest Agent base library.

JBoss SOAP jars do not support latest Rosetta SDK

There's an issue with the JBoss client jars not working properly with the latest Rosetta SDK. This is causing exceptions with the Rosetta SDK is installed.

The fix seems to be to upgrade the entire jaxb/jaxws stack to 2.3.1 and use the jaxws-rt instead of the JBoss jars.

Emails may only have TLDs with at most 4 characters

With new TLDs like .museum, it is possible to have Email addresses with an TLD longer than 4 characters:

The TLD is restricted in ValidatorUtil to 4 letters:

public final static String EMAIL_VALIDATION_REGEX = "^[_a-zA-Z0-9-]+(\\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)*(\\.[a-zA-Z]{2,4})$";

To fix this, set the last number to something larger, like

public final static String EMAIL_VALIDATION_REGEX = "^[_a-zA-Z0-9-]+(\\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)*(\\.[a-zA-Z]{2,10})$";

Purge directories not working

Hi all,

I have problems with purge finished and temporary directories, I update the properties to reduce time but It doesn't purge anything from the directories.

Can you explain how it works?

Thank you so much

Consistent naming for SQL files.

The sql files needed for installing WCT all have filenames that include the name of the database (mysql, etc.). The wct-schema-grants.sql file is an exception, this one is used for both oracle and postgresql.
It would make the install more clear if we used alwats added the database name to the .sql files. So I propose to duplicate the wct-schema-grants.sql file to both wct-schema-grants-oracle and wct-schema-grants-postgresql.sql (schema-grants-mysql.sql already exists).

Can't review harvest - NonUniqueResultException

Occasionally an error can occur with a new harvest, the review screen is blocked by a Hibernate error

org.hibernate.NonUniqueResultException: query did not return a unique result: 2
        at org.hibernate.impl.AbstractQueryImpl.uniqueElement(AbstractQueryImpl.java:762)
        at org.hibernate.impl.AbstractQueryImpl.uniqueResult(AbstractQueryImpl.java:749)
        at org.webcurator.domain.TargetInstanceDAOImpl$8.doInHibernate(TargetInstanceDAOImpl.java:280)
        at org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:366)
        at org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:334)
        at org.webcurator.domain.TargetInstanceDAOImpl.getHarvestResourceDTO(TargetInstanceDAOImpl.java:275)
        at org.webcurator.ui.tools.controller.QualityReviewToolController.load(QualityReviewToolController.java:132)
        at org.webcurator.ui.tools.controller.QualityReviewToolController.handle(QualityReviewToolController.java:99)

Inside the Harvest_Resource table there will be a duplicate seed URL for this Target Instance. How this occurs is still not known.
A quick fix is to un-link the Harvest_Resource row to its corresponding Harvest_Result, so that it is not returned in the Hibernate query above. The HRC_HARVEST_RESULT_OID column can be set to a something that doesn't conflict with another Harvest Result.

D1.2 H3 Harvest Agent recover H3 jobs on startup

Check for existing H3 jobs on Harvest Agent start-up and recover any. Currently if a HA is restarted while it has running crawls, then those Target Instances in WCT will be orphaned, and the jobs will continue to run in H3.

Subtasks:

  • HA queries H3 instance on start-up for jobs
  • Re-initialize any existing jobs recovered, so they are polled back to Core.

Heartbeats dying from H3 Harvest Agent

It appears the heartbeat trigger is randomly dying and not resuming in the H3 Harvest Agent, only being fixed by a restart.
Other interactions with the Harvest Agent can still occur once this state happens:

  • new harvests can be started, and H3 will run them, but no status updates will come back once the heartbeats have stopped.
  • completed harvests in H3 will be detected and the completed job scheduled task will still run in the Harvest Agent.

D1.3 QA Tree View H3 compatibility

Update tree view to handle newer warc versions for H3. It is currently using old heritrix 1.14.1 lib for reading and writing to warcs.

Incomplete H3 support in Target Instance Summary

In the Profile Overrides Section only H1 and imported H3 profiles are fully supported. Non-imported H3 overrides are ignored. For example, if you set the robot policy in a H3 overrides section to 'ignore' (as in the image here), you will find that this setting is not picked up when you start a new harvest using the harvest now button. Set a breakpoint in HarvestCoordinatorImpl#getHarvestProfileString to see the overrides not being applied.

screenshot_2018-10-31 web curator tool target instance qa summary

Add tick all for Role permissions

Add a 'tick all' toggle for the Roles permissions screen.
Very tedious having to manually tick all those boxes for administrators

image

D1.4.6.1 Editor for imported profiles

Create an XML editor for imported profiles (under Management->Harvester Configuration), since the current profile editor can only be used for profiles that were created by WCT.

D1.4.6 H3 profile import functionality

Profile import functionality to handle H3 profiles.
This should also include validation of the H3 profile, using an H3 instance, H3 libraries, or some other means.

Deleting secondary seed in target record post-harvest

If a target has a secondary seed and this seed is deleted post-harvest (but before endorsing, archiving or rejecting), then the secondary seed also disappears from the review tool. The primary seed harvest can still be reviewed after deleting the secondary seed, however the original seed/s should not change.

Heatmap colours don't reflect the numbers they're supposed to represent

The colour coding in the heatmap doesn't tally with the numbers they are supposed to represent: e.g.
The levels are set in the bandwidth tab in WCT:
Green (between 1-6 scheduled harvests)
Yellow (7-14 harvests)
Red (12+ harvests)

When I look at today’s schedule – I see that:
The 12 January actually has 18 harvests scheduled (so should be red, not green)
The 13th has 20 (should also be red, not yellow)
The 14th has 22 (shows correctly as red)

D1.4.3 Add harvester-type elements to Profile management UI

Under Profile Management screen, add harvester type as a filter, and as a column in profile list table. Combo box with available harvester types to be added next to 'Create new' button, will require an option to be selected (javascript check possibly).

LogReader failing in WCT-Store

This bug was introduced after #20
By removing the aheritrix dependency from wct-store, viewing the crawl logs fails for completed harvests.
org.webcurator.core.exceptions.WCTRuntimeException: Failed to invoke tail on the SOAP service : java.lang.reflect.InvocationTargetException

This is due to wct-store's use of LogReaderImpl in the wct-core dependency, which is why the build of wct-store is still successful. For instance, the tail method in LogReaderImpl invokes org.archive.crawler.util.LogReader.tail() (line 71).

After a quick look, I can't see a replacement method within the webarchive-commons lib for the log reader calls. So I'm not sure whether it is worth the effort (and possible) now to try and shift all the remaining aheritrix dependent calls in wct-core to use webarchive-commons or an equivalent, or just allow the aheritrix dependency back into wct-store and park the issue for now.

Switch to using com.exlibris.dps:dps-sdk-fat-all:5.5.0 jar

Currently the com.exlibris:dps-sdk:5.5.0 jar is checked into the webcurator codebase and installed manually by maven. Unfortunately this jar does not have a pom with dependencies. The github project https://github.com/NLNZDigitalPreservation/rosetta-dps-sdk-projects-maven-lib corrects this issue by providing an uploadable Rosetta dps-sdk jar with a maven pom containing all its first-order dependencies. See that project's README.md for more details.

The fix for #87 involves upgrading the Rosetta SDK to 5.5.0. As part of that fix, the rosetta-dps-sdk-projects-maven-lib pom has been copied and checked into the webcurator project. This is a short-term fix.

Long-term it would be better that webcurator uses the rosetta-dps-sdk-projects-maven-lib project directly to provide a maven pom.

This issue may become moot if newer versions of the Rosetta SDK are provided with a proper maven pom (or are published to something like maven central).

Crawler activity report doesn't always run properly

(Same as Sourceforge #152) new example
I can’t run the Oct-Dec 2015 crawler report in one go – it just won’t generate. So I ran the October report by itself – which ran fine and displayed the full month on screen. However when I saved it as a CSV file it only saved data up to 8th October and not right through to 30th October. I ran it twice and it did the same thing.
When I tried to run the November stats separately all I got was a blank page:
December ran okay and saved as a CSV file okay.

D2.1 Fix and consolidate database scripts

Fix and test database scripts:

  • all DML/DDL files that are needed to create the 1.7.x WCT database from scratch.
  • all DML/DDL files that are needed to upgrade from 1.6.3 to 1.7.x

Note that there are separate sets of scripts for each of the platforms MySQL, PostgreSQL, Oracle and H2.

Wrong SQL file names in documentation

The System Administrator Guide states that one should run

sql/wct-schema-1_6_1-mysql.sql

while the file is called

sql/wct-schema-1_6-mysql.sql

The same is for PostgreSQL and Oracle.

D2.2 Update Documentation

Update documentation to reflect changes in release 1.7.x:

  • correct install instructions in System Administrator Guide
  • description of changed functionality in User Manual
  • updated Data Dictionary

D1.4.6.2 Editor for overrides of imported profiles

Under the Profile tab in the Target screen the user can override certain configuration options in the profile. This only works for profiles created by WCT. For imported profiles we will add the option to edit the raw XML, which will then be stored at the target level and completely override the profile that was originally selected for this target.

Ensure all pre 2.0.0 documentation is captured in the new readthedocs documentation

Do a comprehensive review of all existing WCT documentation (usually found in .md, .pdf and .txt files) and ensure that the documentation is captured in the readthedocs documentation (found under the /docs branch).

All pre-2.0.0 documents will be split out into a legacy-docs project, so they shouldn't be deleted from the repository.

Change the root README.md to refer to the readthedocs documentation.

D1.7 Profile Updates to Running H3 Crawls

Add new functionality to Target Instance to allow updating of profile for a running H3 crawl. Functionality would be available via the Profile tab.

Might need to investigate which parts of the profile can be updated once a crawl has started.

Target Instance screen fails to load

Upon a specific configuration of an H3 Target's profile overrides, the database gets into a bad state, and searches of Target Instances that include a TI for this Target, fail and throw this error

org.springframework.orm.hibernate3.HibernateSystemException: exception setting property value with CGLIB (set hibernate.cglib.use_reflection_optimizer=false for more info) setter of org.webcurator.domain.model.core.ProfileOverrides.setOverrideH3DocumentLimit; nested exception is org.hibernate.PropertyAccessException: exception setting property value with CGLIB (set hibernate.cglib.use_reflection_optimizer=false for more info) setter of org.webcurator.domain.model.core.ProfileOverrides.setOverrideH3DocumentLimit

Drools exceptions in QA module

When the automated QA tests run after a harvest has finished, exceptions appear in the log: apparently the classes compiled by Drools are invalid. This might be a JDK8 compatibility issue.

What to look for:

2018-12-27 16:09:22,873 INFO [http-nio-8080-exec-141] rules.QaRecommendationServiceImpl (QaRecommendationServiceImpl.java:105) - Loading rules file rules.drl
**** COMPILER BUG! REPORT THIS IMMEDIATELY AT http://jira.codehaus.org/browse/mvel2

And:

java.lang.VerifyError: (class: ASMAccessorImpl_15846973001545923363520, method: getKnownEgressType signature: ()Ljava/lang/Class;) Illegal type in constant pool
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.newInstance(Class.java:412)

Invalid XML for harvest notes with URLs

URLs in wct:note sections are not properly escaped. This is default JSP behavior and leads to an XML invalid file, because a parser would interpret URL parameters as HTML entities.

Example:

<wct:Note>
Seed is a  redirect to http://example.com?from=outside&subset=latin
</wct:Note>

In this case, a validating parser would expect &subset to be closed with a ;. Valid XML would be

<wct:Note>
Seed is a  redirect to http://example.com?from=outside&amp;subset=latin
</wct:Note>

Ability to patch crawl with H3

Add the ability for the QA Import function to patch crawl using H3.

Also move the patch crawling from Core out to the Harvest Agents. This should help fix a lot of the medium-large size patch crawling that fails.

Problems with a http reverse proxy

At the moment, there are problems, when you have a http reverse proxy like NginX in front of the wct and want to use ssl.

The browser always complains about mixed content and it is not even possible to log into the system.

The problem is, that a base path for the relative urls is used in all of the templates, e.g.:
https://github.com/DIA-NZ/webcurator/blob/master/wct-core/src/main/webapp/logon.jsp#L3

It is not possible to define the value, because it is always read from the request.

Target Profile Override UI Bug in IE

Viewing the Profile screen for a Target in Internet Explorer shows the profile override fields for H1 and H3. The Javascript used to hide one set of overrides isn't working.
Tested in IE11.

image

Make profile override option for robots match profile management

The ui field for 'Ignore Robots' in H3 profile overrides for a Target don't match the ui in the main profile management.

Profile override
image

Main profile management
image

Change the profile override screen to use the checkbox. The checkbox is simpler and less ambiguous.

D1.5 Profile Overrides for H3 Crawls

Enhance Target profile overrides to work for H3 crawls. This should include the use of the 'harvester type' construct, eg. user needs to select the type of harvester the Target will be crawled on.

Remove db-specific config from binary

If you create a binary there will be some redundant (and presently undocumented) config variables in several files, pertaining to the database connection. These variables should be removed.

Cannot review Harvests or view Tree views

When trying to review harvests (any), in Web curator 1.6.2 (similar error appears in TreeView), the following error is written in wct-das.log:

2017-10-06 14:14:44,965 WARN  [http-nio-8080-exec-10] attachments.AttachmentsImpl (AttachmentsImpl.java:558) - Exception:
AxisFault
 faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
 faultSubcode: 
 faultString: java.lang.NullPointerException
 faultActor: 
 faultNode: 
 faultDetail: 
	{http://xml.apache.org/axis/}stackTrace:java.lang.NullPointerException
	at javax.activation.MimetypesFileTypeMap.getContentType(MimetypesFileTypeMap.java:286)
	at javax.activation.FileDataSource.getContentType(FileDataSource.java:126)
	at javax.activation.DataHandler.getContentType(DataHandler.java:205)
	at org.apache.axis.attachments.AttachmentPart.&lt;init&gt;(AttachmentPart.java:82)
	at org.apache.axis.attachments.AttachmentsImpl.createAttachmentPart(AttachmentsImpl.java:272)
	at org.apache.axis.encoding.ser.JAFDataHandlerSerializer.serialize(JAFDataHandlerSerializer.java:66)
	at org.apache.axis.encoding.SerializationContext.serializeActual(SerializationContext.java:1504)
	at org.apache.axis.encoding.SerializationContext.serialize(SerializationContext.java:877)
	at org.apache.axis.encoding.SerializationContext.serialize(SerializationContext.java:801)
	at org.apache.axis.message.RPCParam.serialize(RPCParam.java:208)
	at org.apache.axis.message.RPCElement.outputImpl(RPCElement.java:433)
	at org.apache.axis.message.MessageElement.output(MessageElement.java:1208)
	at org.apache.axis.message.SOAPBody.outputImpl(SOAPBody.java:139)
	at org.apache.axis.message.SOAPEnvelope.outputImpl(SOAPEnvelope.java:478)
	at org.apache.axis.message.MessageElement.output(MessageElement.java:1208)
	at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:315)
	at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:269)
	at org.apache.axis.SOAPPart.saveChanges(SOAPPart.java:530)
	at org.apache.axis.attachments.AttachmentsImpl.getAttachmentCount(AttachmentsImpl.java:554)
	at org.apache.axis.Message.getContentType(Message.java:486)
	at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:775)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:648)
	at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:502)
	at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1132)
	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1533)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1489)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.lang.Thread.run(Thread.java:745)

	{http://xml.apache.org/axis/}hostname:padi1-wct-pro-new

java.lang.NullPointerException
	at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
	at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:317)
	at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:269)
	at org.apache.axis.SOAPPart.saveChanges(SOAPPart.java:530)
	at org.apache.axis.attachments.AttachmentsImpl.getAttachmentCount(AttachmentsImpl.java:554)
	at org.apache.axis.Message.getContentType(Message.java:486)
	at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:775)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:648)
	at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:502)
	at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1132)
	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1533)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1489)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
	at javax.activation.MimetypesFileTypeMap.getContentType(MimetypesFileTypeMap.java:286)
	at javax.activation.FileDataSource.getContentType(FileDataSource.java:126)
	at javax.activation.DataHandler.getContentType(DataHandler.java:205)
	at org.apache.axis.attachments.AttachmentPart.<init>(AttachmentPart.java:82)
	at org.apache.axis.attachments.AttachmentsImpl.createAttachmentPart(AttachmentsImpl.java:272)
	at org.apache.axis.encoding.ser.JAFDataHandlerSerializer.serialize(JAFDataHandlerSerializer.java:66)
	at org.apache.axis.encoding.SerializationContext.serializeActual(SerializationContext.java:1504)
	at org.apache.axis.encoding.SerializationContext.serialize(SerializationContext.java:877)
	at org.apache.axis.encoding.SerializationContext.serialize(SerializationContext.java:801)
	at org.apache.axis.message.RPCParam.serialize(RPCParam.java:208)
	at org.apache.axis.message.RPCElement.outputImpl(RPCElement.java:433)
	at org.apache.axis.message.MessageElement.output(MessageElement.java:1208)
	at org.apache.axis.message.SOAPBody.outputImpl(SOAPBody.java:139)
	at org.apache.axis.message.SOAPEnvelope.outputImpl(SOAPEnvelope.java:478)
	at org.apache.axis.message.MessageElement.output(MessageElement.java:1208)
	at org.apache.axis.SOAPPart.writeTo(SOAPPart.java:315)
	... 29 more

And the web shows a 'Resource not found', 'url cannot be found',

I've traced the tomcat process to view if these attachments were being created, and they were:

19775 14:14:16 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4192013112433590662.att",  <unfinished ...>
19775 14:14:16 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4192013112433590662.att", O_RDWR|O_CREAT|O_EXCL, 0666 <unfinished ...>
19775 14:14:16 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4192013112433590662.att", O_WRONLY|O_CREAT|O_TRUNC, 0666 <unfinished ...>
19775 14:14:17 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4192013112433590662.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/convenis-20171006115308-00000.arc.gz") = -1 ENOSYS (Function not implemented)
19775 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/convenis-20171006115308-00000.arc.gz", O_RDONLY <unfinished ...>
19775 14:14:17 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/convenis-20171006115308-00000.arc.gz" <unfinished ...>
19768 14:14:17 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2119624094982458618.att",  <unfinished ...>
19768 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2119624094982458618.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19768 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2119624094982458618.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19768 14:14:17 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2119624094982458618.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/uri-errors.log" <unfinished ...>
19768 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/uri-errors.log", O_RDONLY) = -1 ENOSYS (Function not implemented)
19768 14:14:17 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/uri-errors.log" <unfinished ...>
19767 14:14:17 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2567056526070013221.att", 0x7f1589d15b80) = -1 ENOSYS (Function not implemented)
19767 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2567056526070013221.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19767 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2567056526070013221.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19767 14:14:17 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2567056526070013221.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl.log") = -1 ENOSYS (Function not implemented)
19767 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl.log", O_RDONLY) = -1 ENOSYS (Function not implemented)
19767 14:14:17 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl.log") = -1 ENOSYS (Function not implemented)
19774 14:14:17 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis3790561022192250135.att", 0x7f15759d2a20) = -1 ENOSYS (Function not implemented)
19774 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis3790561022192250135.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19774 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis3790561022192250135.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19774 14:14:17 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis3790561022192250135.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/progress-statistics.log") = -1 ENOSYS (Function not implemented)
19774 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/progress-statistics.log", O_RDONLY) = -1 ENOSYS (Function not implemented)
19774 14:14:17 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/progress-statistics.log" <unfinished ...>
19769 14:14:17 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis5328836179780793966.att",  <unfinished ...>
19769 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis5328836179780793966.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19769 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis5328836179780793966.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19769 14:14:17 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis5328836179780793966.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/frontier-report.txt") = -1 ENOSYS (Function not implemented)
19769 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/frontier-report.txt", O_RDONLY) = -1 ENOSYS (Function not implemented)
19769 14:14:17 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/frontier-report.txt" <unfinished ...>
19766 14:14:17 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis7234161214932753611.att", 0x7f1589e16fc0) = -1 ENOSYS (Function not implemented)
19766 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis7234161214932753611.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19766 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis7234161214932753611.att", O_WRONLY|O_CREAT|O_TRUNC, 0666 <unfinished ...>
19766 14:14:17 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis7234161214932753611.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl-report.txt") = -1 ENOSYS (Function not implemented)
19766 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl-report.txt", O_RDONLY) = -1 ENOSYS (Function not implemented)
19766 14:14:17 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl-report.txt" <unfinished ...>
19771 14:14:17 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6968345071654791553.att", 0x7f158821f240) = -1 ENOSYS (Function not implemented)
19771 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6968345071654791553.att", O_RDWR|O_CREAT|O_EXCL, 0666 <unfinished ...>
19771 14:14:17 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6968345071654791553.att", O_WRONLY|O_CREAT|O_TRUNC, 0666 <unfinished ...>
19771 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6968345071654791553.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/hosts-report.txt" <unfinished ...>
19771 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/hosts-report.txt", O_RDONLY <unfinished ...>
19771 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/hosts-report.txt" <unfinished ...>
19773 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1105480588160504536.att",  <unfinished ...>
19773 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1105480588160504536.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19773 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1105480588160504536.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19773 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1105480588160504536.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/seeds.txt") = -1 ENOSYS (Function not implemented)
19773 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/seeds.txt", O_RDONLY) = -1 ENOSYS (Function not implemented)
19773 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/seeds.txt" <unfinished ...>
19772 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis33511973174331419.att", 0x7f158811dec0) = -1 ENOSYS (Function not implemented)
19772 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis33511973174331419.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19772 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis33511973174331419.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19772 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis33511973174331419.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/processors-report.txt" <unfinished ...>
19772 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/processors-report.txt", O_RDONLY) = -1 ENOSYS (Function not implemented)
19772 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/processors-report.txt" <unfinished ...>
19770 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8075346245249116781.att", 0x7f15883201c0) = -1 ENOSYS (Function not implemented)
19770 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8075346245249116781.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19770 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8075346245249116781.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19770 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8075346245249116781.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/responsecode-report.txt") = -1 ENOSYS (Function not implemented)
19770 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/responsecode-report.txt", O_RDONLY) = -1 ENOSYS (Function not implemented)
19770 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/responsecode-report.txt" <unfinished ...>
19775 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8142789135174655958.att",  <unfinished ...>
19775 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8142789135174655958.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19775 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8142789135174655958.att", O_WRONLY|O_CREAT|O_TRUNC, 0666 <unfinished ...>
19775 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis8142789135174655958.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/mimetype-report.txt") = -1 ENOSYS (Function not implemented)
19775 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/mimetype-report.txt", O_RDONLY <unfinished ...>
19775 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/mimetype-report.txt" <unfinished ...>
19768 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1708464450127631628.att", 0x7f15885220c0) = -1 ENOSYS (Function not implemented)
19768 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1708464450127631628.att", O_RDWR|O_CREAT|O_EXCL, 0666 <unfinished ...>
19768 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1708464450127631628.att", O_WRONLY|O_CREAT|O_TRUNC, 0666 <unfinished ...>
19768 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis1708464450127631628.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/seeds-report.txt" <unfinished ...>
19768 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/seeds-report.txt", O_RDONLY) = -1 ENOSYS (Function not implemented)
19768 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/seeds-report.txt") = -1 ENOSYS (Function not implemented)
19767 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4077854491931735252.att", 0x7f1589d16040) = -1 ENOSYS (Function not implemented)
19767 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4077854491931735252.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19767 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4077854491931735252.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19767 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4077854491931735252.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl-manifest.txt") = -1 ENOSYS (Function not implemented)
19767 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl-manifest.txt", O_RDONLY <unfinished ...>
19767 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/crawl-manifest.txt") = -1 ENOSYS (Function not implemented)
19774 14:14:18 stat("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6758100849047985817.att", 0x7f15759d2fc0) = -1 ENOSYS (Function not implemented)
19774 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6758100849047985817.att", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOSYS (Function not implemented)
19774 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6758100849047985817.att", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOSYS (Function not implemented)
19774 14:14:18 rename("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis6758100849047985817.att", "/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/order.xml") = -1 ENOSYS (Function not implemented)
19774 14:14:18 open("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/order.xml", O_RDONLY) = -1 ENOSYS (Function not implemented)
19774 14:14:18 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/order.xml" <unfinished ...>
19701 14:14:20 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis4192013112433590662.att" <unfinished ...>
19701 14:14:20 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2119624094982458618.att" <unfinished ...>
19701 14:14:20 unlink("/projectes/padicat/dades/programari/webcurator_1.6.2/attachments/Axis2567056526070013221.att" <unfinished ...>

What could be the issue?

Any help would be appreciated, thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.