Giter VIP home page Giter VIP logo

Comments (8)

bfabio avatar bfabio commented on September 17, 2024 2

#189 dramatically decreases the frequency of this happening.

Keeping this open though, because the root issue is not resolved.

from publiccode-crawler.

bfabio avatar bfabio commented on September 17, 2024

pcvalidator -export expands the URLs only if -remote-base-url is passed. This is documented in the source but not to the user.

@sebbalex is there a chance the crawler could run it with no or empty RemoteBaseURL?

from publiccode-crawler.

sebbalex avatar sebbalex commented on September 17, 2024

@sebbalex is there a chance the crawler could run it with no or empty RemoteBaseURL?

RemoteBaseURL is needed to enforce absolute and relative url validation:

If left empty, absolute URLs will not be validated and no remote validation of files with relative paths will be performed.

furthermore there was no evidence of this in the past and since no changes were made in our codebase this confuses me.

from publiccode-crawler.

bfabio avatar bfabio commented on September 17, 2024

RemoteBaseURL is needed to enforce absolute and relative url validation:

Sorry I wasn't clear. I was trying to say that the crawler being run with an empty RemoteBaseURL for some exotic reasons could explain what we are seeing. I'm just as puzzled as you. 🤔

from publiccode-crawler.

sebbalex avatar sebbalex commented on September 17, 2024

In latest run I noticed about this timeout problems, I think this is related to URL expand issue we got here.

time="2020-09-24T08:35:11Z" level=error msg="Error parsing publiccode.yml: logo: HTTP GET failed for https://raw.githubusercontent.com/AgID/rndt-joomla-template/master/documentation/images/logo-rndt.png: Get https://raw.githubusercontent.com/AgID/rndt-joomla-template/master/documentation/images/logo-rndt.png: dial tcp 151.101.36.133:443: i/o timeout"
time="2020-09-24T08:35:12Z" level=error msg="Error parsing publiccode.yml: logo: HTTP GET failed for https://raw.githubusercontent.com/AgID/rndt-catalogue/master/documentation/images/logo-rndt.png: Get https://raw.githubusercontent.com/AgID/rndt-catalogue/master/documentation/images/logo-rndt.png: dial tcp 151.101.36.133:443: i/o timeout"
time="2020-09-24T08:35:13Z" level=error msg="Error parsing publiccode.yml: logo: HTTP GET failed for https://raw.githubusercontent.com/italia/18app/master/src/Italia.DiciottoApp.iOS/Assets.xcassets/AppIcon.appiconset/Icon120.png: Get https://raw.githubusercontent.com/italia/18app/master/src/Italia.DiciottoApp.iOS/Assets.xcassets/AppIcon.appiconset/Icon120.png: dial tcp 151.101.36.133:443: i/o timeout"
time="2020-09-24T08:35:13Z" level=error msg="Error parsing publiccode.yml: description/it/screenshots: HTTP GET failed for https://raw.githubusercontent.com/consiglionazionaledellericerche/cool-jconon/master/docs/screenshot/responsive_it.png: Get https://raw.githubusercontent.com/consiglionazionaledellericerche/cool-jconon/master/docs/screenshot/responsive_it.png: dial tcp 151.101.36.133:443: i/o timeout\ndescription/en/screenshots: HTTP GET failed for https://raw.githubusercontent.com/consiglionazionaledellericerche/cool-jconon/master/docs/screenshot/home_en.png: Get https://raw.githubusercontent.com/consiglionazionaledellericerche/cool-jconon/master/docs/screenshot/home_en.png: dial tcp 151.101.36.133:443: i/o timeout"
time="2020-09-24T08:35:14Z" level=error msg="Error parsing publiccode.yml: description/it/screenshots: HTTP GET failed for https://raw.githubusercontent.com/vvfosprojects/sovvf/master/doc/images/dashboard.jpg: Get https://raw.githubusercontent.com/vvfosprojects/sovvf/master/doc/images/dashboard.jpg: dial tcp 151.101.36.133:443: i/o timeout"
time="2020-09-24T08:35:14Z" level=error msg="Error parsing publiccode.yml: description/it/screenshots: HTTP GET failed for https://raw.githubusercontent.com/IstitutoCentraleCatalogoUnicoBiblio/Nuovo-Opac-di-Polo-SBN/master/screenshots/nuovo_opac.png: Get https://raw.githubusercontent.com/IstitutoCentraleCatalogoUnicoBiblio/Nuovo-Opac-di-Polo-SBN/master/screenshots/nuovo_opac.png: dial tcp 151.101.36.133:443: i/o timeout"

from publiccode-crawler.

sebbalex avatar sebbalex commented on September 17, 2024

#189 dramatically decreases the frequency of this happening.

Keeping this open though, because the root issue is not resolved.

We could consider that root cause was the amount of concurrency process and close this, wdyt @bfabio ?

from publiccode-crawler.

bfabio avatar bfabio commented on September 17, 2024

@sebbalex I'm not convinced, there must be something wrong in the code that doesn't handle git failures correctly and still resolves the URL as relative. Most (all?) of the failures where caused by concurrency, but the crawler should have stopped processing the repo as soon as they happened.

from publiccode-crawler.

bfabio avatar bfabio commented on September 17, 2024

This doesn't apply anymore.

After #302 the crawler doesn't touch publiccode.yml's contents, APIs consumers are now in charge of doing the expansion, if they need it.

from publiccode-crawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.