Giter VIP home page Giter VIP logo

Comments (6)

Arjun-Sharma1 avatar Arjun-Sharma1 commented on August 22, 2024 1

Thanks for the update! Works great.

from kijiji-scraper.

mwpenny avatar mwpenny commented on August 22, 2024

Hi there. Could you please provide a URL for an ad that can reproduces this issue? Thanks!

from kijiji-scraper.

Arjun-Sharma1 avatar Arjun-Sharma1 commented on August 22, 2024

Sure this is a random one I tried and produced the issue.

https://www.kijiji.ca/v-mens-shoes/mississauga-peel-region/jordan-12-retro-indigo/1564240827 and here's another
https://www.kijiji.ca/v-mens-shoes/mississauga-peel-region/nike-air-vapormax-360-grey/1564281977

This is what the correct size attribute should show ( Used "api" as the scraper type ):
https://gyazo.com/1c1b153315c43f5066479b97c5a70aec

from kijiji-scraper.

Arjun-Sharma1 avatar Arjun-Sharma1 commented on August 22, 2024

Apparently, this happens for any category that has the size attribute (Clothing)

from kijiji-scraper.

mwpenny avatar mwpenny commented on August 22, 2024

Kijiji sends all ad attributes as strings. The mobile API also sends type information, but unfortunately the HTML site doesn't. So for each ad attribute, the HTML scraper tries to determine the type with various checks and then cast the value accordingly. If all checks fail then the value is treated as a string.

What we're seeing here is a false positive due to V8's date parsing being very permissive. The Date constructor and Date.parse() appear to ignore characters at the beginning of a date string if they're followed by one of a handful of delimiters ( , ., -, and possibly more) and then a valid date string. For example, new Date("blah-2021-05-05T23:36:25.833Z") returns a valid Date when run in Chrome or Node.js. The same date string is invalid in Firefox (SpiderMonkey). This is in line with the language spec:

In general, the value produced by Date.parse is implementation-defined when given any String value that does not conform to the Date Time String Format (21.4.1.15) [simplified ISO 8601] and that could not be produced in that implementation by the toString or toUTCString method.

Due to this behavior, and since new Date("10") is a valid (implementation-specific) date in V8, new Date("size-10") is also valid and the size string is interpreted as a date by the scraper.

From what I've seen, Kijiji always uses ISO date strings. They're not exactly equal to JS-style date strings (sometimes have more precision on the time portion), so I'll just change the check to require that date attributes begin with a number.

Good find!

from kijiji-scraper.

mwpenny avatar mwpenny commented on August 22, 2024

The latest version on NPM has the fix. Please give it a try.

from kijiji-scraper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.