Giter VIP home page Giter VIP logo

Comments (7)

ziflex avatar ziflex commented on September 28, 2024

If it complains that CLICK is not supported it means you are using the in-memory HTTP driver and need to switch to CDP one inside your query.

LET doc = DOCUMENT('my-page', { driver: "cdp" })

CLICK(doc, "#my-button")

RETURN TRUE

from cli.

gonssal avatar gonssal commented on September 28, 2024

Yeah that did trick. I'm really sorry about wasting your time with these seemingly stupid issues, I'm finding it hard to work productively with ferret.

Things that are simple in virutally any programming language become exceedingly difficult in FQL.

from cli.

ziflex avatar ziflex commented on September 28, 2024

Hey, I'm sorry to hear that you are having difficulties. What could be done to make it better?

from cli.

gonssal avatar gonssal commented on September 28, 2024

I guess most of the issues are due to the declarative (functional?) design you chose for FQL and how it works.

For example, a real estate site I'm crawling has some data on each property with this HTML:

<ul class="props">
	<li>
		<div><span class="icon-wa50-sup"></span> m<sup>2</sup></div> 
		<div>215</div>
	</li>
	<li>
		<div><span class="icon-wa50-bed"></span> Rooms</div> 
		<div>4</div>
	</li>
	<li>
		<div><span class="icon-wa50-bath"></span> Bathrooms</div>
		<div>3</div>
	</li>
	<li>
		<div><span class="icon-wa50-parking"></span> Parking</div>
		<div><i class="icon-wa50-check"></i></div>
	</li>
</ul>

Not all the elements are always there on all the properties, so to know what I'm getting, I came up with this (some not-relevant code omitted):

LET property = {
	URL: propertyUrl,
	Title: TRIM(INNER_TEXT(propDoc, '.cardSlider > .body h1.titulo')),
	Reference: SUBSTITUTE(TRIM(INNER_TEXT(propDoc, '.cardSlider > .body .ref')), 'Ref.', ''),
	Description: TRIM(INNER_TEXT(propDoc, '.cardSlider > .body #descripcion_larga')),
	Price: SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(TRIM(INNER_TEXT(propDoc, '.cardSlider > .body .precio')), '€', ''), '.', ''), ',', '.'),
	Currency: 'EUR',
	AreaUnit: 'm2',
	Images: images,
	Type: 'buy'
}
LET propertyDataElements = ELEMENTS(propDoc, '.cardSlider > .body > .props:not(.props2) li')
LET propertyDataIndexes = (
	FOR propData IN propertyDataElements
		LET data = (
			ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-sup') ? 'Area' : (
				ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-bed') ? 'Bedrooms' : (
					ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-bath') ? 'Baths' : (
						ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-parking') ? 'Parking' : none
					)
				)
			)
		)
		RETURN data
)
LET propertyDataValues = (
	FOR propData IN propertyDataElements
		LET data = (
			ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-sup') ? 'Area' : (
				ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-bed') ? 'Bedrooms' : (
					ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-bath') ? 'Baths' : (
						ELEMENT_EXISTS(propData, 'div:first-child span.icon-wa50-parking') ? 'Parking' : none
					)
				)
			)
		)
		RETURN (data == 'Parking' ? (ELEMENT(propData, 'div:nth-child(2) i.icon-wa50-check') ? '1' : none) : TRIM(INNER_TEXT(propData, 'div:nth-child(2)')))
)
RETURN MERGE(property, ZIP(propertyDataIndexes, propertyDataValues))

As you can see, the lack of if/else makes me nest a lot of ternary operators. Also, to add fields to the property object I have to build 2 different arrays to build the keys and values and then use ZIP(). I was expecting to be able to do something like this instead:

property[data] = (data == 'Parking' ? (ELEMENT(propData, 'div:nth-child(2) i.icon-wa50-check') ? '1' : none) : TRIM(INNER_TEXT(propData, 'div:nth-child(2)')))

This is just one recent example.

from cli.

ziflex avatar ziflex commented on September 28, 2024

Well, certain limitations were done intentionally while others were a result of the source of inspiration.
Regarding the lack of if/else I do not see how it would simplify your logic, you would still have nested conditions.

And again, most of the time it's the way you solve particular problems and you just need to switch your thought process from imperative to declarative flow.
You can always switch to xpath and do something like this.

from cli.

gonssal avatar gonssal commented on September 28, 2024

Well, certain limitations were done intentionally while others were a result of the source of inspiration. Regarding the lack of if/else I do not see how it would simplify your logic, you would still have nested conditions.

else if helps avoid nesting. A switch could also be used instead. In my example there's only 4 conditions, imagine the nesting if there were 15 or more.

And again, most of the time it's the way you solve particular problems and you just need to switch your thought process from imperative to declarative flow. You can always switch to xpath and do something like this.

The problem was not getting the data, but knowing what type of data it is and appending it to the already existing property object. Instead of doing property[data] = value in a single FOR, I had to create two arrays, making sure they are the same size with the keys and the values, and then ZIP and MERGE. In the github example you linked, imagine having an existing object like this:

LET stargazers = {
    "ziflex": "clock",
    "MontFerret": "organization",
    "Kremlin": "location"
}

And then in the example instead of just showing the icon type, you wanted to iteratively append to the users object with the username as property name. Something like APPEND(stargazers, {"Gusyatnikova": "organization"}) inside a FOR.

I realize it's a design issue, it's the first thing I said, but sometimes I just feel that if I could write the scripts in for example JS, I would be saving a lot time overthinking how to make things work. And please don't get me wrong, I love ferret and I think it's a great piece of software.

from cli.

ziflex avatar ziflex commented on September 28, 2024

Well, certain limitations were done intentionally while others were a result of the source of inspiration. Regarding the lack of if/else I do not see how it would simplify your logic, you would still have nested conditions.

else if helps avoid nesting. A switch could also be used instead. In my example there's only 4 conditions, imagine the nesting if there were 15 or more.

And again, most of the time it's the way you solve particular problems and you just need to switch your thought process from imperative to declarative flow. You can always switch to xpath and do something like this.

The problem was not getting the data, but knowing what type of data it is and appending it to the already existing property object. Instead of doing property[data] = value in a single FOR, I had to create two arrays, making sure they are the same size with the keys and the values, and then ZIP and MERGE. In the github example you linked, imagine having an existing object like this:

LET stargazers = {
    "ziflex": "clock",
    "MontFerret": "organization",
    "Kremlin": "location"
}

And then in the example instead of just showing the icon type, you wanted to iteratively append to the users object with the username as property name. Something like APPEND(stargazers, {"Gusyatnikova": "organization"}) inside a FOR.

I realize it's a design issue, it's the first thing I said, but sometimes I just feel that if I could write the scripts in for example JS, I would be saving a lot time overthinking how to make things work. And please don't get me wrong, I love ferret and I think it's a great piece of software.

No worries, you are sharing the problems you are facing with using Ferret and that's fine!

Yes, I admit that it might be frustrating at times not being able to mutate objects in queries. I will think about how we can mitigate it in the future releases.

from cli.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.