Giter VIP home page Giter VIP logo

Comments (10)

essandess avatar essandess commented on July 23, 2024

It's possible, but requires digging into the specific errors and isolating whether the issue is with these specific rules, or the adblock2privoxy rule parser.

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

Great. I would very much appreciate looking into it as it would make privoxy even more on par with adblock

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

I am not a programmer, but I took quick look at the issue and patterns seems fine if one uses adblock only without privoxy. Looking at the code a bit I also found that word "div" does not exists in the code anywhere, therefore is is logical adblock2privoxy errors out on them.

Looking at the basic pattern I found number of cases where that html code is used

# cat easylist.txt |grep -i "##div"|wc
   1062    2899   72306
# cat easylist.txt |grep -i " div "|wc
     38     260    2106

Among them there is for example fancystreems.com##body > div > a which does show up in other syntaxes, but I have failed to find it in resulting files

cat /etc/privoxy/ab2p.* |grep fancystreems.com
# ||fancystreems.com/300x2503.php (easylist.txt: 47100)
.fancystreems.com/300x2503\.php
# ||fancystreems.com/300x2503.php (easylist.txt: 47100)
.fancystreems.com/300x2503\.php
# ||fancystreems.com/300x2503.php (easylist.txt: 47100)
.fancystreems.com/300x2503\.php

but it seems it has nothing to do the one using div. Therefore it looks like new feature/s would have to be implemented in PatternConverter.hs. I hoping it would not be too much work...

from adblock2privoxy.

essandess avatar essandess commented on July 23, 2024
  • Please confirm: adblock2privoxy reports errors when processing these specific rules, but successfully completes on all other rules. I.e. good functionality except for some of the rules. Is this what you see?

  • I don’t have the cycles now to dig into refactoring a Haskell parser, but I can point you to some clues that might help isolate the issue.

  • The rules all involve CSS element hiding. Here are links to the Adblock and basic selector syntax:

I don’t see anything wrong with the basic syntax of those selectors. Also I didn’t write the parser, and don’t know its limitations, e.g. tree depth and the like.

The first clue might be reference to the rules with stuff like DIV:-abp-contains(REKLAMA). Where is this defined? Should adblock2privoxy be able to parse this?

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

fancystreems.com I did not know such a page existed but to use it as example from the https://easylist-downloads.adblockplus.org/easylist.txt it turns out that adblock2privoxy uses only the first record and simply keeps quiet about the rest without any errors in the ab2b.task log as it likely had known it would not be able to parse it.

||fancystreems.com/300x2503.php
fancystreems.com###bannerfloat2
fancystreems.com###floatLayer1
fancystreems.com###y
fancystreems.com###yst1
fancystreems.com##img[width="300"][height="150"]
fancystreems.com##body > div > a
fancystreems.com##img[width="300"][height="250"]

For sake of easier reading I removed all other websites referenced in this examples separated by comma

Hence no errors are there, but records such as those from wp.pl and gadzetomania.pl defined in the link from the first post do give errors than means adblock2privoxy actually tried to use one of the known to it syntaxes to parse it at which it simply failed. I'll try to see which one, but I have feeling that finding it out would not change anything as it is likely it was meant for something else and new parser would be required to import syntaxes that error-ed out.

Going back to fancystream at least last 2 rules are legitimate block records which web page code still shows. Adding parser for such cases would be welcomed addition

<a href="http://www.fancystreems.com/tvcat/newstv.php"><img src="https://i.imgur.com/SnQS4Gt.jpg" width="300" height="250"></a>
<a href="http://www.fancystreems.com"><img src="http://www.fancystreems.com/images/dot.gif" alt="fancystreems logo" width="152" height="95" border="0" id="logo_icon" title="fancystreems logo" class="logo"></a>

About the cycles. Please take your time. If it gets done by the end of summer I'll be happy :)

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

Allow me to correct myself. All those fancystreems.com rules do get created!
Hence those elements rely on the CSS functionality than appropriate file gets created in the CSS directory and in this case it is /com/fancystreems/ab2p.css
Just like all othe css files it inherits all element hiding rules which may not be all that to optimized and in few cases lead to hiding too much, but nevertheless it works and only those rules that error out are the only ones that we left to worry about.

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

Reading from https://adblockplus.org/filters#elemhide-emulation turns out -abp-* are features that are adblock specific therefore would have to be translated into privoxy language. I tried converting them manually into number of different schemes to check would it work, but none of them did. Included few more that I read from webpage content, but with no luck either.

So what -abp-contains translate into? One has to know it before trying to write correct the parser. After reading https://www.w3schools.com/cssref/css_selectors.asp I tried many different scenarios where for DIV + DIV + div:-abp-contains(REKLAMA) + DIV I used among others

div + div + div[title~="REKLAMA"] + div,
div + div + div[target="REKLAMA"] + div,
div + div + div[href*="REKLAMA"] + div,
div + div + div[id^="REKLAMA"] + div,
div + div + div[id$="REKLAMA"] + div,
div + div + div[id*="REKLAMA"] + div,

but none of them seems to work. Any ideas?

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

Checking the website's code with Inspector in Firefox it seems that code in web page is not all to complicated as simply looks as <div>REKLAMA</div> https://i.imgur.com/hWXfqmt.png
However searching the web for config examples to privoxy which could take care of this does not yield any results. Translation propositions into privoxy format would be very much welcomed.

from adblock2privoxy.

wmyrda avatar wmyrda commented on July 23, 2024

Bad news. According to filter writers css file is not enough for all those contains() rules as it no longer used/allowed by css specification.

Good news. Some code may be borrowed from Adblockplus source to create .js script to hide those elements.

from adblock2privoxy.

essandess avatar essandess commented on July 23, 2024

I do not see a path to incorporate these abp-specific element hiding rules.

from adblock2privoxy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.