Giter VIP home page Giter VIP logo

Comments (7)

HomemadeAdvanced avatar HomemadeAdvanced commented on August 24, 2024 1

Thanks. I will have a look into it when I have the time. This seems like a valid solution. With the lists from https://codeberg.org/HomemadeAdvanced/PiHole/src/branch/main/PiHoleAdlistsGermany.txt I have 27734794 with 17329998 unique domains.

from pihole_adlist_tool.

yubiuser avatar yubiuser commented on August 24, 2024

I'm not sure I understand what you want to archieve: You can select
"Enable the minimal number of adlists that cover all domains that would have been blocked"

which will enable only the set with the least adlists necessary to block everything that would have been blocked. If you enable "Enable only adlists with covered unique domains" you might miss some domains that are not unique (e.g. contained in onĺy two adlists)

from pihole_adlist_tool.

HomemadeAdvanced avatar HomemadeAdvanced commented on August 24, 2024

Currently I have many domains covered by many lists. The total number of blocked domains is twice the number of unique domains. These redundant domains should be reduced, but without the feature of including the visited domains of the last time.

from pihole_adlist_tool.

yubiuser avatar yubiuser commented on August 24, 2024

So what you want is:

"If a adlist contains only domains that are also part of other adlist, deactivate this adlist"? And check this for all adlist at the same time so the maximum number of adlist can be disabled?

(This would all be independent of your browsing behavior. )


This is not possible with the current tool. It is designed to be based on your browsing habits, not focusing on the adlists alone. But I do see some value in your idea. I'll think about it.

from pihole_adlist_tool.

HomemadeAdvanced avatar HomemadeAdvanced commented on August 24, 2024

That's exactly what I would like to use. It would be great if it could be implemented.

from pihole_adlist_tool.

thomasmerz avatar thomasmerz commented on August 24, 2024

I think this also can never be made possible because adlists may change independently, so list1 may add/remove domains that are not/still in list2. Pi-hole will already make all domains "unique":

[i] Number of gravity domains: 4514301 (4038026 unique domains)
[i] Number of exact blacklisted domains: 24
[i] Number of regex blacklist filters: 25
[i] Number of exact whitelisted domains: 13
[i] Number of regex whitelist filters: 4

grafik

But… wait… if you're interested in it this might be a brute-force solution by comparing all lists with each other lists. So be careful when you have "many" adlists because it will run "no-of-adlists minus 1 x no-adlists devided by 2" times:

for d1 in *.domains; do
  for d2 in *.domains; do
    [ "$d1" = "$d2" ] && break
    echo $d1 vs. $d2
    comm -3 <(sort $d1) <(sort $d2) | wc -l
  done
done

Each result with "0" shows you two lists which contents are 100% the same.

For testing purpose I copied one adlist to proof that a "100% the same" duplicate will be found by this:

cp list.1.raw.githubusercontent.com.domains zzz.list.1.raw.githubusercontent.com.domains

❓ Does this help you and can this issue be closed?

from pihole_adlist_tool.

thomasmerz avatar thomasmerz commented on August 24, 2024

See man page:

-3 suppress column 3 (lines that appear in both files)

from pihole_adlist_tool.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.