cimbali / cleanlinks Goto Github PK

View Code? Open in Web Editor NEW

77.0 5.0 2.0 1.78 MB

Converts obfuscated/nested links to genuine clean links.

Home Page: https://addons.mozilla.org/en-GB/firefox/addon/clean-links-webext/

License: Mozilla Public License 2.0

JavaScript 77.00% HTML 12.46% Shell 2.00% CSS 8.54%

firefox-webextension privacy-protection links

cleanlinks's Introduction

CleanLinks Mozilla Firefox Extension

CleanLinks was initially a webextension rewrite and is now a fork of @DiegoCR’s original XUL/XPCOM CleanLinks extension.

What does it do?

CleanLinks protects your private life, by automatically detecting and skipping redirect pages, that track you on your way to the link you really wanted. Tracking parameters (e.g. utm_* or fbclid) are also removed.

Eg:

http://www.foobar.com/track=ftp://gnu.org ➠ ftp://gnu.org/
http://example.com/aHR0cDovL3d3dy5nb29nbGUuY29t ➠ http://www.google.com
javascript:window.open('http://somesite.com') ➠ http://somesite.com/

For maximum privacy, rules are maintained and editable locally (with decent defaults distributed in the add-on). CleanLinks will break some websites and you will need to manually whitelist these URLs for them to work. This can be done easily via the menu from the CleanLinks icon.

You can test the current (master) link cleaning code online, by pasting a link in the text area and clicking the "Clean it!" button.

More details

This add-on protects your private life, by skipping intermediate pages that track you while redirecting you to your destination (1), or that divulge your current page while making requests (2). Any request that contains another URL is considered fishy, and skipped in favor of the target page (for foreground requests) or dropped (for background requests). A whitelist manages the pages and websites that have legitimate uses (3) of such redirects.

Some illustrative examples are:

Facebook tracks all the outgoing links by first sending you to the page https://l.facebook.com/l.php?u=<the actual link here> which then redirects you to the URL.
Analytics report the page you are on, for example google analystics uses https://www.google-analytics.com/collect?dl=<your current page>&<more info on what you do>
Logging in through openid requires to pass the original URL so you can return to that page once the login is performed, e.g. https://github.com/login/oauth/authorize?redirect_uri=<URL to the previous page>&<more parameters for the request>

All these embedded links are detected automatically. Links of type 1 should be redirected to their destination, those of type 2 (identified by the fact they are not “top-level” requests) should be dropped, and those of type 3 allowed through a witelist.

The whitelist is populated with a few sensible defaults, and must be maintained manually by each user as they encounter pages that break. A quick access to the last requests that were cleaned is available by clicking on the add-on icon. In this popup, all recently cleaned links for the tab appear, and these can be added to the whitelist definitively (“Whitelist Embedded URL” button) or for once only (“Open Un-cleaned Link” button).

Other tracking data can be added to the URLs to follow your behaviour online. These can be for example fbclid= or utm_campain= query parameters, or /ref= in the pathname of the URL on amazon. Those can not be detected automatically, so CleanLinks has a set of rules (the same that maintains the embedded URL whitelist) that specifies which data is used for tracking and should be removed from URLs.

How can I help?

Be part of the open-source community that helps each other browse safer and more privately !

Being part of a community means being respectful of everyone and keeping this environment friendly, welcoming, and harrasment-free. Abusive behaviour will not be tolerated, and can be reported by email at [email protected] − wrongdoers may be permanently banned from interacting with CleanLinks.

You can help by reporting issues!

Any reports are welcome, including suggestions to improve and maintain the default rules that CleanLinks uses.

You can help by contributing to the code!

Maintaining even a small addon like CleanLinks is in fact very time consuming, so every little bit helps!

You can help by contributing to translations!

You can improve translations or add a new language on CleanLink’s POEditor page, where the strings will directly be imported into the add-on at the next release.

This is the current status of translations:

Why are the requested permissions required?

The permissions are listed in the manifest file and described in the API documentation. Here is a breakdown of why we need each of the requested permissions:

Permission	Show (on addons.mozilla.org) as	Needed for
clipboardWrite	Input data to the clipboard	Copying cleaned links from the context menu
contextMenus	Not shown	Copying cleaned links from the context menu
alarms	Not shown	Automatically saving options
webRequest	Access your data for all websites	Clean links while they are accessed
webRequestBlocking	Access your data for all websites	Clean links while they are accessed
<all_urls>	Access your data for all websites	Clean javascript links, highlight cleaned links
storage	Not shown	Store rules and preferences
https://publicsuffix.org/list/public_suffix_list.dat	Not shown	Identifying public suffixes (e.g. `.co.uk`)

In which other ways can you get it?

Except from the AMO page https://addons.mozilla.org/addon/clean-links-webext/, you can also get the addon straight from this repo. This is useful if you want to help testing for example.

Either get web-ext, and run web-ext run in the addon/ source code directory.
Alternately, temporarily load the add-on from about:debugging#addons, by ticking "Enable add-on debugging", clicking "Load Temporary Add-on" and selecting the manifest.json file from the source code directory.
Finally, you can build the add-on using yarn bundle or web-ext -s ./addon -a ./dist build in this repository’s top-level directory, and install the add-on from the file that was generated.

cleanlinks's People

Contributors

Stargazers

Watchers

Forkers

alekksander rbrito

cleanlinks's Issues

shhh! could you be quiteter?

please consider adding options to disable red count badge on icon, and notifications (they are very nice and system integrated, just not needed for some).

——

As this is „n°1” issue, i'd like to say THANK YOU. Glad someone finally picked up this great project.

Domain whitelisting via menu

It would be great if the add-on allowed to white list domains (for simplicity sake the domain of the address bar) via its menu or alternatively, each domain which matches the rewrite rules.

Reduntant youtube embeds rewrites

https://www.youtube.com/embed/MuehR3WuZ_E

Each 3 or 4 seconds produces:

https://www.youtube.com/api/stats/playback?ns=yt&el=embedded&cpn=H0fKd5ib6Vj5Ccu-&docid=MuehR3WuZ_E&ver=2&referrer=https%3A%2F%2Fwww.youtube.com%2Fembed%2FMuehR3WuZ_E&cmt=0.192&plid=AAV23H4P4rBjNhQY&ei=h_-sW9DGEMKXgAfM1LKIDA&fmt=244&fs=0&rt=0.996&of=gz14kY-A0c1t2FUxEZbQLw&euri&lact=1001&cl=214291029&mos=0&vm=CAEQABgE&volume=100&c=WEB_EMBEDDED_PLAYER&cver=20180921&cplayer=UNIPLAYER&cbr=Firefox&cbrver=62.0&cos=Windows&cosver=10.0&hl=en_US&cr=GB&len=105&fexp=23710476%2C23718325%2C23721698%2C23721898%2C23723207%2C23726563%2C23728347%2C23733751%2C23737832%2C23744176%2C23749835%2C23751767%2C23752869%2C23753283%2C23753597%2C23755886%2C23755898%2C23756720%2C23758198%2C23759694%2C23760180%2C23762063%2C23762650%2C23762813%2C23763126%2C23763475%2C23763691%2C23764134%2C23764670%2C23764993%2C23766071%2C23766882%2C23767275%2C23767582%2C9440546%2C9449243%2C9463154%2C9475651%2C9485000&rtn=8&afmt=251&size=952%3A1920&inview=1
→ https://www.youtube.com/embed/MuehR3WuZ_E

This rule fixes it:

youtube\.com\/api\/

handle `data:image/...` loading

Inline svg images, e.g. data:image/svg+xml;<svg ...>...</svg> or in base64 data:image/svg+xml;base64, contain legitimate links that are not redirects (typically XML namespaces). We should avoid clearing those.

Maybe even avoid clearing anything starting with data: ?

We probably need to investigate what happens when external resources are loaded from an svg file with the <use /> tag, as this seems a type of redirect that might be used from within svg images.

Support alternate browsers

Chrome
Vivaldi (built on chromium & uses the Chrome Web Store, should be no compared to Chrome)
Opera, see #109
Firefox on Android, see #30

Separate parameter-cleaning from redirect-cleaning

This requires a separate whitelist mechanism, which was missing in v3.1.2. It should probably be per-domain.

Would it be better to rewrite the URLs in the page?

So that the browser's standard "Copy link address" button works without needing to add another specialized button for it.

Doesnt Work on ip-check.info

My Browser (FF ESR 60 64 bit on W7) remains stuck on the homepage of the site http://ip-check.info .

Whitelisting this site doesn't help.

How to make CleanLinks works on this site, sending me on the results page?

Are all the permissions really required?

From the Mozilla add-on webpage:

Permissions

This add-on can:

Access your data for all websites
Input data to the clipboard
Display notifications to you

I'm curious if the last two are really required.

suggestions for the whitelisting pop-up

I don't understand what the whitelisting box is telling me. For one, it scrunches up the left-side and right-side so I don't know what the full url's are. Additionally, I don't know what the left-hand and right-hand sections actually mean.

Could maybe on hover it show the full url string so I can read the whole line item? And can the left-side and right-side have descriptions above them so I know what this is telling me?

Huge CPU load

Hello,

With CleanLinks enabled, Firefox use a lot of CPU and so become slow and laggy.
Once I disable CleanLinks, the problem is gone.
Note, I use the default options.

I didn't make a lot of tests like disabling some options, identify some specific websites, or disabling some others incompatible add-ons.
If I have more detailed information I'll provide here.

Really good extension but not usable for me right now.

Thanks

Properly cleaning disqus comments outgoing links

Here's an example link:

https://disq.us/url?url=https%3A%2F%2Fwww.cnbc.com%2F2018%2F10%2F28%2Fibm-to-acquire-red-hat-in-deal-valued-at-34-billion.html%3A-q99Da4jh5BlAfPTV7GrgJ4rKaU&cuid=1319929

Mind that disqus uses ":" %3A as a URL delimiter which doesn't quite work with the Clean Links add-on.

Original and destination links are merged when copying to the clipboard

In the log window I see

$Source_URL
→ $Cleaned_URL

However when I copy it to the clipboard, I get

$Source_URL$Cleaned_URL

which is not exactly readable.

If you're OK with that then close this bug report as invalid.

Enable subdomains matching for "Skip Domains"

When you exclude a certain domain ("Skip Domains"), e.g.

youtube.com

its subdomains are still matched, e.g.

www.youtube.com

That's counterintuitive.

collumn descriptions have low contrast

http://i.imgur.com/U2NFWet.png

the column description text is white on light grey and hardly visible

cleanlinks „disable” is ignored

go:
https://polishannoyancefilters.netlify.com/issues
select disable from extensions popup
click:
„Oficjalne Polskie Filtry do AdBlocka, uBlocka i AdGuarda.”

result:
link is cleaned and redirects to url with text

expected:
abp:subscribe handling by the browser

(side suggestion:
have abp:subscribe hardcoded as whitelist)

removing the Skip Domains list blanks out the whole extension text

go into options and remove all of the Skipped Domains
exit the options screen and go back in.

Clicking on exception list doesnt do anything

Clean Links 3.0b2, Firefox 58.0.2 32-Bit, Linux Mint Mate 18.3 32bit

Facebook's dirty links to external web pages not fully cleaned

Facebook seems to have recently taken additional steps to craftily embed tracking data into external links, which CleanLinks 3.2.1 doesn't remove. For example, here's the link Facebook has for a bbc.com web page before any cleaning at all:
https://l.facebook.com/l.php?u=https%3A%2F%2Fbbc.in%2F2D8dMwt%3Ffbclid%3DIwA......... [etc. etc., goes on for about 700 characters]
Obviously a job for CleanLinks, so I rightclick and do "Copy clean link," which gives this:
https://bbc.in/2D8dMwt?fbclid=IwAR1zDT_KSlH90LBrbPFKtdbfVVlRKo_pSPysgVVfzwif1WrILbteg-3rj8s
Clearly a big improvement, well done, but there's still tracking data following the question mark that hasn't been removed. I can pretty easily delete it manually which gives me this:
https://bbc.in/2D8dMwt
It'd be good if the next upgrade of CleanLinks would do that itself of course, as the extra data is presumably there to track me.
But there's more. If I paste the above (seemingly clean) link into the address bar and hit go, the link turns into this:
https://www.bbc.com/news/world-latin-america-45982501?ocid=socialflow_facebook&ns_source=facebook&ns_mchannel=social&ns_campaign=bbcnews
That's clearly not as clean as it looked. Doesn't seem like much to worry much about in terms of invading privacy, and I stand some chance of cleaning it myself by hitting the Escape key immediately after hitting go, and deleting from the right up to the question mark, to give this, which is the truly clean link I wanted from the start:
https://www.bbc.com/news/world-latin-america-45982501
After that I can hit go and the page opens fine. Naturally it's a cumbersome and not very reliable workaround. Like I say, the dirt seems fairly harmless anyway, but what bothers me is that Facebook now seems to have the potential to embed something more worrying than that if it ever decides to, and CleanLinks currently won't stop it.
By the way, congratulations on creating an otherwise excellent link cleaner. Best I've found so far :-)

Parse URL properly before cleaning

Cleaning is really applied to only a part to the link, which is URL.path, URL.search (and maybe URL).hash). Since some rules (e.g. whitelisting) apply per domain, it is useful to start from a properly parsed URL object.

Finding out the link is only (partly) tricky in the injected script, as when checking the header we have the full uri. The injected script is useful for visual feedback of cleaned links, but failing there will be caught later on if request cleaning is enabled.

Once that is done, it will be easy to allow per-domain rules, such as parameters to clean (see discussion #20) or some rules which are currently hardcoded (e.g. on google.com/search, don't clean if we find the URL in the parameter q=)

Add option to enable/disable rules also in extension's config

Right now if you click on CleanLinks icon you can enable/disable rules by clicking power button in right upper corner. It would be nice if the same option would be available in extension's config (firefox menu -> extensions -> CleanLinks).

Whitelist selection ignored (?)

reproduce:
1. go: https://www.trojmiasto.pl/wiadomosci/Tak-powstaje-druga-kladka-na-Motlawie-n124071.html

2. try to open images from article (could be the first one)

3. results in internal non existing page

4. go back one page, whitelist latest entry

expected behavior:
clicking image will js zoom pic, or go to http pic location

what happens:
whitelist is ignored or regex to complex. no matter how many times its whitelisted we are back to internal non existing page.

Cannot watch embedded youtube videos on slashfilm

When this extension is enabled YouTube embeds at slashfilm.com don't work, e.g.
https://www.slashfilm.com/burning-trailer-official-us/

I tried whitelisting slashfilm.com but that didn't help.

I tried adding

|\/embed\/

to the list of "Skip Links Matching" but the add-on apparently ignores this rule.

Cleaning of URL tracking for Twitter embeds doesn't seem to work

Steps to reproduce:

Go to https://www.gsmarena.com/google_pixel_3___3_xl_price_and_full_review_leaks_with_iphone_xs_max_camera_shootout-news-33625.php

Click the embedded image in the tweet -> you'll be redirected to

https://twitter.com/jon_prosser/status/1048260765105442816/photo/1?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1048260765105442816%7Ctwgr%5E363937393b70726f64756374696f6e&ref_url=https%3A%2F%2Fwww.gsmarena.com%2Fgoogle_pixel_3___3_xl_price_and_full_review_leaks_with_iphone_xs_max_camera_shootout-news-33625.php

CL remains unstoppable, whitelisting ignored

go to https://audiosciencereview.com/forum/index.php and select register
notice at Verification that validation box is not there
whitelist:
refresh. w3 is not on in popup anymore but extension icon still displays „1” (not visible on screenshot below, sry). whitelisted entry is also missing from skipped domains in CL settings. CL is unstoppable, therefore on the page the validation box still not present
disabling extension solves problem

Please extend the notification popup

Currently it only shows

Link cleaned!

URL

I would like it to show:

Page URL: https://blabla.com
Original URL: https://blabla.com/url=https://another.domain
Cleaned URL: https://another.domain

Page URL is required because processing might occur in a background tab/another Firefox window.

Re-evaluate CleanLinks default whitelisted domains and patterns

I'm not getting this regex to even work

I also like ClearURLs implementation of a mechanism to update the ruleset from github. And reviewing theirs we're clearly missing a ton.

Skip Domains
accounts.google.com,docs.google.com,translate.google.com,login.live.com,plus.google.com,twitter.com,static.ak.facebook.com,www.linkedin.com,www.virustotal.com,account.live.com,admin.brightcove.com,www.mywot.com,webcache.googleusercontent.com,web.archive.org,accounts.youtube.com,signin.ebay.com

Looking at Diego's commits these rules are all at least 4 years old. Sites change. I'd rather start with a clean slate and see what is applicable today.

Setting won't stick

As title.
Up-to-date Archlinux.
Firefox 60.0.1-1.
New profile with only CleanLinks installed.

Disqus "more comments" link doesn't work

E.g. https://videocardz.com/newz/intels-9th-gen-core-i9-packaging-teased

https://disqus.com/embed/comments/?base=default&f=videocardz&t_i=videocardz-78414%27&t_u=https%3A%2F%2Fvideocardz.com%2Fnewz%2Fintels-9th-gen-core-i9-packaging-teased&t_e=Intel%27s%209th%20Gen%20Core%20i9%20packaging%20teased%3F&t_d=Intel%E2%80%99s%209th%20Gen%20Core%20i9%20packaging%20teased%3F&t_t=Intel%27s%209th%20Gen%20Core%20i9%20packaging%20teased%3F&s_o=default#

https://videocardz.com/newz/intels-9th-gen-core-i9-packaging-teased

This exception rule fixes it

\/embed\/comments\/

Clean Links 3.0b3.xpi - forces display of popup messages for each cleaned link

Is this happening to other people or just me?

small improvements for the log, popup, and whitelist

just a bunch of ideas:

whitelist:

if domain is already white listed prevent from creating duplicates

popup:

it's size is too big, does it really has to be that big? top info if necessary could be ¼ of it's size, leaving more for the log. no point of such big font too i guess.
it doesn't scale right. on attachment you can see i can scroll the popup ifself (about 5px)
signify clearly when link tracking is disabled (e.g. grey out list)
clear list button as suggested by jawz101

log:

option to view cleaned links per ~~currently viewed domain~~ tab – not all work it has done so far. switch could be in settings, or on the opposite site of „Whitelist Selection button”

mozaws (firefox add-ons website) redirects support

Here's an example:

https://outgoing.prod.mozaws.net/v1/fc3d342353d59cca18b6b696e5afbdf37f6a5c0d5d209b7da2226932ff2e517a/https%3A//github.com/Cimbali/CleanLinks

Cleaning Google's URL tracking

Google follows you through search results pages using the ei={22 random chars} get parameter, e.g.

https://www.google.com/search?q=nintendo+switch&client=firefox-b-ab&ei=XXXXXXXXXXXXXXXXXXXXXXXXX&start=10&sa=N

I wonder if you could strip it.

Split history for private browsing windows

We can probably solve this and the "log" half of #19 by displaying per-tab history only in the logs.

Remove "page cleaning" mode

Right now there is a mode for CleanLinks (disabled by default) which cleans up all links preventively in a page, as opposed to handling this in the onClick handler. It's not named anywhere, it's just whenever "event delegation mode" is disabled.

It's less efficient for a handful of reasons:

It does unnecessary work for links that won't be clicked
It doesn't catch all links, when trackers are inserted in javacsript. "Repeat delay" is an ugly hackish attempt at fixing this and links in pages loaded in frames or by XHR.
Even with the repeat delay, some links slip through the cracks, see any google search page for example where links are clean until onmousedown when they are replaced with the link
Even if perfect, we'd still need the onClick handler to notify about cleaned links.

I only see one upside:

You can visually see all links that were potentially tracking you on the page and their count on the icon.

In light of this I'm thinking of removing this feature to simplify the code. Then we can even use the count on the icon as the total of cleaned links in the current session.

Does anyone use this page-cleaning feature? If so, am I missing something? Does it do something the event-delegation mode can not?

make notification popup time longer or customizable

It just flickers on and off too quickly. Eventually I just disabled it but if I did have it on I'd want to see what it says.

when putting cleanlinks in firefoxs overflow menu it isnt displayed completely

i have a 1280x800 screen and it isnt enough to display the whole addon because the overflow menu reduces the addons width and has no overflow scroll bars. it might be more of a problem on mozillas side but maybe some of the text could be put in a collapsible container or a tooltip to accommodate those shortcomings.

disable notification?

I'm getting a lot of them, sometimes covering large parts of the screen

[Feature suggestion] Button to open settings

it would be useful to have a button that opens the settings page. if the filter exceptions button was supposed to do that(it is not working for me) then please rename it.

Redirect loop when visiting Amazon AWS

When visiting console.aws.amazon.com

This addon creates a redirect loop between

signin.aws.amazon.com
console.aws.amazon.com
us-east-1.aws.amazon.com

After a few seconds Firefox abandons charging the page

I have isolated the issue by doing a clean install of Firefox with only the CleanLinks addon, Aws works without CleanLinks, and fails with CleanLinks.

Whitelisting signin.aws.amazon.com solves the issue

Can't open links in new tab

While using extension it's not possible to open links in new tab when using keyboard shortcut Cmd/Ctrl + link. Reproducible on google.com.

URL query parameters are not removed

In "Remove From Links" field enter ref.
Only URL query parameter’s name is removed, leaving parameter’s value in URL. E.g. http://example.org/?ref=value is cleaned to http://example.org/?=value.
Expected result should be http://example.org/

Reproducible here: https://jsfiddle.net/8L52o4qw/2/

Handle ports in hostnames

Either have host:port separately whitelistable from host, or have them both be allowed by a host whitelist entry.

Add option to hide number from icon

CleanLinks shows number of cleaned links on the extension icon. I propose to add option, to hide this number.

Make cleaned links history popup available on ff for android

Use pageAction instead of browserAction if we detect that we are on the mobile browser.

See https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Differences_between_desktop_and_Android#Effect_on_your_add-on_UI

CleanLinks is stubborn – doesn't whitelist

go imgops.com and paste:

https://upload.wikimedia.org/wikipedia/commons/d/db/East_Berlin_Death_Strip_seen_from_Axel_Springer_Building_1984.jpg

press start

CleanLinks kicks in and redirects to direct img url.

go back and whitelist proper action

try again and see whitelist being ignored (?) adding imgops.com to whitelist manually is a walk–around

all settings fields are now empty and cannot be edited...parsing bug

Adding |?Expires to the end of the default "Skip Links matching with:" field will render all fields unviewable and uneditable.

I'm using Win FF 56.0.2 Nightly.

I added that exception because Clean Links was scrubbing time-limited media links and this addition should fix the filtering, however it broke the display.

So, how can this be fixed?

fsdn.com
slashdotmedia.com
sourceforge.net