Giter VIP home page Giter VIP logo

searx-space's Introduction



Privacy-respecting, hackable metasearch engine

Searx.space lists ready-to-use running instances.

A user, admin and developer handbook is available on the homepage.

SearXNG install SearXNG homepage SearXNG wiki AGPL License Issues commits weblate SearXNG logo


Contact

Ask questions or just chat about SearXNG on

IRC
#searxng on libera.chat which is bridged to Matrix.
Matrix
#searxng:matrix.org

Setup

Translations

Help translate SearXNG at Weblate

Contributing

Are you a developer? Have a look at our development quickstart guide, it's very easy to contribute. Additionally we have a developer documentation.

Codespaces

You can contribute from your browser using GitHub Codespaces:

  • Fork the repository
  • Click on the <> Code green button
  • Click on the Codespaces tab instead of Local
  • Click on Create codespace on master
  • VSCode is going to start in the browser
  • Wait for git pull && make install to appear and then disappear
  • You have 120 hours per month (see also your list of existing Codespaces)
  • You can start SearXNG using make run in the terminal or by pressing Ctrl+Shift+B

searx-space's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

searx-space's Issues

Prioritize privacy or security?

@dalf pushed a commit dalf@37beff9 that fixed the issue #8 but while reviewing the commit I found out that currently searx-stats2 display at a higher priority an instance that have a better Content Security Policy than an instance that doesn't contain any modified scripts (like tracking or ads) so more privacy friendly than an instance with modified scripts.

What should be better for the visitors: prioritize by default the privacy or the security?

Personally I think if #11 gets merged, privacy should be the priority because there are already good mechanisms in modern browsers to secure connections between a client and a server.
Whereas it's more common to have shady scripts built in a Searx instance than having external attackers trying to gather searches from visitors of an instance by injecting scripts into the HTML page.

CNAME with IPv6 not working ?

Hi,

I have been upgrading our Searx instance à La Quadrature du Net (https://searx.laquadrature.net)
When looking at the https://searx.space/ it says that we have no IPv6.
However, we do have IPv6 and the server is configured to respond on IPv6.

`---> host searx.laquadrature.net
searx.laquadrature.net is an alias for tau.lqdn.fr.
tau.lqdn.fr has address 185.34.33.4
tau.lqdn.fr has IPv6 address 2a00:99a0:0:1000::4

And the relevant nginx configuration :

listen 185.34.33.4:443 ssl http2;
listen [2a00:99a0:0:1000::4]:443 ssl http2;
server_name searx.laquadrature.net;

I'm wondering what is happening, if it is a bug in the stats code, or because it is a CNAME ?

TLS grade having hard time to update

I have recently started to host my searx instance and after doing some custom bot filtering I scored poorly on both TLS/CSP grades. It has been fixed by simply whitelisting some ips/user-agents. However days passed and TLS grade is not getting updated whereas CSP refreshed within ~1 day.

What leads me to believe that this is an issue:


My instance:
https://searx.monicz.pl/

Instances.json file contains removed engines from searx master

Hi,

The instances.json file contains data about removed engines.

Two examples are:

On searx.space engines tab, either we remove the columns of engines that have been removed from searx master, or we should add a ⚠️ warning sign to the faroo and asksteem rows of searx instances that are still listing these removed engines from searx master:

Screenshot 2020-04-21 at 11 44 55

And maybe for searx instances that are correctly updated and don't list some removed engines, we should precise with this kind of icon in the related rows:

minus-in-circle

I don't know which code has to be changed for that.

Best is to add these two signs and if some searx instances have not been updated in 90 days or so, they should be removed from the list of available searx instances.

SUGGESTION: Contacting the instance's maintainer(s)

This is derived from searx/searx#1972. I personally think that there should be a way to contact the maintainer(s) of a public instance (email for example). It is harder to trust this awesome service if there is no way to contact the maintainer(s).

My suggestion for this change is to tell the maintainer(s) to provide an email address at a minimum.

EDIT: I think I may have applied the labels wrong

searx.space - shows incorrect data

https://searx.space/ does not refresh results active instances, nibblehole.com - missing CSP grade. The search response time often shows incorrect data, suddenly 70% in red? I think the problem is with your network, maybe too much traffic on the server?

basic.py: use .../config URL to check searx version.

Currently the project checks if an URL is searx instance using this regex:
https://github.com/dalf/searx-stats2/blob/634c30527e1c31dd25558780873408898b33ef16/searxstats/fetcher/basic.py#L15

Some people wants to customize this string, in this case the instance will appear in the "Error" tab (Searx instance not found).

It would be better to check the .../config URL.

Related to https://github.com/dalf/searx-stats2/issues/29#issuecomment-604428693

instance status

is the website - https://searx.space/ still checking the instance status? it looks like the process to check the condition of instances stopped a few days ago.

About https://searx.space

In response to searx/searx#1853 (comment)

For now, the searx-stats2 project has collaborators: asciimoo and return42 (not sure how to open).

For now, I'm the one who host searx.space (Kimsufi host). Currently it hosts some other website for my personnal usage.

searx.space log contains: timestamp, method, url, proto, status, size (no IP, no user agent, no referrer). Most probably can remove completely, but I don't think there is a privacy problem here and it helps to know some information (bandwidth, ratio between the 200 and 304 http status code).

It runs the master branch of https://github.com/dalf/searx-stats2 (git pull && make docker-build are executed by me).
In /etc/cron.d/searx-stats2:

0 1,4,7,10,13,16,19,22 * * *   root cd /srv/searx-stats2 && make docker-run &> /tmp/searx-stats2.log || exit 0

It can be interesting to display the result of searx-stats2 from different location but IHMO the actual fetch should be done only once to avoid to hammer the different instances. Related to https://github.com/dalf/searx-stats2/issues/1

I thought about different things:

  • use custom domains and GitHub Pages.
  • run searx-stats2 in a VM, and always run the master branch (or a "searx-space" branch). Not sure if different root user is good idea (?).
  • allow searx to download and display instances.json. Additional benefit: at the same time, each instance could ping searx-stats2 to say "hey I'm a public instance".
  • spread instance.json using P2P networks.

It is difficult to combine everything:

  • fetch HTTPS / CSP / HTML grades only once.
  • get response time from only different location.
  • AND avoid a the single point of failure / one central location
  • AND #20 : how an ops can know some requests come from searx-stats2 if the check is run multiple times in different locations.

FYI, searx.me seems to be ban in China according to https://viewdns.info/ (it wasn't the case in 2016).

Show Instances that have a Morty or other Proxy

Unfortunately we're up against CloudFlare™️ and that is a problem, especially if we use Tor; so a proxy is needed. Some of the instances offer i.e. Morty, but many do not, and instances have been disappearing lately, or at least their Tor Hidden Services. So it would be nice to have a list of instances that offer a proxy.

Originally opened in: searx/searx#1852

Thanks. 🍻

Wrong country report for Oracle cloud and Hurricane electric network

searx.be is on networks from Oracle cloud and Hurricane Electric (tunnelbroker).
The reported country on searx.space is US but in fact it should be noted "DE" because the IP is from a server located in Germanay.

I used this tool for checking the real country location of an IP address: https://www.iplocation.net/ip-lookup. For IPv4 address of searx.be the reported country on this tool is Germany and for the IPv6 it's also Germany but only for DB-IP.

I know it's not a simple task to solve because I've read the code, it's using WHOIS for checking the country and Oracle cloud doesn't report the country in the WHOIS.

IP2Location offer a free Geo database which can be used locally and seems at least a bit more accurate than WHOIS, at least on the IPv4 address of searx.be: https://www.ip2location.com/demo

Connection Timeout on some instance

For reference: dalf/cryptcheck-backend#1 (comment)

searx-stats2 has some "Connection Timeout" error despite a reliable searx instance.

I have been able to reproduce the connection timeout with this code ( httpx==0.11.0 )

import httpx
async with httpx.AsyncClient() as c:
   r = await c.get('https://searx.be')
TimeoutError                              Traceback (most recent call last)
~/ve/lib/python3.7/site-packages/httpx/backends/asyncio.py in open_tcp_stream(self, hostname, port, ssl_context, timeout)
    198                     asyncio.open_connection(hostname, port, ssl=ssl_context),
--> 199                     timeout.connect_timeout,
    200                 )

/usr/lib/python3.7/asyncio/tasks.py in wait_for(fut, timeout, loop)
    422             await _cancel_and_wait(fut, loop=loop)
--> 423             raise futures.TimeoutError()
    424     finally:

TimeoutError: 

During handling of the above exception, another exception occurred:

ConnectTimeout                            Traceback (most recent call last)
<ipython-input-2-5d9ab2d17d71> in async-def-wrapper()

~/ve/lib/python3.7/site-packages/httpx/client.py in get(self, url, params, headers, cookies, auth, allow_redirects, timeout)
   1235             auth=auth,
   1236             allow_redirects=allow_redirects,
-> 1237             timeout=timeout,
   1238         )
   1239 

Note: the bug won't appear immediately after, I have to wait few minutes before the same error appears.

The last httpx version (0.13.3) seems to fix the problem.

searx.space on tor

Searx.space is only available on the clearnet. It has an option to display working Tor Searx instances maybe searx.space could also be reachable from an onion address to preserve the full privacy when searching for a searx instance hosted on Tor?

Searx instances with simple theme as default dont display search response times in searx.space

Search engines that have otherwise good tls, observatory ratings and so on are at the bottom of the list for not displaying search response times. Examples are: https://spot.ecloud.global/, https://start.paulgo.io/ and https://searx.feneas.org/. They all have in common that they use some sort of the simple theme as default. For my instance its just the simple theme with some css (https://github.com/paulgoio/searx). I also enabled /stats, which also did not help.
I can also see searx.space hitting filtron, but still no search response time:


2021-06-05 09:40:37 | [searx.space] 2021-06-05 07:40:37.752 51.15.252.168 GET start.paulgo.io/search?q=%21google+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
| 2021-06-05 09:40:37 | [searx.space] 2021-06-05 07:40:37.727 51.15.252.168 GET start.paulgo.io/?q=%21google+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
 | 2021-06-05 09:37:56 | [searx.space] 2021-06-05 07:37:56.011 51.15.252.168 GET start.paulgo.io/search?q=%21google+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
 | 2021-06-05 09:37:55 | [searx.space] 2021-06-05 07:37:55.987 51.15.252.168 GET start.paulgo.io/?q=%21google+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
 | 2021-06-05 09:35:26 | [searx.space] 2021-06-05 07:35:26.291 51.15.252.168 GET start.paulgo.io/search?q=%21wp+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
 | 2021-06-05 09:35:26 | [searx.space] 2021-06-05 07:35:26.265 51.15.252.168 GET start.paulgo.io/?q=%21wp+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
 | 2021-06-05 09:32:56 | [searx.space] 2021-06-05 07:32:56.609 51.15.252.168 GET start.paulgo.io/search?q=%21wp+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
 | 2021-06-05 09:32:56 | [searx.space] 2021-06-05 07:32:56.585 51.15.252.168 GET start.paulgo.io/?q=%21wp+time "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"

[Feature Request] SearX.Space: Add A "Eyes" Filter For Country

Eyes Filter

Would make browsing https://searx.space/ a lot easier by country, if there was an easy way to toggle only instances hosted outside of 5 eyes, 9 eyes, and 14 eyes.
So I would suggest a basic dropdown near the country field, where you can just click 5, 9, or 14 eyes toggles, or select none to only see instances outside of the 14 eyes.

searx.space on tor

Searx.space is only available on the clearnet. It has an option to display working Tor Searx instances maybe searx.space could also be reachable from an onion address to preserve the full privacy when searching for a searx instance hosted on Tor?

HTTP Grade D and E discriminate searx instances with a custom theme

Giving a lower rating (and probably ranking lower) because a searx instance customized the default theme is actually quite bad.
A few examples that customized the default theme and are apart from that perfectly good Searx instances:

And I could think of another example that would give a bad rating: minifying on the fly all the assets that aren't minifyied like for example search_on_category_select.js.

Actually the idea of giving grade based on the fact that the HTML files aren't completely the same as the original source code of Searx is a good idea but it's reaching a field where bad and good intentions are mixed together. And it's in this field, using the automated way to distinguish the bad from the good is actually very hard to do.

warning when instance is behind a proxy/MITM service like Cloudflare

Cloudflare and other MITM services can see the traffic between the visitor and the Searx instance.
This is a huge privacy issue.

Introducing a warning by for example making the certificate column red for the instances with a Cloudflare certificates is a good way to warn the visitor of a potential risk for his privacy.

Example of a potential modification of the interface:
2019-12-13_21-09

First public version

Which issues block a first public release:

Verify that the default engines are indeed working

This instance https://search.stinpriza.org/ is sometimes at the top of the list because the default engine sometimes instantly timeout and thus gives a very low response time:
image
image

Checking if the default engines are working is probably the best way to avoid having broken instances at the top of the list.
I remembered there were an instance like this a few months ago on stats.searx.xyz and it was always at the top of the list because it would instantly return no results.

TLS grade not displaying correctly

TLS grade isn't displaying for a lot of instances even though most of them would have gotten A+.
screenshot

Crypt check still displays correct rating though.
screenshot2

support tor services

Currently, none of the Searx instances served from tor listed in the wiki are listed on the interface.

An array of table values returning on get request from server

Can you have an API which gives the table on searx.space online https sites in an array of array format or array of dictionary format?

It would be really helpful to find online searx instances with suitable characteristics. If you can add querying, it will be all the more better.

Thanks.

tls grade: cryptcheck.fr or ssllabs ?

Being the one first criteria to sort the instance list, the tls grade measure has to be reliable, otherwise the order will be "blinking". Unfortunately, this is not the case right now using cryptcheck:

  • no result will be returned if a server has an ipv6 miss-configuration.
  • some other cases, I can't determine.

It is possible to run cryptcheck locally using docker and / or send pull request to the git repository.
But, it seems there is a consensus around ssllab, at least in the current instance list.

Here some information ssllab:

Extract :

...
You are not allowed, without our express permission, to:

  • use the API for commercial purposes;
  • use the API on a public web site;
  • publish any information received from us via the APIs without the owner’s express permission;
  • distribute, proxy, or otherwise make the API available for access or use by any person or entity other than your authorized employees, including but not limited to acting as a service bureau or developing a competing product or service offering.

...

Response times are server based and not based on the location of the visitor

Currently, the response times are retrieved from only one server so from one location in the world.
Thus, some Searx instances could actually have a bad ranking because they are far away from the server that serve the statistics.

One way to fix that would be to test every Searx instances from multiple locations.
The problem is that renting a server running all the time in multiple locations in the world is expensive.

So I think the best way to test each Searx instance without paying anything would be to use the CDN of Zeit. They have quite a lot of servers in different locations of the world: https://zeit.co/docs/v2/network/regions-and-providers/
We would just have to implement the test into a lambda function.

CSP grade: use the http-observatory project instead of https://observatory.mozilla.org

The CSP column (*) uses the external service https://observatory.mozilla.org/ . For some reason, the results are not reliable.

(*) source code: https://github.com/dalf/searx-stats2/blob/master/searxstats/fetcher/mozillaobs.py

Most probably it would be better to embed the python code: https://github.com/mozilla/http-observatory

Moreover it would be possible to check instances not installed at the root domain ( https://.../searx ).

JavaScript free

2020-01-07_07-49
A JavaScript free version of the website would be great because in the Searx community it's pretty common to have users that doesn't enable JavaScript by default.
Moreover, Searx itself support searches without having JavaScript enabled and stats.searx.xyz display its results without having JavaScript enabled.

Engine tab doesn't work as expected.

Description

All instances except one display 🟡 or ?.

Current way to have ✔️ and ❌

if your searx setup use docker, then see

There is no documentation on how to install searx-checker if you don't use docker.
Basically you need to call every day (cron.daily) :

python3 checker/checker.py -o <somewhere>/status.json http://127.0.0.1:8888

Then map <your searx instance url>/status to <somewhere>/status.json

Example: https://a.searx.space/status

searx-checker should be embedded into searx

a.searx.space: wrong country, should be FR instead of PL

In https://searx.space, the country shows PL, it should be FR.

$ whois 2001:41d0:8:4fd4::1
...
inet6num:       2001:41d0::/44
netname:        OVH-200141d00000
descr:          OVH
country:        FR
admin-c:        OK217-RIPE
tech-c:         OTC2-RIPE
mnt-by:         OVH-MNT
status:         AGGREGATED-BY-LIR
created:        2015-04-10T21:50:51Z
last-modified:  2015-04-10T21:50:51Z
source:         RIPE
assignment-size:64

Comment https://searx.bar

Are instances in the "Offline & Error" category regularly scanned to see if they're working? If not, can you please verify if https://searx.bar is working and re-add it to the "Online HTTPS" list if so? Cryptcheck should be able to resolve the address now. That was the issue that led to Cryptcheck showing a timeout.

Cryptcheck results

Uptime average of public instances

This is a break down of issue #54 because it contained too many features at the same time.


Track the uptime of public instances. We could use uptime robot or another tool for that and then use their API for fetching the overall uptime. That's what Invidious does with their instances list: https://instances.invidio.us/

TLS grade having a hard time to update

This is a follow up to #50.

So once again I am having an issue with my instance's TLS grade as apparently my ciphers are too modern. Talking about searx.monicz.pl here. And here is a result from cryptcheck itself: https://cryptcheck.fr/https/searx.monicz.pl

I believe this is related to me using x25519 curve for a handshake. And here is an issue posted on cryptcheck's repository: aeris/cryptcheck#30

+here are some extra TLS details from the ssllabs guys https://www.ssllabs.com/ssltest/analyze.html?d=searx.monicz.pl

A few words from me: x25519 is not an unusual curve to choose. It has been widely supported for a few good years now. From the ssllabs result you may find that my encryption is valid for all modern (and not) browsers like Chrome 69, Firefox 62.

My opinion is that cryptcheck is currently unable to process modern encryption thus an alternative should be found. Fortunately there are a few open-source projects which focus on bring ssllabs API to life. Learn more at https://www.ssllabs.com/projects/ssllabs-apis/index.html Some of them are developed in python so I believe that the implentation itself should not be a big of a hassle.

Ssllabs has been keeping up with the latest TLS improvements and vulnerabilities. I would say that it is a service of choice when it comes to testing your website's TLS configuration. And it also provides a TLS grading similar to cryptcheck's one.

external_ressources.py: add supported for GIT_URL in /config

The PR searx/searx#1900 brings support for searx branding.

The .../config URL adds some new fields:

  • brand
    • GIT_URL
    • DOC_URL

GIT_URL should be use to check the vanilla status.

Internal to do:

  • get_repository: the directory parameter becomes a parent directory. The real directory becomes some like os.path.join(directory, sha256(url)).
  • GIT_URL must be ready when external_ressources.py runs.
  • If the HTML column is neither V nor E, perhaps the instance doesn't comply with the AGPL license (see dalf/searx-instances#23 )

Extended version:

  • add a new html page (a new tab) in the output to list different forks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.