Comments (4)
AFAIK clients should use another test helper if one of them is down, but it may be not the case
This is what ooni-probe does, but MK is lagging behind in this respect.
In particular, my understanding is that (@hellais correct me if I'm wrong) the bouncer returns three different collectors and/or test helpers (if any):
- the onion one
- the https one
- the cloudfronted one
Regarding specifically MK: it cannot use the onion one; it will use the https one; with some more hammering it will also be able to use the cloudfronted one.
Question: how relevant is the fact that a client will retry with another helper in light of the fact that the three returned helpers (collectors) may all point to the same VM implementation?
I mean, let's assume that the bouncer returns:
- A.onion pointing to VM A
- A.https pointing to VM A
- A.cloudfronted pointing to VM A
In such case, what do we gain by retrying all of them, if the breakage is caused by A being down?
-
Is the bouncer smart enough to return helpers (or collectors) belonging to different VMs (e.g. A.onion, B.https, C.cloudfronted) so to increase diversity?
-
Is the bouncer able to return more than one entry for each category?
-
Is the bouncer somehow connected to alerts so that it reasonably knows that A is up when it returns A.onion, A.https and A.cloudfronted?
from sysadmin.
Relapse at b.web-connectivity.th.ooni.io, timeline UTC:
25 Sep 00:00 2017-09-25T00:00:40.448510440Z Another twistd server is running, PID 1
25 Sep 00:05 [FIRING:1] InstanceDown
25 Sep 07:36 2017-09-25T07:36:01.047083733Z Removing stale pidfile /oonib.pid
25 Sep 07:40 [RESOLVED] InstanceDown
from sysadmin.
This relapsed again twice recently:
01:11 18th November 2019 UTC+2
[FIRING] mia-wcth.ooni.io:9100 had 1 OOMs in 5m
[FIRING] http://y3zq5fwelrzkkv3s.onion/status endpoint down
[FIRING] http://2lzg3f4r3eapfi6j.onion/status endpoint down
[RESOLVED] mia-wcth.ooni.io:9100 had 1 OOMs in 5m
hellais 11:33 18th November 2019 UTC+2
[RESOLVED] https://mia-wcth.ooni.io/status endpoint down
This was actually down. The issue was:
This could either be a previously started instance of your application or a
different application entirely. To start a new one, either run it in some other
directory, or use the --pidfile and --logfile parameters to avoid clashes.
Another twistd server is running, PID 1
The fix was to delete /srv/web_connectivity/oonib.pid and restart the docker container
and then
00:34 21st November 2019 UTC+2
[FIRING] https://mia-wcth.ooni.io/status endpoint down
10:29 21st November 2019 UTC+2
Damn the mia-wcth went down again due to: https://openobservatory.slack.com/archives/C38EJ0CET/p1574073207161400
10:29
[RESOLVED] https://mia-wcth.ooni.io/status endpoint down
from sysadmin.
Relapsed again:
[FIRING] https://mia-wcth.ooni.io/status endpoint down
docker stop 1ca0dce565e5
rm /srv/web_connectivity/oonib.pid
docker restart 1ca0dce565e5
docker logs 1ca0dce565e5 # show Starting web_connectivity helper...
tail /srv/web_connectivity/logs/oonibackend.log -f # unstuck
[RESOLVED] mia-run.ooni.nu:9100 is not OK, check `systemctl list-units | grep failed`
from sysadmin.
Related Issues (20)
- Upgrade TLS used by web-connectivity test-helper HOT 1
- Test master issue
- Child issue
- Bouncer giving out down test-helpers for ~16 hours HOT 5
- Migrate collectors to the oonified host HOT 1
- Make the cans public and move the to the open data account
- Implement a deb based deployment process HOT 2
- Monitoring: fix frequent alerts around .onion services being unreachable
- Prometheus / Grafana: store data for longer times HOT 10
- Incident: very slow rsync between ams-ps1 and datacollector HOT 4
- Incident: blocked pipeline on 2019-12-10 HOT 1
- y3zq5fwelrzkkv3s.onion ams-wcth2 unreachable HOT 2
- Drop brie.darkk.net.ru from monitoring HOT 2
- [FIRING] Lots of `scrape_samples_scraped` lost Now ~ 2.83k, 24h ago ~ 27.96k. HOT 1
- psql: FATAL: database "metadb" does not exist HOT 4
- MetaDB - Time to replication
- Metadb Replica Access - postgres user HOT 1
- no:assignee
- Slack bridge is not transmitting messages to IRC channel HOT 2
- RSS by country feed is not available
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sysadmin.