Comments (4)
I had a quick play with this. 'nserror' codes are errors from the desktop platform - in most cases for sync, these error codes are actually network errors - errors where we can't even start a http request (and as phil says, httperror
typically means a http error response, as opposed to a failure to actually make the request at all.)
Finding these code in the tree is tricky, but http://james-ross.co.uk/mozilla/misc/nserror is good enough. However, the "most popular" ones are:
2152398878 = NS_ERROR_UNKNOWN_HOST
2152398861 = NS_ERROR_CONNECTION_REFUSED
2152398862 = NS_ERROR_NET_TIMEOUT
2147500036 = NS_ERROR_ABORT
I forked that query, then changed the sql to look only at nserror and I see NS_ERROR_UNKNOWN_HOST - I tried to publish it and it warned me I was overwriting something else - which surprised me as I forked it - but I was worried I was overwriting something bad, and long story short, I got confused :) So I think https://sql.telemetry.mozilla.org/queries/70710/source#177932 isn't going to work for you, so I pasted the SQL below.
Long story short, spanner has those 4 error codes in higher numbers which accounts for most of the different. I've no idea why that would be the case though.
My hacks to the sql - only look at nserror and put the code before the node type
WITH
d as (
SELECT
timestamp_trunc(submission_timestamp,hour) as hour,
case when s.failure_reason is null then concat('success', '_', case when payload.sync_node_type is null then 'unknown' else payload.sync_node_type end)
else concat(s.failure_reason.code, '_', case when payload.sync_node_type is null then 'unknown' else payload.sync_node_type end) end as node_type_sync_failure_reason,
count(*) as syncs_with_name,
count(distinct payload.uid) as users_with_name
from `moz-fx-data-shared-prod.telemetry_live.sync_v4`
CROSS JOIN UNNEST(payload.syncs) s
where s.failure_reason.name = "nserror"
AND date(submission_timestamp) >= date(2020,1,9)
AND normalized_channel in ('nightly', 'beta', 'release')
AND safe_cast(substr(application.version,1,2) as int64) >= 73
group by 1,2 order by 1,2
),
totals as (
SELECT
timestamp_trunc(submission_timestamp,hour) as hour,
case when payload.sync_node_type is null then 'unknown' else payload.sync_node_type end as sync_node_type,
count(*) as total_syncs,
count(distinct payload.uid) as total_users
from `moz-fx-data-shared-prod.telemetry_live.sync_v4`
CROSS JOIN UNNEST(payload.syncs) s
where date(submission_timestamp) >= date(2020,1,9)
AND normalized_channel in ('nightly', 'beta', 'release')
AND safe_cast(substr(application.version,1,2) as int64) >= 73
group by 1,2
)
SELECT
d.*,
t.total_users,
t.total_syncs,
t.sync_node_type,
d.syncs_with_name / t.total_syncs as proportion_of_syncs_with_failure_reason,
d.users_with_name / t.total_users as proportion_of_users_with_failure_reason
FROM d
INNER JOIN totals t
ON d.hour = t.hour AND split(d.node_type_sync_failure_reason, '_')[OFFSET(1)] = t.sync_node_type
WHERE t.sync_node_type != 'unknown'
order by 1,2
from services-engineering.
Looking at Overall Sync Failure Name, there appears to be similar differences between mysql_nserror
vs spanner_nserror
(disabling mysql/spanner_success
makes this clearer).
Also similarities w/ spanner_httperror
vs mysql_httperror
.
I believe nserror appears to be a generic class of errors, and nserror
here indicates a DNS lookup failurehttperror
looks like any kind of error response.
from services-engineering.
I'll see if I can dig up the error codes. @pjenvey do you have a hard timeline for this or just sooner rather than later. I should be able to take a peek by EOW
from services-engineering.
@irrationalagent Just sooner rather than later, thanks!
from services-engineering.
Related Issues (20)
- Onboarding docs for Mark Drobnak HOT 3
- Audit and update Metrics HOT 2
- META: Convert travis CI commands to circle-ci
- syncstorage-rs latency spikes HOT 2
- syncstorage-rs logging HOT 4
- 0.5.0 load test anomaly HOT 4
- Investigate syncstorage-rs stage timeouts HOT 2
- Move DS runbook to wiki
- syncstorage-rs memory consumption HOT 1
- Broadcast Bounce Mitigation
- Add bookmark generator
- Update webpush test page tracking bug
- Meta: Update various CI to use new docker login HOT 4
- META: Ensure that instances are not using travis-ci.org
- Change syncstorage's default keepalive setting HOT 1
- Technical overview for Project Cumulus service HOT 2
- Cumulus Service PRD HOT 3
- Move off of Travis
- Create a new repo/skeleton for Project Cumulus service HOT 5
- QA's syncstorage-loadtest env broken HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from services-engineering.