mozilla / mdv2-prototype Goto Github PK

Prototype for the Mozilla Measurement Dashboard Version 2

Home Page: https://mozilla.github.io/mdv2-prototype/

License: Mozilla Public License 2.0

HTML 5.83% CSS 1.34% JavaScript 92.83%

mdv2-prototype's Introduction

mdv2

mdv2 is a prototype Telemetry Mesurement Dashboard built to evaluate designs and architectures for suitability as a replacement for the venerable telemetry.mozilla.org Measurement Dashboard.

It is built with sample data, a custom data format, React components, Bootstrap styling, Plotly plots, and the help of many people.

This project owes its design and architecture to the work of Erin Comerford over the course of her Outreachy internship in 2018.

mdv2-prototype's People

Contributors

Stargazers

Watchers

Forkers

openjck chutten ecomerford georgf mozilla-github-standards

mdv2-prototype's Issues

Add some text around the metric selectors

It would be a nice & cheap improvement to add some text around the measure, version and channel selectors, similar to the mockup.

For now, i'm suggesting:

"Measure:" before the metric selector (staying consistent with the page title)
"Viewing Firefox data per user for Firefox <channel> version <version>".

CODE_OF_CONDUCT.md file missing

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please see Mozilla-GitHub-Standards or email [email protected].

(Message COC001)

Use the Probe Dictionary for metrics metadata

Show metrics descriptions that are fetched from the probe dictionary (https://telemetry.mozilla.org/probe-dictionary/)

Use exponential scale for x-axis of comparison view

In the current comparison view, both graphs have very long tails, which is not ideal for viewing the main portion of the graph. I'd like to update this so that the x-axis is scaled exponentially, which should "normalize" the graph, as it were, or centralize the main curves.

Give Distribution View's "extraText" a design pass

In #56 we introduced a piece of extra text below boolean plots to help our users answer the question "What %ge/How many users have ever X?"

The wording isn't final (English is hard), the design isn't final (font, whitespace, colour, weight, decoration), and adding this text opens up the possibility of maybe adding more words down there (maybe our users also want to know "What %ge/How many users have ever not X?").

And maybe we want words down there for non-boolean data as well (#56 removed the "Mean: X" display, but we could put it back)

This issue is for discussing these aspects of the design and arriving at a consensus.

Determine what React's best practices are for props and apply them to our components

In #73 @georgf notes that "data" isn't the best name or the best structure for BarPlot's props. After the DataStore refactor maybe calling it metricDataArray or something would do, but this highlights that I don't have an idea what the best practices for props are, so I don't know how we should best apply them here.

This Issue is for researching and documenting any best practices for props passing, and then applying them to our components.

Add server-side caching of data requests

Right now there are 2 possible options we're choosing from for how the client-aggregate data will be queried. We will either use Druid (http://druid.io/) to store the data to be queried or GraphQL (https://graphql.org/) on top of our data in S3/parquet for example.

Both these technologies allow for a single endpoint to be queried with a json blob to define the query. The benefit of this is that we do not need to create a separate service with multiple endpoints (like we do in the current mdv: https://github.com/mozilla/python_mozaggregator/blob/master/mozaggregator/service.py), one for each query type. The queries can be created on the client-side as needed.

That said, we will still need to server-side caching of the responses (e.g. in mdv1 it's done here: https://github.com/mozilla/python_mozaggregator/blob/master/mozaggregator/service.py#L260)

Unify user / client / respondent terminology

We're starting to use mixed terminology in the dashboard for "users", "clients" & "respondents".

Let's use a single term here. We defaulted to "user", we can have proposals and collect feedback if we need to change it.

Use categorical labels in ComparisonView

When loading HTTP_SCHEME_UPGRADE_TYPE we just show numerical labels on the x-axis.
The distribution view shows the categorical labels instead, lets do the same in the comparison view.

Give "Comparing X against Y" control a design review

Right now The selector in the Comparison View looks something like

This isn't ideal at least because we can't terminate the sentence without it looking odd, and because the width of the measure name changing means the position of the user-interactive part moves around.

We can do better, probably.

Selecting HTTP_SCHEME_UPGRADE_TYPE breaks the app

Something in the merge of #53 or #55 caused the app to break whenever you select HTTP_SCHEME_UPGRADE_TYPE. This is suboptimal.

Comparison view graph requires more optimal/dynamic dimensions

The full height and width attributes in the current comparison view did not function as desired, so we need to find a better way to optimize the size of the density area graph (metrics-graphics).

Distribution View for Boolean Data shows all hover text at once

Due to the nature of the plotly plot (grouped bar), the hover text for always, sometimes, and never all appear at once:

Ideally it'd look more or less identical to how it does, but with the hover being only for the nearest data point to the cursor.

I'm not sure if plotly gives us that much control over its hover behaviour, so we may have to experiment with using a single trace instead of three (so it can be ungrouped), so that plotly could hover each data point individually. This may lose us the legend and our colours if we're not careful.

This mostly requires familiarizing ourselves with plotly's capabilities and iterating on a design until we're happy.

Add evolution view

The evolution view in the current mdv (https://telemetry.mozilla.org/new-pipeline/evo.html) is an important component that still needs to be built into this prototype.

There was no user research done on this, although we have information about its performance being quite poor (https://docs.google.com/document/d/1o6p38pnMwEOfxib6-GxM0Li8iNHZpUAmvS5Sl1D9sxk/edit)

There will need to be a bit of investigation to decide if this should look different in mdv and how it should look.

Remove area chart from ComparisonView

The MetricsGraphics area chart in the ComparisonView doesn't really seem to add anything over the other two charts in it.
Can we just remove it?

Consider adding node_modules to .gitignore and removing directory from repo

It's not a hard and fast rule, but most projects at Mozilla don't keep node_modules in source control. They list node_modules in .gitignore to instruct Git to ignore the directory and provide setup instructions that tell other developers to run npm install before using the project. See fhwr-unflattener as a simple example of that approach.

There are pros and cons to keeping node_modules out of source control, so what you decide to do will depend on your needs. But maybe other front-end devs from around the org, like @darkwing, @spasovski, @hamilton, and @jpetto could offer some of their opinions. 😃

Aggregate statistics broken for scalars

(This originally came up in #34)

Switching to scalars_devtools_onboarding_is_devtools_user results (in the summary view) in the median value of infinity.
This seems to come from getLastBucketUpper() doing a div by 0 as buckets[buckets.length - 2] is 0.
This is not a new issue, it just is visible now because we're actively loading different data files now (we didn't before).

Chris mentioned:

getLastBucketUpper has commented out code that requires information about what the type of measurement is. We don't have that information, but scalars_devtools_onboarding_is_devtools_user is a metric that requires the special handling. I'm not worried about that at this state as it will require broader changes to make correct.

Audit and support all data types

The current prototype only supports a subset of all the data types that are aggregated. To audit the missing data types we can check the aggregator for all of the types: https://github.com/mozilla/python_mozaggregator/blob/master/mozaggregator/aggregator.py

We then need to decide how each one will be handled and file individual tickets.

Add a server component to be used for auth and caching

This includes all required pieces to get a lightweight server up for development - docker + nginx + flask etc.

To start, this server will serve the existing static json data (https://github.com/mozilla/mdv2/tree/master/src/data) and will later be replaced with actual calls to druid or graphql.

Allow full-text search for metrics

We could potentially do this by searching the probe dictionary descriptions: https://telemetry.mozilla.org/probe-dictionary/

Searchable dropdowns

Right now only 'metric' is searchable. All dropdowns should be searchable

Use const in code where possible

We can use const much more than we are right now.
We could do a single clean-up pass for this on the project.

Prototype a "change" measure for Summary View for booleans, categoricals

Numerical measures have a notion of their mean changing over time which we posit is a useful piece of information for our users.

We should have some notion of "change since last version" for boolean and categorical probes and clear and concise text to explain it to users.

This may be blocked on #65 or a follow-up where "change since last version" will work for non-numeric statistics.

Add deep-linking based on the dimensions chosen by the user

Right now, choosing an item from a dropdown does not create a custom/new link for it. This is how mdv1 does it and this is what we should do here too.

Summary data for change needs improvement

On the summary view, for "change", we always compare to the last loaded value, not to the value from the preceding version.
We need to change this to always look at the preceding versions data.

Wiki changes

FYI: The following changes were made to this repository's wiki:

defacing spam has been removed
Restricting write access to contributors is strongly encouraged. Please make that change (documentation).

These were made as the result of a recent automated defacement of publically writeable wikis.

What to do with buckets with 0 (or near-0) samples?

In mdv1 we have the option (on by default) to trim buckets of histograms that have less than an arbitrary threshold of samples (0.01%)

Given how uint scalars like first_paint are auto-bucketed to "exponential, max: 10000, n_buckets: 100", we're probably going to need a similar mechanism (just look at how first_paint looks in mdv2's distribution view today).

I see a few tolerable options

Do nothing and let users zoom the plotly plots
Trim left and right buckets with 0 counts in them, without option
Set an arbitrary criteria as before and put in an option to turn it off

(the design discussion may happen in a google doc or elsewhere as is helpful. This issue is so I don't forget that this is a problem)

Add server-side authentication

In the current mdv, auth is done client side: https://github.com/mozilla/telemetry-dashboard/blob/gh-pages/new-pipeline/src/auth.js

This is non-standard and more easily spoof-able, we should implement auth on the server and follow the Mozilla standards here: https://mana.mozilla.org/wiki/display/SECURITY/SSO+Request+Form

Add missing graphing dimensions

Right now mdv2 only has metric, channel and version as dimensions. There are some missing ones that exist in mdv and should be added here too:

os_family
App_name
Submission_date and/or build_id

Need consensus for type of comparison graph

@chutten @wlach , Georg and I have been discussing the type of graph we should use for the comparison view. Possibilities include a histogram with a line graph overlaid, or a double line graph to show the comparisons. I'd love input on which you think would be a better visualization for comparing the distribution of two versions for a probe!

Also @wlach , I don't think I'll be able to complete the graph using pure metrics graphics (correct me if I'm wrong!). Any advice on how to pull in vanilla D3 to add a second graph?

Thanks!

Comparison view requires labels

The comparison view needs labels to show which graph represents which version. Coming soon!

Rearchitect Datastore

To serve the needs of the Summary, Comparison, and eventually the Evolution view we need a higher-level data storage component that can serve data efficiently from multiple combinations of channel, version, and metric name.

For example, the Summary view needs information from the currently-selected metric, and that same metric from that same channel, but from version - 1. The Comparison view needs information from the currently-selected metric, and that same metric from that same channel, and from any other version in history. (The Evolution view will need information from a range of versions within a channel and metric name, but it is presently out of scope).

It is my recommendation that we turn MetricData into a type responsible only for storing and getting summary statistics of a particular channel+version+metric's data. (similar to MetricData's current _active member). And that we break out the loading, caching, and organization of MetricData instances to its own higher-level class, DataStore (where, for instance, the loading logic of loadDataFor will live).

The DataStore and the currently-selected MetricData will be passed down via props to the view components.

This will improve code encapsulation and should be a step in a direction to help the Views deal with multiple dynamic data needs.

Fix metric data loading

We currently load data from /data/...:

const response = await fetch(`/data/${metric}_${channel}_${version}.json`);

This breaks e.g. with github pages, which has the data under /mdv2/data/....

Text in metrics input box should not be greyed out

When loading the page, the text in the metrics input box is greyed out.
Once selecting a metric, it has black text.

I think we should just always have it as black text if possible.

Non-release data should be public

To start, we will have auth over the whole dashboard. But once it's ready, we will make it public for non-release data, just like mdv1.

Note: this is blocked on having it fully tested internally and completed all user research.

Comparison view full height does not display properly

Currently, the comparison view graph has a full height attribute, but this doesn't correspond to a taller graph than when the height was designated as a number.

The container of this graph must be limiting the height. Needs some investigation and an update to correct this.

Move filter option text to App component

#30 added text around the metric filter options.
I think it would be more logical to have the text one level up.
E.g. do something like this in App.js:
Viewing Firefox data per user for Firefox <ChannelSelector .../> version <VersionSelector .../>.

I made tests noisy

In adding the second datastore to Comparison View I made the tests noisy.

Specifically, the test environment doesn't have a hostname. So when the comparison view tries to load its default data fetch complains that it doesn't have an absolute url, and we log that in console.warn over here.

It's benign, but it's loud and so it should be fixed. It's possible this'll be taken care of as part of the broader DataStore refactor, but this is here to make certain of it (and to document my mea culpa)

Implement "Get Shortlink"

This link exists in the prototype but does nothing right now. It should mimic what's done in mdv1 and provide a short, shareable link.

Add histogram display to comparison view

To thoroughly test the comparison view, we're going to add an alternative to the double line graph that is currently available. This will use plotly.js to display overlaid histogram areas, which will more resemble the distribution view that we currently have.

Add an "Export to Iodide" Feature

This would be a button that exports all data visible in mdv to iodide so that it both replicates the same graphs and the data is available for further manipulation

Make graphs use the full component width

In the distribution & evolution view, the graph seems to be using a fixed size.
It would be great to have it grow with the available page width, while using the current size as a minimum size.