Giter VIP home page Giter VIP logo

compassql's Introduction

Vega: A Visualization Grammar

Vega Examples

Vega is a visualization grammar, a declarative format for creating, saving, and sharing interactive visualization designs. With Vega you can describe data visualizations in a JSON format, and generate interactive views using either HTML5 Canvas or SVG.

For documentation, tutorials, and examples, see the Vega website. For a description of changes between Vega 2 and later versions, please refer to the Vega Porting Guide.

Build Instructions

For a basic setup allowing you to build Vega and run examples:

  • Clone https://github.com/vega/vega.
  • Run yarn to install dependencies for all packages. If you don't have yarn installed, see https://yarnpkg.com/en/docs/install. We use Yarn workspaces to manage multiple packages within this monorepo.
  • Once installation is complete, run yarn test to run test cases, or run yarn build to build output files for all packages.
  • After running either yarn test or yarn build, run yarn serve to launch a local web server — your default browser will open and you can browse to the "test" folder to view test specifications.

This repository includes the Vega website and documentation in the docs folder. To launch the website locally, first run bundle install in the docs folder to install the necessary Jekyll libraries. Afterwards, use yarn docs to build the documentation and launch a local webserver. After launching, you can open http://127.0.0.1:4000/vega/ to see the website.

Internet Explorer Support

For backwards compatibility, Vega includes a babel-ified IE-compatible version of the code in the packages/vega/build-es5 directory. Older browser would also require several polyfill libraries:

<script src="https://cdnjs.cloudflare.com/ajax/libs/babel-polyfill/7.4.4/polyfill.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/runtime.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/fetch.umd.min.js"></script>

Contributions, Development, and Support

Interested in contributing to Vega? Please see our contribution and development guidelines, subject to our code of conduct.

Looking for support, or interested in sharing examples and tips? Post to the Vega discussion forum or join the Vega slack organization! We also have examples available as Observable notebooks.

If you're curious about system performance, see some in-browser benchmarks. Read about future plans in our roadmap.

compassql's People

Contributors

akshatsh avatar dependabot-preview[bot] avatar dependabot[bot] avatar domoritz avatar donghaoren avatar espressoroaster avatar felixcodes avatar haldenl avatar jstcki avatar kanitw avatar leibatt avatar light-and-salt avatar mattwchun avatar oigewan avatar p42-ai[bot] avatar peter-gy avatar rileychang avatar ssharif6 avatar tafsiri avatar vlandham avatar yhoonkim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

compassql's Issues

Refactor / Additional Test

  • Extract and test hasRequiredPropertyAsEnumSpec in satisfy of EncodingConstraintModel and SpecConstraintModel

Replicating Compass

Gen

  • aggregate.test.ts
  • encodings.test.ts

Run npm run cover and see coverage report -- add more tests for uncovered constraints

Enumerate Scale Properties

Scale

Background

  • Look at description and changes of #27 to see the infrastructure for adding nested property (bin.maxbins) -- note that I might miss something in the description, but if that's the case, you'll notice problem as you debug.

1st step Scale.type

  • add scale.type (one PR)
    • understand what scale.type means from Vega-Lite docs
    • Add stuff like in #27
    • spec constraints (add to spec.ts)
      • omitBarAreaForLogScale -- don't use bar and area mark for log scale.
    • encoding constraints
      • dataTypeMatchesScaleType -- look at
      • omitBinForLogScale (originally vega/compass#151)
    • Add Example Query to examples/
    • add test for enumerate,
    • add test forgenerate
    • add test for all new constraints

Scale.*

Repeat the process for other scale properties (one PR for each)

  • add ones that are required by other tasks
    • type
      • clamp: Q, T
      • exponent: pow
      • round: Q, T
        • accept types of values depending on scale type
    • zero --> zero doesn't play well with [ ScaleType.ORDINAL, LOG, TIME, UTC]. I don't think I'm missing anything else...
    • bandSize
      • #93
      • bandSize must be at least 0
    • range
      • #101
      • values must contain two or more values.
    • domain
    • round
    • clamp
      • must have continuous domain / continuous domain (quantitative and time types only)
    • nice
      • similar to clamp.. quantitatiev and time.
    • exponent
    • useRawDomain

--- LATER ---

  • padding
    • works with channel.x, channel.y --> uses pixels
    • ??? padding (0, 1) for rangeBands ??? -- LATER

Improve Ranking

  • Channel, Cardinality
  • Penalize over encoding

Test

  • TxT
  • TxQ
  • QxT > Q

Data-driven occlusion test

Right now we just say aggregate has no occlusion, while raw has occlusion -- that's not always correct.

Distinguish high-cardinality strings from nominal fields

Fields with too high cardinality takes up a lot of space and can be slow to render.

  • add a flag isKeyLike (or some better name) to schema

We might want to consider a few options:

  • distinguish between categories (low cardinality) and text (high cardinality) as they serve different purpose in data analysis anyway.
    • Check if the cardinality is above X% (50%?) of the overall data count and above minimum threshold (e.g., 40)

Maybe check if "if the cardinality is above ~80% of the overall data count" or some similar criteria

  • Add a constraint that excludes fields with too high cardanality from being added automatically.

This spec generates duplicated output

{
  "mark": {
    "mode:": "pick/enum"
    "values": [""]
  },
  "encodings": [
    {
      "channel": "x",
      "field": "Cylinders",
      "type": "quantitative"
    },{
      "autoCount": true
    }
  ]
}

Cardinality Based Constraints

  • determine input format for cardinality in the schema
  • maxCardinalityForFacets
  • maxCardinalityForColor
  • maxCardinalityForShape
  • minCardinalityForBin

Revise old compass constraints

Not sure if we should add the following

  • maxCardinalityForAutoAddOrdinal #70
  • alwaysAddHistogram
  • consistentAutoQ -- if aggregate for all Q are "*" -- give all of them same level of aggregation. (already have omitRawContinuousFieldForAggregatePlot)

Refactor Bin to Support Bin Parameter

Currently in EncodingQuery, it's

bin?: boolean | EnumSpec<boolean> | ShortEnumSpec;

However, bin can have parameter too and I don't want mixing up between boolean and object here.

So I'm thinking

bin?: BinQuery

with the following interface

interface BinQuery {
  enable: boolean | EnumSpec<boolean> | ShortEnumSpec;
  maxbins: number | EnumSpec<number> | ShortEnumSpec;
  ... // other params
}

Any thoughts? @domoritz

Add JSON schema

  • Generate JSON Schema for CompassQL schema

Look at this line in Vega-Lite
https://github.com/vega/vega-lite/blob/master/package.json#L35

Do the same for Query.

  • Add Tests to validate all examples

In Vega-Lite, we have a test that validates all example specs so that both its input and output validates JSON schema.

  • Validates input CompassQL query (each example json files)
  • For each example query, run the query method in query.ts and check the output. For each SpecQueryModel in the output convert them into Vega-Lite specs (call .toSpec()) and validates Vega-Lite output.)

Make sure that the example test is excluded from test coverage.
(See Vega-Lite's package.json)

MVP for Enumerate

  • enumerate answers based on input CompassQL query
    • check if the constraint is enabled (in the option)
    • generate fields -- read from schema
  • support two types of constraints
    • encoding constraint (constraint for one encoding mappings)
    • spec constraint (constraint that involves multiple encoding mappings or involves relationship between mark and encoding)
  • determine order in a way that automatically adding count still works
    • noRepeatedField --> '*'
  • Remember which field we assign for later reference

Missing Constraints

  • channelsSupportRoles
  • omitShapeWithBin (channel supports role?)
  • omitShapeWithTimeDimension (channel supports role?)
  • omitBarWithSize
  • omitRawBar/Area

Constraint propertyPrecedence

  1. Prevent duplicate output if autoCount comes after channel in propertyPrecedence

Basically, whenever, autoCount is false, we shouldn't even assign it to a channel.

We have to either add Logic to prevent autoCount to come after channel in the propertyPrecedence
or make answerSet in generate really a set to prevent duplication

  1. Prevent nested property output from coming before its parent

Add missing core tests

enumerator.test.ts

For each of these properties:

  • aggregate
  • timeUnit
  • field
  • type
  1. Write a test that enumerate all valid values
  • aggregate
  • timeUnit
  • field
  • type

hint: turn config.verbose = true

  1. Write a test that enumerate both valid and invalid values (and test that the output contains only valid values)
  • aggregate
    • To see relevant constraints, look at constraints/{spec|encoding}.ts
      • look at properties of each constraint
      • look at a few ones that contain Property.AGGREGATE

(LATER)

  1. Write a test that enumerate all valid values
  • bin -- bin is the most complicated -- ping me to explain about it
  1. Write a test that enumerate both valid and invalid values (and test that the output contains only valid values)
  • To see relevant constraints, look at constraints/{spec|encoding}.ts
  • timeUnit
  • field
  • type

Other Files

Run npm run cover and see coverage report -- add more tests for uncovered constraints

Enumerate Stack

  • Stack
  • Stack constraint (don't enumerate non-summing aggregate for stack)

Refactor

  • Consistent Variable Name
    • encodingQ => encQ
    • property => prop
  • EnumSpecIndex.timeunit => timeUnit

cc: @ZeningQu

Aggregate Plot with Facet the only group-by should be rated worse

e.g.,

{
  "data": {
    "url": "data/cars.json"
  },
  "mark": "point",
  "encoding": {
    "row": {
      "field": "Cylinders",
      "type": "nominal"
    },
    "x": {
      "aggregate": "mean",
      "bin": false,
      "field": "Horsepower",
      "type": "quantitative"
    },
    "y": {
      "aggregate": "mean",
      "bin": false,
      "field": "Acceleration",
      "type": "quantitative"
    }
  }
}

Split generate.ts into two files

Right now enumerator stuff are in generate.ts.
However, this makes generate.test.ts unduly long.

Therefore, we should extract enumerator.ts from generate.ts

Syntax for nested grouping

Nested grouping is very important for understanding structure / debugging output results.
(I'm currently flooded by transposes of the visualizations.)

Therefore we need a good syntax for nested grouping.

Suppose I want to hierarchical grouping that first group by dataQueryKey then by encodingKey.

  • For each subgroup (by encodingKey), I want to order the subgroup's items by rankFn1.
  • For each group (by dataQueryKey), I want to order the group's items (which are subgroups based on encodingKey) by rankFn2.
  • Finally, I want to order groups by rankFn3.

For example, rankingFn1 = rankingFn2 = "effectiveness". rankFn3 can be some data enumeration order. The ranking function will rank groups by calculating score for the top-item in each list.

Suppose

spec = {
    "data": {"url": "data/cars.json"},
    "mark": "?",
    "encodings": [
      {
        "channel": "?",
        "field": "Cylinders",
        "type": "ordinal"
      },{
        "channel": "?",
        "bin": "?",
        "aggregate": "?",
        "field": "Horsepower",
        "type": "quantitative"
      }
    ]
  }

Here are a few alternative queries:

a) Nested version

{
  spec: spec, 
  group/groupings: { 
    // This case, definitely start with top-level grouping key. 
    by: 'dataKey',
    // if we want one output for each group, we can replace this orderItemBy with chooseBy
    orderItemBy: 'rankingFn2' 
    subgroup/subgroupings: {
      by: 'encodingKey',
      orderItemBy: 'rankingFn1'     
    }
  }],
  orderBy: 'rankingFn3'   
}

b) Array-based

{
  spec: spec, 
  // should the first one be the top-level one or the subgroup one -- current it's the subgroup one
  group/groupings: [{ 

     groupBy: 'encodingKey',
     // if we want one output for each group, we can replace this orderItemBy with chooseBy
     orderItemBy: 'rankingFn1'  
  },{
     groupBy: 'dataKey',
     orderItemBy: 'rankingFn2'  
  }],
  orderBy: 'rankingFn3'   // or orderGroupBy?
}

@jheer @domoritz any preference for a. or b. (or other options) / minor wordings?

I am not married to of these yet. Other ideas are welcomed.
I'm leaning toward the nested version because it's seems clearer which one is the top-level grouping.

Add statistical profiling

  • 1D
  • 2D
  • Need to think what to add

Refactor constraints

Specs

  • hasAppropriateGraphicTypeForMark
  • omitRawBarLineArea
  • omitRawTable

Don't bin Q-field add autoCount if there are already dimension in the spec

For example,

{
  "spec": {
    "data": {"url": "data/cars.json"},
    "mark": "?",
    "encodings": [
      {
        "channel": "?",
        "field": "Cylinders",
        "type": "nominal"
      },{
        "channel": "?",
        "field": "Origin",
        "type": "ordinal"
      },{
        "channel": "?",
        "bin": "?",
        "aggregate": "?",
        "field": "Acceleration",
        "type": "quantitative"
      }
    ]
  },
  "groupBy": "data",
  "config": {
    "autoAddCount": true
  }
}

has this group group: Cylinders,n|Origin,o|bin(Acceleration,q)|count(*,q) that contains a visualization like this one:

vega_editor

{
  "data": {
    "url": "data/cars.json"
  },
  "mark": "point",
  "encoding": {
    "y": {
      "field": "Cylinders",
      "type": "nominal"
    },
    "x": {
      "field": "Origin",
      "type": "ordinal"
    },
    "row": {
      "bin": true,
      "field": "Acceleration",
      "type": "quantitative"
    },
    "size": {
      "aggregate": "count",
      "field": "*",
      "type": "quantitative"
    }
  }
}

Deal with text table.

In older Compass, we add a few hacks for recommending text table.

With the new label and tile, we need to revise how we deal with this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.