mozilla / chronicle Goto Github PK

View Code? Open in Web Editor NEW

16.0 11.0 6.0 3.22 MB

find everything you've ever found

Home Page: http://mozillachronicle.tumblr.com/

License: Mozilla Public License 2.0

JavaScript 82.32% HTML 4.35% CSS 12.75% Shell 0.58%

chronicle's Introduction

Chronicle

find everything you've ever found

Installation

Large Tools

Chronicle is built using Node.js, ElasticSearch, PostgreSQL, and Redis, so you'll want to install the current stable version of all of these.

If you are using Mac OS and have Homebrew installed, this incantation should work:

$ brew install nodejs elasticsearch postgresql redis

Code

The server-side code dependencies are managed with npm and requires that Grunt is globally installed (npm install -g grunt-cli). The front-end dependencies are managed with Bower; you can install it via npm install -g bower if you don't have it on your system.

To fetch dependencies and get cooking:

npm install and ensure redis, elasticsearch, postgres are all running
As part of the npm install process, the postinstall script will install the Bower dependencies for you.
Copy config/local.json.example to config/local.json, and put your local info in there.
Run ./bin/create_db.sh to create the database

this script currently hard-codes the db user, password, and dbname to 'chronicle' (issue #112)

Run ./bin/migrate.js to run all the migrations that create the database tables and indexes. (This script also reindexes elasticsearch, but on the first pass, you don't have data in postgres to copy over.)
Run ./bin/create_test_data.js to create a test user and test data

the test user is defined in the config file
the test data is a set of visits created using the URLs in config/test-urls.js. Over time we'll experiment with different test data sets, might wind up with a test-urls directory instead.

npm start
You're up and running! surf to http://localhost:8080 🏄

Tests

Right now the test suite consists entirely of functional tests that require Selenium Server 2.44.0.

Prerequisites

Java JDK or JRE (http://www.oracle.com/technetwork/java/javase/downloads/index.html)
Selenium Server (http://docs.seleniumhq.org/download/)

Run the tests

Run the following in separate terminal windows/tabs:

java -jar path/to/selenium-server-standalone-2.44.0.jar
grunt test

Available Grunt Tasks

Name	Description
`autoprefixer`	Adds vendor prefixes to CSS files based on http://caniuse.com statistics.
`build`	Build front-end assets and copy them to dist.
`changelog`	Generate a changelog from git metadata.
`clean`	Deletes files and folders.
`contributors`	Generates a list of contributors from your project's git history.
`copy`	Copies files and folders.
`copyright`	Checks for MPL copyright headers in source files.
`css`	Alias for "sass", "autoprefixer" tasks.
`hapi`	Starts the hapi server.
`jscs`	JavaScript Code Style checker.
`jshint`	Validates files with JSHint.
`jsonlint`	Validates JSON files.
`lint`	Alias for "jshint", "jscs", "jsonlint", "copyright" tasks.
`sass`	Compiles Sass files to vanilla CSS.
`serve`	Alias for "hapi", "build", and "watch" tasks.
`validate-shrinkwrap`	Submits your npm-shrinkwrap.json file to https://nodesecurity.io for validation.
`watch`	Runs predefined tasks whenever watched files change.

npm Scripts

Name	Description
`authors`	Alias for `grunt contributors` Grunt task.
`lint`	Alias for `grunt lint` Grunt task. This task gets run during the precommit Git hook.
`outdated`	Alias for `npm outdated --depth 0` to list top-level outdated modules in your package.json file. For more information, see https://docs.npmjs.com/cli/outdated.
`postinstall`	Runs after the package is installed, and automatically installs/updates the Bower dependencies.
`shrinkwrap`	Alias for `npm shrinkwrap --dev` and `npm run validate` to generate and validate npm-shrinkwrap.json file (including devDependencies).
`start`	Runs `grunt serve`.
`test`	Runs unit and functional tests.
`validate`	Alias for `grunt validate-shrinkwrap` task (ignoring any errors which may be reported).

Creating Dummy Data

If you just want to test something quickly with a small, known test data set:

Run ./bin/create_db.sh to drop and re-create the local Postgres database.
Run ./bin/migrate.js to apply any Postgres migrations specified in the server/db/migrations/ directory.
To enable test data, ensure the testUser.enabled config option is set in config/local.json.

You can use the default id and email (defined in server/config.js), or set them yourself. You can set the values via env vars or config values. See server/config.js for the defaults and which config values or env vars to use.

Run ./bin/create_test_data.js to create a dummy user and a few dummy visits.

The created dummy visits which will be created can be found in the config/test-urls.js file.

Learn More

Tumblr: http://mozillachronicle.tumblr.com/
IRC channel: #chronicle on mozilla IRC
Mailing list: [email protected] (https://mail.mozilla.org/listinfo/chronicle-dev)

chronicle's People

Contributors

Stargazers

Watchers

Forkers

jaredhirsch pdehaan vladikoff sangyf jbhoosreddy

chronicle's Issues

Move Grunt tasks from README into CONTRIBUTING

I just noticed that there was a boilerplate section for it in /CONTRIBUTING.md#grunt-commands so I should have probably put it there instead.

Check for Hapi server errors during startup

re: #32 (comment) and /server/index.js:22

http://hapijs.com/api#serverstartcallback

It looks like server.start() takes a single callback as a parameter with the following signature function (err) {...}. So we may want to add a check for err being non-null and pipe that through heka or something something.

add scraper service (extract interesting data from URLs)

Thoughts

new URLs should be added to scraper queue
scraper output should be JSON structured & sent to elasticsearch, it'll also need to be transformed and inserted into MySQL
scraper-worker can just poll the scraper's endpoint (or wait for a callback, whatever), then chuck it in elasticsearch when ready
scraper might be third party, or we might own it and build it ourselves
scraper will need to canonicalize URL, generate summary, find a suitable image/media blob, and possibly also generate keywords
let's have a provider-agnostic interface/contract that lets us use the same API whether it's embedly, our own scraper, or some combination of the two
also: be sure to insert canonical URL into visits.url and add visits fields for other new fields

Tasks

Create schema for user pages to store scraped data
Create scraper worker (embedly)
Index scraped data
Expose scraped data in visits API
Expose scraped data in search API
Show scraped data on the front-end in the visits index

Add documentation around our vague npm scripts in package.json

Throw these in the CONTRIBUTING.md file as well, probably adjacent to the Grunt task descriptions.

user signup / sign in / sign out with fxa-oauth

docs:

https://developer.mozilla.org/en-US/docs/Mozilla/APIs_attached_to_Firefox_Accounts

v1 in ruby:

https://github.com/nchapman/chronicle/blob/master/app/controllers/auth_controller.rb

possibly related:

https://id.etherpad.mozilla.org/fxa-relier-client (maybe not totally relevant)
https://github.com/mozilla/fxa-relier-client/

allow visit creation via PUT /visits/:visitId

Right now, clients can only create visits via POST to /v1/visits, but they can optionally specify the visitId.

RESTfully speaking, the only reason to POST to an endpoint is because you don't know what the representation will be. If the client knows the visitId, then the client could just PUT the visit to its URL, /v1/visits/visitId.

Not a big deal, but would be nice to eventually fix.

enable users to view their chronicle history

need user account system
need to accept visit input
generate a view (most recent first?) of the history via API, so front-end can visualize it

deploy one-off dev stack on AWS + write docs

build out the basic stack on AWS
take tons of screenshots (I find the AWS dashboard super confusing to navigate)
put the docs someplace (README? docs folder?)

move config.js to top-level, add front-end config there?

Since we're now passing some config to the front-end, and since server/config.js actually has a separate server section, maybe it would be useful to move config.js to top level, and give it a front-end or client section?

Avoid copying all bower_components

We're currently copying everything in app/bower_components to dist. That's not ideal.

Change default static path from 'app' to 'dist'?

Currently in server/config.js we set staticPath config to "app".

It seems to work if I set the default to "dist" and recompile my app, but I need to run the following commands first:

grunt dist
grunt css
npm start

I can probably combine steps 1+2 into a grunt build task. We could probably even use https://github.com/athieriot/grunt-hapi to create some grunt server task which builds/moves the required assets and then starts the grunt server.

Static assets 404ing

Neither JS nor CSS seem to be coming up, even though the files are in dist/

Add CHANGELOG.md

... cause OPs and mgmt love that stuff, plus, it's a nice way to track what's new between releases.

As with most things, fxa-content-server has a nice solution: /grunttasks/changelog.js (via https://github.com/btford/grunt-conventional-changelog)

api server: build out visits API

provisional API definition: https://etherpad.mozilla.org/chronicle-api
since we have no real hapi directory layout standard, maybe models, views, controllers?
- views transform the model output into JSON (do we need this?)
- controller includes route handler, coordinates model/view
- model includes biz logic, touches the sequelize ORM "models" (kinda confusing, hmm), emits via views
after writing it out, I'm unconvinced this is a useful abstraction. we'll see how implementation goes.
thinking of something like

/server
  /controllers
  /routes
  /views
  /models
  /db
    /models
    /migrations

if this turns out to be overkill, just write it out using the hapi route handler and get on with life :-)

enable users to save their history in chronicle

expose endpoint to save visits
save visit info someplace we can find it later (mysql)
use work queue to decouple if possible (or open a followup bug)

create service for sending transactional emails (sendmail locally, SES in the cloud)

when a user registers, send them a welcome email
- user settings should include an email preference (later)

Server Error: Unauthorized

I get this error in the log every time I load the root path (http://localhost:8080/). I assume this is a temporary problem until we finish setting up auth.

Debug: auth, unauthenticated, error, session
    Error: Unauthorized
    at Object.exports.create (/Users/nchapman/Code/chronicle/node_modules/hapi-auth-cookie/node_modules/boom/lib/index.js:21:17)
    at Object.exports.unauthorized (/Users/nchapman/Code/chronicle/node_modules/hapi-auth-cookie/node_modules/boom/lib/index.js:85:23)
    at validate (/Users/nchapman/Code/chronicle/node_modules/hapi-auth-cookie/lib/index.js:114:49)
    at Object.scheme.authenticate (/Users/nchapman/Code/chronicle/node_modules/hapi-auth-cookie/lib/index.js:179:13)
    at /Users/nchapman/Code/chronicle/node_modules/hapi/lib/auth.js:214:30
    at internals.Protect.run (/Users/nchapman/Code/chronicle/node_modules/hapi/lib/protect.js:56:5)
    at authenticate (/Users/nchapman/Code/chronicle/node_modules/hapi/lib/auth.js:205:26)
    at internals.Auth._authenticate (/Users/nchapman/Code/chronicle/node_modules/hapi/lib/auth.js:328:5)
    at internals.Auth.authenticate (/Users/nchapman/Code/chronicle/node_modules/hapi/lib/auth.js:164:17)
    at /Users/nchapman/Code/chronicle/node_modules/hapi/lib/request.js:321:13

define jshint rules + add git precommit hook

Server should replace strings in index.html

Get auth sorted out, working with visits api

The current auth code basically duplicates the hapi bell plugin inside the /auth/complete route handler. Decide if it's even worth bothering with bell; if it's simpler to just make it a fully custom auth strategy, do that.

enable users to register with fxa oauth

user clicks register button on front-end
user does iframed oauth flow (as seen on 123done oauth branch https://github.com/mozilla/123done/tree/oauth)
oauth token stored in DB, swapped for session cookie (or other non-cookie session token?)
refer to 123done branch and/or ask people for any other details
should this bug also contain login / logout stories?

Add ability to sign in/up from the front-end

Document visits api

Put the docs inside /docs/API.md, follow fxa formatting conventions where those make sense

convict error when trying to clone and run from GitHub

Steps to repro

$ git clone [email protected]:mozilla/chronicle.git
$ cd chronicle
$ npm install
$ cp config/local.json.example config/local.json
$ npm start

Actual results

> [email protected] start /Users/pdehaan/dev/tmp/chronicle
> node server/index.js


/Users/pdehaan/dev/tmp/chronicle/node_modules/convict/lib/convict.js:393
        throw new Error(errBuf);
              ^
Error: server.session.duration: must be a positive integer: value was "7 days"
    at Object.rv.validate (/Users/pdehaan/dev/tmp/chronicle/node_modules/convict/lib/convict.js:393:15)
    at Object.<anonymous> (/Users/pdehaan/dev/tmp/chronicle/server/config.js:183:6)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)
    at require (module.js:380:17)
    at Object.<anonymous> (/Users/pdehaan/dev/tmp/chronicle/server/index.js:8:14)
    at Module._compile (module.js:456:26)

npm ERR! [email protected] start: `node server/index.js`
npm ERR! Exit status 8
npm ERR!
npm ERR! Failed at the [email protected] start script.
npm ERR! This is most likely a problem with the mozilla-chronicle package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     node server/index.js
npm ERR! You can get their info via:
npm ERR!     npm owner ls mozilla-chronicle
npm ERR! There is likely additional logging output above.
npm ERR! System Darwin 12.5.0
npm ERR! command "node" "/usr/local/bin/npm" "start"
npm ERR! cwd /Users/pdehaan/dev/tmp/chronicle
npm ERR! node -v v0.10.33
npm ERR! npm -v 1.4.28
npm ERR! code ELIFECYCLE
npm ERR!
npm ERR! Additional logging details can be found in:
npm ERR!     /Users/pdehaan/dev/tmp/chronicle/npm-debug.log
npm ERR! not ok code 0

Expected results

No errors.

Need to dig in a bit more, but I think the problem w/ convict is this in the config/local.json.example file:

    "session": {
      "password": "Wh4t3ver.",
      "isSecure": false,
      "duration": "7 days"
    },

I think convict may want that duration in milliseconds here in the JSON file (but oddly accepts it as a valid default in server/config.js).

Setup basic functional tests

add convict, create readable config.js

leaning on examples from other fxa-* projects that use convict for managing configs, for example

Configure Sass Directories

Gonna be doing something very similar to the article @pdehaan mentioned:

http://bramsmulders.com/how-i-improved-my-workflow-with-smacss-sass.html

whitelist any services webheads will make outgoing requests to

we can skip this until we go to prod

create models / migrations from provisional schema

schema: https://etherpad.mozilla.org/chronicle-api
some helpful docs on sequelize code layout:
- http://www.redotheweb.com/2013/02/20/sequelize-the-javascript-orm-in-practice.html
- https://github.com/JeyDotC/articles/blob/master/EXPRESS%20WITH%20SEQUELIZE.md
let's keep the db files under the server directory:

/server
  /db
    /models
    /migrations

something like that

fixup DB creation/client code

it's just so ugly rn

Set up Travis

Badge:

[![Build Status: Travis](https://travis-ci.org/mozilla/chronicle.svg?branch=master)](https://travis-ci.org/mozilla/chronicle)

fxa-content-server travis file is at https://github.com/mozilla/fxa-content-server/blob/master/.travis.yml
(one nice addition is that npm config set spin false bit since it can make the logs suck marginally less).

add grunt task to check for missing MPL headers

Change default config to output to a public/ directory instead of using dist/

(so says @nchapman, so shall it be done)

Add CONTRIBUTING.md

Helpful to establish norms as we kick of the project.
FxA uses Angular's rules for commits:
https://github.com/angular/angular.js/blob/master/CONTRIBUTING.md#commit-message-format

'grunt server' should set process timezone to UTC

We want timestamps on the server to always be in UTC.

This won't be a problem in real servers, but when developing locally, there are a couple ways this can fail:

The MySQL server needs to have its timezone set to UTC. Otherwise it'll store the TIMESTAMP fields as UTC, but convert them to the local timezone in query results.
- This can be easily fixed by sending the query SET time_zone=+00:00. I'm adding this to the logic that grabs a connection from the connection pool.
Node needs to have its timezone set to UTC, via process.env.TZ, before starting the server.
- Trying to set this on a running process produces non-deterministic results; see this bug for details.
- When node-mysql gets a timestamp from the database, it converts it into a Date object, which is formatted to the system local time.
- Other workarounds for node-mysql returning results in PST aren't attractive:
  - use the deprecated node-mysql typeCast function to write a custom type mapping for dates (ughhh) (example)
  - manually handle this in the db layer and hope we don't miss a spot

The simplest workaround is to ensure that the env var 'TZ' has been set before the server starts. I'm not sure how to map this request to grunt's declarative syntax. @pdehaan any thoughts on how best to handle this?

Track which device created a visit

In order to give users a more "seamful" experience, we should know which device they were browsing from so that we can offer them better context.

Another year, another copyright update

In /app/index.html we're hardcoding "2014".

Not sure if we want to just do a quick PR update every year to bump the year manually, or if we inject the year value via JavaScript.

Create base worker code / work queue

This is not really user visible, but does provide a pluggable infrastructure for every additional service to hook into, while keeping them all decoupled.

figure out session / nonce story for oauth flow

we're starting with the redirect flow
we'll probably need to ditch the relier library, because we want the nonce to be signed on the server, hence, opaque to the client

add workers to send URLs to embed.ly, ES, MySQL

Assume hapi tosses a new URL into the redis work queue, we'll want a few workers doing various things with that data:

drop it into mysql (directly or via ORM? feels awkward)
send it off to embed.ly
after it returns from embed.ly, throw (some subset of) the response into elasticsearch

Server should serve l10n strings to client

It should be a json object with English strings as keys and the proper translation as values at an endpoint like /1/l10n/client.json.

A couple of JSHint warnings

STR:

Set up JSHint.
Run it.

Actual results:

$ npm run lint

> [email protected] lint /Users/pdehaan/dev/github/chronicle
> grunt lint

Running "jshint:app" (jshint) task

app/scripts/views/base.js
  line 72  col 31  'text' is defined but never used.

app/scripts/views/visits/index.js
  line 12  col 34  Extra comma. (it breaks older versions of IE)

  ⚠  2 warnings

Warning: Task "jshint:app" failed. Use --force to continue.

Aborted due to warnings.

72:        context.l = function (text) {
73:          return function (text, render) {
74:            return render(self.localize(text));
75:          };
76:        };

Warning 1: app/scripts/views/base.js:72 — Note that the text attribute is defined twice (lines 72 and 73), so the outermost one is probably out of scope and overridden by the nested one and used on line 74.

11:  var VisitsIndexView = BaseView.extend({
12:    template: VisitsIndexTemplate,
13:  });

Warning 2: app/scripts/views/visits/index.js:12 — trailing comma thing there...

Rough prototype at https://github.com/pdehaan/grunt-examples/tree/master/grunt-hapi-example

$ grunt server
Running "hapi:async" (hapi) task

Running "watch" task
Waiting...
>> File "server/index.js" changed.
Running "hapi:async" (hapi) task

Done, without errors.

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: listen EADDRINUSE
    at errnoException (net.js:904:11)
    at Server._listen2 (net.js:1042:14)
    at listen (net.js:1064:10)
    at net.js:1146:9
    at asyncCallback (dns.js:68:16)
    at Object.onanswer [as oncomplete] (dns.js:121:9)
Completed in 0.702s at Thu Dec 18 2014 17:59:28 GMT-0800 (PST) - Waiting...

use Joi to validate API endpoints

...when we have them :-)

roll our own image proxy service

we want a richer view of history than just URLs, titles
we expect the scraper's analysis of the page will return 1 or a few image URLs
grab that image, resize/optimize it, host it somewhere (https and http accessible)
expose final proxied link, so we can store it in the DB (or something, TBD)
do we want this service to check referrers and disallow non-chronicle traffic?