Giter VIP home page Giter VIP logo

flight-scrappper's Introduction

flight-scrappper

A Web scraper made with nodejs and selenium-webdriver that gathers flight data and stores it in a mongodb database.

NPM Version Codacy Badge dependencies Status devDependencies Status MIT License

let FlightScrappper = require('flight-scrappper');

FlightScrappper.run().then((flights) => {
    console.log(flights);
}).catch(function(err) {
    console.log('Found an error! ' + err);
});

Requirements

Installation

$ npm install --save flight-scrappper

Options

The following options can be defined as an argument of the FlightScrappper.run() method.

This can be done passing an object {option1:'abc',option2:'abc',...}.

If an option is not defined, a default value will be used instead.

These are the default values:

const defaultDateFormat = 'DD-MM-YYYY';
let defaultOptions = {
    periods: 1,
    interval: 48,
    currency: 'EUR',
    directFlight: false,
    dateFormat: defaultDateFormat,
    targetDate: Utils.getDefaultDateString(defaultDateFormat),
    database: 'localhost:27017/flight-scrappper',
    collection: 'flight-data',
    timeout: 80000,
    browser: 'chrome',
    chromedriverArgs: [],
    maximize: false,
    retries: 1,
    routes: [{
        from: 'LIS',
        to: 'PAR'
    }]
};

These queried dates are calculated with the following formula targetDate + options.interval x options.periods times.

Example: Setting periods to 2, interval to 24 and targetDate to 5/01/2000 will generate an array such as ['5/01/2000','07/01/2000'].

Running

First, start your mongodb database. You can try npm run mongo-linux/win/mac to start your database in an easy way, or do it manually. For more information on what these commands are doing, just read the scripts object in the package.json file.

If you want to scrap flights, without storing data, you can set database to 'none'.

To start the flight-scrappper with the default values just type $ npm start.

If you want to run with different options just add arguments as specified in Options.

If you want to get feedback in the console please check Debugging.

Output

FlightScrappper.run will return a promise which will resolve into the number of inserted documents or into an error.

The output data that will look like this:

"search" : {
    "from" : "LIS",
    "to" : "AKL",
    "source" : "momondo",
    "queried" : ISODate("2016-10-23T12:09:21.566Z")
},
"data" : {
    "duration" : 2080,
    "stops" : 2,
    "flightClass" : 0,
    "airline": ["TAP","Ryanair"],
    "price" : {
        "amount" : 778,
        "currency" : "EUR"
    },
    "departure" : {
        "time" : {
            "minute" : 15,
            "hour" : 14,
            "day" : 25,
            "month" : 10,
            "year" : 2016
        },
        "airport" : "LIS"
    },
    "arrival" : {
        "time" : {
            "minute" : 55,
            "hour" : 12,
            "day" : 27,
            "month" : 10,
            "year" : 2016
        },
        "airport" : "AKL"
    }
}

Tests

To run the test suite, first install the dependencies, then run npm test:

$ npm install
$ npm test

Debugging or Verbose

$ npm run debug to have console output.

Contributing

Contributions, requests or pull requests are welcome & appreciated!

Send your pull requests to the developing branch please!

Send me an email if you have questions regarding possible contributions.

License

MIT

flight-scrappper's People

Contributors

bertolo1988 avatar ilikecarps avatar waffle-iron avatar

Watchers

James Cloos avatar Joe Cool avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.