Giter VIP home page Giter VIP logo

udacity-log-analysis's Introduction

Log-Analysis-Udacity-Project

An internal reporting tool that uses information of large database of a web server and draw business conclusions from that information. (Project from Full Stack Web Development Nanodegree)

Introduction

This is a python module that uses information of large database of a web server and draw business conclusions from that information. The database contains newspaper articles, as well as the web server log for the site. The log has a database row for each time a reader loaded a web page. The database includes three tables:

  • The authors table includes information about the authors of articles.
  • The articles table includes the articles themselves.
  • The log table includes one entry for each time a user has accessed the site.

The project drives following conclusions:

  • Most popular three articles of all time.
  • Most popular article authors of all time.
  • Days on which more than 1% of requests lead to errors.

Functions in log.py:

  • connect(): Connects to the PostgreSQL database and returns a database connection.
  • popular_article(): Prints most popular three articles of all time.
  • popular_authors(): Prints most popular article authors of all time.
  • log_status(): Print days on which more than 1% of requests lead to errors.
  • view_popular_articles(): Creates view popular_articles that drives first conclusion.
  • view_popular_authors(): Creates view popular_authors that drives second conclusion.
  • view_log_status(): Creates view log_status that drives third conclusion.

Views Made:

  • popular_articles

create or replace view popular_articles as
select title, count(title) as views from articles,log
where log.path = concat('/article/',articles.slug)
group by title order by views desc
  • popular_authors

create or replace view popular_authors as
select authors.name, count(articles.author) as views from articles, log, authors
where log.path = concat('/article/',articles.slug) and articles.author = authors.id
group by authors.name order by views desc
  • log_status

create or replace view log_status as
select Date,Total,Error, (Error::float*100)/Total::float as Percent from
(select time::timestamp::date as Date, count(status) as Total,
sum(case when status = '404 NOT FOUND' then 1 else 0 end) as Error from log
group by time::timestamp::date) as result
where (Error::float*100)/Total::float > 1.0 order by Percent desc;

Instructions

  • Install Vagrant and VirtualBox.

  • Clone the repository to your local machine:

    git clone https://github.com/visheshbanga/Log-Analysis-Udacity-Project
  • Start the virtual machine

    From your terminal, inside the project directory, run the command `vagrant up`. This will cause Vagrant to download the Linux operating system and install it. When vagrant up is finished running, you will get your shell prompt back. At this point, you can run `vagrant ssh` to log in to your newly installed Linux VM!
  • Download the data

    You will need to unzip this file after downloading it. The file inside is called newsdata.sql. Put this file into the vagrant directory, which is shared with your virtual machine.
  • Setup Database

    To load the database use the following command:
    psql -d news -f newsdata.sql;
  • Make Views

    Make views by running respective queries on command line or uncomment code written in python module.
  • Run Module

    python log.py

Output:

Screenshot.jpg

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.