Giter VIP home page Giter VIP logo

data_engineer_roadmap's Introduction

Data Engineer Study Roadmap

Roadmap overview

Mind map Data Engineer

Online courses, bootcamps and articles

Title Content Plataform Language Link
The Ultimate Hands-On Hadoop - Tame your Big Data! General Udemy en_US Go to Course

Capstone project

Processing profitability of investment portfolios

You have just be hired for startup to work as Principal Data Engineer. The business of this startup is to provide an investment platform where their customers have at their dispossal an infinity of financial products that can be cosidered in their investiment strategies, like so:

  • Bonds
  • CDs (Certificate of Deposit)
  • Stocks
  • Investment Funds
  • Retirement
  • REITs
  • Etc.

Besides of the quantity and range of products available, and the user experience created for these clients, is the which of this startup to provide a daily information about the performance for each customer investiment porftolio and in the future provide personalized recommendations to improve their investment performance.

You challenge as Principal Data Engineer is to architect, design and implement a data architecture for an on premise infrastrcuture a brand new data architecture for this startup in order to provide to your custermers data about their investments performance.

User stories

  • As investor, I want to receive a month overview from my portfolio, with the percentual allocation and profitability by portfolio, asset class and asset, so I can have a better understanding of the evolution of my portfolio;
  • As investor, I want to see a daily overview of my portfolio, with the percentual allocation and profitability by portfolio, asset class and asset, so I can follow-up its evolution and may decide to rebalancing it;
  • As investor, I want to see the difference between my portfolio profile and my investor profile, so I can check if I am out of step with my suitability;
  • As broker, I want to see which clients are facing a low performance comapred with an index, so I can offer to them a consulting, menthorship or traning, in order to help them perform better;
  • As borker, I want to see which customers are out of line with their suitability, so I can help them bring their investiments strategy closer to their investor profile or have an insight from those has changed the investiment behaviour;

Challenges (Optional)

  • Configure whole infrastructure (cluster) for data ingestion and data processing using hadoop ecosystem from ground up;
  • Architect, design and implement a transition from on premise to cloud computing using Amazon Web Service or Google Cloud Big Data solutions;

DATA FEEDS

CLIENT DATA FEED

This file brings the active clients information. This includes information about its ID, Address and etc.

Column Datatype Length Description
Fristname TEXT 15 The client's first name
Lastname TEXT 15 The client's last name
Middlename TEXT 20 The clients's middle name
Document number TEXT 20 The clients's document number
Document type TEXT 10 The clients's document type
Street Address 1 TEXT 60 The client's street address line 1
Street Address 2 TEXT 60 The client's street address line 2
City TEXT 20 The client's city
County TEXT 20 The client's county
State TEXT 20 The client's state
Zipcode NUMBER 09 The client's zipcode
Country TEXT 10 The client's country
Contract number NUMBER 10 The client's contract number. Left padding with zeros
Investor profile NUMBER 10 The client's investor profile. Could be: CONSERVATIVE, MODERATE, BALANCED, GROWTH

PORTFOLIOS ASSETS DATA FEED

FUNDS INFO DATA FEED

STOCKS DATA FEED

BENCHMARK INDEX DATA FEED

data_engineer_roadmap's People

Contributors

wfercosta avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.