Giter VIP home page Giter VIP logo

Laurence Geng's Projects

apache-hudi-core-conceptions icon apache-hudi-core-conceptions

A set of notebooks to explore and explain core conceptions of Apache Hudi, such as file layouts, file sizing, compaction, clustering and so on.

aws-cli icon aws-cli

Universal Command Line Interface for Amazon Web Services

aws-cli-plus icon aws-cli-plus

This command line tool is a useful complement to aws-cli. It offers a suite of utilities that manages and operates ec2, emr and other aws services.

aws-glue-data-catalog-client-for-apache-hive-metastore icon aws-glue-data-catalog-client-for-apache-hive-metastore

The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions

aws-serverless-data-lake-workshop icon aws-serverless-data-lake-workshop

This workshop is meant to give customers a hands-on experience with mentioned AWS services.  Serverless Data Lake workshop helps customers build a cloud-native and future-proof serverless data lake architecture. It allows hands-on time with AWS big data and analytics services including Amazon Kinesis Services for streaming data ingestion and analytics, AWS Glue for ETL and Data Catalogue Management, Amazon Athena to query data lake.

bash-ini-multisection icon bash-ini-multisection

This is a bash library for reading ini style config files, which allows multiple [section] entries.

bdp icon bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

bdp-platform icon bdp-platform

大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

bigdata-growth icon bigdata-growth

大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。

demo-lib icon demo-lib

Routines & sample codes to demonstrate framework usage and architecture thought.

emr-edgenode-maker icon emr-edgenode-maker

This tool can easily make / build an emr cluster edge node / client node / gateway node

emr-serverless-utils icon emr-serverless-utils

A utilities library for Amazon EMR Serverless, i.e. a generic job class for executing sql files, and so on.

flink-recommendsystem-demo icon flink-recommendsystem-demo

:helicopter::rocket:基于Flink实现的商品实时推荐系统。flink统计商品热度,放入redis缓存,分析日志信息,将画像标签和实时记录放入Hbase。在用户发起推荐请求后,根据用户画像重排序热度榜,并结合协同过滤和标签两个推荐模块为新生成的榜单的每一个产品添加关联产品,最后返回新的用户列表。

flink-sql-cdc icon flink-sql-cdc

Self-contained demo using Flink SQL and Debezium to build a CDC-based analytics pipeline. All you need is Docker! :whale:

flying-diamond icon flying-diamond

This project is developed in 2011, I wrote it for learning MVC pattern and Java Swing library.

handson-ml2 icon handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.