Name: Kyle Bendickson
Type: User
Company: @tabular-io
Bio: I work on distributed systems, mostly big data as an open source dev working on Apache Iceberg and friends. But mostly, I walk my dog a lot.
Location: Los Angeles, CA
Kyle Bendickson's Projects
Repro case for pants.buildgen issue with generated sources
Scala macros for generating Parquet schema projections and filter predicates
Example: Convert Protobuf to Parquet using parquet-avro and avro-protobuf
Parquet Command-line Tools
Apache Parquet
VM based deployment for prototyping Big Data tools on Amazon Web Services
Guide to Setup a development Environment on Apple Silicon M1 computers using Homebrew, Python, Pyenv, Poetry Numpy, Tensorflow.
MITM data analysis utility for Pokemon GO
The official home of the Presto distributed SQL query engine for big data
Python bindings to the Zstandard (zstd) compression library
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
An implementation of https://raytracing.github.io/books/RayTracingInOneWeekend.html
RayDP: Distributed data processing library that provides simple APIs for running Spark on Ray and integrating Spark with distributed deep learning and machine learning frameworks.
A library that provides an embeddable, persistent key-value store for fast storage optimized for AWS
Server Implementations of the rosbridge v2 Protocol
The Standard ROS JavaScript Library
rust wrapper for rocksdb
The Scala programming language
Scaling Python Machine Learning
A Scala API for Apache Beam and Google Cloud Dataflow.
Utils for streaming large files (S3, HDFS, gzip, bz2...)
Apache Spark
DataStax Spark Cassandra Connector
Spark ClickHouse Connector build on DataSourceV2 API and gRPC protocol.
Flowchart for debugging Spark aplications
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Kafka offset committer for structured streaming query