Comments (3)
Erklärung
Abstract
Table of Contents
Introduction (4 pages)
- Problem Statement - in similar fashion to the intro of the expose
- Motivation - Sketches; why sketches
- Outline - of the whole work / what comes in the coming chapters
Background (8 pages)
- Stream Processing; Distributed Query Processing
- Sketches - history; details about theory; advantages to other solutions; different types of sketches
- Algorithms - AGMS and FastAGMS explained + visual/pseudo code; runtime and space complexity
- Vectorization
- Related Work - Papers that have tested out count-min or AGMS/FAGMS in some setting are good examples. I could stick to some more general information if I don't have enough direct examples, but should also try to keep it brief.
Approach (11-14 pages)
- Optimization Posibilities - Traditional single instruction CPU vs. SIMD + AVX-512; Cache vs. RAM vs. Disc
- Optimization Implementation - AVX-512 or other SIMD-like for memory-sensitive cases?; provide code snippets of original and changed versions; link to open-source repo of original code
- Discussion?
Evaluation (would contain multiple diagrams 15-17 pages)
- Test Environment - Intel Sever-Grade CPU; Linux; C++; Data generator
- Analysis Tools - VTune; Bash : reporting the process and tools, no in-depth explanations of how they work.
- Setting baseline - but what baseline? algorithms execution times? (could run a file with n-samples and record avg execution time); VTune hotspots analysis (+ code snippets); VTune performance analysis;
- Results - VTune hotspot/performance; algorithm execution speed-up(compared to baseline) ; results from different data distributions; results on machines with different cache sizes?
- Cache size as a factor?
Conclusion (2 pages)
- Optimization Conclusion AGMS - compute, memory or both
- Optimization Conclusion FastAGMS - compute, memory or both
- Comparison Conclusion AGMS vs. FAGMS : based on something Martin mentioned; based on data distribution; what else?
List of Tables
List of Figures
Listings
Bibliography
from thesis.
- Read a few papers that use SIMD and such to optimize to get inspired on structure and visualization
from thesis.
Message to convey:
-
Varying degrees of throughput speed-up depending on rows / buckets size can be observed. AGMS tends to get a bigger overall speed-up but F-AGMS has more throughput in the higher rows / buckets scenarios (in cases where high accuracy is 1st priority).
-
AGMS can be useful in cases where accuary can be sacrificed for speed, as it does have a significant edge in lower buckets / rows scenarios.
-
F-AGMS can sustain a good accuracy/throughput ratio across all scenarios, with even 8 buckets / 8 rows being way more precise than AGMS. Maybe provide a 8 rows / 163840 buckets like in your example to show that accuracy can be kept very high (probably in high 98%s - low 99%s) with practically no performance loss and a small space loss.
-
Display overall speed-up from the current implementation across all tested cases for both algorithms and compare numbers. It should be around 10x for AGMS and 7x for F-AGMS.
-
There is a small hit on performance using data samples other than zipf. Normal has a slightly smaller throughput, uniform even smaller than normal. These are tied to amount of microarchitecture usage being observed via VTune.
-
Probably show more memory-dependance after implementing SIMD (reaching peak of compute headroom), but I'll have to see what comes out from profiling the optimized files.
-
Discussion for future optimization: parallelization, memory-bounds optimization ?, GPU implementation, further CPU SIMD optimization.
1. Experimental Setup / Baseline
1.1 VTune
- Overall runtime vs microarchitecture utilization based on data distribution. Discuss correlation
1.1.1 Hotspot Analysis
- Per-function runtime ratio and microarchitecture utilization
- Heatmap of overall microarchitecture utilization (serves as overview)
1.1.2 Micro-architecture Analysis
- discuss memory-bounds results
1.2 AGMS
- raw data points
- averaged data plus curve (tendency)
1.3 Fast-AGMS
- raw data points
- averaged data plus curve (tendency)
1.4 AGMS vs Fast-AGMS
- Comparison line-plot, serves as overview
- Averaged throughput across all cases, single number for each algorithm, easy to compare average speed(-up)
2. Optimization Benchmarked
Repeat more or less the same steps (VTune only partially):
- One graph showing the new data
- One additional graph comparing between old and new (when necessary)
Approach
server hw specs
VTune command line; VTune version?
Three sample data types of different sizes. Code snippets for each distribution.
Using SIMD: Sketch implementation in C++ - some snippets
Seaborn for graphing; maybe one code snippet example
explain benchmarking different cases and different amount of runs with each time a new random variables generation
from thesis.
Related Issues (20)
- AGMS VTune Memory Access Analysis
- Fast-AGMS VTune Memory Access Analysis
- AGMS VTune Microarchitecture Exploration Analysis
- Fast-AGMS VTune Microarchitecture Exploration Analysis HOT 1
- Memory Tuning AGMS
- Memory Tuning Fast-AGMS
- Benchmark AGMS Optimized
- Benchmark Fast-AGMS Optimized
- Visualization Ideas
- Related Work
- Vectorization and AVX Research
- Questions to Martin HOT 14
- Implement H3 Bucket Hashing Function HOT 2
- TODO
- SIMD Notes
- Questions For Martin HOT 1
- Writing Ideas/Notes
- Microarchitecture Exploration Notes
- Hotspots Notes HOT 3
- Writing TODO
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thesis.