Giter VIP home page Giter VIP logo

Comments (3)

beepy0 avatar beepy0 commented on September 25, 2024

Erklärung
Abstract
Table of Contents
Introduction (4 pages)

  • Problem Statement - in similar fashion to the intro of the expose
  • Motivation - Sketches; why sketches
  • Outline - of the whole work / what comes in the coming chapters​

Background (8 pages)

  • Stream Processing; Distributed Query Processing
  • Sketches - history; details about theory; advantages to other solutions; different types of sketches
  • Algorithms - AGMS and FastAGMS explained + visual/pseudo code; runtime and space complexity
  • Vectorization
  • Related Work - Papers that have tested out count-min or AGMS/FAGMS in some setting are good examples. I could stick to some more general information if I don't have enough direct examples, but should also try to keep it brief.

Approach (11-14 pages)

  • Optimization Posibilities - Traditional single instruction CPU vs. SIMD + AVX-512; Cache vs. RAM vs. Disc
  • Optimization Implementation  - AVX-512 or other SIMD-like for memory-sensitive cases?; provide code snippets of original and changed versions; link to open-source repo of original code
  • Discussion?

Evaluation (would contain multiple diagrams 15-17 pages)

  • Test Environment - Intel Sever-Grade CPU; Linux; C++; Data generator
  • Analysis Tools - VTune; Bash : reporting the process and tools, no in-depth explanations of how they work.
  • Setting baseline - but what baseline? algorithms execution times? (could run a file with n-samples and record avg execution time); VTune hotspots analysis (+ code snippets); VTune performance analysis;
  • Results - VTune hotspot/performance; algorithm execution speed-up(compared to baseline) ; results from different data distributions; results on machines with different cache sizes?
  • Cache size as a factor?

Conclusion (2 pages)

  • Optimization Conclusion AGMS - compute, memory or both
  • Optimization Conclusion FastAGMS - compute, memory or both
  • Comparison Conclusion AGMS vs. FAGMS : based on something Martin mentioned; based on data distribution; what else?

List of Tables
List of Figures
Listings
Bibliography

from thesis.

beepy0 avatar beepy0 commented on September 25, 2024
  • Read a few papers that use SIMD and such to optimize to get inspired on structure and visualization

from thesis.

beepy0 avatar beepy0 commented on September 25, 2024

Message to convey:

  • Varying degrees of throughput speed-up depending on rows / buckets size can be observed. AGMS tends to get a bigger overall speed-up but F-AGMS has more throughput in the higher rows / buckets scenarios (in cases where high accuracy is 1st priority).

  • AGMS can be useful in cases where accuary can be sacrificed for speed, as it does have a significant edge in lower buckets / rows scenarios.

  • F-AGMS can sustain a good accuracy/throughput ratio across all scenarios, with even 8 buckets / 8 rows being way more precise than AGMS. Maybe provide a 8 rows / 163840 buckets like in your example to show that accuracy can be kept very high (probably in high 98%s - low 99%s) with practically no performance loss and a small space loss.

  • Display overall speed-up from the current implementation across all tested cases for both algorithms and compare numbers. It should be around 10x for AGMS and 7x for F-AGMS.

  • There is a small hit on performance using data samples other than zipf. Normal has a slightly smaller throughput, uniform even smaller than normal. These are tied to amount of microarchitecture usage being observed via VTune.

  • Probably show more memory-dependance after implementing SIMD (reaching peak of compute headroom), but I'll have to see what comes out from profiling the optimized files.

  • Discussion for future optimization: parallelization, memory-bounds optimization ?, GPU implementation, further CPU SIMD optimization.


1. Experimental Setup / Baseline

1.1 VTune

- Overall runtime vs microarchitecture utilization based on data distribution. Discuss correlation

1.1.1 Hotspot Analysis

  - Per-function runtime ratio and microarchitecture utilization

  - Heatmap of overall microarchitecture utilization (serves as overview)

1.1.2 Micro-architecture Analysis

  - discuss memory-bounds results

1.2 AGMS

- raw data points

- averaged data plus curve (tendency)

1.3 Fast-AGMS

- raw data points

- averaged data plus curve (tendency)

1.4 AGMS vs Fast-AGMS

- Comparison line-plot, serves as overview

- Averaged throughput across all cases, single number for each algorithm, easy to compare average speed(-up)

2. Optimization Benchmarked

Repeat more or less the same steps (VTune only partially):

  • One graph showing the new data
  • One additional graph comparing between old and new (when necessary)

Approach

server hw specs

VTune command line; VTune version?

Three sample data types of different sizes. Code snippets for each distribution.

Using SIMD: Sketch implementation in C++ - some snippets

Seaborn for graphing; maybe one code snippet example

explain benchmarking different cases and different amount of runs with each time a new random variables generation

from thesis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.