beepy0 / thesis Goto Github PK
View Code? Open in Web Editor NEWAVX-512 SIMD Optimization and Benchmarking of AGMS and Fast-AGMS Sketch Algorithms
AVX-512 SIMD Optimization and Benchmarking of AGMS and Fast-AGMS Sketch Algorithms
Note speed-up based on SIMD, memory optimization or anything else that was done to improve throughput of the algorithms.
for each row to max_row
key_i_bit & row
tmp_row = first_row
for each row to (max_row - 1)
tmp_row ^= next_row
return tmp_row
blank
blank
3. In your opinion, what could be a good outline of the problem statement for my thesis? (Do I talk about the need for more throughput as datastreams grow even more, do I talk about how vectorization works and what needs to be considered (e.g. compute bounds) before implementing it, or do I talk about something else?)
4. Do I provide pseudo-code for the join/self_join functions for both algorithms in the description section of those algorithms, or just for the update functions?
5. In which part does it make most sense to discuss the baseline benchmark results (with graphs etc)? I was initially planing to discuss them after I introduce the test conditions under Approach, e.g. after talking about implementation and tools, but now I'm thinking it might be that all results must go under Evaluation? Please let me know
Create an initial outline plan and post it here.
It will be updated and and added to this ticket as time goes on.
In the Microarchitecture group.
This ticket is to log and document the process of benchmarking, gathering of data, and interpretation of that data for the baseline case of the Fast-AGMS algorithm.
The goals are as follows:
VTune - In the Microarchitecture group
Put here all the related information for this algorithm. Papers, experiments, videos, etc.
revise Problem Statement (comments in email)
3 side notes in Feedback 2 regarding Background
Note speed-up based on SIMD, memory optimization or anything else that was done to improve throughput of the algorithms.
sketch update
only, or also join size
, self-join size
?bins_no
and rows_no
and all bins_no * rows_no
get updated for each new value) whereas Fast-AGMS isn't. Should I include a diagram showing AGMS's comparative disadvantage in that regard, as part of a discussion subsection for example?buckets_no
? The other bucket hash functions in the project used a modulo at the end for the return value, and currently the H3 implementation also.+,-,&,^
etc. But what about math functions like ceil(), lo2()
etc? Same issue when I have to pass the values of a vector of size n to n simultaneous update functions, how does that look like? Do I rewrite the functions to pass the SIMD vectors instead of single variables?Advanced Hotspots
, Microarchitecture Exploration
, Memory Access
Get guides and tutorials on how to do the different types of analysis using VTune.
Intel employee presents VTune: https://youtu.be/SqXmMeigZSo?list=PLjX5iDdaL94bEF1xBiRii0t9CFKcwNgWJ
Recommended compiler optimization setting is any normal setting with symbols enabled, meaning it must include "-g" (function symbols, function names and line numbers) . https://stackoverflow.com/questions/89603/how-does-the-debugging-option-g-change-the-binary-executable ; https://www.rapidtables.com/code/linux/gcc/gcc-o.html
-g
-collect advanced-hotspots
-collect general-exploration
-collect memory-access
This milestone concerns understanding how the provided open source C++ code for AGMS and Fast-AGMS works, how to use it, and how to work with the additionally provided data generators. The goals are as follows:
A list of questions I need to ask Martin the next time we meet.
Helpful papers that give insight on what I did in this paper can briefly mentioned.
Note for related work: Fast AGMS is AGMS + Count-Min -> check papers that worked on Count-Min to maybe include some of those insights in the related work.
In the Microarchitecture group.
Put here all the related information for this algorithm. Papers, experiments, videos, etc.
TODO:
This ticket is to log and document the process of benchmarking, gathering of data, and interpretation of that data for the baseline case of the AGMS algorithm.
/opt/intel/vtune_amplifier_2018.0.2.525261/bin64/
/opt/intel/vtune_amplifier_2019.3.0.590814/bin64/
out files that can be executed to start are either:
/opt/intel/system_studio_2019/vtune_amplifier_2019.3.0.590814/bin64/amplxe-cl -collect hotspots -knob sampling-mode=hw -finalization-mode=full -app-working-dir /home/morty/sketch_profiling/ -- /home/morty/sketch_profiling/FILENAME
/opt/intel/system_studio_2019/vtune_amplifier_2019.3.0.590814/bin64/amplxe-cl -collect uarch-exploration -knob collect-memory-bandwidth=true -target-duration-type=veryshort -finalization-mode=full -app-working-dir /home/morty/sketch_profiling/ -- /home/morty/sketch_profiling/FILENAME
rsync -chavzP --stats /home/meggamorty/CLionProjects/thesis/optimization/cmake-build-debug/optimization [email protected]:/home/morty/FOLDER
rsync -chavzP --stats [email protected]:/home/morty/sketch_profiling/COPYDIR /home/meggamorty/vtune-reports/PASTEDIR
Missing debug information: https://software.intel.com/en-us/vtune-amplifier-help-error-cannot-locate-debug-info
Get access from Martin
Run the OpenMP dummy tests on the local Intel machine
Run VTune Analysis on the local machine
Run VTune Analysis on the remote machine via CLI
Transfer the file to the local machine https://stackoverflow.com/questions/9090817/copying-files-using-rsync-from-remote-server-to-local-machine
VTune - In the Microarchitecture group
The first milestone consists of mainly doing research on the theory and technology that I'm going to be using for this project. The plan currently consists of the following goals:
Folder name Related_Work
Folder Name Papers_using_FastAGMS
Folder Name Papers_using_SIMD
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.