numap's Issues
Sample read and write event at same time.
Can i sample the read and write at same time?
I read the https://stackoverflow.com/questions/42088515/perf-event-open-how-to-monitoring-multiple-events and try to make a group between write_sample and read_sample, (like this perf_event_open(&pe, measure->tids[thread], cpu, measure->fd_per_tid[thread], 0)
)but it shows that "error when calling perf_event_open: Invalid argument". So is there some way to sample read events and write events at same time?
No support for memory-only NUMA domains
Currently the code crashes when dealing with systems that also contain memory-only NUMA domains.
This code is really problematic
nb_numa_nodes = numa_num_configured_nodes();
int nb_cpus = numa_num_configured_cpus();
for (node = 0; node < nb_numa_nodes; node++) {
struct bitmask *mask = numa_allocate_cpumask();
numa_node_to_cpus(node, mask);
numa_node_to_cpu[node] = -1;
for (cpu = 0; cpu < nb_cpus; cpu++) {
if (*(mask->maskp) & (1 << cpu)) {
numa_node_to_cpu[node] = cpu;
break;
}
}
numa_bitmask_free(mask);
if (numa_node_to_cpu[node] == -1) {
nb_numa_nodes = -1; // to be handled properly
}
}
Especially,, as nb_numa_nodes
is changed inside the loop that is still using it. Futher, nb_numa_nodes
is unsigned. Setting it to -1 will result in an unexpected behavior.
Even if I solve this issue numap still does not work on e.g. the example
data_src implements
I read the source code in numap as:
"
int is_served_by_local_cache2(union perf_mem_data_src data_src) {
if (data_src.mem_lvl & PERF_MEM_LVL_HIT) {
if (data_src.mem_lvl & PERF_MEM_LVL_L2) {
return 1;
}
}
return 0;
}
"
But when i read the perf man page (http://man7.org/linux/man-pages/man2/perf_event_open.2.html), it says that "mem_lvl
Memory hierarchy level hit or miss, a bitwise com‐
bination of the following, shifted left by
PERF_MEM_LVL_SHIFT:
PERF_MEM_LVL_NA Not available
PERF_MEM_LVL_HIT Hit
"
So i think the code should be like this:
“int is_served_by_local_cache2(union perf_mem_data_src data_src) {
if ((data_src.mem_lvl >> PERF_MEM_LVL_SHIFT)& PERF_MEM_LVL_HIT) {
if ((data_src.mem_lvl >> PERF_MEM_LVL_SHIFT) & PERF_MEM_LVL_L2) {
return 1;
}
}
return 0;
}”
But i think numap runs correctly. So did i read the wrong documents?
bug
perf_event_open fails on Linux kernel < 4.1
When setting the parameters for perf_event_open, numap sets the use_clockid and clockid fields which were introduced in Linux kernel 4.1.
On older kernels, this makes to call to perf_event_open fail with the error code "Invalid argument".
We should detect this problem at compile time and, in case of an unsupported kernel:
- make the compilation fail with an explicit message (ie. "kernel took old")
- or, print a warning and don't use this feature, but this may break a few things
Numap does not support AMD processors
AMD processors provide Instruction Based Sampling that allows to samples instructions executed by the cpu. It could be used for collecting memory information in Numap.
If anyone is willing to port Numap on AMD, I can give advises on numap. My main problem is the lack of time :)
Can't run example
I'm try to run the example on vmware with kernel version (Linux version 4.15.0-38-generic (buildd@lcy01-amd64-023) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #41-Ubuntu SMP Wed Oct 10 10:59:38 UTC 2018) and cpu (Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz)
i get the exception:
Starting memory read sampling -> numap_sampling_start error : perf_event ==> Operation not supported
Then I try to run the example on server with kernel version( Linux version 3.10.0-327.36.3.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Oct 24 16:09:20 UTC 2016) and cpu (Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz)
and i get the exception:
Segmentation fault
Using PERF_EVENT_IOC_REFRESH
It is possible to use PERF_EVENT_IOC_REFRESH so that when the sample buffer is full, a signal is delivered. It would be very useful if numap had an option to enable this.
For instance, a callback could be passed to numap_sampling_init_measure. If a callback is passed, then numap enables PERF_EVENT_IOC_REFRESH and calls the callback each time the buffer is full.
Another solution (that would not break the API), would be to add a new function (eg numap_sampling_add_callback) to enable this feature.
Get samples' latency
I read the intel doc(https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html) 18.9.4.2. It seems that the Sandy Bridge micro arch support the latency monitor on read and write. But i did't see the implement in numap. Is it possible to get latency of samples on this arch? Thank you!
cpu support
I'm not familiar with cpu PMU so that could you please add the support to this cpu arch? here is cpu information:
Architecture: x86_64
Byte Order: Little Endian
Vendor ID: GenuineIntel
CPU family: 6
Model: 87
Model name: Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz
The intel doc mention it in 18.14 (https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html)
And here is another doc: https://software.intel.com/en-us/articles/intel-xeon-phi-x200-family-processor-performance-monitoring-reference-manual
Based on TABLE 2-4 in https://software.intel.com/sites/default/files/managed/6e/3d/Intel%C2%AE%20Xeon%20Phi%E2%84%A2%20Processor%20Performance%20Monitoring%20Reference%20Manual_Vol2_Mar2017.pdf i think it's MEM_UOPS_RETIRED:ALL_LOADS and MEM_UOPS_RETIRED:ALL_STORES
Thank you very much!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.