Comments (7)
@hnez Please look into this as its from #4026 PR.
@acmiyaguchi I'm still getting (valid looking) metrics from Prometheus write plugin, but its now logging complaints like this:
uc_update: Value too old: name = collectd_gpu_sysman_temperature_celsius{dev_file="card2",location="global-max",pci_bdf="0000:93:00.0",pci_dev="0x56c0"}; value time = 0.000; last cache update = 0.000;
uc_update: uc_update_metric failed: Error #-1; Additionally, strerror_r failed.
I compile/enable only one read + one write plugin, which may explain why it works for me.
For the write_http
plugin crash, I would suggest running collectd under Valgrind memcheck tool (apt/dnf install valgrind
& valgrind collectd <options>
), see:
And for Prometheus write plugin missing data, I would suggest trying Valgrind threading error tools:
(I think the issue could relate to these plugins doing their own threading in addition to threading now done by collectd, and that adding races for accessing the metrics.)
from collectd.
Hi @eero-t,
I've asked my team lead for some time to look into the "on thread per writer"-fallout and got a slot later this week.
I hope I can fix the uc_
caching errors then an look into this issue.
from collectd.
Hi,
I've finally had time to look into these issues and you are right, both of them are on me. Oopsie.
write_http
Segfault
The write_http
plugin segfaulting was due to a wrong assumption on my side on who (plugin or daemon) is responsible for keeping the user_data_t
around once it is passed to plugin_register_write
.
I was under the assumption that it is the plugins responsibility to keep the user_data_t
around and the daemon could just store a reference to it.
This is not the case.
The write_http
plugin for example allocates the user_data_t
on the stack, passes it to register_write
and right afterwards the reference is no longer valid. The plugins I've tested for #4026 did not do it this way.
The behavior we observe before the segfaul is the spooky action at a distance from using stale references to a region on the stack.
The bug should be fixed by #4102.
Another write_http
Segfault
I've also noticed that write_http
segfaults on teardown due to a use-after-free caused by the user_data_t
once again.
This should be fixed by #4104.
write_prometheus
does not register metrics
This was caused by missing time
and interval
setup before calling uc_update
that was also observed by @eero-t.
This should be fixed by #4103.
Results
With all three patches applied my test script shows all three plugins working.
It would be great if you could test the changes as well and comment here / in the respective PRs.
Best regards
Leonard
from collectd.
Verified that PR for "write_prometheus" plugin issue, fixed it (and PR looks otherwise OK).
I'm not using "write_http" plugin so somebody else needs to check those PRs, but it's interesting that also the plugin itself had unsafe assumption that needed to be fixed. So there may be other write plugins with similar assumptions, that got broken.
@hnez in #4026 you mention testing only write_throttle
and "logfile" (write_log
?) plugins. Maybe you could check also some other, simpler write plugin(s)?
$ ls src/write*.c
src/write_graphite.c src/write_log.c src/write_riemann.c src/write_syslog.c
src/write_http.c src/write_mongodb.c src/write_riemann_threshold.c src/write_tsdb.c
src/write_influxdb_udp.c src/write_prometheus.c src/write_sensu.c
src/write_kafka.c src/write_redis.c src/write_stackdriver.c
Note: rieman write plugin is buggy already in main
branch, see #4050.
from collectd.
The other 2 fixes are merged, but this is still pending:
I've also noticed that
write_http
segfaults on teardown due to a use-after-free caused by theuser_data_t
once again. This should be fixed by #4104.
from collectd.
#4104 and #4117 have been superseded by #4176
from collectd.
I think this has been fixed. Please re-open if you're still experiencing problems.
from collectd.
Related Issues (20)
- Unable to download tar gz from http://collectd.org/files/collectd-{{collectd_version}}.tar.gz HOT 1
- Export collectd API
- memory plugin: Metrics to not sum up to a stable physical memory size. HOT 7
- interface plugin no values returned on Solaris 11.4 (Patch included) HOT 1
- [collectd 6] Compiler warnings for collectd core with stricter compilation options HOT 3
- Incorrect diskstat value wrap up calculations for 64-bit platforms in disk plugin HOT 1
- GCC warnings with stricter compiler checks ("main" branch) HOT 1
- "README.md" vs. "configure" output mismatches HOT 2
- Disk plug-in failed to load and plug-in interface did not register for value uniquename HOT 4
- [collectd 6] Missing "StoreRates" support for "write_http" plugin (regression from v5)
- Disk plug-in failed to load and plug-in interface did not register for value uniquename -- after upgraded to 5.12 we got the same issue HOT 3
- [collectd 6] write_prometheus handles resource attributes incorrectly. HOT 15
- There should be an error when resource attributes uniqueness assumption is violated
- Add support for headers on open_telemetry exporter plugin HOT 1
- Online documentation: add linkable anchors to headings HOT 2
- MQTT plugin: option to disable unix timestamps on values. HOT 1
- smart plugin and NVME
- Build fails on MacOS HOT 1
- Perl.c fails to compile with clang
- Please use pkg-config instead of libgcrypt-config to locate libgcrypt HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from collectd.