Giter VIP home page Giter VIP logo

mqtt-topic-logger's Introduction

Simple Python MQTT Data Logger by Topic

. This software monitors a group of topics and creates a log file for each topic to which this MQTT client has subscribed.

You can specify the root log directory when starting defaults to tlogs

.

Default log size is 5MB You need to provide the script with:

List of topics to monitor
broker name and port
username and password if needed.
base log directory and number of logs have defaults

Valid command line Options: --help -h -b -p -t -q -v -d logging debug -n -u Username -P Password -s
-l -T test mode when use with the data logger tester -r Record size in bytes default=10000 -c log in csv format -f -f filename of header file default is data.csv

Example Usage:

You will always need to specify the broker name or IP address and the topics to log

Note: you may not need to use the python prefix or may need to use python3 mqtt-topic-logger.py (Linux)

Specify broker and topics

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/#

Specify broker and multiple topics

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/# -t  home/#

Log All Data:

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/# -s 

Specify the client name used by the logger

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/# -n data-logger

Specify the log directory

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/# -l mylogs

Log in CSV format

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/# -c

Log in CSV format and use data.csv header file

python mqtt-topic-logger.py -h 192.168.1.157 -t sensors/# -c -f data.csv

Logger Class

The class is implemented in a module called tlogger.py (topic logger).

To create an instance you ca supply two parameters:

The log directory- defaults to tlogs
Max Log Size defaults to 5MB

log=tlogger.T_logger(log_dir)

The logger creates the log files in the directory using the topic names for the directory names and log files starting with log000.txt When the file reaches 5Mb it is rotated

log data is JSON format with a timestamp added to the message

The logger will return True if successful and False if not.

To prevent loss of data in the case of computer failure the logs are continuously flushed to disk .

The logger will not clear log files when you start the logger you should ensure the log directory is empty. When logging to a csv file you can change the default header order using a header file. Each topic requires its own header entry. Below is an example header file:

test/sensor1,time_ms,time,ms,Urms,Umin,Umax test/sensor2,time_ms,time,ms,Urms,Umin,Umax test/sensor3,time_ms,time,sensor,count,status test/sensor4,time_ms,time,ms,Urms,Umin,Umax,count

You can see that topics sensor1 and sensor2 use the same header whereas sensor3 and sensor4 have different headers. Because topics can have different json foramts the better option is to let the script build the header file rather than supplying one.

mqtt-topic-logger's People

Contributors

stevecope avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mqtt-topic-logger's Issues

Strategic Vision (Duplicate from Data Logger)

Strategic Vision

There is a big problem in MQTT data logging. If you store all of your data by topic, the disk heads jump around on writing. It does not scale well.

On the other hand if you store all of your data in one file, and then try to read the data by topic, the disk heads jump around on reading. It does not scale well.

I should also point out that there are two application areas. One is for IoT data logging. The other is for chat servers. I am interested in trees of chat servers. Where the user wants the recent messages to load really fast.
The needs for IoT may be different.

What is one to do? I have been scratching my head on this problem for a year now. I am the guy who created the irst data logger repository for Steve in March of 2018. I think I finally figured out the answer.

Of course you can use the Kafka message broker. That is for big data, lots of servers, lots of redundancy. They do not want to loose a single byte of data. Me, I just want a simple small solution for one server. Plus maybe Kafka does not know about hierarchy.

And I do not want to write lots of software. As far as possible I want to reuse solid stable existing packages. Like this one!

My key insight is that there are both hard drive based file systems, and RAM based file systems. Often the /tmp directory in Linux is a RAM based file system. Plus /tmp now swaps when needed. Perfect.

So what should we do?
One could write all of the messages to one file on the hard drive, and also write each message to a separate file by topic in the RAM-based file system.

If the server crashes, no problem, just read the data from the hard drive, repopulate the tmp drives files, and all is good.

What happens when your data files get too large? Well the current software does log rotation. You can extend the concept. When you rotate the hard drive log, at the same time rotate the ram logs.
And then save the rotated ram logs to the hard drive.

What if your server crashes just when that is happening?
That would be a very rare event. You can always recreate the topic log files from the bulk log file.

I just do not think this is big problem. I am not in the corporate Kafka space of every piece of data is sacred. It does not take long to write out the temporary files. Also one could first write the ram logs, then rotate and rewrite them. The problem will not happen often. I don’t think my ISP servers have every crashed. Not much data would be lost. I am okay with loosing some chat data. And if you really care, one can always recreate the topic logs.

So what does this mean for this topic logger? We really should have one logger, currently we have one for logging to one file, and one for logging by topic.

The topic logger should be able to both log to a single hard drive file, and to simultaneously log to multiple topic-based RAM-based files.

On rotation, it should rotate both the hard-drive file, and the topic files, and then save the topic ram logs to the hard drive. Better yet, on rotation, to protect against a crash, it could first save the topic logs, rotate, and save them again.

What if some topic logs are very small. Maybe, on rotation, it could merge an old topic log, with a newer topic log.

After a crash, it should be able to read the hard drive file and recreate the RAM files.

For testing, it should be able to run without MQTT, generate random messages and store and rotate them.

I think it might even be nice if it could read the log file and send those messages to the MQTT broker.

To summarize. I would love a logger that is able to read from MQTT, from a random generator, and from the hard drive. It should be able to write to the hard drive and to the ram log files, maybe even to MQTT. It should do rotation as described above.

I know that I am asking for a lot. I do appreciate this free software. Eventually I may write this. But maybe someone has a more urgent need than I do.

How does that sound? Did I miss anything? What do the IoT applications need? Your feedback would be most appreciated.

What if you need a quick fix? In the short run, there is a quick fix I know of. Talk to me if you are interested. But in the long run, this is the solution I want.

This works perfectly for me.

Since there are only 2 of us watching this, I thought it okay to post here.

So much of the world is focussed on the large scale corporate applications. HBase or Cassandra for large distributed networks of chat server. Think facebook or Discord.

But what if you just need a small server. MQTT, this software and a RAM drive are perfect. Every minute you can backup changed files from your RAM disk to the hard rive. I think rdist will do that.

My previous post was way way too complicated. I can just start using this. Easy Peazy.

Next up time to look through the code.

And why is there so little traffic here? I do not get it.

Next question is which MQTT server should I use?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.