Giter VIP home page Giter VIP logo

loghub's Introduction

Loghub

Loghub maintains a collection of system logs, which are freely accessible for research purposes. Some of the logs are production data released from previous studies, while some others are collected from real systems in our lab environment. Wherever possible, the logs are NOT sanitized, anonymized or modified in any way. All these logs amount to over 87GB in size. We thus host only a small sample (2k lines) on Github for each dataset.

If you use our loghub datasets in your research for publication, please kindly cite the following paper.

How to get the data?

If you are interested in these datasets, please request the raw logs at Zenodo. Please kindly note that the affiliation information is minimally required for your data request.

Logs currently available (still in beta release):

Software System Time Span #Messages Size Compressed (gzip)
Distributed systems
HDFS 38.7 hours 11,175,629 1.54GB 152.01MB
N.A. 71,118,073 16.84GB 877.38MB
Hadoop N.A. 394,308 49.78MB 2.50MB
Spark N.A. 33,236,604 2.88GB 179.18MB
Zookeeper 26.7 days 74,380 10.18MB 452KB
OpenStack N.A. 207,820 60.02MB 5.27MB
Operating systems
Windows 226.7 days 114,608,388 27.36GB 1.63GB
Linux 263.9 days 25,567 2.30MB 228KB
Mac 7.0 days 117,283 16.48MB 1.46MB
Server applications
Apache Web server 263.9 days 56,481 5.02MB 260KB
OpenSSH 28.4 days 655,146 71.70MB 4.49MB
Mobile systems
Andriod N.A. 63,042,037 7.00GB 825.57MB
HealthApp 10.5 days 253,395 22.98MB 2.24MB
Supercomputers
Blue Gene/L 214.7 days 4,747,963 725.77MB 61.46MB
HPC N.A. 433,489 32.77MB 3.21MB
Thunderbird 244 days 211,212,192 31.04GB 1.97GB
Standalone software
Proxifier N.A. 21,329 2.48MB 172KB

Publications using these datasets

Organizations that request these datasets

We proudly announce that the loghub datasets have been requested by more than 110 organizations from both industry and academia.

Feedback

For any questions or feedback, please post to our issue page.

License

The log datasets are freely available for research purposes.

loghub's People

Contributors

zhujiem avatar shilinhe avatar

Watchers

James Cloos avatar Mohammed A. Shehab avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.