Giter VIP home page Giter VIP logo

linux-troubleshooting's Introduction

Linux Troubleshooting Guide

Sluggish system

Determine system load average using uptime.

$ uptime
13:35:03 up 103 days, 8 min, 5 users, load average: 2.03, 20.17, 15.09

Diagnose load problems with top.

top - 14:08:25 up 38 days,  8:02,  1 user,  load average: 1.70, 1.77, 1.68
Tasks: 107 total,   3 running, 104 sleeping,   0 stopped,   0 zombie
Cpu(s): 11.4%us, 29.6%sy, 0.0%ni, 58.3%id,  .7%wa, 0.0%hi, 0.0%si, 0.0%st
Mem:   1024176k total,   997408k used,    26768k free,    85520k buffers
Swap:  1004052k total,     4360k used,   999692k free,   286040k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM      TIME+  COMMAND
 9463 mysql     16   0  686m 111m 3328 S   53  5.5  569:17.64  mysqld
18749 nagios    16   0  140m 134m 1868 S   12  6.6    1345:01  nagios2db_status
24636 nagios    17   0 34660  10m  712 S    8  0.5    1195:15  nagios
22442 nagios    24   0  6048 2024 1452 S    8  0.1    0:00.04  check_time.pl

Focus on the CPU metrics (see appendix for explanation of each metric):

Cpu(s): 11.4%us,  29.6%sy,  0.0%ni,  58.3%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st

The first metric you need look at is wa (I/O wait). It measures the time that processes spend waiting for file I/O (read/write calls to files in mounted filesystem, swapping to disk due to low memory etc.). During this time, they are theoretically able to run but cannot do so because the data required is not there yet. Such processes usually show up in D state and contribute to the load average of the system.

Note that I/O wait in this case does not count the time spent waiting for network I/O, but confusingly, it will include time spent waiting for network file systems (NFS).

A high I/O wait means there is either an I/O problem or a RAM problem. If the I/O wait is low, you can look at the second metric - CPU idle time (id). If the idle time is low, it means there is a problem with CPU resources. If the idle time is high, then you have to troubleshoot elsewhere. This might mean diagnosing network problems, or in the case of web server, looking at slow queries to a database, for instance.

Appendix

CPU metrics (top)

us: user CPU time

This is the percentage of CPU time spent running users’ processes that aren’t niced. (Nicing is a process that allows you to change its priority in relation to other processes.)

sy: system CPU time

This is the percentage of CPU time spent running the kernel and kernel processes.

ni: nice CPU time

If you have user processes that have been niced, this metric will tell you the percentage of CPU time spent running them.

id: CPU idle time

This is one of the metrics that you want to be high. It represents the percentage of CPU time that is spent idle. If you have a sluggish system but this number is high, you know the cause isn’t high CPU load.

wa: I/O wait

This number represents the percentage of CPU time that is spent waiting for I/O. It is a particularly valuable metric when you are tracking down the cause of a sluggish system, because if this value is low, you can pretty safely rule out disk or network I/O as the cause.

hi: hardware interrupts

This is the percentage of CPU time spent servicing hardware interrupts.

si: software interrupts

This is the percentage of CPU time spent servicing software interrupts.

st: steal time

If you are running virtual machines, this metric will tell you the percentage of CPU time that was stolen from you for other tasks.

linux-troubleshooting's People

Contributors

uohzxela avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.