标签 dstat 下的文章

dstat是一个Linux监控工具。可定制采集数据，可设置采集频率，可输出字符界面和导出CSV。默认一秒一条监测数据。其中以top开头的参数，可以记录检测类型最大值的进程。比如--top-cpu记录CPU占用最大的命令，--top-cpu-adv还会记录CPU占用最大的进程ID等。参数--time的时间格式，需要通过环境变量DSTAT_TIMEFMT进行定义。

源码： https://github.com/dstat-real/dstat

示例操作命令：

# 设置时间格式
export DSTAT_TIMEFMT='%Y-%m-%d %H:%M:%S'

# 执行监测，并导出CSV文件
dstat --time --cpu --mem --disk --io --net --sys --top-cpu-adv --top-mem --top-bio-adv --top-io-adv --output /opt/dstat_log/dstat_$(date +%Y%m%d).csv

使用时，可结合tmux，随时查看其采集数据，即时输出在终端。导出的CSV文件，需要下载到本地，并使用第三方工具生成图表。

在众多监控方案中，dstat不算优秀的解决方案，而且只有采集数据的功能。其记录数据，采用CSV格式。如果终端不够宽时，不能完整显示每行的采集数据。而且CSV格式不好扩展，比如--top-cpu-adv记录的数据，不适合机器理解。这里记录一下相关经验。

1 正式版的bug

安装过0.7.3和0.7.4两个版本，并使用Python3运行，都存在以下两个Bug。幸好是使用Python开发，可以直接修复。其安装路径为/usr/bin/dstat。

a）在Debian 10以上使用Python3运行时，出现以下Bug：

/usr/bin/dstat:2619: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
  import imp
Terminal width too small, trimming output.
Traceback (most recent call last):
  File "/usr/bin/dstat", line 2847, in <module>
    main()
  File "/usr/bin/dstat", line 2687, in main
    scheduler.run()
  File "/usr/lib/python3.10/sched.py", line 151, in run
    action(*argument, **kwargs)
  File "/usr/bin/dstat", line 2806, in perform
    oline = oline + o.showcsv() + o.showcsvend(totlist, vislist)
  File "/usr/bin/dstat", line 547, in showcsv
    if isinstance(self.val[name], types.ListType) or isinstance(self.val[name], types.TupleType):
NameError: name 'types' is not defined. Did you mean: 'type'?

解决办法，参考以下文档：

dstat --output is broken
https://bugs.launchpad.net/ubuntu/+source/dstat/+bug/1905665

简单来说，改两行代码。如下：

# 第547行，改为：
if isinstance(self.val[name], (tuple, list)):

# 第552行，改为：
elif isinstance(self.val[name], str):

b）--top-mem参数统计错误的bug

参考文章：

Invalid parsing of /proc//stat
https://github.com/dstat-real/dstat/issues/120

修改方法def proc_splitline(filename, sep=None)，改为：

if filename.startswith("/proc/") and filename.endswith("/stat") and filename != "/proc/stat":
    tmp = linecache.getline(filename, 1).split(sep)
    it = [i for i,c in enumerate(tmp) if c.endswith(')')]
    it = 2 if not it else it[-1]+1
    return tmp[0:1] + [' '.join(tmp[1:it])] + tmp[it:]
else:
    return linecache.getline(filename, 1).split(sep)

2 应用场景

感觉比较适合单机版，或者指定采集一些系统数据。不适合生产机大规模部署。如果非要用dstat不可，可以考虑 dstat + Fluentd + Influxdb + Grafana 这种组合方案。