How To Configure Collectd to Gather System Metrics for Graphite on Ubuntu 14.04

Introduction

Collecting and visualizing data is an important way to make informed decisions about your servers and projects.

In a previous guide, we discussed how to install and configure Graphite to visualize data on our servers. However, we didn't have a good way of collecting or even passing data into Graphite.

In this guide, we'll discuss the installation and use of collectd, a system statistics gatherer that can collect and organize metrics about your server and running services.

We will show you how to install and configure collectd to pass data into Graphite to render. We will assume that you have Graphite up and running on an Ubuntu 14.04 server as we showed you in the last guide.

Install Collectd

The first thing we are going to do is install collectd. We can get this from the default repositories.

Refresh the local package index and then install by typing:

sudo apt-get update
sudo apt-get install collectd collectd-utils

This will install the daemon and a helper control interface. We still need to configure it so that it knows to pass the data it collects to Graphite.

Configure Collectd

Begin by opening the collectd configuration file in your editor with root privileges:

sudo nano /etc/collectd/collectd.conf

The first thing that we should set is the hostname of the machine that we are on. Collectd can be used to send information to a remote Graphite server, but we are using this on the same machine for this guide. You can choose whatever name you'd like:

Hostname "graph_host"

If you have a real domain name configured, you can skip this and just leave toe FQDNLookup so that the server will use the DNS system to get the proper domain.

You may notice there is a parameter for "Interval", which is the interval that collectd waits before querying data on the host. This is set by default to 10 seconds. If you followed along in the Graphite article, you will notice that this is the usual shortest interval for Graphite to track stats. These two values must match for data to be recorded reliably.

Next, we get right into the services that Collectd will gather information about. Collectd does this through the use of plugins. Most of the plugins are used to read information from the system, but plugins are also used to define where to send information. Graphite is one of these write plugins.

For this guide, we are going to ensure that the following plugins are enabled. You can comment out any other plugins, or you can work on configuring them correctly if you want to try them out on your host:

LoadPlugin apache
LoadPlugin cpu
LoadPlugin df
LoadPlugin entropy
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin processes
LoadPlugin rrdtool
LoadPlugin users
LoadPlugin write_graphite

Some of these need configuration, and some of them will work fine out-of-the-box.

Continuing on down the file, we get to the configuration section of each plugin. Plugins are configured by defining a "block" for each configuration section. This is somewhat similar to how Apache compartmentalizes directives within blocks. We only will be taking a look at a few of these, since most of our plugins will work fine the way they are.

We enabled the Apache plugin because we have Apache installed to serve Graphite. We can configure the Apache plugin with a simple section that looks like this:

<Plugin apache>
    <Instance "Graphite">
        URL "http://domain_name_or_IP/server-status?auto"
        Server "apache"
    </Instance>
</Plugin>

In a production environment, you may wish to keep the server stats protected behind an authentication layer. You can look at the commented code in this section of the file to see how that would work. For simplicity's sake, we are going to demonstrate an open setup that is not authenticated.

We will be creating the server-status page for Apache that provides us with the details we need in a bit.

For the df plugin, which tells us how full our disks are, we can add a simple configuration that looks like this:

<Plugin df>
    Device "/dev/vda"
    MountPoint "/"
    FSType "ext3"
</Plugin>

You should point the device to the device name of the drive on your system. You can find this by typing the command in the terminal:

df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/vda        61796348 1766820  56867416   4% /
none                   4       0         4   0% /sys/fs/cgroup
udev             2013364      12   2013352   1% /dev
tmpfs             404836     340    404496   1% /run
none                5120       0      5120   0% /run/lock
none             2024168       0   2024168   0% /run/shm
none              102400       0    102400   0% /run/user

Choose the networking interface you wish to monitor:

<Plugin interface>
    Interface "eth0"
    IgnoreSelected false
</Plugin>

Finally, we come to the Graphite plugin. This will tell collectd how to connect to our Graphite instance. Make the section look something like this:

<Plugin write_graphite>
    <Node "graphing">
        Host "localhost"
        Port "2003"
        Protocol "tcp"
        LogSendErrors true
        Prefix "collectd."
        StoreRates true
        AlwaysAppendDS false
        EscapeCharacter "_"
    </Node>
</Plugin>

This tells our daemon how to connect to Carbon in order to pass off its data. We specify that it should look to the local computer on port 2003, which Carbon uses to listen for TCP connections.

Next, we tell it to use that protocol to reliably hand off the data to Carbon. We tell it to log errors about the hand off and then set the prefix for the data. Since we end this value with a dot, all of the collectd stats for this host will be stored in a "collectd" directory.

The store rates determines whether stats will be converted to gauges before being passed. The append data source line would append the node name to our metrics if enabled. The escape character determines how certain values with dots in them are converted to avoid Carbon from splitting them into directories.

Save and close the file when you are finished.

Configure Apache to Report Stats

In our configuration file, we enabled Apache stats tracking. We still need to configure Apache to allow this though.

In the Apache virtual hosts file that we have enabled for Graphite, we can add a simple location block that will tell Apache to report stats.

Open the file in your text editor:

sudo nano /etc/apache2/sites-available/apache2-graphite.conf

Below the "content" location block, we are going to add another block so that Apache will serve statistics at the /server-status page. Add the following section:

Alias /content/ /usr/share/graphite-web/static/
    <Location "/content/">
        SetHandler None
    </Location>

    <Location "/server-status">
        SetHandler server-status
        Require all granted
    </Location>

    ErrorLog ${APACHE_LOG_DIR}/graphite-web_error.log

Save and close the file when you are finished.

Now, we can reload Apache to get access to the new statistics:

sudo service apache2 reload

We can check to make sure everything is working correctly by visiting the page in our web browser. We just need to go to our domain, followed by /server-status:

http://domain_name_or_IP/server-status

You should see a page that looks something like this:

server stats

Setting the Storage Schema and Aggregation

Now that we have collectd configured to gather statistics about your services, we need to adjust Graphite to handle the data it receives correctly.

Let's start by creating a storage schema definition. Open up the storage schema configuration file:

sudo nano /etc/carbon/storage-schemas.conf

Inside, we need to add a definition that will dictate how long the information is kept, and how detailed the data should be at various levels.

We will tell Graphite to store collectd information at intervals of ten seconds for one day, at one minute for seven days, and intervals of ten minutes for one year.

This will give us a good balance between detailed information for recent activity and general trends over the long term. Collectd passes its metrics starting with the string collectd, so we will match that pattern.

The policy we described can be added by adding these lines. Remember, add these above the default policy, or else they will never be applied:

[collectd]
pattern = ^collectd.*
retentions = 10s:1d,1m:7d,10m:1y

Save and close the file when you are finished.

Reload the Services

Now that collectd is configured and Graphite knows how to handle its data, we can reload the services.

First, restart the Carbon service. It is a good idea to use the "stop" and then "start" command with a few seconds in between instead of the "restart" command. This makes sure that the data is completely flushed prior to the restart:

sudo service carbon-cache stop          ## wait a few seconds here
sudo service carbon-cache start

After the Carbon service is up and running again, we can do the same thing with collectd. The service may not be running yet, but this will ensure that it handles the data correctly:

sudo service collectd stop
sudo service collectd start

After this, you can visit your domain again, and you should see a new tree with your collectd information:

collectd tree

Conclusion

Our collectd configuration is complete and our stats are already being recorded! Now, we have a daemon configured to track our server and services.

We can configure or write additional plugins for collectd as the need arises. Additional servers with collectd can also send data to our Graphite server. Collectd is mainly used for collecting statistics about common services and your machines as a whole.

In the next article, we'll set up StatsD, a service that can cache data before flushing it to Graphite. This will allow us us to work around the problem of data loss when sending stats too quickly that we described in the previous article. It will also give us with an interface to track statistics within our own programs and projects.

Read more

世界越快心越慢

在晚飯後的休息時間,我特別享受在客廳瀏灠youtube上各樣各式創作者的影音作品。很大不同於傳統媒體,節目多是針對大多數族群喜好挑選的,在youtube上我會依心情看無腦的動畫、一些旅拍記錄、新聞時事談論。 尤其在看了大量的Youtube的分享後,我真的感受到會限制我的是我的無知,特別是那些我想都沒想過的實際應用,在學習後大大幫助到我的生活和工作層面。 休息在家時,我喜歡想一些沒做過的菜,動手去設計生活和工作上的解決方案,自己是真的很難閒著沒事做。 如創作文章,陪養新的習慣都能感覺到成長的喜悅,是不同於吃喝玩樂的快樂的。 創作不去限制固定的形式,文字是創作、影像聲音也是創作,記錄生活也是創作,我想留下的就是創造—》實現—》回憶,這樣子的循環過程,在留下的足跡面看到自己一路上的成長、失敗、絕望、重新再來。 雖然大部份的時候去做這些創作也不明白有什麼特別的意義,但不去做也不會留下什麼,所以呀不如反事都去試試看,也許能有不一樣的水花也許有意想不到的結果,投資自己永遠不會是失敗的決定,不是嗎?先問問自己再開始計畫下一步,未來沒人說得準。 像最近看youtube仍大一群人在為DOS開

By Phillips Hsieh

知識管理的三個步驟:一小時學會把知識運用到生活上

摘錄瓦基「閱讀前哨站」文章作為自己學習知識管理的內容 Part1「篩選資訊」 如何從海量資訊中篩選出啟發性、實用性和相關性的精華,讓你在學習過程中不再迷失方向。 1. 實用性 2. 啟發性 Part2「提高理解」 如何通過譬喻法和應用法,將抽象的知識與日常生活和工作緊密結合,建立更深刻的理解。 1. 應用法 2. 譬喻法 Part3「運用知識」 如何連結既有知識,跟自己感興趣的領域和專案產生關聯,讓你在運用知識的路途上游刃有餘。 1. 跟日常工作專案、人際活動產生連結 # 為什麼要寫日記? * 寫日記是為了忘記,忘卻瑣碎事情,保持專注力 * 寫日記就像在翻譯這個世界,訓練自己的解讀能力 * 不只是透過日記來記錄生活,而是透過日記來發展生活 #如何寫日記? * 不要寫流水帳式的日記,而是寫覆盤式的日記 當我們試著記錄活動和感受之間的關聯,有助於辦認出真正快樂的事 日記的記錄方式要以過程為主,而非結果 * 感恩日記的科學建議,每日感恩的案例

By Phillips Hsieh
2024年 3月30日 14屆美利達環彰化百K

2024年 3月30日 14屆美利達環彰化百K

這是場半小時就被秒報名額滿的經典賽事, 能順利出賽實屬隊友的功勞, 這次的準備工作想試試新買的外胎, 因為是無內胎用的外胎, 特別緊超級難安裝的, 問了其他朋友才知道, 要沾上肥皂水才容易滑入車框。 一早四點起床準備, 五點集合備好咖啡在車上飲用, 約了六點在彰化田尾鄉南鎮國小, 整好裝四人一起出發前往會場。 被排在最後一批出發, 這次的路線會繞行的員林148上139縣道, 其實在早上五點多天就開始有點飄雨, 大伙就開始擔心不會要雨戰吧! 果不其然才出發準備上148爬坡雨勢越來越大, 戴著防風眼鏡的我在身體的熱氣加上雨水冷凝效果下, 鏡面上滿是霧氣肉眼可視距離才剩不到五公尺, 只能緊依前前方的車友幫忙開路, 之後洪大跟上來我立馬請求他幫忙開路, 上了139停下車把防風眼鏡收起來, 反正下雨天又陰天完全用不到太陽眼鏡了。 雨是邊下邊打雷, 大伙都在這條139上一台一台單車好像避電針, 一時有點害怕不然想平時沒做什麼壞事, 真打到自己就是天意了。 下了139雨勢開始變小, 大伙的速度開始有所提昇, 開高鐵列車的時機己成熟, 物色好列車就跟好跟滿。 最後找了一隊似乎整團有固定在練

By Phillips Hsieh
2023 12月9號 美利達單車嘉年華

2023 12月9號 美利達單車嘉年華

第二次參加美利達環南投賽事, 還記得去年第一次參加這美利達環南投, 還特地提前一天跟車友在魚池住了一晚。 這回用上了剛在7月份剛安裝的車頂架, 安裝了二種不同的攜車架, 都樂這邊可以不用拆車輪直上車頂, YAKIMA這邊選了經濟的款式, 折掉前輪利用前叉固定在攜車架上。 約了唯一一位一起參加的朋友, 二人一早四點約見面, 幫朋友帶上了拿鐵咖啡, 開上日月潭在水社碼頭停好車, 騎往向山遊客中心, 路過美麗的日月潭簡直不要太美了拍一張。 抵達會場己是人山人海了, 跟著大伙排隊順便也看網紅也欣賞名車。 出發就先沿著日月潭順時針騎, 騎到玄裝寺很急停下來上一下廁所, 比賽時都會尿都特別的滿, 一方面是比較緊張,一方面是特別興奮。 這時己經跟車友失散了, 只能獨推沿路看有沒有車友可以一起組隊的, 很可惜在山區大家的實力不一只求平安順騎了, 原則就是有補給就停有食物就吃。 下到水里人群再次聚集起來, 光等紅綠燈就是一條車龍。 騎行了一大圈水里再回到131縣道, 這時背後傳來熟悉的聲音叫菲哥, 終於跟車友重新集合接下來就一路邊聊邊騎。 最後來幾張專業攝影師拍攝的照片 回到終點台上

By Phillips Hsieh