OUCE 2014 – Review

We have finished our 6th OpenNMS User Conference Europe. It was the first time we made a conference outside of Germany. We started with the first conference in 2009 when I worked at NETHINKS and begun with OpenNMS professional services. For the 3rd and 4th conference we decided to move the venue closer to the people organizing the conference and the IT center in Fulda was a really good place. In 2013 NETHINKS and the OpenNMS Foundation Europe worked together to move the organization of the conference to the community at the University of Applied Science in Fulda.

In 2014 we had our first conference out of Germany at the University of Southampton – United Kingdom, Great Britain, England, the Empire … whatever. We had attendees from 12 countries. The OpenNMS Group, Inc. showed the latest development sponsored to the project by commercial companies like the new Semantic Topology User Interface and the Operators Panel.

OpenNMS and OCS Inventory

OpenNMS and OCS Inventory

Additionally had a visit from three nice guys from OCS Inventory with an talk about OCS Inventory and practical workshop.

The crowd around OpenNMS is really amazing and we finished with an open discussion about the OpenNMS project from a community perspective. We came also around documentation where have really some issues about. So I decided to give a short overview how other projects handle documentation. I’ve spent the last conferences with sitting in “How to contribute” talks from OpenStack and ElasticSearch and they do really a great job when it comes around the community.

Sorry if you missed this great opportunity having the possibility talking with a lot of great people. We got all slides and they are available in our conference system. As soon we received the videos we will upload them to our OpenNMS Youtube Channel.

Thanks to all attendees for the great time

RRDtool graph improvement

If you have OpenNMS with RRDtool running, you can improve the whole rendering a little bit just by replacing the command.prefix in the following files:

  • snmp-graph.properties
  • snmp-adhoc-graph.properties
  • response-graph.properties
  • response-adhoc-graph.properties
command.prefix=/usr/bin/rrdtool graph - --imgformat PNG --font DEFAULT:7 --font TITLE:10 --start {startTime} --end {endTime} -E --width=1000 --height=225

I don’t like the stamp size graphs in OpenNMS, so I changed it for all. It could be problematic if you have grouped graph report (KSC report). So if you have trouble, you can just remove the –width and –height parameter.The command will also overwrite the width and height for all graphs, so they have all the same size. I don’t like the stamp size graphs in OpenNMS default, so I changed

Default graphing

Default graphing (Click to enlarge)

Graphing with better font rendering and anti-aliasing

Graphing with better font rendering and anti-aliasing (Click to enlarge)

Chemnitz Linuxtage 2014

clt2014_gp_coverLinux Days 2014 are over and here comes a short personal résumé. The conference was great and I met a lot of nice people. I have some non-german friends, so I switched the article to my bad school english – sorry about that. The first talk for me was “Amazon Linux – The operating system for the cloud”. It was quite interesting to have somebody who starts provide a slim virtual environment optimized, RPM RHEL binary compatible, distribution with more up-to-date packages. The main issue, it is only and really only available for EC2. I asked if they plan to free their distribution for other Cloud platforms like Open Stack or Cloud Stack – result – not on the road map and even not planned :/ If you start investing time and energy building or moving to a cloud infrastructure, you don’t want to be locked-in. In my opinion the main real killer feature of all the cloud hype is the beginning of standardizing API’s for a software defined vendor independent infrastructure. It allows me to move or scale whenever I want (or external dependencies like electricity cost, availability, compute/network/storage prices are good for me) my services and data seamless to the provider nevertheless if it’s Amazon, Microsoft, Rackspace or my privately owned infrastructure.

The next talk was really was about NeDI with the topic Network Discovery that Really Works from Dr. Michael Schwartzkopff. I know the speaker is familiar with OpenNMS and it would be nice how he presents the NeDI key-features. With our new geographical map, linkd and the new topology map in OpenNMS-SNAPSHOT, we cover some similar use-cases. What kind a cool feature in NeDI is, the possibility making a trace route through a really complex topology in real time and filtering the big topology on different criteria, e.g. vendors, locations or branch offices. They even start to monitor services and collect massively data for network interfaces. The approach is similar to OpenNMS using SNMP using CDP and LLDP for discovering the network topology. In OpenNMS the approach for working with the topology is a little bit different and covers more monitoring related workflows which you can compare by your self here:

Beside my work with computer network and monitoring stuff, I’m also interested in social, political topics around free software and also network neutrality. My friend Sven showed me an interesting project called Freifunk which is a non-profit foundation about building a citizen self driven meshed wireless network. I’ve talked with a few guys at FrosCON and on the Chaos Communication Congress 30C3 in Hamburg. What I really like about it, they start to become really creative, developed new protocols as free software and they solve really big special network problems. So it was cool see a talk from Freifunk in Chemnitz about the special developed Wife-Meshing protocol B.A.T.M.A.N.. They run a really large free community network with 180 Wifi spots and 7000 active sessions. What’s even better, they where able to get support from the city administration which gives them the possibility to place Wifi antennas on public buildings. It would be interesting to know if this is just a east-german phenomenon. Maybe I like monitoring for the reason, I love the network domain, distributed systems, free software and working with smart good looking people which brought me to the OpenNMS project.

The second day had very special topic on my agenda – PostgreSQL: Killing NoSQL. It was a talk from a 15 year experienced PostgreSQL pro and it was a really refreshing talk. He explained some nice stories about customers starting a phone call with something like: “We want to scale like Facebook” and he explained all the NoSQL hype on the level I prefer – ACID vs. BASE and Brewers CAP theorem. He talked about what happen if you try to use the wrong concepts for the wrong use-case and how to prevent you to end up in a real mess also regarding structured vs. unstructured data. He gave also an overview about interesting technologies added to latest PostgreSQL like asynchronous processing facility called PGQ and Postgres-XC which adds read/write scalability to Postgres. This talk had a lot of cool content and he showed – in my point of view a little bit scary example – how flexible Postgres can be with e.g. mongres.

The last talk I’ve seen was how to migrate 500 servers in 3 days with 150 people and what they have learned but to me not interesting enough. They moved 500 server from XEN to VMware and described the planning and what problems they had. It was not that interesting to me as it sounds, so I left the talk after 15 minutes.

They had a big book shop and it was really cool to see a book you wrote on the table. I can’t deny during my 4h travel back home thinking about writing an improved second edition for OpenNMS 1.14. I know there is so much stuff missing in the current book, but why not just go the next step and improve … opennms-sell It was a really cool conference and I’m thinking about giving a talk or a workshop about OpenNMS next year. I like the atmosphere and the University in Chemnitz is a really fancy modern location – if you think about an entrance fee of 8,-€ for two days, it is amazing. Thank you for the cool conference to the CLT-Team and hope see you again next year.

Polishing ask.opennms.eu

I’ve spend some time with improving the question and answer board ask.opennms.eu, which we provide from the OpenNMS Foundation. I was not really happy with the default color layout from Pixel n Grain Light, but it provides a nice basic layout which allows to sprinkle some simple CSS magic over it. Marcel Fuhrmann asked me an interesting question, if its possible to add additional fields if you ask question. It would be cool if you have always to fill in the OpenNMS version you talk about. I found a plugin which is called extra-question-field.

If you ask now a question it requires to fill in the OpenNMS version number and added optional fields for the Java version, PostgreSQL version and operating system. I think this information depend on the type of question and should not be required.

Many thanks to question2answer for building this tool and also the guy who build the extra-question-field-plugin.

So here is the transition from old to new, in hope it is not worse than before.

Have fun answering and asking questions

Default theme ask.opennms.eu

Default theme

OpenNMS theme

OpenNMS theme

FOSDEM 2014

Markus, Sven and me spent last weekend in Bruxelles at FOSDEM. I enjoyed the conference, having no talk, nothing to prepare or to organize is relaxing. So I just picked the talks where I was interested in. The conference itself is completely free, you just go in and enjoy. It is the whole Campus of the Université Libre de Bruxelles. The schedule was amazing 512 lectures and over 5000 attendees. (Un)fortunately the evening before the first conference day, we met a few guys from FrOSCon at the Delirium Cafe. This guys kept us awake and we met a few others from Bruxelles which kept us awake until 4:30am. We had some real issues getting to the opening event. I think every speaker with a 10:00am slot had a hard time with the audience these weekend ;)

I spend a few talks on OpenStack, beside all the technology stuff they had also one talk explaining why and how to contribute to the project. It is interesting for me to see, how they deal with their over 90 git projects and over 14.000 people in the community. In the end, they make really great job in communcation, documentation, organization and automate as much as they can, kudos.

Unfortunately – Sven and me had on Sunday a 10.00am elasticsearch talk on our agenda. So we where not a good mate for Markus and we lost him in the Deliriumcafe with the ScummVM guys. I bet they drunk one beer for every platform they can run on – and they do more platforms then the JavaVM – But wait for it … Markus made it to the 10.00am talk – Chapeau! They gave a short overview about the new features for their 1.0 release. The key features are really nice snapshot and restore and new functions for bucketing and aggregation. We had chat in their booth and they had a quite cool demo with Logstash, Kibana and elasticsearch running. We asked if some of the Elasticsearch guys are interested to give a talk on our little conference in Southampton this year. They where quite interested and the ball is rolling, so cross fingers.

For me personally Cassandra, elasticsearch and Storm are the Open Source projects where I think they will change a lot in the next years.

30c3

For me there are three main events each year. Starting with the OpenNMS User Conference (OUCE) which is organized by the OpenNMS Foundation. The DevJam the OpenNMS developers conference which is sponsored by The OpenNMS Group to provide a big get-together for commercial developers and community members – and last and least the Chaos Communication Congress which is the conference to get unconventional input, socialize, discuss, think and hack. For short … the geek playground you want to have ;) I got infected by visiting Dustin and Sven last year at the 29c3, so this year is a must have ;)

Much more interesting, Sven got a slot for a lightning talk to announce their new ddserver project. After dealing with more and more stupid limitations with the DynDNS services – it is exactly this example to get people addicted by providing many years a free service and then someone changes his mind try to convert the free service in a pay-for-service. Please don’t get me wrong, I don’t have anything against pay-for services. It is in my opinion a good example how external service provider can force you to pay money for a service you’ve used many years for free.

Sven and Dustin decided to start a free software project, which allows you to run a dynamic DNS infrastructure on your own. It’s a little bit like Owncloud, but for your free dynamic DNS service. Funny enough, just two weeks earlier a similar project was born, called nsupdate, which does the same thing. So if you have such a coincident, then it is a good sign, the project idea is good and useful. I’ve spend some time to create Vagrant machine which installs and deploys ddserver so you have quick starting ground to play with the tool. Checkout the vagrant-ddserver project on Github. The application is written in Python and a the WebUI is build with Twitter Bootstrap.
ddserver-beta

Monitoring SSL certificates

Running SSL for websites and mail services is important. Taking care of the certificates is a tricky job. The expiration date of a certificate came always in the most inconvenient way. Nevertheless it is always too close, so you’re in hurry to renew the certificates and try to remember all the strange openssl commands. On the command line you can test your certificate with a command like this

openssl s_client -showcerts -connect mail.example.com:993

Since 1.12.2 in OpenNMS – Kudos to Ronald Roskens for this contribution – you can use the SSLCertMonitor to test the expiration date. If the expiration date is for example less than 30 days, you got a service down and if you want a notification about it. It gives you in hope a less stressful certificate renewal workflow. In my example I have a mail with IMAPS and SMTPS and a web server on HTTPS where I want to check the certificates. In my opinion testing the expiration time every two hours is fair enough. The service goes down if the certificate expiration date runs below 30 days. I have also created separate polling package with customized polling time in my poller-configuration.xml:

<package name="Long-Running-Scripts">
  <filter>IPADDR != '0.0.0.0'</filter>
  <include-range begin="1.1.1.1" end="254.254.254.254"/>
  <include-range begin="::1" end="ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"/>
  <rrd step="7200">
    <rra>RRA:AVERAGE:0.5:1:2016</rra>
    <rra>RRA:AVERAGE:0.5:12:1488</rra>
    <rra>RRA:AVERAGE:0.5:288:366</rra>
    <rra>RRA:MAX:0.5:288:366</rra>
    <rra>RRA:MIN:0.5:288:366</rra>
  </rrd>
  <service name="SSL-Cert-IMAPS-993" interval="7200000" user-defined="false" status="on">
    <parameter key="retry" value="2"/>
    <parameter key="timeout" value="2000"/>
    <parameter key="port" value="993"/>
    <parameter key="days" value="30"/>
  </service>
  <service name="SSL-Cert-SMTPS-465" interval="7200000" user-defined="false" status="on">
    <parameter key="retry" value="2"/>
    <parameter key="timeout" value="2000"/>
    <parameter key="port" value="465"/>
    <parameter key="days" value="30"/>
  </service>
  <service name="SSL-Cert-HTTPS-443" interval="7200000" user-defined="false" status="off">
    <parameter key="retry" value="1"/>
    <parameter key="timeout" value="3000"/>
    <parameter key="port" value="443"/>
    <parameter key="days" value="30"/>
  </service>
  <downtime interval="7200000" begin="0" end="86400000"/>
  <!-- 2h, 0, 1d -->
  <downtime interval="14400000" begin="86400000" end="43200000"/>
  <!-- 4h, 1d, 5d -->
  <downtime interval="28800000" begin="43200000" end="432000000"/>
  <!-- 8h, 5d, 5d -->
  <downtime begin="432000000" />
  <!-- anything after 5 days delete -->
</package>
.
.
<monitor service="SSL-Cert-IMAPS-993" class-name="org.opennms.netmgt.poller.monitors.SSLCertMonitor" />
<monitor service="SSL-Cert-SMTPS-465" class-name="org.opennms.netmgt.poller.monitors.SSLCertMonitor" />
<monitor service="SSL-Cert-HTTPS-443" class-name="org.opennms.netmgt.poller.monitors.SSLCertMonitor" />

To enable the new monitors it is necessary to restart OpenNMS. I assign the services with provisiond manually or with the provision.pl script using the ReST API. To test the service I’ve set the days parameter to 1000 to see if the service went down:

SSL Certificate monitor down for IMAPS on port SSL.

SSL Certificate monitor down for IMAPS on port SSL.


After testing a down event, I’ve set the configuration back to 30 days. Are there any drawbacks of the solution? It is currently not possible to check multiple SSL certificates on HTTPS vhost environments. The SSL certificate monitor establishes the connection directly to the IP address. Sorry, I know offering SMTPS on port 465 is deprecated: RFC 6409, but hadn’t have the time playing again with my little private Postfix environment and currently it just works ;)

So long and thanks for all the fish …

Mail server DNSRBL monitoring

Running mail server sounds sometimes easier as it is. There is one particular nasty thing which can happen from time to time to any mail server administrator. Your mail server is rejected by delivering mails for the reason you got on a DNS-based Real-time Blackhole List (DNSRBL). One goal of this blackhole lists was to create a dictionary with open-relay mail servers which are known to send spam. If your mail server is registered on such a blackhole list, there is a good chance you have a degraded mail service.

As the name DNS-based says, the list is build as a DNS zone and is maintained by the DNSRBL provider. For example your mail server receives a connection, you can check first if the clients address isn’t registered on a DNSRBL. Based on this information you can decide to accept or reject the connection.

Your got a connection to your mail server from a client with the IP address let’s say 23.4.2.55. The mail server checks first for example the blackhole list from SpamCop. The mail server builds a specific DNS query with the reversed IP address and appends the domain of the DNSRBL provider. The DNS query will look like the following

~ % host -a 55.2.4.23.bl.spamcop.net
Trying "55.2.4.23.bl.spamcop.net"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51340
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;55.2.4.23.bl.spamcop.net.  IN      ANY

;; ANSWER SECTION:
55.2.4.23.bl.spamcop.net. 2100 IN   A       127.0.0.2
55.2.4.23.bl.spamcop.net. 2100 IN   TXT     "Blocked - see http://www.spamcop.net/bl.shtml?23.4.2.55"

Received 134 bytes from 8.8.8.8#53 in 16 ms

If this DNS lookup returns an A record with 127.0.0.2 the mail server is listed on the blackhole list, if the query returns nothing, the server is good to go. Some provider have also a TXT record with an URL which tells you why the server is registered on the list.

Being responsible for a mail server it is helpful to know, if your mail server is registered on such blackhole lists. For this reason there are tons of websites where you can check if an IP address is registered, e.g. http://dnsbl.info. If you have many mail servers you maybe want a more sophisticated way. The test itself is a very simple DNS lookup and can easily done by a script. I’ve created a Groovy script which runs inside the Bean Scripting Framework Monitor (BSFMonitor) of OpenNMS. To have a shorter script execution run time, I’ve used a thread pool to run DNS queries in parallel. It allows quite effectively to test over 50 DNSRBL provider. A few of them have very slow response times so you need to tweak the timeout parameter for the script. To save resources I run the test every two hours. The service goes down as soon one of the DNSRBL provider has listed the IP address.

Service down for SPAM-Blacklist-Monitor

Service down for SPAM-Blacklist-Monitor

The service down event gives a list with all DNSRBL provider which have the IP address in their blackhole list, so you can start to get in contact with them and dealing with their policies.

The configuration is not very complicated. The Groovy script is copied to

/etc/opennms/scripts/SpamBlackListMonitor.groovy

The monitor configuration in poller-configuration.xml has to look like this

<package name="Long-Running-Scripts">
  <filter>IPADDR != '0.0.0.0'</filter>
  <include-range begin="1.1.1.1" end="254.254.254.254"/>
  <include-range begin="::1" end="ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"/>
  <rrd step="7200">
    <rra>RRA:AVERAGE:0.5:1:2016</rra>
    <rra>RRA:AVERAGE:0.5:12:1488</rra>
    <rra>RRA:AVERAGE:0.5:288:366</rra>
    <rra>RRA:MAX:0.5:288:366</rra>
    <rra>RRA:MIN:0.5:288:366</rra>
  </rrd>
  <service name="SPAM-Blacklist-Monitor" interval="7200000" user-defined="true" status="on">
    <parameter key="file-name" value="/etc/opennms/scripts/SpamBlackListMonitor.groovy"/>
    <parameter key="lang-class" value="groovy"/>
    <parameter key="bsf-engine" value="org.codehaus.groovy.bsf.GroovyEngine"/>
    <parameter key="run-type" value="exec" />
    <parameter key="retry" value="1" />
    <parameter key="timeout" value="60000" />
    <parameter key="file-extensions" value="groovy,gy"/>
    <parameter key="rrd-repository" value="/opt/opennms/share/rrd/response"/>
    <parameter key="rrd-base-name" value="dnsrbl"/>
  </service>
  <downtime interval="30000" begin="0" end="300000"/>
  <!-- 30s, 0, 5m -->
  <downtime interval="300000" begin="300000" end="43200000"/>
  <!-- 5m, 5m, 12h -->
  <downtime interval="600000" begin="43200000" end="432000000"/>
  <!-- 10m, 12h, 5d -->
  <downtime begin="432000000" delete="true"/>
  <!-- anything after 5 days delete -->
</package>
.
.
<monitor service="SPAM-Blacklist-Monitor" class-name="org.opennms.netmgt.poller.monitors.BSFMonitor"/>

I've created a specific poller package which has a different RRD configuration which fits to my polling interval for 2 hours. I don't have set any special IP filters, I assign the monitor by myself, so it will only run on the IP interfaces which have the service assigned. After restarting OpenNMS to load the new monitor, I assign the service SPAM-Blacklist-Monitor to my mail servers IP address I want to check. I’ve set the timeout to 60 seconds and repeat the test every 2 hours, which was in my case fast enough. The response time will be recorded as the total run time of the script, so you can see if your timeout configuration is good enough. So feel free to play. You can find the Groovy script on Github opennms-bsf-scripts repository. If you have any improvements and feedback, you’re welcome.

Are there any drawbacks of the solution using a Groovy Script with BSFMonitor? Yes, there is, the BSFMonitor has to read the script from the hard disk every time the monitor is executed. Before the script is executed the Groovy Script has also to be compiled to byte code, in this specific case additional ~1.3 seconds of CPU time. If you run this test thousands of times, it would be more efficient to implement this monitor in native Java. The good thing is, you can change the script by add or remove DNSRBL provider without restarting OpenNMS. Even being able to run the Script in the native JVM allows a more effective way to use threads instead of scripts using fork(). I have also direct access to the OpenNMS logging framework and data structures.

Doing the name lookup, I've used the native Java method of InetAddress.getByName(). In the Java Virtual Machine (JVM) you may have to deal with the JVM DNS caching policies which can be modified with the following JVM parameter

networkaddress.cache.ttl (default: -1)
Indicates the caching policy for successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the successful lookup. A value of -1 indicates "cache forever".

networkaddress.cache.negative.ttl (default: 10)
Indicates the caching policy for un-successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the failure for un-successful lookups.

A value of 0 indicates "never cache". A value of -1 indicates "cache forever".

So long and thanks for all the fish ...