The NDRC is located in the Digital Hub in Dublin, its also part of the Guinness Brewery, an auspicious location for a meetup truth be told. We were scheduled to kick off at 6:30pm but the room was still fairly empty and I was starting to think the hot weather might have swayed people towards other pursuits. But ten minutes later we had our audience of fifty.
Sarah O’ Farrell of the NDRC kicked us off with an overview of the NDRC launchpad programme. They are currently accepting applications, so if you have a good idea for a startup get an application in there.
The presentations started with a 5 minute brief on Cassandra given by Patrick McFadin, Solution Architect with DataStax and all round knowledgeable dude when it comes to Cassandra and Databases. Patrick’s brief was excellent, I can’t quote it word for word, but for a drier version:
“Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneably consistent, column-oriented database that bases its distribution design on Amazon’s Dynamo and its data model on Google’s Bigtable. Created at Facebook, it is now used at some of the most popular sites on the Web.” from Cassandra: The Definitive Guide by Eben Hewitt, published by O’ Reilly.
Bill De Hora, CTO of Cayova and a Cassandra veteren was up next. Bill brings a lot of domain knowledge to the table, both in terms of Cassandra and distributed systems. His presentation was detailed and coherent. The audience got a flavour for Cassandra as the main source of truth at Cayova for complex, social media data, e.g. timelines, posts, file metadata etc. The best thing about the presentation was I found myself greedily anticipating seeing the slides again as they were full of great nuggets from the development/operations perspective. For the complete deck, see here.
Patrick took the stage again and explained how he went from being an Oracle DBA to working with Cassandra and why he was so excited by the technology. He also emphasized that working with NoSQL is a paradigm shift from working with traditional RDBMS. More so that many of the issues companies have with using Cassandra at scale arise from not reading the great documentation available. Very sound advice delivered in a way that really resonated with the audience. Personally, I really enjoyed the detail he went into re. disk i/o and OS recommendations. It is always good to get a refresher on why you really need to pay attention to disk config, esp. when using SSDs which as he rightly pointed out have come down enough in price that they are now a feasible option. DataStax are now recommending them as the default storage of choice in high performance clusters with mixed read/write requirements, see here.
We wrapped up quickly as security were at our heels, but it was a very informative two hours and the audience stayed till the bitter end, which given the temperature in the room was a testament to how engaging the speakers were. You can find his presentation here.
Roll on the next one!
David Long is the CEO of DigBigData and a long-time Cassandra fan.