Archive

The Dulin Report

Browsable archive from the WordPress export.

Results (31)

The future is bright Mar 30, 2025 Software Engineering is here to stay Mar 3, 2024 On luck and gumption Oct 8, 2023 Book review: Clojure for the Brave and True Oct 2, 2022 Why don’t they tell you that in the instructions? Aug 31, 2022 Monolithic repository vs a monolith Aug 23, 2022 Scripting languages are tools for tying APIs together, not building complex systems Jun 8, 2022 Good developers can pick up new programming languages Jun 3, 2022 Java is no longer relevant May 29, 2022 Automation and coding tools for pet projects on the Apple hardware May 28, 2022 There is no such thing as one grand unified full-stack programming language May 27, 2022 Best practices for building a microservice architecture Apr 25, 2022 Tools of the craft Dec 18, 2021 What programming language to use for a brand new project? Feb 18, 2020 Which AWS messaging and queuing service to use? Jan 25, 2019 The religion of JavaScript Nov 26, 2018 Let’s talk cloud neutrality Sep 17, 2018 TypeScript starts where JavaScript leaves off Aug 2, 2017 Design patterns in TypeScript: Chain of Responsibility Jul 22, 2017 Amazon Alexa is eating the retailers alive Jun 22, 2016 What can we learn from the last week's salesforce.com outage ? May 15, 2016 Why it makes perfect sense for Dropbox to leave AWS May 7, 2016 OAuth 2.0: the protocol at the center of the universe Jan 1, 2016 What Every College Computer Science Freshman Should Know Aug 14, 2015 The Three Myths About JavaScript Simplicity Jul 10, 2015 The longer the chain of responsibility the less likely there is anyone in the hierarchy who can actually accept it Jun 7, 2015 Big Data is not all about Hadoop May 30, 2015 Exploration of the Software Engineering as a Profession Apr 8, 2015 Thanking MIT Scratch Sep 14, 2013 Have computers become too complicated for teaching ? Jan 1, 2013 Scripting News: After X years programming Jun 5, 2012

Big Data is not all about Hadoop

May 30, 2015

[caption id="attachment_216" align="aligncenter" width="300"]Punchcard Photo credit Jan Andersen Punchcard
Photo credit Jan Andersen[/caption]

Big Data is not Hadoop, and Hadoop is not Big Data.

A lot of people are surprised that somehow Big Data adoption is growing while Hadoop is struggling. There is some speculation as to why and I have a much more pragmatic explanation: Hadoop is not SQL.

Not all developers are created equal. Not all developers can pick up new skills – and enjoy doing so. The vast majority of enterprise developers are business analysts who know how to configure business software like Salesforce or SAP. Many know SQL, also effectively a well established business language. Some may also know a programming language or two among the likes of Java, JavaScript, C# or even Python but that is not their primary job function or even interest. The mere concept of Map-Reduce might as well be a foreign language to this group of people.

Most IT departments don't understand the implications of adopting distributed storage tools like Hadoop or Cassandra. Expansion and scalability happens by adding new nodes, thus increasing IT maintenance costs. The reality is that vast majority of businesses do not need Hadoop. Dramatic improvements in storage technology, especially SSDs, declining costs of multi-core servers, and seamless support for replicas offered by environments like AWS mean that traditional well established data processing and reporting systems (i.e. SQL) can actually be better at “Big Data” than Hadoop.