Archive

The Dulin Report

Browsable archive from the WordPress export.

Results (24)

Should today’s developers worry about AI code generators taking their jobs? Dec 11, 2022 Book review: Clojure for the Brave and True Oct 2, 2022 Stop Shakespearizing Sep 16, 2022 Using GNU Make with JavaScript and Node.js to build AWS Lambda functions Sep 4, 2022 Monolithic repository vs a monolith Aug 23, 2022 Scripting languages are tools for tying APIs together, not building complex systems Jun 8, 2022 Good developers can pick up new programming languages Jun 3, 2022 There is no such thing as one grand unified full-stack programming language May 27, 2022 TypeScript is a productivity problem in and of itself Apr 20, 2022 Tools of the craft Dec 18, 2021 Node.js and Lambda deployment size restrictions Mar 1, 2021 What programming language to use for a brand new project? Feb 18, 2020 Using Markov Chain Generator to create Donald Trump's state of union speech Jan 20, 2019 The religion of JavaScript Nov 26, 2018 TypeScript starts where JavaScript leaves off Aug 2, 2017 Node.js is a perfect enterprise application platform Jul 30, 2017 Copyright in the 21st century or how "IT Gurus of Atlanta" plagiarized my and other's articles Mar 21, 2017 Collaborative work in the cloud: what I learned teaching my daughter how to code Dec 10, 2016 Amazon Alexa is eating the retailers alive Jun 22, 2016 JavaScript as the language of the cloud Feb 20, 2016 What Every College Computer Science Freshman Should Know Aug 14, 2015 The Three Myths About JavaScript Simplicity Jul 10, 2015 Big Data is not all about Hadoop May 30, 2015 How We Overcomplicated Web Design Oct 8, 2014

Big Data is not all about Hadoop

May 30, 2015

[caption id="attachment_216" align="aligncenter" width="300"]Punchcard Photo credit Jan Andersen Punchcard
Photo credit Jan Andersen[/caption]

Big Data is not Hadoop, and Hadoop is not Big Data.

A lot of people are surprised that somehow Big Data adoption is growing while Hadoop is struggling. There is some speculation as to why and I have a much more pragmatic explanation: Hadoop is not SQL.

Not all developers are created equal. Not all developers can pick up new skills – and enjoy doing so. The vast majority of enterprise developers are business analysts who know how to configure business software like Salesforce or SAP. Many know SQL, also effectively a well established business language. Some may also know a programming language or two among the likes of Java, JavaScript, C# or even Python but that is not their primary job function or even interest. The mere concept of Map-Reduce might as well be a foreign language to this group of people.

Most IT departments don't understand the implications of adopting distributed storage tools like Hadoop or Cassandra. Expansion and scalability happens by adding new nodes, thus increasing IT maintenance costs. The reality is that vast majority of businesses do not need Hadoop. Dramatic improvements in storage technology, especially SSDs, declining costs of multi-core servers, and seamless support for replicas offered by environments like AWS mean that traditional well established data processing and reporting systems (i.e. SQL) can actually be better at “Big Data” than Hadoop.