Archive

The Dulin Report

Browsable archive from the WordPress export.

Results (28)

Stop Shakespearizing Sep 16, 2022 Using GNU Make with JavaScript and Node.js to build AWS Lambda functions Sep 4, 2022 Monolithic repository vs a monolith Aug 23, 2022 TypeScript is a productivity problem in and of itself Apr 20, 2022 Node.js and Lambda deployment size restrictions Mar 1, 2021 What programming language to use for a brand new project? Feb 18, 2020 The religion of JavaScript Nov 26, 2018 Let’s talk cloud neutrality Sep 17, 2018 TypeScript starts where JavaScript leaves off Aug 2, 2017 Node.js is a perfect enterprise application platform Jul 30, 2017 Design patterns in TypeScript: Chain of Responsibility Jul 22, 2017 Singletons in TypeScript Jul 16, 2017 Collaborative work in the cloud: what I learned teaching my daughter how to code Dec 10, 2016 JavaScript as the language of the cloud Feb 20, 2016 Operations costs are the Achille's heel of NoSQL Nov 23, 2015 Ten Questions to Consider Before Choosing Cassandra Aug 8, 2015 The Three Myths About JavaScript Simplicity Jul 10, 2015 Big Data is not all about Hadoop May 30, 2015 Smart IT Departments Own Their Business API and Take Ownership of Data Governance May 13, 2015 We Need a Cloud Version of Cassandra May 7, 2015 Apple is (or was) the Biggest User of Apache Cassandra Apr 23, 2015 Building a Supercomputer in AWS: Is it even worth it ? Apr 13, 2015 Ordered Sets and Logs in Cassandra vs SQL Apr 8, 2015 Where AWS Elastic BeanStalk Could be Better Mar 3, 2015 Trying to Replace Cassandra with DynamoDB ? Not so fast Feb 2, 2015 Why I am Tempted to Replace Cassandra With DynamoDB Nov 13, 2014 Cassandra: Lessons Learned Jun 6, 2014 Best way to start writing an XSLT Jun 25, 2006

Big Data is not all about Hadoop

May 30, 2015

[caption id="attachment_216" align="aligncenter" width="300"]Punchcard Photo credit Jan Andersen Punchcard
Photo credit Jan Andersen[/caption]

Big Data is not Hadoop, and Hadoop is not Big Data.

A lot of people are surprised that somehow Big Data adoption is growing while Hadoop is struggling. There is some speculation as to why and I have a much more pragmatic explanation: Hadoop is not SQL.

Not all developers are created equal. Not all developers can pick up new skills – and enjoy doing so. The vast majority of enterprise developers are business analysts who know how to configure business software like Salesforce or SAP. Many know SQL, also effectively a well established business language. Some may also know a programming language or two among the likes of Java, JavaScript, C# or even Python but that is not their primary job function or even interest. The mere concept of Map-Reduce might as well be a foreign language to this group of people.

Most IT departments don't understand the implications of adopting distributed storage tools like Hadoop or Cassandra. Expansion and scalability happens by adding new nodes, thus increasing IT maintenance costs. The reality is that vast majority of businesses do not need Hadoop. Dramatic improvements in storage technology, especially SSDs, declining costs of multi-core servers, and seamless support for replicas offered by environments like AWS mean that traditional well established data processing and reporting systems (i.e. SQL) can actually be better at “Big Data” than Hadoop.