Archive

The Dulin Report

Browsable archive from the WordPress export.

2015

On Managing Stress, Multitasking and Other New Year's Resolutions Jan 1, 2015 Configuring Master-Slave Replication With PostgreSQL Jan 31, 2015 Trying to Replace Cassandra with DynamoDB ? Not so fast Feb 2, 2015 On apprenticeship Feb 13, 2015 Where AWS Elastic BeanStalk Could be Better Mar 3, 2015 Finding Unused Elastic Load Balancers Mar 24, 2015 Do not apply data science methods without understanding them Mar 25, 2015 Microsoft and Apple Have Everything to Lose if Chromebooks Succeed Mar 31, 2015 Two developers choose to take a class Apr 1, 2015 What can Evernote Teach Us About Enterprise App Architecture Apr 2, 2015 Exploration of the Software Engineering as a Profession Apr 8, 2015 Ordered Sets and Logs in Cassandra vs SQL Apr 8, 2015 Building a Supercomputer in AWS: Is it even worth it ? Apr 13, 2015 Apple is (or was) the Biggest User of Apache Cassandra Apr 23, 2015 My Brief Affair With Android Apr 25, 2015 Why I am not Getting an Apple Watch For Now: Or Ever Apr 26, 2015 The Clarkson School Class of 2015 Commencement May 5, 2015 The Clarkson School Class of 2015 Commencement speech May 5, 2015 We Need a Cloud Version of Cassandra May 7, 2015 Guaranteeing Delivery of Messages with AWS SQS May 9, 2015 Smart IT Departments Own Their Business API and Take Ownership of Data Governance May 13, 2015 Big Data is not all about Hadoop May 30, 2015 The longer the chain of responsibility the less likely there is anyone in the hierarchy who can actually accept it Jun 7, 2015 Your IT Department's Kodak Moment Jun 17, 2015 Attracting STEM Graduates to Traditional Enterprise IT Jul 4, 2015 Book Review: "Shop Class As Soulcraft" By Matthew B. Crawford Jul 5, 2015 The Three Myths About JavaScript Simplicity Jul 10, 2015 Social Media Detox Jul 11, 2015 Big Data Should Be Used To Make Ads More Relevant Jul 29, 2015 On Maintaining Personal Brand as a Software Engineer Aug 2, 2015 Ten Questions to Consider Before Choosing Cassandra Aug 8, 2015 What Every College Computer Science Freshman Should Know Aug 14, 2015 We Live in a Mobile Device Notification Hell Aug 22, 2015 Top Ten Differences Between ActiveMQ and Amazon SQS Sep 5, 2015 Setting Up Cross-Region Replication of AWS RDS for PostgreSQL Sep 12, 2015 I Stand With Ahmed Sep 19, 2015 Banking Technology is in Dire Need of Standartization and Openness Sep 28, 2015 IT departments must transform in the face of the cloud revolution Nov 9, 2015 Operations costs are the Achille's heel of NoSQL Nov 23, 2015 Our civilization has a single point of failure Dec 16, 2015

Apple is (or was) the Biggest User of Apache Cassandra

April 23, 2015

One thing I did not realize about Cassandra is that Apple is (or was) one of the biggest Cassandra users out there:
Word in Goldmacher's circles is that Apple will be “replacing” its huge Cassandra noSQL implementation with FoundationDB. Apple uses Cassandra for “iMessage, iTunes passwords, a bunch of stuff,” he says.

In fact, Apple is touted as having one of the largest production deployments of Cassandra of all, with over 75,000 nodes storing over 10 petabytes of data. Cassandra is a free and open source database with a commercial version offered by DataStax

The article further states that FoundationDB can operate on cheaper hardware, less nodes, and faster. It states that Apple could reduce their cluster size by 5-10%.

5-10% off of a cluster that size is not something to be sneezed at. We are talking upwards of 7500 servers and millions of dollars in savings in hardware and even more devops costs.

Since RAM is the new disk, disk is the new tape an in-memory data store backed by a disk is going to support reads that are orders of magnitude faster than a data store like Cassandra that uses disk as a primary storage mechanism. For example, Redis has a data model that is similar to Cassandra but it is entirely in-memory.

Of course, it all depends on your requirements. If your needs are to accumulate massive amounts of information that is queried infrequently or in off-peak batches, then Cassandra is very appropriate. But if you require consistent performance for both reads and writes you should look elsewhere.