Archive

The Dulin Report

Browsable archive from the WordPress export.

Results (182)

Strategic activity mapping for software architects May 25, 2025 On the role of Distinguished Engineer and CTO Mindset Apr 27, 2025 The future is bright Mar 30, 2025 2024 Reflections Dec 31, 2024 My giant follows me wherever I go Sep 20, 2024 The day I became an architect Sep 11, 2024 Are developer jobs truly in decline? Jun 29, 2024 Leadership is About "We," Not "I" Jun 9, 2024 Form follows fiasco Mar 31, 2024 Software Engineering is here to stay Mar 3, 2024 Thanksgiving reflections Nov 23, 2023 Safe and Secure: Seminar on Cybersecurity for Seniors and Their Families Nov 5, 2023 On luck and gumption Oct 8, 2023 Some thoughts on recent RTO announcements Jun 22, 2023 On Amazon Prime Video’s move to a monolith May 14, 2023 One size does not fit all: neither cloud nor on-prem Apr 10, 2023 Some thoughts on the latest LastPass fiasco Mar 5, 2023 Comparing AWS SQS, SNS, and Kinesis: A Technical Breakdown for Enterprise Developers Feb 11, 2023 Should today’s developers worry about AI code generators taking their jobs? Dec 11, 2022 Working from home works as well as any distributed team Nov 25, 2022 Things to be Thankful for Nov 24, 2022 If we stop feeding the monster, the monster will die Nov 20, 2022 Why I am a poll worker since 2020 Nov 11, 2022 Why you should question the “database per service” pattern Oct 5, 2022 Book review: Clojure for the Brave and True Oct 2, 2022 The Toxic Clique Sep 28, 2022 Stop Shakespearizing Sep 16, 2022 Using GNU Make with JavaScript and Node.js to build AWS Lambda functions Sep 4, 2022 Why don’t they tell you that in the instructions? Aug 31, 2022 Monolithic repository vs a monolith Aug 23, 2022 All developers should know UNIX Jun 30, 2022 Keep your caching simple and inexpensive Jun 12, 2022 Scripting languages are tools for tying APIs together, not building complex systems Jun 8, 2022 Good developers can pick up new programming languages Jun 3, 2022 Java is no longer relevant May 29, 2022 Automation and coding tools for pet projects on the Apple hardware May 28, 2022 There is no such thing as one grand unified full-stack programming language May 27, 2022 Am I getting old or is it really ok now to trash your employer on social media? May 25, 2022 Peloton could monetize these ideas if they only listen May 15, 2022 Most terrifying professional artifact May 14, 2022 If you haven’t done it already, get yourself a Raspberry Pi and install Linux on it May 9, 2022 Good idea fairy strikes when you least expect it May 2, 2022 Best practices for building a microservice architecture Apr 25, 2022 TypeScript is a productivity problem in and of itself Apr 20, 2022 In most cases, there is no need for NoSQL Apr 18, 2022 Tools of the craft Dec 18, 2021 Kitchen table conversations Nov 7, 2021 Application developers like to think their app is the only one Apr 5, 2021 Node.js and Lambda deployment size restrictions Mar 1, 2021 A year of COVID taught us all how to work remotely Feb 10, 2021 Should we abolish Section 230 ? Feb 1, 2021 This year I endorse Joe Biden for President Aug 26, 2020 True identity verification should require a human Mar 16, 2020 Making the best of remote work - Coronavirus blues Mar 16, 2020 Perhaps something good will come out of the 2020 Coronavirus hysteria Mar 11, 2020 The passwords are no longer a necessity. Let’s find a good alternative. Mar 2, 2020 What programming language to use for a brand new project? Feb 18, 2020 On elephant graveyards Feb 15, 2020 TDWI 2019: Architecting Modern Big Data API Ecosystems May 30, 2019 Configuring Peloton Apple Health integration Feb 16, 2019 All emails are free -- except they are not Feb 9, 2019 Returning security back to the user Feb 2, 2019 Microsoft acquires Citus Data Jan 26, 2019 Which AWS messaging and queuing service to use? Jan 25, 2019 Facebook vastly improved their advertiser vetting process Jan 21, 2019 Using Markov Chain Generator to create Donald Trump's state of union speech Jan 20, 2019 Adobe Creative Cloud is an example of iPad replacing a laptop Jan 3, 2019 The religion of JavaScript Nov 26, 2018 Teleportation can corrupt your data Sep 29, 2018 Let’s talk cloud neutrality Sep 17, 2018 A conservative version of Facebook? Aug 30, 2018 Fixing the Information Marketplace Aug 26, 2018 On Facebook and Twitter censorship Aug 20, 2018 What does a Chief Software Architect do? Jun 23, 2018 Apple Watch Series 3 is a gem worth waiting for May 28, 2018 I downloaded my Facebook data. Nothing there surprised me. Apr 14, 2018 Facebook is the new Microsoft Apr 14, 2018 Quick guide to Internet privacy for families Apr 7, 2018 Leaving Facebook and Twitter: here are the alternatives Mar 25, 2018 When politics and technology intersect Mar 24, 2018 Nobody wants your app Aug 2, 2017 TypeScript starts where JavaScript leaves off Aug 2, 2017 Node.js is a perfect enterprise application platform Jul 30, 2017 Design patterns in TypeScript: Factory Jul 30, 2017 Design patterns in TypeScript: Chain of Responsibility Jul 22, 2017 I built an ultimate development environment for iPad Pro. Here is how. Jul 21, 2017 Singletons in TypeScript Jul 16, 2017 The technology publishing industry needs to transform in order to survive Jun 30, 2017 Rather than innovating Walmart bullies their tech vendors to leave AWS Jun 27, 2017 Architecting API ecosystems: my interview with Anthony Brovchenko of R. Culturi Jun 5, 2017 TDWI 2017, Chicago, IL: Architecting Modern Big Data API Ecosystems May 30, 2017 I tried an Apple Watch for two days and I hated it Mar 30, 2017 Copyright in the 21st century or how "IT Gurus of Atlanta" plagiarized my and other's articles Mar 21, 2017 Emails, politics, and common sense Jan 14, 2017 Online grocers have an additional burden to be reliable Jan 5, 2017 Windows 10: a confession from an iOS traitor Jan 4, 2017 Here is to a great 2017! Dec 26, 2016 The smartest person in the room Dec 24, 2016 Collaborative work in the cloud: what I learned teaching my daughter how to code Dec 10, 2016 Apple’s recent announcements have been underwhelming Oct 29, 2016 Don't trust your cloud service until you've read the terms Sep 27, 2016 I am addicted to Medium, and I am tempted to move my entire blog to it Sep 9, 2016 What I learned from using Amazon Alexa for a month Sep 7, 2016 Why I switched to Android and Google Project Fi and why should you Aug 28, 2016 Praising Bank of America's automated phone-based customer service Aug 23, 2016 Amazon Alexa is eating the retailers alive Jun 22, 2016 In search for the mythical neutrality among top-tier public cloud providers Jun 18, 2016 In Support Of Gary Johnson Jun 13, 2016 Files and folders: apps vs documents May 26, 2016 What can we learn from the last week's salesforce.com outage ? May 15, 2016 Why it makes perfect sense for Dropbox to leave AWS May 7, 2016 JEE in the cloud era: building application servers Apr 22, 2016 Let's stop letting tools get in the way of results Apr 10, 2016 Managed IT is not the future of the cloud Apr 9, 2016 JavaScript as the language of the cloud Feb 20, 2016 LinkedIn needs a reset Feb 13, 2016 In memory of Ed Yourdon Jan 23, 2016 OAuth 2.0: the protocol at the center of the universe Jan 1, 2016 Our civilization has a single point of failure Dec 16, 2015 Operations costs are the Achille's heel of NoSQL Nov 23, 2015 IT departments must transform in the face of the cloud revolution Nov 9, 2015 Banking Technology is in Dire Need of Standartization and Openness Sep 28, 2015 I Stand With Ahmed Sep 19, 2015 Setting Up Cross-Region Replication of AWS RDS for PostgreSQL Sep 12, 2015 Top Ten Differences Between ActiveMQ and Amazon SQS Sep 5, 2015 We Live in a Mobile Device Notification Hell Aug 22, 2015 What Every College Computer Science Freshman Should Know Aug 14, 2015 Ten Questions to Consider Before Choosing Cassandra Aug 8, 2015 On Maintaining Personal Brand as a Software Engineer Aug 2, 2015 Big Data Should Be Used To Make Ads More Relevant Jul 29, 2015 Social Media Detox Jul 11, 2015 The Three Myths About JavaScript Simplicity Jul 10, 2015 Book Review: "Shop Class As Soulcraft" By Matthew B. Crawford Jul 5, 2015 Attracting STEM Graduates to Traditional Enterprise IT Jul 4, 2015 Your IT Department's Kodak Moment Jun 17, 2015 The longer the chain of responsibility the less likely there is anyone in the hierarchy who can actually accept it Jun 7, 2015 Big Data is not all about Hadoop May 30, 2015 Smart IT Departments Own Their Business API and Take Ownership of Data Governance May 13, 2015 Guaranteeing Delivery of Messages with AWS SQS May 9, 2015 We Need a Cloud Version of Cassandra May 7, 2015 The Clarkson School Class of 2015 Commencement speech May 5, 2015 The Clarkson School Class of 2015 Commencement May 5, 2015 Why I am not Getting an Apple Watch For Now: Or Ever Apr 26, 2015 My Brief Affair With Android Apr 25, 2015 Apple is (or was) the Biggest User of Apache Cassandra Apr 23, 2015 Building a Supercomputer in AWS: Is it even worth it ? Apr 13, 2015 Ordered Sets and Logs in Cassandra vs SQL Apr 8, 2015 Exploration of the Software Engineering as a Profession Apr 8, 2015 What can Evernote Teach Us About Enterprise App Architecture Apr 2, 2015 Two developers choose to take a class Apr 1, 2015 Microsoft and Apple Have Everything to Lose if Chromebooks Succeed Mar 31, 2015 Do not apply data science methods without understanding them Mar 25, 2015 Finding Unused Elastic Load Balancers Mar 24, 2015 Where AWS Elastic BeanStalk Could be Better Mar 3, 2015 On apprenticeship Feb 13, 2015 Trying to Replace Cassandra with DynamoDB ? Not so fast Feb 2, 2015 Configuring Master-Slave Replication With PostgreSQL Jan 31, 2015 On Managing Stress, Multitasking and Other New Year's Resolutions Jan 1, 2015 Why I am Tempted to Replace Cassandra With DynamoDB Nov 13, 2014 Software Engineering and Domain Area Expertise Nov 7, 2014 How We Overcomplicated Web Design Oct 8, 2014 Docker can fundamentally change how you think of server deployments Aug 26, 2014 Infrastructure in the cloud vs on-premise Aug 25, 2014 Everyone Wants to Be a Tailor Aug 23, 2014 Wall St. wakes up to underinvestment in OMS Aug 21, 2014 Cassandra: a key puzzle piece in a design for failure Aug 18, 2014 Software Engineers Are Not Doctors Aug 3, 2014 Cassandra: Lessons Learned Jun 6, 2014 On anti-loops Mar 13, 2014 Things I wish Apache Cassandra was better at Feb 12, 2014 On working from home and remote teams Nov 17, 2013 Thanking MIT Scratch Sep 14, 2013 "Hello, World!" Using Apache Thrift Feb 24, 2013 Have computers become too complicated for teaching ? Jan 1, 2013 Thoughts on Wall Street Technology Aug 11, 2012 Scripting News: After X years programming Jun 5, 2012 Happy New Year! Jan 1, 2012 Java, Linux and UNIX: How much things have progressed Dec 7, 2010 Eminence Grise: A trusted advisor May 13, 2009 We are all contract professionals Jan 13, 2007 Best way to start writing an XSLT Jun 25, 2006 You can always learn from someone better than yourself Feb 11, 2006

Trying to Replace Cassandra with DynamoDB ? Not so fast

February 2, 2015

In November last year I pointed out how tempted I was to replace Cassandra with DynamoDB. Since then I have done some research and things are not as straightforward as they may seem at first.



I'd like to revisit my post and clarify a few things. On elasticity of Cassandra I said the following:



Scaling a Cassandra cluster involves adding new nodes. Each additional node require hours of baby sitting. The process of adding a node takes a few mins, but bootstrapping can take hours. If you are using tokens you are in a bigger pickle since you have to compute just the right balance, move tokens around, and clean up (* we are using tokens since this is a legacy production cluster, and there is no safe and easy way to migrate to vnodes). Once you have added a node, it becomes a fixed cost plus extra network charges. If you ever want to scale down you have to work backwards and decommission extra nodes, which takes hours, and then you have to rebalance your cluster again if you're still using tokens.

Going back to DynamoDB, the only thing I need to care about is IOPS. What is my minimum ? What is my maximum ? How much am I willing to pay. Period. End of story.



Not so fast. The story doesn't actually end there. As it turns out, there is a very important factor that I have not considered. What I did not consider was Cassandra's burst performance. Allow me to explain.



Suppose your application experiences extended periods of low traffic, with significant bursts of activity every few hours. For example, overnight there are batch processes that update the data and then come morning thousands of mobile devices wake up and download the data.



For the sake of the conversation lets say the number of devices is 1000. As per SLA the users expect to get their data in seconds. Let's also say that as per SLA you have to guarantee that up to 250 concurrent requests must return in under 10 seconds. What that means is that if overnight you ran a job for, say, 10 hours that updated 1000 records per device, when those devices wake up you will need to read 250*1000/10=25000 runits of data per second.



Now, Cassandra sitting on a c3.xlarge AWS instance and using SSDs for storage will be more than happy to oblige. DynamoDB, on the other hand, is a bit more intricate.



If you wanted to pay for 25000 read capacity units, you don't really have any problems. However, a DynamoDB table with that much of provisioned capacity is actually orders of magnitude more expensive than a manually configured Cassandra cluster capable of this performance.



On the other hand, it may seem that you could use DynamoDB auto-scaling. The problem, however, is that it can take hours to go from 100 capacity units to 25000 units (at least per my benchmarks). Your users won't understand your excuses for not complying with the SLA.



As it turns out, DynamoDB makes a heck of a lot of sense if you have a steady-stream write and read workloads. You may be able to write into DynamoDB via SQS so you can deal with bursts of activity. In fact, comparison with electric utility is the best analogy I could come up with. Imagine if your electric company took hours to ramp up capacity every morning when people wake up and turn the lights on. Likewise, what I would like to see from Amazon is a DynamoDB pricing model that works like this:




  1. You provision a maximum “fuse” capacity you are willing to pay for. There is a one-time fee to buy the “fuze.” Continuing with the electrical utility analogy this is like paying to get connected to the grid and purchasing a meter and a fuze panel.

  2. You are charged exclusively for the utilization. Once the “fuze” is in place, you only pay for capacity you actually use. If you go for an hour without accessing your table at all, you pay zero for that hour. If you use 12367 read units per second for 25 minutes, you pay for that. If you reach the capacity of your “fuze” you get an exception and you have to deal with it in your application.



I am keeping an eye on changes to the DynamoDB pricing model and I look forward to Amazon improving the platform. Until then, I guess I am stuck with Cassandra.