Over the last two decades, thin provisioning and thin infrastructure have become common in the industry. Hence, the growth of AWS managed services.
It started with thin provisioning of storage and later with thin provisioning of virtual machines and virtual machines’ storage, and finally, with the ‘Pay As You Go’ model in the cloud, where you can consume resources on an on-demand basis and pay only for what you use. This brings us to the modern era with additional cloud services like managed databases and more.
These days, cloud providers offer great flexibility. But with large consumption comes large bills. Since its inception, the promise of the cloud was that you won’t need to invest in Capex (capital expenditure). This proposition is ideal for startups, as their goals involve rapid, fast-paced release cycles despite employing fewer engineers. But eventually, these costs catch up with them as they scale.
With a potential financial recession becoming more and more indefinite, it’s increasingly important for businesses to investigate the hidden costs of the services they’ve invested in.
In addition, it’s important to understand what happens when your lean or serverless architecture gets to a point where they are no longer cost-efficient or cannot hold your workloads.
Here are some of the most popular fully managed services that cloud vendors offer today:
- Managed databases like Aurora Serverless and DynamoDB.
- Serverless (Function as a service, API Gateway, etc).
- Managed Analytics solutions (MSK, EMR, etc).
- Managed infrastructure (like AWS Fargate, and AWS Batch).
In this post, I’ll cover several popular AWS managed services and will discuss their typical use cases as well as their hidden costs for growing and mature organizations. I will explain how customers who work with them on a large scale may require prior forethought to prepare their infrastructure, as well as an understanding that the total cost of the solution may rise as you grow.
DynamoDB (DDB) is a fully managed proprietary NoSQL database service that supports key-value and document data structures and is offered as part of the AWS portfolio.
In most circumstances, deploying and scaling a non-relational database requires you to manage and configure its core hardware and software components. AWS’s DynamoDB removes this barrier by automatically scaling throughput to meet the needs of your workloads and subdividing data as your needs expand. Because it clones data over three facilities in an AWS Region, you get increased availability and durability as well.
Thousands of companies use DDB as their core database including Amazon.com. DDB was one of the first fully managed No-SQL database services that was built on the cloud to serve cloud-native applications while providing sub-millisecond performance. DDB also provides global tables which give you the ability to replicate your table to multiple regions and to serve your customers in closer regions.
Though DDB is a managed NoSQL database, it has some limitations and features that should be understood in advance in order to ensure cost efficiency over time. Here are some tips and best practices to keep in mind:
Use S3 and hold a pointer in DDB for documents that require more than 400KB
The Document size in DDB is limited to 400KB, so you’ll need to use S3 in scenarios that require more storage.
Understand the cost Implications of using GSI and LSI:
Due to the replication of data in Global Secondary Indexes (GSI) and Local Secondary Indexes (LSI), using these features can get expensive. This happens because every time you update a table, the secondary table is also updated. As a result, you carry the additional cost of duplicating the data.
DDB Commitments can’t be applied to Global Table replications
Global Tables builds on the global Amazon DynamoDB footprint to provide you with a fully managed, multi-region, and multi-active database that delivers fast, local, read and write performance for massively scaled, global applications. This solution is ideal for disaster recovery. But unfortunately, Global Table replication cannot be covered with DDB commitments, and therefore, no discounts can be applied to the replicated table.
Make sure you understand your workload needs before using DDB:
Knowing how to predict your workloads (or testing it for a few days to understand your needs) can save you up to 50% of the Read and Write Capacity units (WCU and RCU), as you scale with DDB. These savings are due to the nature of DDB provisioning. DDB offers two types of provisioning, On-demand or Provisioned with Autoscaling. On-Demand provides the highest performance, but this, of course, comes with a higher price tag. Alternatively, you can use provisioned with Auto-scaling but the catch is that it only enables you to scale your table capacity four times per day, which can result in paying for capacity you’re not using. In addition, Auto-scaling is slow to respond to sudden peaks in demand, which is problematic for dynamic workloads. By having a better understanding of your workload needs, however, you can take advantage of autoscaling’s discounts over On-demand provisioning.
Use DAX to increase DynamoDB Performance
DynamoDB users can take advantage of DAX, a Managed ElastiCache solution, to improve reads. But as always, you should use it only after you’ve optimized your queries, otherwise, it may cost significantly more than you bargained for. One option that AWS recommends is using parallelism, which entails breaking up queries into smaller subqueries.
Amazon Aurora (Aurora) is a fully managed relational database engine that’s compatible with MySQL and PostgreSQL. Amazon built the Aurora engine seven years ago to handle one of the greatest bottlenecks in relational databases, storage performance. The Aurora engine separates the compute and storage engines. With Aurora, the storage is fully managed by AWS both for IOPS and capacity–a very useful feature. But unfortunately, it makes it very hard to predict the future cost of your database cluster.
Because Aurora has a very different architecture than normal EBS volumes, one read in one database is not equivalent to a read on EBS volumes. As a result, once you start using Aurora you cannot predict what the read/IOPS will look like, making it difficult to estimate the overall price. The best way to estimate IOPS is to run a POC on Aurora and measure the number of IOPS. This way, you’ll have a better understanding of your potential costs.
Amazon Aurora also has a serverless offering (now in Version 2) that offers fully managed PostgreSQL and MySQL Relational databases. The premise of Aurora Serverless is to get both compute and storage managed by AWS. This makes sense if you wish to start with a thin footprint and grow as your needs increase, while focusing on building your application and not on infrastructure. Though using Aurora has lots of benefits, both Aurora and Aurora Serverless have some key drawbacks that should be taken into consideration:
- Using Aurora Serverless is expensive unless you plan to use it for workloads that are stopped on weekends. If you are planning to run your databases 24/7, using the regular Aurora or RDS (Amazon Relational Databases) will make more sense.
- No root access is available. This is true for both RDS and Aurora. In some scenarios, you’ll need root access to connect to your database, but unfortunately, this cannot be achieved with Aurora clusters, which can impact root cause analysis and visibility.
AWS Fargate is a serverless platform for containers that offers a pay-as-you-go compute engine that lets you focus on building applications without managing servers.
AWS Fargate is compatible with both ECS (Amazon Elastic Container Service) and Amazon EKS (Amazon Elastic Kubernetes Service). AWS Fargate is a compute engine for deploying and managing containers without having to manage any of the underlying infrastructures. Fargate makes it easy to scale your applications by launching tens to tens of thousands of containers in seconds.
Although this sounds great, Fargate has a few limitations that you need to be aware of:
- When running On-Demand, you may find Fargate to be very expensive. To address this, you can purchase compute Savings Plans to cover your Fargate workloads.
- With Fargate, you do not need to choose the family or any of the specifications of the underlying infrastructure. This means that you can ask for memory and CPU only. As a result, performance is often better with EC2. Some customers will see this limitation as a blocker to adopting Fargate.
Troubleshooting Fargate is quite complex. Since you don’t have full visibility into the infrastructure, you may find it difficult to find the root cause of performance issues, which means you may need to contact AWS support to fix issues should they arise. In addition, since you don’t have access to the containers, make sure you export everything around the deployed environment ( i.e. metrics, logs, etc) and filter, view, and debug them in CloudWatch.
Using fully-managed services has many pros for a growing company that lacks all the resources of a mature organization, but there are some caveats that you need to take into account when you reach a certain level of scale. In most cases, you’ll find it very expensive or hard to maintain, and then you’ll find yourself needing to re-architect your infrastructure, which is a real pain once everything is already in place.
However, deploying your own infrastructure requires many engineering hours around deployment, updating, patching, security monitoring, and more. This will cost your organization both time as well as engineering salaries. Managed services eliminate, or at the very least, greatly reduce these burdens and enable engineers to work more on their core responsibilities.
By planning ahead and taking all of these issues into account before you implement managed services, you will be better equipped to take advantage of these technologies and reduce the number of menial tasks for your engineers for years to come.
Whether you choose to use AWS-managed services or not, you can definitely benefit from more automation. Learn how Renovisor can help you automate cloud management so you can save up to 60% on EC2 and EBS costs.