<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=522217871302542&amp;ev=PageView&amp;noscript=1">

Cloud This and Cloud That...Determining an Effective, Sustainable Cloud Data Warehouse Strategy

Posted by Colleen Balda on Mar 26, 2018 12:00:00 PM

How many times have organizations reached a “pivot point” concerning technologies? Trying to stay relevant is making businesses dizzy. Don’t get me wrong – pivots are often necessary for reaching the end goal, but spinning in circles doesn’t get an organization anywhere. 

It helps to have a game plan to remain on course, especially if the customer is a data-driven organization trying to achieve a competitive edge. Many organizations are re-evaluating the traditional pillars of analytics environments and looking to pivot towards a new playing field. This new playing field involves moving away from onsite enterprise data warehouses attempting to house “one version of the truth” to a platform that accommodates greater speed, agility and cost-efficiency.

 Even though it’s not a brand new concept, Cloud Data Warehousing or Data Warehousing as a Service (DWaaS) is a playing field none the less for the future of data management. Major players have already engaged, in fact, taking the lead advantage; Amazon Redshift appeared on the scene in 2012. Since that time an entire host of other offerings have entered the market. Traditional vendors like IBM, SAP, Microsoft and Oracle jumped into the race. Others, born from specific cloud vendors like Snowflake and Google’s BigQuery have also emerged. Let’s take a closer look at the ins and outs of navigating cloud data warehouses.

What's the Need?

Moving a data warehouse to the cloud has advantages only gained if proper considerations and strategies are made. The evaluation criteria for a data warehouse environment in the cloud are very different from traditional on-premise applications. This is due to the differences in speed, operability and real-time impact of traditional tools. 

There are a few key considerations an organization must evaluate:

  • Size: With an on-premise solution, sizing a data warehouse is critical. Organizations need to be as precise as possible for projecting growth and usage estimates to ensure they don’t over or under purchase.
  • Scale: With Cloud Data Warehouse solutions, an organization can start with today’s demands and grow or shrink as future requirements evolve. It’s important to note some offerings like Amazon Redshift scale solely based on size usage, while other products like Snowflake or Microsoft SQL Data Warehouse offer increased flexibility to scale compute and storage separately. This flexibility allows businesses to scale up for a large monthly, quarterly, annual review, planning process or even for nightly Extract-Transform-Load (ETL) jobs without incurring additional cost at off-peak periods.
  • As a Service: One of the largest benefits organizations see in moving to a Cloud Data Warehouse is the ability to make purchases in a SaaS model. The calculations vary across vendors but are generally a combination of usage and resource allocation on a monthly basis to determine cost. As an increased number of solutions provide scaling for both compute and storage independently, the flexibility increases allowing organizations to pay for usage incurred.

IT Footprint: Many cloud services run mission-critical software and the demand for availability has increased. Unloading the responsibility of uptime to a third party has its pros and cons, but a data warehouse is often a place where periods of downtime won’t cause critical business failure. Significant cost savings can be gained by not having to buy and maintain the platform internally.

Key Differentiators

  • Ingestion and Presentation of Data – Different vendors have inherent capabilities to handle different types of data.
    1. Some vendors can handle JSON semi-structured files natively. For example, Snowflake uses an extended SQL syntax to access the data directly and present it to BI or analysis tools for use.
    2. Other vendor tools rely on transformation processes, big data or data lakes for consumption and storage of the raw data at scale. These vendors then use an ETL process to transform the data into a finalized reporting structure.
  • Unified Platforms – If an organization intends to expand into big data or already manages big data, it makes the most sense to maintain a unified platform.
    1. Microsoft, AWS, IBM, etc. are diligent about making sure platforms play well together and can be used in tandem for analytics and reporting.
  • Consumption and Pricing Models – The largest differentiator for defining an organization’s cloud data warehouse strategy is the consumption and pricing model desired.
    1. Amazon Redshift is based on a straight calculation of storage and compute capacity per hour.
    2. Microsoft’s Azure SQL Data Warehouse and Snowflake allow scaling of compute and storage independently. Snowflake also automatically manages the compute cluster to prevent charges when not in use, which offers savings for sporadic workloads.
    3. Google’s Big Query charges based on the number and size of queries run against it.

Let's Not Overlook the Challenges

Cloud or On-Prem – Organizations shouldn’t stress about the location of their data warehouse environments. Every DWaaS available offers tools to load the warehouse. Jobs for ETL remain important. As complexity increases, governance, data quality and security are also considerations no matter  the location of the warehouse. Managing data in motion remains an important task regardless of the chosen platform.

Advertised “Ease of Use” – Many vendors tout the ease of use of their solution. Organizations must remember, there’s no magic wand, meaning no solution is truly plug and play. Optimization tasks like clustering keys, distribution decisions and compression levels don’t go away. Historically, popular solutions like Netezza and Greenplum offered amazing performance boosts, but tweaking and configuration were necessary to reach top performance.

A Reliable Navigation Tool

Tech Data offers its customers a unique Proof of Value (POV) concept service engagement called 30 Days to Cloud. Customers can white label this offering and present it to end users to assist them in building a cloud strategy. The team at Tech Data provides the services and thought leadership necessary to help customers feel confident in choosing the right option.

With Tech Data’s assistance, you’ll be able to showcase cloud integration tools, DWaaS or cloud business intelligence tools designed to solve a specific business problem. This offering guides an end user through the complex landscape of cloud vendors to solve a business need. In just 30 Days, end users can confidently prove out a proposed long-term strategy to move to the cloud.

Tags: Analytics, DWaaS, Data Mangement, Cloud Data Warehouse, Big Data Solutions