An Ongoing Challenge for AI/ML in the Traditional Enterprise: Getting Data to the Cloud

Artificial Intelligence (AI) and Machine Learning (ML) are among the “buzziest” of the buzz words in the technology sector, especially in the cloud market.  The hyperscalers and the ISVs that surround them (Immuta included) leverage these terms heavily in their marketing material and campaigns. For the hyperscalers in particular, what are often considered “advanced services” represent the front lines of competition and giants including Amazon Web Services (AWS), Microsoft, and Google are investing aggressively to win mind, and hopefully, market share for these services.  AWS, for example, has taken measures in the past to drive adoption of key services that it sees as differentiators, namely what are known as the KRADL services (Kinesis, Redshift, Aurora, DynamoDB, Lambda).  Similarly, one of the key themes at the most recent AWS re:Invent was Machine Learning, with a number of highly-promoted announcements relating to Amazon SageMaker, and simplified ways to train machine learning systems and adopt new technology. More recently, AWS has evolved its acronyms, with DAIMLA surfacing as a representation for “Data, AI/ML, Analytics”.

The rate of innovation in the AI/ML space is astonishing.  Add to that major trends around serverless and microservices, as well as more widespread acknowledgement that “hybrid” is a real thing, and we’ve got a market with endless possibilities and growth potential along multiple vectors.  

However, there’s a key inhibitor to the growth of cloud-based AI adoption… data. Data is the fuel that drives advanced analytics and that ultimately enables sophisticated algorithms and the data scientists that create them to create business value.  It’s this business value that today’s enterprise is betting on in terms of creating competitive advantage vs. organizations that are less aggressively adopting analytics at the core of their business, regardless of their industry. According to Gartner, the primary business value that AI will drive in the near-term is customer experience improvement, followed by efficiency improvements.  As AI use becomes more prevalent, it is expected to be a key driver for new revenue streams.

This brings us to the primary point of this post.  While AI should be at the top of every business’ priority list, it is fundamentally ineffective unless the data exists and is accessible under the right circumstances.  Immuta enables businesses to overcome the challenges associated with compliant access to data such as those associated with HIPAA or GDPR, and to manage the risk associated with data misuse.  However, in the context of cloud, if the data isn’t available, then cloud-based analytics and many of the new services that the cloud providers have been introducing, may be ineffective. Immuta runs anywhere, so data availability will never slow us down!

To simplify market segmentation, we must look at organizations as falling into one of two categories:

  1. Cloud-native – meaning that they were established in the recent cloud economy, and run most if not all of their IT systems on a public cloud platform.
  2. Traditional enterprise – typically a more established business that was established prior to the rise of the public cloud providers and that now faces cloud adoption and migration challenges.

Gartner states that by 2023, 75% of all databases will be on a cloud platform.  However, 2023 is still a long way out, and much of the mission critical data that powers business remains on-premise today.  Traditional enterprises are faced with several issues here – getting data to the cloud, organizing that data once there, providing compliant access to the right data per organizational policies, and then exposing that data to the analytical tools best suited for the job.

Getting data to the cloud is often the longest pole in the proverbial “tent” since data is often messy, and large.  Another critical issue that we see is that data traditionally was protected by home grown systems on-premise that will not exist in the cloud.  Organizations sometimes realize too late that once they migrate their data to the cloud, they can’t protect it like they used to and need to consider different and more flexible strategies to enable true data analytics.  The concept of a Data Lake has become a popular way to land data in cloud-based databases and storage.

Depending on the size and scope of the data sets to be moved, it makes sense for organizations to enlist the services of a cloud professional services organization that can offer a comprehensive move plan that considers all aspects of the migration including organizational change or impact, cost, ROI, business impact and ultimately the physical movement of the appropriate data sets as needed.  Note that moving analytical processes to the cloud for the sake of moving them is not a sound strategy. There should be business justification for doing so, such as improved agility, lower costs, or increased security. Businesses need to evaluate which of these common cloud benefits make the most sense for their business on a workload by workload basis.

***

Lifting and shifting data science operations to the cloud is expensive and risky. Click here to learn how to save more than 60% of cloud infrastructure costs while enforcing required privacy controls and governance across all of your enterprise data.