Organizations are shifting from on-premises data storage to the cloud, with the goal of profiting from the cloud’s flexibility, scalability, and affordability as quickly as possible.
In fact, Gartner predicts that “90% of data management tools and platforms that fail to support multi-cloud and hybrid capabilities will be set for decommissioning” by 2026. What’s more, 81% of respondents to our 2022 State of Data Engineering Survey projected that their company would be primarily cloud-based within 12-24 months. Contemporary cloud use is ubiquitous, and will only become more standardized moving forward.
The move to the cloud requires many careful considerations, most notably around data privacy. When data resides in the cloud, proper governance and data access controls are essential to data security. It is possible to set up and maintain these access controls on your own, but it may take more time and effort than you’d expect, particularly if your number of cloud data platforms, data sources, or data users is on track to grow.
Using our practitioner-and-partner-validated Cost Estimator, let’s take a look at what this do-it-yourself (DIY) approach would entail.
Cloud-Based Access Control by the Numbers
As you start to build your DIY access control model, there are a few important numbers that you’ll need to discern:
The first is the total number of data consumers involved in your network. This term might sound broad, but it’s for good reason. A data consumer is any user that makes use of the data involved in your storage ecosystem. These users are likely accessing data for data science or business intelligence (BI) purposes.
The next important factor is the number of protected tables in your data ecosystem. A protected table is defined as any table that has a form of restricted access applied to it. Whether the table includes personally identifiable information (PII), protected health information (PHI), or any other form of sensitive data, organizations have a responsibility to ensure the right mechanisms are in place to prevent it from unauthorized access. Does your marketing team need to see customers’ health metrics or credit card information? Probably not.
Data Privacy Rules
The third integral factor to consider is the number of rules that your data is subject to. This is another purposefully broad category. A rule is any sort of government regulation, internal rule, data sharing contract, or beyond, that places some sort of restriction on who can do what with which data or at what granularity they can see the data. In today’s climate, these rules are created and expanded at an extremely high rate, with data increasingly becoming the subject of privacy-minded measures. Our State of Data Engineering Survey noted that 88% of respondents worked for an organization that was subject to one or more of these regulations.
For this example, let’s attach the following standard numbers to each of these fields:
- 50 Data Users
- 10 Protected Tables
- 10 Relevant Rules
This will help us understand what a typical organization should expect when creating cloud-based data access control functions from scratch.
Guaranteeing Compliance: Data Laws, Rules, and Regulations
The more standardized data use becomes in business and government practices, the more it will continue to be regulated. While these data protection measures are created and applied for good reason, they can very easily become difficult to keep up with.
One prevalent genre of rules are compliance laws and regulations. Enforced by governments around the world, these laws take an overarching approach to protecting consumer, employee, citizen, business, and agency data. Some recognizable examples of compliance laws are the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and the Health Insurance Portability and Accountability Act (HIPAA). While these are three of the more well-known compliance laws, the full list is considerably longer.
While these laws are enforced somewhat broadly, relevant rules can also come on a more granular level. For instance, your organization may have its own internal rules regarding data use. Whether these rules pertain to user access, data storage, or beyond, they must be taken into account when building a data access control framework. Similar restrictions can arise from contractual agreements or data sharing measures between organizations. As data is continually created and shared or sold among groups, agreements are often put in place to guarantee its usage is safe and legitimate.
At the end of the day, there are many ways in which rules and regulations come into play with cloud data access controls. To continue this example, let’s assume that the following rules are relevant to your organization:
- Regulatory Measures (GDPR, CCPA, HIPAA)
- Contractual Legal Rules
- Data Sharing or Use Agreements
- By Business Line (Internal Sharing)
- Geography-Specific Rules
Cloud Framework Complexity
The last questions we’ll want to consider here are around the complexity of your cloud framework.
When subject to regulations like GDPR or contractual agreements, data access control rules need to be created and maintained specifically to achieve compliance. We’ll assume the rules we create in this cloud storage system are of average complexity, meaning secure tables are mapped to user roles where only certain rows, columns, or cells are visible to certain users, and no dynamic data masking or privacy controls are required.
We must also consider how often these rules change annually. Depending on the access control model being used, data access policies may not have the capacity to automatically adapt. Many organizations currently employ an outdated form of role-based access control (RBAC) that does not possess the same dynamic nature and scalability as attribute-based access control (ABAC). This means that a change in company structure, new hires, or other similar factors could require policies and roles to be edited or completely reworked. Again, we’ll assume that the rules will change at an average rate of 5-10 times per year.
[Tip] Read more about access control’s necessary evolution towards ABAC here.
Our final question is how often you’ll need to audit this cloud system for compliance. Data use compliance refers to any standards and regulations that govern how companies and government organizations keep data secure, private, and safe from breaches or damage. Audits are performed to examine whether organizations’ data use practices are compliant , and can take a fair amount of time and manual effort from data and legal teams. We’ll assume this organization will audit at an average rate of 3-5 times per year.
DIY Cloud-Based Access Control Results
If we take these standard example values and compute the costs assuming an average 40-hour work week and data platform owner salary of $60/hour, the yearly cost for DIY cloud-based access control would equal $933,540.
Yes, that’s correct. For one year of DIY access control creation and maintenance, it could cost your organization close to a million dollars. And this is just the monetary costs. When you factor in the hours it would take for manual updates and changes to be made to the framework whenever a new rule or organizational change arises, the hits to productivity are immense.
This cost also assumes a cloud system that is static. Whether cloud platform expansion, company growth, or the introduction of more data sources, the numbers we used are bound to rise. As growth occurs in any of these forms, this cost will also increase. While your DIY model could be workable at first, its potential to scale is limited by budget, staffing, and inevitable growth. This is not to mention the additional fines and penalties associated with issues of non-compliance that can arise if your framework is not properly configured.
Automated Cloud-Based Access Control
Through this example, we’ve seen the potential monetary and performance costs that DIY cloud-based access control would require of a modern organization. While this approach is certainly possible, it is ultimately very expensive and time-consuming, especially for organizations at the start of their cloud migration journey. It’s important to explore other options that can achieve proper compliance needs and are scalable, without skyrocketing costs or massive hits to productivity.
This is where automated cloud data access control can become an essential part of an organization’s cloud ecosystem. A tool with universal cloud compatibility can apply across your cloud network, allowing for flexibility and scalability as you expand data sources and analytics tools. By using dynamic attribute-based access control (ABAC), policies can be built simply based on compliance needs, and changed or scaled as necessary. Universal data policy enforcement capabilities can also facilitate audit logs on every query, creating a system that can be checked for compliance at any point.
An automated cloud-based access control tool can essentially eliminate the burdens associated with a DIY approach. In a head-to-head comparison with Apache Ranger, Immuta’s automated ABAC approach reduced necessary policy changes by 75x, resulting in over $300,000 in savings. With automation that removes the time-consuming manual upkeep of a homegrown model and scalability to meet changing demands, automated data access control can keep your data safe and your teams productive.
To apply your organization’s specific details to this DIY approach, you can try our Cost Estimator here. Interested in how a tool with these capabilities could become a part of your cloud ecosystem? Look no further. Learn how Immuta can facilitate secure data access without any unnecessary excess by trying a self-guided demo.