Tackling the Big Data Challenge Alliance Seeks to Help Business, Government Adopt Best Practices

The Cloud Security Alliance has formed a big data working group to address privacy and security challenges among organizations. What are those challenges and how does the group aim to tackle them?

J.R. Santos, global research director for CSA, along with CSA Chief Operating Officer John Howie and Arnab Roy, research member of Fujitsu Laboratories of America, spoke with Information Security Media Group about the working group, detailing the challenges present in big data today.

"One of the things is that big data is gathered from diverse endpoints," Roy says in an interview with GovInfoSecurity's Eric Chabrow [transcript below]. "Data aggregation and dissemination have to remain secure and inside the context of a formal understandable framework."

Availability of the data is also a key issue, Roy explains.

Another concern, says Howie, is the various regulatory compliance obligations organizations have. "How do various privacy regimes and legislation impact your ability to collect and process that data and then disseminate the results," he says.

The working group will focus on six specific themes around big data, including:

  • Big data-scale crypto;
  • Cloud infrastructure;
  • Data analytics for security;
  • Framework and taxonomy;
  • Policy and governance;
  • Privacy.

In developing its short-term goals, Santos says that the working group wants to get a baseline set of best practices prepared, "and then take a look at some of the research proposals that are out there and see if there are any opportunities to collaborate with government or industry to execute on research initiatives."

CSA is a not-for-profit organization with a mission to promote the use of best practices for providing security assurance within cloud computing, and to provide education on the uses of cloud computing to help secure all other forms of computing. The alliance is led by a broad coalition of industry practitioners, corporations, associations and other stakeholders.

The working group is chaired by Sreeranga Rajan of Fujitsu Laboratories of America and co-chaired by Neel Sundaresan of eBay and Wilco Van Ginkel of Verizon.

Big Data Security, Privacy Challenges

ERIC CHABROW: What are the security and privacy challenges organizations face because of big data?

ARNAB ROY: One of the things is that big data is gathered from diverse endpoints. There are more types of active vendors, providers and consumers; the data owners; for example, the mobile users, social network users and so on. One of the things that has to be taken care of is the data aggregation and dissemination have to remain secure and inside the context of a formal understandable framework. This should be part of the contract that has to be provided to data owners.

Availability of data to data consumers is an important aspect in big data. The SLA needs to address this. The searching and filtering of data is also important since all of the massive amounts of data may not be accessed. What are the capabilities provided by the provider, in this respect, that need to be answered? One of the most important points is that the balance between privacy and utility needs to be totally analyzed. Big data is most useful when it can be analyzed for information.

One more important point is that since there's a separation between data and owners, providers and data consumers, the integrity of data coming from endpoints has to be ensured. In other words, data poisoning has to be ruled out.

Lastly, a key big data question is how quickly cloud providers can migrate the customer to another site when security's compromised.

J.R. SANTOS: The one thing to add to that is we have some issues around computing encrypted data in these environments, detection and so forth and I think Arnab touched on analytics as well. But the big thing right now is just trying to understand things such as key management and ownership of data. They're some of the issues that I wanted to add to that.

JOHN HOWIE: One of the other concerns that any user of big data will have is the various regulatory compliance obligations they have. We're all very familiar with some of the particular privacy requirements in Europe and where big data might actually accrue in from multiple endpoints around the globe. How do various privacy regimes and legislation impact your ability to collect that data, process that data and then disseminate the results? That would have to be factored in as well, and the appropriate controls being designed and considered to meet your various compliance obligations.

Big Data Working Group: 6 Themes

CHABROW: Announcing the formation of the group, six specific themes have been identified. Big data-scale crypto, cloud infrastructure, data analytics for security, framework and taxonomy, policy and governance, and privacy. First, what do you mean by big data-scale crypto?

ROY: There are a number of topics in cryptography and the Internet infrastructure that need to be addressed. Many of these are research questions, but for communication protocols, data-centric security, privacy of big data, management of keys, you need to look at data integrity and poisoning concerns. How do you automatically detect those? You need to search and filter through encrypted data because there's a massive amount of data and you cannot look at everything at the same time. You have to collect an aggregate data in a secure manner. You have to ensure that users can collaborate in a secure way.

Suppose your cloud holds some data and you sometimes need to be assured that it indeed holds the data that it's supposed to hold. You need to prove that. How do you do that in a cryptographically secure way? How do you outsource computation in a secure way using cryptographic technology? Broadly, the new direction that these topics would take is you need to look at the large volume of data so that requires changing the signs and new directions on the topics. You need to look at streaming data with data that's coming in fast. You need to adapt your technology to that. Thirdly, the data is pretty diverse. How do you take care of that?

Cloud Infrastructure

CHABROW: Next on the list was cloud infrastructure.

HOWIE: With cloud infrastructure, some of the key objectives there are we want an identified scope of the cloud infrastructure platform and its various entities and comprised attack surface. We also want it to identify techniques for attack surface modeling analysis, reduction and determine new issues that could arise in the cloud. The third thing is to identify new problems and techniques that can make attack surface analysis an effective methodology for presenting the security picture.


CHABROW: Rather than going through all six, why don't we just focus on one other at the moment, privacy?

HOWIE: There's this issue whereby big data is ineptly going to pull from a number of endpoints as we heard earlier. The question is whether or not you can actually move that data into an analytics system legally. What are the privacy frameworks and requirements? What are the contracts between the data subjects and the data owner? What are the contracts in place between - to use a European term - a data controller or a data processor to permit the movement of data around in order to process it? And then not just protecting it from a disclosure or misuse of the point of collection but at the point of processing both intermediate processing and at the very end when the analytics are delivered; those are very serious questions and at the end of the day, as legislation will evolve and change over time, you need to make sure that you design a series of sustainable, scalable controls which can be deployed within any system to ensure that a privacy regime can be maintained and is flexible to accommodate changes in privacy legislation almost anywhere in the world at any point in time.

CHABROW: Is that a big problem, because the view of privacy here in the United States may be different than the view of privacy in Europe?

HOWIE: Yes and also it's bigger than that. It's not just differing views on privacy and what privacy means in the U.S. versus the EU, but even within the European Union, under the current system where you have a directive which is implemented in the law, there are actually differences in what you can do with data as enacted and national legislation in each of the 27 countries, so what might be legal in one might not be legal in another.

Another example within the U.S. is the concept of breach notification, where it's a state-based system patchwork of legislation. There's not a federal breach notification law for most types of data, and so as a result you actually find that what you do in California in terms of breach notification might be different in Washington and that might actually impact where and how you collect and process that data to try and sidestep a breach notification requirement. That may actually be an issue as well.

Then lastly, in the Asia Pacific region we're seeing a number of different privacy regimes growing there. You've got some countries which are following the European model. You've got some which are developing their own model, and some which actually candidly have absolutely no privacy protection whatsoever that's worth really that much to consumers. Any company that's working globally and analyzing its global customer base or its global supply chain base has to actually accommodate a very wide and varied number of privacy regulations worldwide.

Goals and Objectives

CHABROW: What are your objectives? What are you short-term goals? What are your long-term goals for the big data working group?

SANTOS: Some of our short-term goals are to really try to get a baseline set of best practices out there and identify the gaps that can't be solved with current technologies, and then take a look at some of the research proposals that are out there and see if there are any opportunities to collaborate with government or industry to execute on research initiatives. That's kind of our short-term goal. In 2013-2014, we're going to start looking at creating privacy test beds, start looking at and participating in standard development organization-related work, making sure that we're linking with major players and developing liaison relationships with some of the major SDOs to come up with standardizing big data, security and privacy best practices. We'll take some of the things that we established early on and look at how that impacts international standards and really try to execute on any of the research funding initiatives that we may run into as we do look for those opportunities in the upcoming months.

In a nutshell, that's what we plan to do. The goal of this group is really to help industry and government adopt these best practices.

CHABROW: How are you going to disseminate the information and the findings of your group?

SANTOS: It could be through a number of different channels. One of the things I mentioned earlier was the relationships that we have with SDOs, looking at developing standards around these practices that we develop. In addition to that, we also work closely with our network of corporate membership that consists of not only cloud users but also mostly cloud providers as well. A lot of it is going to be through a number of different tools that we've developed. John mentioned things like our cloud controls matrix that's a popular tool being used within the industry currently, in addition to things like our CAIQ questionnaire which is used in most cases as an assessment tool for the end-user. Then there are things like CSA Star, where providers will be posting their answers to CAIQ and our control matrix. There's going to be a number of ways we're going to deliver it through current tools and mechanisms that CSA has to offer, in addition to general involvement with folks in the industry and through governments and standards development organizations.

CHABROW: How would you judge whether what you do is successful or not?

SANTOS: That's a good question. Success to me - and this is my opinion only - is really just adoption. If people are utilizing the tools and the best practices that we put out there, then I think that's success in my eyes. We're helping solve a big problem and it's a collaborative effort between folks in the industry and government that are coming together to solve this. For me to see that is success in my eyes.

Around the Network