It is no longer new to say that artificial intelligence (AI) needs free flow of data. Data feeds AI systems, enabling them to learn patterns from the data, solve new problems, and keep evolving. To what extent this process can be optimized largely depends on its capability of obtaining huge volumes of diversified and up-to-date data. This is how free flow of data can help AI. As explained below, additionally, free flow of data also helps lower down operational cost for businesses and researches, which is important particularly for small and medium-sized AI-relevant firms, and empower more efficient data integration and analysis to further invigorate the AI innovation ecosystem.
Data localization blocks the free flow of data. It imposes legal requirements to force organizations to store data locally. While the impact of data localization policies may vary among different countries depending on the levels of restrictions, in general, data localization hampers AI development for its limiting the data available, enhancing business cost to pay for duplicative IT services and other compliance activities, creating barriers in digital world which needs to be seamless to be optimized, and damaging the advantage of digital centralization.
China has been widely criticized for its strict data localization rules. In the past decade, China has already mandated data localization through a bunch of sectoral statutes, including those relating to finance, online ride-hailing service, insurance, credit reporting, demographic health, etc. Its Cybersecurity Law, effective on June 1, 2017, seems to escalate China’s data localization regime to a new level by Article 37, requiring all the operators of Critical Information Infrastructure (CII) to store personal information and important data generated from their operations in China within the territory of China. If the data has to be transferred outbound, the law requires CII operators must pass a security assessment following government rules. Till the writing of this essay, China has not yet clarified the legal definitions of key concepts such as CII and important data, which potentially can be interpreted broadly to cover many categories of information generated from the industries of telecommunication, broadcasting, big data, clouds computing, IoT, industrial control systems and government systems. It is worth noting that another drafting regulation released earlier in April of the same year, the Measures on Security Assessment of Cross-border Data Transfer of Personal Information and Important Data, seeks to extend the data localization requirement to all network operators, not just CII operators. Compared to many other countries, China’s data localization approach seems to be on the most restrictive and sweeping end, with a focus on governmental control.
How Data Localization May Hurt China’s Own AI Development?
Data localization may hinder AI development from a variety of aspects. An immediate one is that it directly limits the provision of data on the global level needed to train AI systems, and thus, undercuts the building blocks AI is built on. To clarify, such limitation is not necessarily a bad thing, depending on the legislative motivations and how it gets implemented. For example, it can be good if the focus is on privacy protection, since we probably don’t want a blind pursuit of AI development based on the unethical exploitation of people’s personal and sensitive data. This is the approach by the General Data Protection Regulations of the European Union. But still, from a technical view, AI would learn and evolve better if it feeds upon the data from a hundred countries from all over the world rather than ten countries from, say, only the Europe, because the data sources would be plentiful and diversified. To maximize this benefit needs an open and mutually trusted global ecosystem encouraging sharing of data and innovations internationally. Currently, many policy makers in the West have started to negotiate bilateral or multilateral trade agreements to facilitate cross-border data transfer. If China seeks to become a global AI industrial leader, it needs to actively participate into this global conversation and earn the trust from other countries by sharing its data, not closing itself off.
The second problem is that data localization may impede the global expansion of Chinese companies and AI researchers. For example, Alibaba DAMO Academy, a global research and development initiative invested by China’s leading tech company Alibaba, now has set up workstations located in Beijing, Hangzhou, Singapore, Tel Aviv, Seattle, Sunnyvale and New York, as well as worldwide joint labs and research programs. It cannot enjoy the full benefit of this global network if China’s data localization requirement forbids its data to be freely transferred among these international spots and easily processed and analyzed anywhere it finds convenient.
Lastly, data localization may diminish the competitiveness of China’s market relevant to AI development. Despite the vagueness of China’s laws and regulations, many foreign companies including Apple and Amazon’s cloud service AWS are now using Chinese local data centers to store the data generated from their China businesses. Additionally, they have to establish a separate system and portal specifically for the China market. For example, Amazon AWS now provides two types of user accounts, one for Chinese users and the other general account for the rest of the world. The general account cannot be used to access the resources in the AWS China Regions, and users who wish to use the AWS China Regions must sign up for a separate set of account credentials unique to AWS China services. It can be expected that Chinese cloud service providers, such as Alibaba Cloud, will also need to deal with this kind of complication if they want to enter the global market.
Cloud computing and data centers are critical infrastructure components to AI industry, enabling researchers and developers to use computational power in a more efficient way to develop and run AI models. As the AWS case showed, creating barriers in computing and data systems by data localization rules would compromise the scalability benefits of cloud computing and data centers, increase the operational cost for these service providers, and accordingly, diminish the market vitality and competitiveness in China. It is the local users, including AI developers and researchers, who ultimately bear the cost. Data localization has been proved to cause the cost of using cloud services to rise with 54 percent more in Brazil when a user has to use local cloud service provider. It does no good to the local AI businesses as well, because growing up in a “protected” market where their foreign competitors are not given a fair play ground to compete, they would lack the critical competitiveness to thrive in the global market.
Where China Data Localization Policy May Go?
Recent discussions in China reveal a growing internal awareness against data localization, calling for more flexible and relaxed data policies to foster economic development and be more engaged in the international market. For example, China held an international forum in May to explore legal issues relating to AI and cross-border data transfer. Quite a few government officials and industry leaders made public speeches appealing for a more flexible regime on cross-border data transfer to foster AI industry. In the end of August, a major internet research center affiliated to Shanghai Academy of Social Science, a think tank with government background, published an in-depth report pointing out the positive effects of cross-border data transfer in promoting economic growth, national innovation and globalization, and that China’s current policy on cross-border data transfer is too conservative and incompatible with China’s digital economy status. Particularly, it mentioned that personal information and important data each should be subject to two different and separate regulations, because they implicates two distinct types of protected interests and security risks. The report proposed that the cross border transfer of personal information should be subject to a self-regulatory mechanism under the government supervision to protect privacy, and the important sensitive data should be regulated by a layered administrative assessment mechanism to protect national interests.
A leading Chinese scholar Zhou Hanhua also criticized in a 2018 paper that the existing policies in China misapplied the national security standard into the area of personal information protection. Zhou also mentioned that the personal information law should focus on individual rights protection rather than national security interests. Otherwise, Zhou said, it would cause unexpected serious consequences, humper big data and international trade businesses, and not really protect national cybersecurity.
The Chinese government seemed to start to echo these voices. As mentioned, China published a drafting rule in 2017, proposing a security review mechanism for the outbound transfer of personal information and important data. Without progress on this drafting rule since then, in June, 2019 China released another drafting rule, the Measures on Security Assessment of the Cross-border Transfer of Personal Information, which would only apply to personal information but not important data, and more center on safeguarding individual rights. Meanwhile, the outbound transfer of important data was briefly touched in a separate drafting rule released shortly before in the end of May, the Measures on Data Security Management, with a focus on security risk assessment. These two documents reveal the government intention of distinguishing the regulation of personal data from important data. Furthermore, in August, the State Council released the government plan for a pilot free trade-zone located in Shanghai, China, which mentions that an administrative system around data security would be established and tested for cross-border data transfer in support of the industries like integrated circuit, artificial intelligence, biomedicine and other critical areas.
***
In a notice released in July, 2017 by the State Council on Issuing the Development Plan on the New Generation of Artificial Intelligence, China has identified AI as a major driving force for its economy, aiming to lead the world in AI technology by 2030. As this essay has discussed, strict data localization rules can backfire and hurt a country’s own AI industry, and Chinese policy makers seem to have realized the problem and started to explore a controllable way to facilitate cross-border data transfer. More rules will be expected in the near future to roll out the detailed implementation.