European AI with Indian Data: The illusion of low-hanging fruits

The study “The Role of Indian Data for European AI” investigates the potential benefits of a closer data exchange collaboration between India and Europe. It explores the Indian AI landscape, Germany’s need for access to larger data pools, and requirements to make a cooperation possible. While no quick wins are to be expected from a closer cooperation, the long-term prospects are promising, provided that several important obstacles are cleared first. India’s new regulation on data privacy and security, due to be passed soon, appears to be a make or break issue.

After a period of sluggish progress in the Euro-Indian strategic partnership, the EU-India summit held in July 2020 reaffirmed that both sides are keen to deepen their cooperation in security, the environment, innovation and public health. The two large democratic blocs, together accounting for roughly a quarter of the world’s population and GDP, see huge potential in closer cooperation. One area that seems particularly promising is innovation and artificial intelligence (AI). The leaders of India and Germany also put AI cooperation prominently on their agenda included in the joint statement after their last intergovernmental consultations in November 2019.

This push also builds on the assumption that data from India might be valuable to promote AI-development in Europe. As India is a vast and diverse country, it produces a lot of data that might complement data available in Europe, and vice versa. But is this assumption valid?

Acknowledging the strategic importance of close cooperation between the EU and India, this study, prepared for the Bertelsmann Stiftung by CPC Analytics, an Indo-German AI consultancy, investigates whether a cooperation on data between India and Germany/the EU can boost the German and European AI ecosystem. It also asks whether a realistic case can be made for AI cooperation, and especially for cross-border data exchange between India and Germany/the EU.

The study focuses on the situation in India to understand the potential for a data collaboration based on the actor and regulatory landscape in the country. The AI landscape in India is sketched in a first step and this knowledge is then complemented by interviews with relevant actors in the field in order to shed light on the practitioner’s perspective of using cross-border data to build AI. As a regulatory environment allowing for data exchange is a necessary condition for such an exchange to happen on a large scale, the current regulatory situation and plans are described.

Indeed, the analyses of India’s digitalization efforts and of the private sector landscape point towards a rapidly evolving environment for AI. India’s government has launched a series of initiatives to support the country’s digital development. Useful data is increasingly being collected and the country certainly has the talent pool and industry to make use of this data. These factors make cooperation on data for AI between Germany and India promising. Yet major obstacles are present, including regulatory and data availability issues. These obstacles are discussed in the study, which also takes a detailed look at two sectors, health and e-commerce, finding that collaboration possibilities vary greatly between them.

The potential for cooperation is considerable but remains theoretical for now with regards to large scale data exchanges (i.e. across a wide variety of sectors and companies). A quick realization, let alone quick economic benefits, does not seem realistic at the moment. This situation is unlikely to change before there is clarity on the effects of India’s Personal Data Protection (PDP) bill, which is not expected until 2022.