It is often argued that we lack data on international migration. Although there are large gaps in countries’ knowledge about migration, it is also true that there is an abundance of relevant data being produced every day, every time we use a mobile device or internet services. This article explores how big data can be useful for migration policymakers, and discusses both the potential and the limitations of using these new data sources.
Migration has risen to the top of the political agenda in most countries around the world. The 2030 Agenda for Sustainable Development, which sets the global development objectives to be achieved by 2030, explicitly recognises how migration and development are inextricably linked and that migration can be a force for good, if it takes place in a safe, regular and orderly fashion through the implementation of appropriate, fact-based policies. Better data and evidence on migration are essential to design forward-looking policies that can allow countries to realise the opportunities offered by migration and effectively tackle its challenges. This is also emphasised in the final draft of the Global Compact for Safe, Orderly and Regular Migration (GCM), to be formally adopted in December 2018, which states that enhancing the evidence based on migration will be essential “to guide coherent policymaking and well-informed public discourse” and to monitor and evaluate the implementation of the GCM commitments once the document is adopted.
Paucity of migration data
The paucity of migration data has long been recognised, and traditional statistical systems are often not sufficiently equipped to meet the growing needs for evidence for implementation and monitoring of the Sustainable Development Goals and, soon enough, of the GCM commitments. There are still wide gaps in the quantity and quality of migration data globally, particularly in countries in the Global South, where resources for statistical activities may be very limited. Many countries do not include key questions in relation to migration in their national population censuses. Household surveys are costly and hardly ever include migrants, particularly those who are undocumented. Administrative sources, such as residence and work permits, are not regularly analysed and disseminated in many countries.
The gaps in migration data may seem paradoxical at a time of unprecedented abundance of data globally. In fact, there are massive amounts of data generated by users of digital devices and internet-based platforms, and collected in real time, at very little cost, by private companies. A growing number of studies and applications are showing how these non-traditional data sources could provide valuable insights on migration and human mobility. However, significant challenges, such as access to and continuity of the data, confidentiality and security risks, as well as technical and methodological issues, are preventing more systematic uses of big data sources for migration analysis.
Social media platforms
Several studies and experiments based on the combination of traditional and new sources or methods are demonstrating the value of data innovation in the field of migration and mobility. One of the most recent examples shows how data obtained from social media advertising platforms, such as that offered by Facebook, can provide almost real-time information on the number of users Facebook classifies as ‘expats’ – individuals living in a country other than their (self-reported) ‘home country’ – in a specific country or on a global scale, at a certain point in time. This means that Facebook can be used to ‘nowcast’ changing trends in migration and mobility, almost as if it were a ‘real-time’ census. For instance, the increasing trend in the number of Venezuelan migrants in Spain between 2017 and early 2018, reported by the Spanish National Statistical Office, is clearly reflected in Facebook data on Venezuelan ‘expats’ living in Spain. Also, although only about 30% of the world’s population uses Facebook, the total number of ‘expats’ counted by Facebook in February 2018 was 273 million – remarkably close to the UN official estimate of 258 international migrants globally in mid-2017. Apart from changing trends, Facebook data may also be helpful to identify skills, educational background or interests of recently-arrived ‘expats’ (based on self-reported information), which are generally not available from official statistics.
Geo-tagged social media activity can also be used to estimate international migration flow patterns globally, of which our knowledge is currently very limited, given only about 45 countries in the world are able to report international migration flow statistics to the UN Statistical Division. For example, data from the professional social media platform LinkedIn can be used to produce a digital mapping of the workforce and individuals’ occupational profiles – including those of migrants – in different countries and regions, particularly in locations or sectors where use of this platform is relatively high.  Some scholars have analysed Twitter data to compare internal and international migration patterns, disaggregated by age and sex based on user self-reported information. Content shared by users on social media can also help analyse public opinion on migration or even aspects of migrant inclusion. Instagram and Google Trends data can also potentially be used as an early-warning system of migratory movements.   Interesting work by the SoBigData team also analysed consumption patterns of migrants in grocery stores to study migrant integration.
Combining different data sources
Migration-relevant insights can also emerge from the combination of different data sources. At a Big Data for Migration workshop organised by IOM’s Global Migration Data Analysis Centre and the European Commission’s Knowledge Centre for Migration and Demography in November 2017, some researchers showed how detail records from mobile phone calls and satellite imagery can be combined to map movements of individuals across countries, at the subnational level. Mobile phone data on international calls made by individuals in a certain community and traditional statistics from population censuses can also be combined to analyse patterns of migrant integration and urban segregation. Mobile phone call detail records have also been used to identify transnationalism patterns, or individuals living and working in more than one country. Big data sources can therefore be helpful to measure more fluid migration and mobility patterns, such as transnationalism and circular migration, going beyond the definition of an international migrant based on change in the country of usual residence.
To summarise, new data sources and innovative methods offer opportunities to fill some of the gaps in migration data from traditional sources. This is because they have some unique characteristics, linked to their size, the velocity at which they are generated, and the richness of information that can be extracted from such complex data sets. Big data have wide coverage because they relate to all users of mobile devices and internet-based platforms, including individuals who may be hard to reach through traditional survey methods. They are generated and collected in real time, and can be frequently updated, so their analysis can provide timely insights for policymakers. They can be obtained at a relatively low cost – provided the companies collecting and storing them are willing to share them – and they can offer insights that cannot be generated through the analysis of traditional data.
However, the opportunities presented by data innovation for migration are mirrored by significant challenges:
- There are legal and ethical issues to be addressed, such as data confidentiality and protection of fundamental rights. Mobile phone and social media data are generated automatically by users of these devices or platforms, so guarantees about how their data will be processed and used are necessary. Use of artificial intelligence and machine learning for public policy decisions can also pose risks to fundamental rights, as described in a recent focus paper by the EU Agency for Fundamental Rights.
- Extracting meaningful insights from new data sources can be technically and methodologically difficult. These are large volumes of data, often complex and ‘noisy,’ requiring advanced analytical methods and capacities. Furthermore, these data are not collected according to rigorous statistical standards and only reflect the behaviour of users of mobile devices and platforms, so may not be fully representative of the entire population.
- Access to the data collected by the private sector, and continuity of the data are issues preventing a more systematic use of such data sources for policymaking.
- Big data sources hardly allow to distinguish international migrants as per the UN definition, and other ‘mobile’ populations.
Big Data for Migration Alliance
In an effort to systematically tackle these challenges, raise awareness about uses of new data sources and methods in the field of migration, and encourage more research and experimentation in this area, IOM’s Global Migration Data Analysis Centre and the European Commission’s Knowledge Centre on Migration and Demography convened a Big Data for Migration Alliance (BD4M).  The Alliance was launched on
25 June 2018 in Brussels as a cross-sectoral network of organisations and individuals from the private sector and the research, statistical and policymaking community, who all share an interest in realising the value of using new data sources in migration analysis and decision-making, and concretely addressing the challenges presented by these new methods.
There is currently no dedicated unit tasked with investigating the potential of big data and new data sources in the area of migration. Global and national initiatives on the topic seem scattered, and the new bodies or mechanisms created to harness the data revolution for sustainable development at UN- and EU-level do not specifically focus on realising the potential of big data for measurement of human mobility. The BD4M aims to fill these gaps by:
- facilitating the creation of new forms of partnerships between the private and the public sectors to address data access issues;
- demonstrating the potential of new data sources to respond to specific policy needs, particularly when data from traditional sources may not be available;
- establishing a dialogue between policymakers, scientists, data providers and regulators to tackle privacy and ethical issues.
The BD4M would also encourage the creation of a network of ‘data stewards’ or ‘data focal points’ within different organisations. This idea was inspired by the GovLab at the New York University Tandon School of Engineering (also a member of the Alliance), in order to build the trust required for private-public collaborations to flourish and realise the potential of big data to improve evidence on migration and mobility, and informed policymaking, around the world.
To conclude, many years have passed since the term ‘Big Data’ became the topic of those who saw enormous advantages in the development of new technologies for managing huge amounts of data. The private sector has understood the numerous opportunities connected to big data, and many are now aware that their respective leaderships in the market may depend on big data.
Real-time analysis is now also a top priority for governments that want to modernise internal migratory data management and processes, as well as to develop evidence-based policies with an empirical approach. Exploring opportunities for better data-driven decision-making processes will guide governments to move rapidly towards a more agile and cost-effective e-Governance. However, the process of fully revolutionising traditional migration data management systems still seems to have far to go before achieving its full potential.
Handling the abundance of data
The first real challenge that must be faced by governments is technical and is related to the capabilities to collect, store, manage and analyse the magnitude of big data available. The amount of data available today is outstanding: smartphones, credit cards, visas, airline tickets, Frequent Flyer information, social media, web interactions and geo-localisation data are only some of the many possible sources of migratory data. These data, if collected and linked together accurately, and then analysed and interpreted with advanced analysis models, can provide, for example, reliable forecasts on future trends and events.
Mapping data at different levels
To get the most out of big data, countries must first map all forms of migratory data available at national and international level. Furthermore, they must develop rapid adaptation mechanisms in consideration of the fast-changing nature of all types of migratory movements, identify all possible new sources available, assess their reliability, and finally define clear policies to ensure that data are protected, and analyses are shared according to pre-set objectives and legal safeguards.
There are many questions that States must ask themselves to assess the challenges and consider the opportunities related to better use of big data. The technological progress of the last decade has already made life easier for data analysts, who can now draw on advanced technologies. But we are not yet done with the possible developments. For example, how can big data expedite and improve decision-making processes using predictable models? How can these models improve efficiency and effectiveness of border management and immigration responses based on a more precise risk analysis of the various multifaceted migratory flows?
To explore these questions and related issues, leading experts from governments, international organisations and the identity solutions ecosystem will brainstorm together in the 5th edition of the Border Management & Identity Conference (BMIC) to be held in Bangkok, Thailand from 11‑13 December 2018. Organised by the International Organization for Migration (IOM) and the Asia Pacific Smart Card Association (APSCA), and supported by the Ministry of Foreign Affairs of Thailand, this is the largest gathering in Asia of national government authorities including immigration, identity, border control, civil registration, customs, population management and other agencies with responsibilities in the area of border and identity management.
The objective of the 5th BMIC is to improve border and identity management in the Asian region through closer consultation and cooperation between national authorities responsible for border control, national identity and their key interlocutors at international level. The conference will include different workshops to discuss experiences, challenges and new developments in the following four areas:
- Managing Trusted Identities in a Decentralised World;
- New Ways to Collect, Manage and Use Data;
- Ensuring End-to-End Trust in Identities & Credentials;
- New Approaches Using Mobile Solutions.
1 United Nations. Transforming our world: the 2030 agenda for sustainable development. [Accessed 26 July 2018].
2 Global Compact for Migration (2018). Global Compact for Safe, Orderly and Regular Migration. [Accessed 26 July 2018].
3 Migration data sources. [Accessed 26 July 2018].
4 Zagheni, E., Weber, I. and Gummadi, K. (2017). Leveraging Facebook’s Advertising Platform to Monitor Stock of Migrants. Population and Development Review, Vol. 43 (4), pp. 721-734.
5 Spyratos S., Vespe M., Natale F., Weber I., Zagheni E. and Rengo, M. (2018). JRC Technical Reports: Migration Data using Social Media: a European Perspective. [Accessed 26 July 2018].
6 International Organization for Migration (2018). GCM Data Bulletin on Big Data and Migration. [Accessed 26 July 2018].
7 State, B., Rodriguez, M., Helbing, D. and Zagheni, E. (2014). Migration of Professionals to the U.S.: Evidence from LinkedIn data. [Accessed 26 July 2018].
8 Rango, M. and Vespe, M. (2017). Big Data and Alternative Data Sources on Migration: from Case Studies to Policy Support. [Accessed 26 July 2018].
9 Zagheni, E., et al. (2014). Inferring International and Internal Migration Patterns from Twitter Data. [Accessed 28 July 2018].
10 Connor, P. (2017). Can Google Trends Forecast Forced Migration Flows? [Accessed 28 July 2018].
12 So Big Data: Exploratories. [Accessed 28 July 2018].
13 Ahas, R., Siiri, S. and Tiru, M. (2017). Tracking Transnationalism with Mobile Telephone Data. [Accessed 28 July 2018].
14 European Union Agency for Fundamental Rights (2018). Big Data: Discrimination in data-supported decision making. [Accessed 28 July 2018].
15 European Commission (2018). Big Data for Migration Alliance – BD4M. [Accessed 28 July 2018].
16 Verhulst, S. (2018). Data Stewards: Data Leadership to Address 21st Century Challenges. [Accessed 28 July 2018].
17 Further information about the Border Management & Identity Conference [Accessed 28 July 2018].
Frank Laczko is the Director of IOM’s recently established Global Migration Data Analysis Centre. He was previously based in Geneva, where he led IOM’s Migration Research Division. He is the co-chair of the Data and Research Group of the Global Migration Group, editor of IOM/Springer Global Migration Issues book series and co-editor of Migration Policy Practice, a journal for migration policymakers and practitioners.