Transforming Naturally Occurring Text Data into Economic Statistics

text analysis
economic statistics

Turrell, Arthur, Bradley Speigner, Jyldyz Djumalieva, David Copple, and James Thurgood. “6. Transforming Naturally Occurring Text Data into Economic Statistics.” In Big Data for Twenty-First-Century Economic Statistics, pp. 173-208. University of Chicago Press, 2022. doi: 10.7208/chicago/9780226801391-008

Figure from paper

Bank of England

Bradley Speigner

Bank of England

David Copple

Bank of England

Jyl Djumaliev

Data Science Campus


January 2022



Using a dataset of 15 million UK job adverts from a recruitment website, we construct new economic statistics measuring labour market demand. These data are ‘naturally occurring’, having originally been posted online by firms. They offer information on two dimensions of vacancies—region and occupation—that firm-based surveys do not usually, and cannot easily, collect. These data do not come with official classification labels so we develop an algorithm which maps the free form text of job descriptions into standard occupational classification codes. The created vacancy statistics give a plausible, granular picture of UK labour demand and permit the analysis of Beveridge curves and mismatch unemployment at the occupational level.


 Add to Zotero

  title={6. Transforming Naturally Occurring Text Data into Economic Statistics},
  author={Turrell, Arthur and Speigner, Bradley and Djumalieva, Jyldyz and Copple, David and Thurgood, James},
  booktitle={Big Data for Twenty-First-Century Economic Statistics},
  publisher={University of Chicago Press}