Category Archives: News

Named entity linking on unstructured texts using spaCyOpenTapioca

Due to the size and volume of unstructured textual data, automatic processing techniques are desired by many researchers in business and economic studies. A common use case is the data scraped from Internet. Researchers can process it using the algorithm called named entity linking. It finds concepts in texts (e.g., organisations, persons and locations) and links these concepts to entities in a knowledge base.

UB Mannheim developed the pipeline spaCyOpenTapioca for named entity linking in spaCy using OpenTapioca. It has low computational requirements and links the concepts to entities in Wikidata. The open source code is available at GitHub. It is supplemented with Jupyter Notebook and reproducible Binder.

Let’s apply spaCyOpenTapioca to the sentence “Christian Drosten works in Charité, Germany.”. It correctly identifies Christian Drosten as a person with Wikidata ID Q1079331, Charité as organisation with Q162684 and Germany as location with Q183. Visualisation of results is also possible:

INCONECSS Community Meetings Focussing on Research Data

Every three months, ZBW – Leibniz Information Centre for Economics invites you to their INCONECSS Community Meetings where you can delve into current topics in the field of economics and business information and share your experience with others.

The upcoming INCONECSS Community Meetings will take place on:

Sep 27, 2021: Research Data: Representations, Analytics and Visualizations
Dec 13, 2021: Trainings and Games Related to Research Data

Sign up and further information:

An Integrated Data Framework Guides Policy in Times of Dynamic Economic Shocks

Sudden and unforeseen shocks can cause incalculable and fast-changing economic dynamics to which policy makers need to respond quickly – as we have experienced during the COVID-19 pandemic. At the same time, data from traditional statistics that usually guide policy decisions are only available with non-negligible time delays. This leaves policy makers uncertain about how to most effectively manage their economic countermeasures to support businesses.

Given this information deficit, our colleagues from ZEW – Leibniz Centre for European Economic Research propose a framework that guides policy makers throughout all stages of an unforeseen economic shock by providing timely and reliable data as a basis to make informed decisions. They do so by combining early stage ‘ad hoc’ web analyses, ‘follow-up’ business surveys, and ‘retrospective’ analyses of firm outcomes.

Learn more about the proposed integrated data framework in the recently published discussion paper.

The Mannheim Web Panel introduces novel semi-structured webpage data on company level

Company websites are an important source of economic data and can be used for various scientific approaches, such as predicting firm innovativeness or examining market entry strategies. But the content of those websites changes over time, which requires a continuous monitoring to capture this (change of) information.

For this reason, the ZEW – Leibniz Centre for European Economic Research scrapes the content of corporate websites since 2018 in a panel format, updated every three to six months.

Find out more about the data and how you can access it on our new Mannheim Web Panel page!

Geographic pattern of product innovator firm prediction - Mannheim Web Panel results

Does your data project comply with the privacy protection regulations? iVA helps you to find out

Working with data involves attention to data privacy issues in order to protect the individual. But for researchers it can be very demanding to identify which privacy protection regulation is binding and under which conditions it applies to their own work as legal issues are usually not central to their area of expertise.

To offer researchers and other people working with data an entry point to understand those important privacy law issues, the BERD@BW team developed an interactive Virtual Assistant (iVA). iVA leads you with a series of questions through the regulations and provides a result based on your answers.

interactive Virtual Assistant - BERD@BW
BERD@BW: interactive Virtual Assistant (iVA) helps researchers to understand data privacy regulations

The first part of iVA, which examines with you if privacy protection regulations apply to your data project, was recently updated and can be accessed here: (german)

While iVA currently addresses if privacy protection regulations apply to you, we are already working on an extension to cover the issue more profoundly. We are looking forward to present you a second part of iVA, that will let you check if you are allowed to process personal data and what requirements you have to keep in mind.

RaiseWikibase is presented at ESWC 2021

European Semantic Web Conference (ESWC) is a major venue on semantic technologies. ESWC 2021 took place online on 6-10 June and had many interesting contributions. BERD was happy to attend ESWC 2021 and to present our new tool RaiseWikibase.

Our poster RaiseWikibase: Fast inserts into the BERD instance was presented by Renat Shigapov. RaiseWikibase is a Python tool for speeding up knowledge graph construction and data integration using Wikibase. In our paper we made performance analysis and showed an example of knowledge graph construction with a few millions of German companies. Take a look at our open source code, one minute video and preprint.