Democratizing Data: AI tools for discovery by Julia Lane

The call for better data and evidence for decision-making has become very real in the US as evidenced by the passage of both the Foundations of Evidence-based Policymaking Act (Evidence Act) and the CHIPS+ Act, establishing a National Secure Data Service. The challenge to be addressed is finding out not just what data are produced but how they are used – in essence, to build an for data -so that both governments and researchers can quickly find the data and evidence they need.

Julia Lane will provide an overview of a massive effort over the past five years which has been focused on finding out how data are being used, to answer what questions, and find out who are the experts, by mining text documents that are hidden in plain sight – in the text of scientific publications, government reports and public documents.

Just as with Amazon, the results are enormously powerful. The pilot, which is sponsored by agencies such as NSF’s National Center for Science and Engineering Statistics (NCSES) and the Department of Education’s National Center for Education Statistics (NCES) – has generated a prototype API and a dashboard that can be used – so that, for example, agencies can document dataset use for Congress and the public, program managers can identify investment opportunities rapidly and researchers can more easily build on existing knowledge rather than redoing things from scratch. To paraphrase Lee Platt’s aphorism about HP – “If government knew what government knows, it would be three times more productive”.

In this workshop we will not just present the results, but have hands-on tutorials to work with the resulting API, dashboard, Jupyter Notebooks and visualization tools. The goal is to inspire BERD researchers to collaborate – or to develop new approaches to building an for European scientific and public data.

free, no prerequisites

In-Person Workshop, Mannheim University Library

29th September 2022

3 p.m. – 6 p.m.