Democratizing Data: AI tools for discovery

With Julia Lane
Mannheim, September 29th 2022

The call for better data and evidence for decision-making has become very real in the US as evidenced by the passage of both the Foundations of Evidence-based Policymaking Act (Evidence Act) and the CHIPS+ Act, establishing a National Secure Data Service. The challenge to be addressed is finding out not just what data are produced but how they are used – in essence, to build an for data -so that both governments and researchers can quickly find the data and evidence they need.

Julia Lane will provide an overview of a massive effort over the past five years which has been focused on finding out how data are being used, to answer what questions, and find out who are the experts, by mining text documents that are hidden in plain sight – in the text of scientific publications, government reports and public documents.

Just as with Amazon, the results are enormously powerful. The pilot, which is sponsored by agencies such as NSF’s National Center for Science and Engineering Statistics (NCSES) and the Department of Education’s National Center for Education Statistics (NCES) – has generated a prototype API and a dashboard that can be used – so that, for example, agencies can document dataset use for Congress and the public, program managers can identify investment opportunities rapidly and researchers can more easily build on existing knowledge rather than redoing things from scratch. To paraphrase Lee Platt’s aphorism about HP – “If government knew what government knows, it would be three times more productive”.

In this workshop we will not just present the results, but have hands-on tutorials to work with the resulting API, dashboard, Jupyter Notebooks and visualization tools. The goal is to inspire BERD researchers to collaborate – or to develop new approaches to building an for European scientific and public data.

About the Instructor

Julia Lane is a Professor at the NYU Wagner Graduate School of Public Service. She was a senior advisor in the Office of the Federal CIO at the White House, supporting the implementation of the Federal Data Strategy. She cofounded the Coleridge Initiative, whose goal is to use data to transform the way governments access and use data for the social good through training programs, research projects and a secure data facility.