OpenRefine

Candice McGowan

Do you have a lot of messy text data? Does it have a lot of spelling mistakes or variations? OpenRefine, a free, open-source tool can help you clean your data. In this introductory workshop, we will cover what OpenRefine is, when it’s useful (as well as when it’s challenging to use), and how to do some basic data cleaning using OpenRefine on a demo dataset. Bring your laptop so that you can follow along!

Link to dataset.

Link to Open Refine.

Facilitator(s): Mark Christensen, Susan Atkey, Larissa Ringham, Milena Constanda

ICSTI Text and Data Mining

Jeremy Frey, Professor of Physical Chemistry, Head of Computational Systems Chemistry, University of Southampton, UK

Audrey McCulloch, Chief Executive, Association of Learned Professional and Society Publishers (ALPSP) and Director of the Publishers Licensing Society

Ellen Finnie, Head, Scholarly Communications & Collections Strategy, MIT Libraries

Michael Levine-Clark, Dean and Director of Libraries, University of Denver

Text and Data Mining (TDM) facilitates the discovery, selection, structuring, and analysis of large numbers of documents/sets of data, enabling the visualization of results in new ways to support innovation and the development of new knowledge. In both academia and commercial contexts, TDM is increasingly recognized as a means to extract, re-use and leverage additional value from published information, by linking concepts, addressing specific questions, and creating efficiencies. But TDM in practice is not straightforward. TDM methodology and use are fast changing but are not yet matched by the development of enabling policies.

This webinar provides a review of where we are today with TDM, as seen from the perspective of the researcher, library, and licensing-publisher communities.

Link to the webinar.

Spam prevention powered by Akismet