Introduction


  • OpenRefine is a powerful, free, and open source tool that can be used for data cleaning.
  • OpenRefine will automatically track any steps allowing you to backtrack as needed and providing a record of all work done.

Working with OpenRefine


  • OpenRefine can import a variety of file types.
  • OpenRefine can be used to explore data using filters.
  • Clustering in OpenRefine can help to identify different values that might mean the same thing.
  • OpenRefine can transform the values of a column.

Filtering and Sorting with OpenRefine


  • OpenRefine provides a way to sort and filter data without affecting the raw data.

Examining Numbers in OpenRefine


  • OpenRefine also provides ways to to examine and clean numerical data.

Using scripts


  • All changes are being tracked in OpenRefine, and this information can be used for scripts for future analyses or reproducing an analysis.

Exporting and Saving Data from OpenRefine


  • Cleaned data or entire projects can be exported from OpenRefine.
  • Projects can be shared with collaborators, enabling them to see, reproduce and check all data cleaning steps you performed.

Other Resources in OpenRefine


  • Other examples and resources online are good for learning more about OpenRefine.