IATI Consultations Archive

Live discussions and consultations can be found at discuss.iatistandard.org.

Working with Organisation XML files in Google Refine

The latest versions of Google Refine are able to open XML files. 

Whilst we currently have a CSV export option for IATI Activity files, no mapping between Organisations and CSV is provided. However, the simple format of the Organisation files makes using Google Refine an ideal way to open them.

This screencast steps through the process: http://screencast-o-matic.com/watch/clneD74id

The steps involved

1) Install Version 2.5o or later of Google Refine from: http://code.google.com/p/google-refine

Refine is a free download. When you launch it, it will open an interface in your web browser. However, the software itself is running on your computer, and data you work with in refine is stored locally. 

2) Create a new project, and choose 'Get data from Web Addresses (URLs)'

3) Enter the address of an IATI Organisation XML file here. You can get the addresses from the 'Download' links on the IATI Registry. Click Next> to get Refine to fetch the file.

This page should list just the IATI Organisation files: http://iatiregistry.org/dataset?filetype=organisation

4) Refine will display the XML structure. Hover your mouse over the blocks like '<total-budget>' or '<document-link>' to choose what your table will contain.

You need the yellow highlight box to cover all the elements you want in a row. You may need to scroll down to find an example of document-link or other elements you are looking for.

If you are interested in more than one sort of element from a file then you will need to run through this process multiple times to generate the flattened tables you need.

5) Once you have selected the elements you want, Refine will display a preview table of this. If it looks correct, choose 'Create Project' from the top of the screen. Otherwise press the 'Pick Record Elements' button to try a different selection.

6) With your Google Refine table open you can manipulate the data directly, or use the 'Export' menu at the top-right of the screen to export your newly generated table to Excel or some other spreadsheet via the CSV (Comma-separated values) format. 

 

Did this process work for you? Add questions, clarifications or comments below...

Have more questions? Submit a request

2 Comments

  • 0
    Avatar
    Michael Roberts

    Excellent.  Looks great Tim.

  • 0
    Avatar
    Tim Davies

    Via @tfmorris (http://bit.ly/wJxK2F): No need to pick one field at a time.  Tell Google Refine top level XML element & it will unpack all children into columns.

Article is closed for comments.