Currently the File Format codelist is a subset of available mime types.
We can see no reason why this should not be the full set of available mime-type. However, our knowledge on this is sketchy, so we would appreciate community help on deciding if this correct and where we would find a definitive list. We would then see this as an 'external' list that is maintained by someone other than IATI.
This may be a useful start: http://en.wikipedia.org/wiki/Internet_media_type
The authoritative is maintained by IANA at http://www.iana.org/assignments/media-types
I'm not sure if a structured dataset of this is maintained, or whether we would need to regularly scrape these pages to keep an IATI code list in sync.
We propose that IATI recognises the full list of two-part codes as maintained by IANA at http://www.iana.org/assignments/media-types.
This task was not completed in the 1.03 process, although it had been marked as planned, so it has been bumped into 1.04
I don't think it's practical or useful to scrape IANA's list. However, I do agree that it's important that IATI recognises the full IANA list. Therefore, I've not imported any more IANA codes, but have added a note to the our codelist, to say it is only a partial list, and any mimetype registered with IANA is also acceptable.
The information that this is a partial codelist is also availible through the complete="0" attribue in the XML - https://github.com/IATI/IATI-Codelists/blob/version-1.04/xml/FileFormat.xml
Surely for anyone validating data - then having a complete codelist of valid values is important?
From a quick look at the source of the XML serialisation of the lists at https://www.iana.org/assignments/media-types/media-types.xml (the front view is cleverly XSLT styled) there is nice structured data there that should be easy to pull in with xpath/xquery and no scraping required.
Thanks for pointing that out Tim. It looked more like http://web.archive.org/web/20130920193129/http://www.iana.org/assignments/media-types last time I was considering pulling the list in.
Given that we have this in a nice structured format, I will look into pulling this in for 1.04 - https://github.com/IATI/IATI-Codelists-NonEmbedded/issues/7