There is a still small but steadily growing consensus across a number of standards bodies to adopt a shared methodology for identifying organisations.
This methodology splits identifiers into two parts: a registration agency and a registration identifier. IATI maintains (and continues to build) a code list of registration agencies. The registration identifier is issued by the agency.
This methodology is well established within IATI for the identification of NGOs and private sector organisations. The exception to this is the way in which we have utilised OECD DAC identifiers for those institutions recognised by the DAC. To standardise these identifiers it is proposed that:
- The OECD DAC is recognised as a registration agency with code XM-DAC.
(All agencies are prefixed with ISO country codes: the XM indicates a multilateral/international agency)
- DAC donor codes should be adopted
IATI at the outset substituted DAC donor codes with ISO country codes. This should be corrected.
The government of the United Kingdom (as a donor) is currently represented as "GB". This should now change to XM-DAC-12
- DAC agency codes should be modified to be consistent with donor codes
UK DFID is currently referenced in IATI as "GB-1". This should now change to XM-DAC-12-1
- DAC delivery channel codes should be prefixed with the DAC agency code
UNDP is currently referenced in IATI as "41114". This should now change to XM-DAC-41114
This is a major change to how most big donors and implementing agencies are identified. For publishers of data this is a one-off change which we believe is in the long-term interests of the standard. For users of data this will result in disruption in the short-term, and IATI will put in place a cross-referencing utility to ease the pain for users of data.
The DAC codelists also allow the use of non-identifying codes: 998 - Developing countries, unspecified; 20000 - NON-GOVERNMENTAL ORGANISATIONS (NGOs) AND CIVIL SOCIETY; etc. These are currently not included in the IATI codelist (but do appear in data sets). There should be a policy on this, I think.
Related: if a publisher does not want to disclose an organisation, they could use a code like 20000, to indicate "an NGO, somewhere". Especially with the proposal to make either an id or a text for an organisation mandatory, there should be a common approach to dealing with such non-disclosed partners.
I support this proposal.
It would require an update to the Organisational ID guidance to introduce 'XM' for multilateral registries, and at the same time we may want to introduce another X- namespace in place of the current 'MISC'
The other issue though to consider here is replacement patterns for IDs, which I don't think have ever been very neatly agreed upon.
If we did not permit '-' within an ID and instead replaced with '_' (e.g. 12-1 would become '12_1') then consuming applications can more easily split Organisation and Activity IDs back into their component parts...
Tim something along these lines is definitely worth including now, given we are proposing changes with quite far reaching ramifications for publishers and users alike.
The objective is to get a unique (or at least a unique first occurrence) delimiter between the organisation identifier prefix and the suffix that specifies the activity.
Wondering whether the less painful option is to use "_" as the org/activity delimiter.
But either way, I agree,
Another approach might be to introduce a "vocabulary" attribute, similar to how sector works, and put this inside the participating-org tag. It could allow inclusion of multiple ids (and other information) to be added, to provide more data for reconciliation services. It would also clearly separate the id (and its formatting conventions) from the registrar.
Rolf I disagree with splitting the two parts of the organisation identifier as by the same logic the activity identifier would be split into three components. This increases the complexity of finding an activity from its identifier (essential for traceability). It also makes the essential concept of a globally unique activity identifier more complicated.
Bill, I'd say: if we're working on rules for formatting the id string to enable validation of part of that string, we've proven the case for splitting it at the conceptual level in the standard.
Indeed, by the same logic it could make sense to split the activity identifier. But the situation for organisation ids is different:
- The id part is a code list maintained by an external entity (outside IATI). The code list for registration agencies (registrars) is IATI-specific. Splitting the components makes validating the registrar code against the code list "business as usual" rather than a special case with its own logic.
- There will be various registration agencies in the combined IATI data, as well as many organisations with missing ids. Pointing to someone else's activity based on the id they choose and use in their own IATI data is a lot simpler than pointing to another organisation based on no definitive "single source of truth".
For both the organisation id and the activity id, you would know that when two sets of registrar-code and registry-id are the same, they point to the same thing. It makes it possible and trivial to concatenate them in your datastore if you like.
I'll happily try to make the case for splitting activity ids too :-) but for organisation ids (this topic), I think there are compelling reasons to do this.
There is a significant problem in IATI and other standards around organisation identifiers for public bodies. We need to address this problem. However, although it has some merits, we do not support this proposal in its present form.
We need to develop a fully fleshed out proposal on organisation identifiers, particularly for public bodies. Otherwise, we risk asking for significantly breaking and disruptive adjustments to be made now and then potentially again in the future. We would suggest setting up a working group on this and a developing a full paper for consideration by the Steering Committee. We would be happy to be involved in such a working group along with others.
Splitting the "ref" into two parts, as Rolf suggests, may be worth considering (Bill's counterargument is also well made). Attempting to parse refs by splitting them based on particular characters ("-", "_", etc.) and thereby prohibiting whichever character you choose from the rest of the string may be problematic if an existing identifier uses one of those characters.
We will be going ahead with this proposal, but will flag up concerns raised.
The secretariat has over the past two years done extensive work on this issue, including the commissioning of an external study. The approach to date has been premised on the discovery of a global system or methodology which IATI could join. This work has proved fruitless. There was sympathy in the TAG Montreal session on Joined Up Data for IATI to make the running in the absence of other standards and as a result a proposal is being prepared for IATI, Open Contracting and, hopefully, a broader consortium of the willing to adopt a methodology that can be implemented now.
We cannot afford to procrastinate any longer for two reasons. Firstly because to adopt a consistent methodology requires re-engineering for publishers and users of data alike. This is already a big job and in 12-18 months time when we get to the next integer upgrade it will be reaching unworkable proportions. Secondly the quality of data is already suffering immensely without a means of identifying public entities. Once we have a standard methodology that applies to all organisations we can start to think about enforcing meaningful compliance.
Bill, it would be great to see the external study if you have a link - I couldn't find it on the IATI website. Does it deal specifically with public bodies?