Please see the motivation for this revised proposal in Bill Anderson's comment below.
- IATI organisation identifiers should:
- be globally unique
- be constructed from a consistent methodology
- be compatible with other data standards
- IATI activity identifiers
- must be globally unique
- must be persistent
- must be prefixed with an organisation identifier
- IATI organisation identifiers CAN NOT be persistent. Therefore when a reporting organisation's identifier changes:
- The previous reporting organisation identifier(s) should be reported in addition to the new one
- Previously reported activities should maintain their original identifier. The activity identifier should be prefixed
- Use of organisation identifiers [added 18-09-2014]
- ALL publishers of IATI data MUST have a valid organisation identifier reported in reporting-org/@ref
- When using the @ref attribute in in participating-org/@ref, transaction/provider-org/@ref and transaction/receiver-org/@ref must be a valid organisation identifier. If you do not have one, then the narrative element may be used to describe the organisation. [amended 29-09-2014]
- The reporting-org element is MANDATORY.
- ALL the following rules must apply to the organisation-identifier in reporting-org/@ref
- It is mandatory
- The agency prefix MUST be a valid code in the IATI OrganisationRegistrationAgency code list
- The identifier MUST be the same as that recorded by the publisher on the IATI Registry
- The identifier MUST only contain alphanumeric characters and hyphen, underscore, colon or period
IATI activity identifier
- The iati-identifier is MANDATORY
- It MUST be globally unique among all activities published through the IATI Registry
- Once an activity has been reported to IATI its identifier MUST NOT be changed in subsequent updates,
- It MUST be prefixed with
- EITHER the organisation-identifier found in reporting-org/@ref
- OR a previous reporting-org identifier reported in other-identifier
- The identifier MUST only contain alphanumeric characters and hyphen, underscore, colon or period
The following modifications should be made to the other-identifier element
- The definition of the element should be changed to allow both organisation and activity identifiers to be reported.
- other-identifier/@type is MANDATORY when other-identifier is present
- other-identifier/owner-org/@ref is NOT MANDATORY but when used MUST contain a valid organisation identifier
- An OtherIdentifierType codelist should be added for use by other-identifier/@type. Values are:
- A1 - Reporting Organisation's internal activity identifier
- A2 - CRS Activity identifier
- A3 - Previous Activity Identifier
- A9 - Other Activity Identifier
- B1 - Previous Reporting Organisation Identifier
- B9 - Other Organisation Identifier
IATI Organisation Codelist
- IATI will establish its own registration agency with a published list of registered identifiers
- The prefix for this agency will be ??-IATI . (NB it has been pointed out that the "XM" prefix has been 'defined' as "multilateral" and is therefore not appropriate for IATI. [XI - for international organisations?] )
- This list will initially be populated with ALL currently valid identifiers that are in use and that do not have a valid prefix.
- While many of these identifiers have been derived from DAC codes, this 'meaning' is not carried forward. i.e. IATI generated identifiers have no intrinsic meaning.
- Any publisher may request IATI to generate a code for it. This will be a manual process with a quick turnaround.
Discussions to date
For technical details about implementing this proposal go to: https://github.com/IATI/IATI-Schemas/issues/231
Yohanna, you are right. You are in a special case having already traveled this path, but it may well apply to others who have absolutely no option but to change activity identifiers.
I agree we should add "Previous Activity Identifier" to the codelist.
For implementation see: https://github.com/IATI/IATI-Codelists/issues/68
Bill, thanks for your work on developing this proposal, I know you’ve spent a lot of time working on this and we recognise its importance not just for IATI but for wider joined-up data goals.
A few suggestions (mostly process issues) about how to both ensure that this change achieves its maximum benefit and minimises disruption:
a) Maximise benefit – start making some significant progress on providing organisation IDs to public bodies (point (1) below)
b) Minimise disruption – keep the impact on users of the data as low as possible (points (2)-(4) below):
The specific suggestions are:
(1) Ask the Steering Committee for a mandate to set up a Working Group with a remit to report back in six months on how to use this structure/methodology to provide codes for all public bodies
(2) Provide a reconciliation service (preferably integrated with the datastore) that would help relate GB-1 and XM-DAC-12-1 type codes to each other
(3) Provide a translation utility between v1 and v2 data
(4) Only forbid forward slashes from the iati-identifier and org id
I don't think it's clear in the proposal whether or not other-identifier/@type would be mandatory. I think it would have to be to be able to make sense of the data supplied.
John (or others involved in the discussion), I have a question of clarification regarding forbidden characters.
"We also agreed that the jurisdiction and registration body should both forbid the “-” character. Organisation IDs could thus be read from left to right, and decomposition remains possible."
If I understand this correctly, it would make something like "CRA-ARC" an invalid ID for a registration body. Is that correct?
If that is indeed the case, we will have a problem since CRA-ARC is our registration body for CSOs and private sector organisations. We must include both CRA and ARC in the name (English and French acronyms) to comply with Canadian legislation. The other way would be CRA/ARC, but / is also forbidden (again, if I understnad correctly).
We would have the same problem if, for instance, we started using numbers issued by a Canadian federal body to identify other deparments (fictitious example: TBS-SCT-05 where TBS-SCT would be the registration body and 05 might be the Office of the Comptroller General as implementing partner for a project)
I'd be grateful for clarifications if I misunderstood, or guidance on how we would comply with this new rule.
I think in most of what I've seen so far, other-identifier has more often been used for internal project IDs rather than for linking to CRS reporting (though I may be wrong on this, it's just a vague impression)
If that's the case, I would suggest reversing A1 and A2, and to answer David's question, setting a default that if not specified, other-identifier would be assumed to be A1, internal project ID. If we have a default then perhaps we can avoid making the attribute mandatory.
I think it's also cleaner to put internal project ID as A1 and the default, as all organisations should have some internal project ID whereas only some organisations will have CRS identifiers.
I'd prefer to see @type as required - it tells you what the data is without having to look up another rule that says go look on the codelist for A1. (why not just say A1 in the data?)
If we make it mandatory now, and people disagree in the future we can relax the condition in a decimal upgrade under our current upgrade policy, but we would not be able to harden a non-mandatory attribute later on.
It strikes me as though data publishers know what type their identifier is, and by telling data users, it makes their life easier.
I only became aware of these proposed changes today through the very helpful summary document distributed by the Secretariat.
As part of the fusion of CIDA and DFAIT, we are in the process of merging our management systems. This will likely result in all x-CIDA projects getting new project numbers - hence new activity-identifiers.
While such situations will hopefully not happen very often, other publishers may find themselves in a similar situation. What would the solution be? Should we include in the codelist "Previous Activity Identifier"?
Yes "CA-CRA-ARC" is invalid.The OrganisationRegistrationAgency Codelist has CA-CRA_ARC. Is this okay?
This looks like a clear and balanced proposal.
I support the identifier as a single object, as this is much easier to work with in existing systems, and in flat file structures, as well as being more portable and easier to maintain compatibility with other standards (e.g. Open Contracting)
The one area I might suggest some clarification would be:
(a) whether identifiers, appended to the prefix, can contain hyphens, given the hyphen is currently used as a separator for the prefix;
(b) what non-valid characters in an original identifier should be replaced with.
E.g. for (a) if registration agency US-ABC hands out identifiers of the form '1235-123-23' should this be reported as 'US-ABC-1235-123-23' or something like US-ABC-1235_123_23'
And for (b), unless we specific a set replacement approach, an original ID such as '1235/123/23' could end up variously rendered as 'US-ABC-1235_123_23' or 'US-ABC-1235.123.23' or 'US-ABC-123512323'.
My general preference would be replace all non-valid characters with underscore, as this is rarely used as a separator in IDs, and, assuming IDs generally only contain one sort of invalid character it would even be possible for the Registration Agency codelist to (should it be required) store what _ should be replaced with to reconstitute the original identifier.
Thanks Bill. We had considered both ways (CRA-ARC and CRA_ARC), looks like we chose the right one!
So CRA_ARC will remain valid, and "_" is what we should use when facing similar situations? Is there a risk that "_" would become invalid later on?
Just want to query:
A9 - Other Activity Identifier
-- Why is this A9, when A1 and A2 are the only other two?
-- What link would this have to related-activity element, if at all? Is it the same, or more specifically about a previous identifier that activity had?
The IATI Technical Team met this morning to finalise schema issues relating to the other-identifier element. There are two additional changes that we wish to make. We realise that this is very late in the day but we think it is important to be consistent and get this right. The first change is to make the new type attribute mandatory when the element is used. The second is to replace the "owner-name" attribute with a sub-element that is consistent with other organisation elements. The effect of adding a sub-element also requires the replacing of other-identifier/text() with other-identifier/@ref. So:
[added 2014-09-29: for implementation see: https://github.com/IATI/IATI-Schemas/issues/228 ]
Unlike the rest of the 2.01 Iteration 3 proposal consensus is still to be reached on this issue. Over 50 comments have been posted in this debate. Thanks to patient and sustained interest from Herman van Loon, John Adams, David Megginson, Owen Scott, Mark Brough, Dan Mihaila and Tim Davies we continue to make progress and I believe we are close to consensus now.
Mark Brough, Ben Webb, Tim Davies, John Adams had a follow-on discussion on IATI organisation identifiers and illegal characters on 2/10/14 to seek consensus on the discussion above. This is a summary of our discussion:
Illegal characters in organisation ID and iati-identifier
Importance of validation before use
We can’t prevent publishers publishing invalid characters and even invalid XML to the IATI Registry.
We can encourage data users (including data brokers such as data stores) to carry out validation before using the data, and to report any invalids through the ticketing system. The Datastore may be able to help with its automatic validation.
We should also add guidance to the IATI Standard website to warn developers that:
B2 should be B9
As far as I understand there is absolutely no risk of "_" becoming invalid.
Mark, I agree with your suggestions. Namely: