IATI Consultations Archive

Live discussions and consultations can be found at discuss.iatistandard.org.

Version 2.01 - Iteration 3 - 9. Organisation and Activity Identifiers

Please see the motivation for this revised proposal in Bill Anderson's comment below.

Principles

  • IATI organisation identifiers should:
    • be globally unique
    • be constructed from a consistent methodology
    • be compatible with other data standards
  • IATI activity identifiers
    • must be globally unique
    • must be persistent
    • must be prefixed with an organisation identifier
  • IATI organisation identifiers CAN NOT be persistent. Therefore when a reporting organisation's identifier changes:
    • The previous reporting organisation identifier(s) should be reported in addition to the new one
    • Previously reported activities should maintain their original identifier. The activity identifier should be prefixed 
  • Use of organisation identifiers [added 18-09-2014]
    • ALL publishers of IATI data MUST have a valid organisation identifier reported in reporting-org/@ref
    • When using the @ref attribute in in participating-org/@ref, transaction/provider-org/@ref and transaction/receiver-org/@ref must be a valid organisation identifier. If you do not have one, then the narrative element may be used to describe the organisation. [amended 29-09-2014]

 Reporting organisation

  • The reporting-org element is MANDATORY.
  • ALL the following rules must apply to the organisation-identifier in reporting-org/@ref
    • It is mandatory
    • The agency prefix MUST be a valid code in the IATI OrganisationRegistrationAgency code list
    • The identifier MUST be the same as that recorded by the publisher on the IATI Registry
    • The identifier MUST only contain alphanumeric characters and hyphen, underscore, colon or period
IATI activity identifier
  • The iati-identifier is MANDATORY
  • It MUST be globally unique among all activities published through the IATI Registry
  • Once an activity has been reported to IATI its identifier MUST NOT be changed in subsequent updates, 
  • It MUST be prefixed with
    • EITHER the organisation-identifier found in reporting-org/@ref 
    • OR a previous reporting-org identifier reported in other-identifier
  • The identifier MUST only contain alphanumeric characters and hyphen, underscore, colon or period

Other-identifier

The following modifications should be made to the other-identifier element

  • The definition of the element should be changed to allow both organisation and activity identifiers to be reported.
  • Removed
    • other-identifier/text()
    • other-identifier/@owner-name
  • Created 
    • other-identifier/@type
    • other-identifier/@ref
    • other-identifier/owner-org
    • other-identifier/owner-org/@ref
    • other-identifier/owner-org/narrative
    • other-identifier/owner-org/narrative/@xml:lang
    • other-identifier/owner-org/narrative/text()
  •  Rules
    • other-identifier/@type is MANDATORY when other-identifier is present
    • other-identifier/owner-org/@ref is NOT MANDATORY but when used MUST contain a valid organisation identifier
  • An OtherIdentifierType codelist should be added for use by other-identifier/@type. Values are:
    • A1 - Reporting Organisation's internal activity identifier
    • A2 - CRS Activity identifier
    • A3 - Previous Activity Identifier
    • A9 - Other Activity Identifier
    • B1 - Previous Reporting Organisation Identifier
    • B9 - Other Organisation Identifier

IATI Organisation Codelist

[added 18-09-2014]

  • IATI will establish its own registration agency with a published list of registered identifiers
  • The prefix for this agency will be ??-IATI . (NB it has been pointed out that the "XM" prefix has been 'defined' as "multilateral" and is therefore not appropriate for IATI. [XI - for international organisations?] )
  • This list will initially be populated with ALL currently valid identifiers that are in use and that do not have a valid prefix.
  • While many of these identifiers have been derived from DAC codes, this 'meaning' is not carried forward. i.e. IATI generated identifiers have no intrinsic meaning.
  • Any publisher may request IATI to generate a code for it. This will be a manual process with a quick turnaround.

 

Discussions to date


 

For technical details about implementing this proposal go to: https://github.com/IATI/IATI-Schemas/issues/231 

Have more questions? Submit a request

17 Comments

  • 0
    Avatar
    Bill Anderson

    Yohanna, you are right. You are in a special case having already traveled this path, but it may well apply to others who have absolutely no option but to change activity identifiers.

    I agree we should add "Previous Activity Identifier" to the codelist.

    For implementation see: https://github.com/IATI/IATI-Codelists/issues/68

  • 0
    Avatar
    Mark Brough

    Bill, thanks for your work on developing this proposal, I know you’ve spent a lot of time working on this and we recognise its importance not just for IATI but for wider joined-up data goals.

    A few suggestions (mostly process issues) about how to both ensure that this change achieves its maximum benefit and minimises disruption:

    a)      Maximise benefit – start making some significant progress on providing organisation IDs to public bodies (point (1) below)

    b)      Minimise disruption – keep the impact on users of the data as low as possible (points (2)-(4) below):

    The specific suggestions are:

    (1)      Ask the Steering Committee for a mandate to set up a Working Group with a remit to report back in six months on how to use this structure/methodology to provide codes for all public bodies

    1. recognise it will probably take longer than that to fully solve these problems
    2. ask for volunteers from among the Steering Committee to take part in this group, both to ensure widespread support and develop consensus, as well as providing data and increasing likelihood of effective implementation in the near future;

    (2)      Provide a reconciliation service (preferably integrated with the datastore) that would help relate GB-1 and XM-DAC-12-1 type codes to each other

    1. this could start with just a slightly reformatted version of the spreadsheet previously provided, but preferably would be provided as a more tightly integrated service with the datastore;
    2. then you could ask for anything funded by GB-1, and optionally also return anything funded by XM-DAC-12-1.

    (3)      Provide a translation utility between v1 and v2 data

    1. so users can request the same data in either v1 or v2 format
    2. this is more general a thing than organisation IDs but it would basically mean a user would not require two different import routines for v1 and v2 data (or vX…);

    (4)      Only forbid forward slashes from the iati-identifier and org id

    1. allow all other characters
    2. the issue is to do with URLs getting confused if IATI identifiers contain forward slashes. No other characters create such problems (even forward slashes shouldn’t be a problem as they could be percent-encoded, but in practice they often aren’t)
    3. this way we can keep iati-identifiers as close to the internal project codes as possible, thus increasing chances of globally unique IDs
  • 0
    Avatar
    David Carpenter

    I don't think it's clear in the proposal whether or not other-identifier/@type would be mandatory. I think it would have to be to be able to make sense of the data supplied.

  • 0
    Avatar
    Yohanna Loucheur

    John (or others involved in the discussion), I have a question of clarification regarding forbidden characters. 

    "We also agreed that the jurisdiction and registration body should both forbid the “-” character. Organisation IDs could thus be read from left to right, and decomposition remains possible."

    If I understand this correctly, it would make something like "CRA-ARC" an invalid ID for a registration body. Is that correct?

    If that is indeed the case, we will have a problem since CRA-ARC is our registration body for CSOs and private sector organisations. We must include both CRA and ARC in the name (English and French acronyms) to comply with Canadian legislation. The other way would be CRA/ARC, but / is also forbidden (again, if I understnad correctly).

    We would have the same problem if, for instance, we started using numbers issued by a Canadian federal body to identify other deparments (fictitious example: TBS-SCT-05 where TBS-SCT would be the registration body and 05 might be the Office of the Comptroller General as implementing partner for a project)

    I'd be grateful for clarifications if I misunderstood, or guidance on how we would comply with this new rule. 

  • 0
    Avatar
    Mark Brough

    I think in most of what I've seen so far, other-identifier has more often been used for internal project IDs rather than for linking to CRS reporting (though I may be wrong on this, it's just a vague impression)

    If that's the case, I would suggest reversing A1 and A2, and to answer David's question, setting a default that if not specified, other-identifier would be assumed to be A1, internal project ID. If we have a default then perhaps we can avoid making the attribute mandatory.

    I think it's also cleaner to put internal project ID as A1 and the default, as all organisations should have some internal project ID whereas only some organisations will have CRS identifiers.

  • 0
    Avatar
    David Carpenter

    I'd prefer to see @type as required - it tells you what the data is without having to look up another rule that says go look on the codelist for A1. (why not just say A1 in the data?)

    If we make it mandatory now, and people disagree in the future we can relax the condition in a decimal upgrade under our current upgrade policy, but we would not be able to harden a non-mandatory attribute later on.

    It strikes me as though data publishers know what type their identifier is, and by telling data users, it makes their life easier.

  • 0
    Avatar
    Yohanna Loucheur

    I only became aware of these proposed changes today through the very helpful summary document distributed by the Secretariat.
    As part of the fusion of CIDA and DFAIT, we are in the process of merging our management systems. This will likely result in all x-CIDA projects getting new project numbers - hence new activity-identifiers.

    While such situations will hopefully not happen very often, other publishers may find themselves in a similar situation. What would the solution be?  Should we include in the codelist "Previous Activity Identifier"?

  • 0
    Avatar
    Bill Anderson

    Hi Yohanna

    Yes "CA-CRA-ARC" is invalid.The OrganisationRegistrationAgency Codelist has CA-CRA_ARC. Is this okay?

  • 0
    Avatar
    Tim Davies

    This looks like a clear and balanced proposal.

    I support the identifier as a single object, as this is much easier to work with in existing systems, and in flat file structures, as well as being more portable and easier to maintain compatibility with other standards (e.g. Open Contracting)

    The one area I might suggest some clarification would be:

    (a) whether identifiers, appended to the prefix, can contain hyphens, given the hyphen is currently used as a separator for the prefix;

    (b) what non-valid characters in an original identifier should be replaced with. 

    E.g. for (a) if registration agency US-ABC hands out identifiers of the form '1235-123-23' should this be reported as 'US-ABC-1235-123-23' or something like US-ABC-1235_123_23'

    And for (b), unless we specific a set replacement approach, an original ID such as '1235/123/23' could end up variously rendered as 'US-ABC-1235_123_23' or 'US-ABC-1235.123.23' or 'US-ABC-123512323'.

    My general preference would be replace all non-valid characters with underscore, as this is rarely used as a separator in IDs, and, assuming IDs generally only contain one sort of invalid character it would even be possible for the Registration Agency codelist to (should it be required) store what _ should be replaced with to reconstitute the original identifier.

  • 0
    Avatar
    Yohanna Loucheur

    Thanks Bill.  We had considered both ways (CRA-ARC and CRA_ARC), looks like we chose the right one! 

    So CRA_ARC will remain valid, and "_" is what we should use when facing similar situations? Is there a risk that "_" would become invalid later on?

  • 0
    Avatar
    Steven Flower

    Just want to query:

    A9 - Other Activity Identifier

    -- Why is this A9, when A1 and A2 are the only other two?

    -- What link would this have to related-activity element, if at all?  Is it the same, or more specifically about a previous identifier that activity had?

  • 0
    Avatar
    IATI Tech Team

    The IATI Technical Team met this morning to finalise schema issues relating to the other-identifier element. There are two additional changes that we wish to make. We realise that this is very late in the day but we think it is important to be consistent and get this right. The first change is to make the new type attribute mandatory when the element is used. The second is to replace the "owner-name" attribute with a sub-element that is consistent with other organisation elements. The effect of adding a sub-element also requires the replacing of other-identifier/text() with other-identifier/@ref. So:

     

    • other-identifier/text() is removed
    • other-identifier/@owner-name is removed

     

    • other-identifier/@ref is created
    • other-identifier/owner-org is created as an optional element (0..1)
    • other-identifier/owner-org/@ref is created
    • other-identifier/owner-org/narrative is created (0..*)
    • other-identifier/owner-org/narrative/@xml:lang is created
    • other-identifier/owner-org/narrative/text() is created

     

    • other-identifier/@type is MANDATORY when other-identifier is present
    • other-identifier/owner-org/@ref is NOT MANDATORY but when used MUST contain a valid organisation identifier
    • If other-identifier/owner-org is present then either other-identifier/owner-org/@ref or other-identifier/owner-org/narrative/text() MUST be present

     

    [added 2014-09-29: for implementation see: https://github.com/IATI/IATI-Schemas/issues/228 ]

  • 0
    Avatar
    Bill Anderson

    Unlike the rest of the 2.01 Iteration 3 proposal consensus is still to be reached on this issue. Over 50 comments have been posted in this debate. Thanks to patient and sustained interest from Herman van Loon, John Adams, David Megginson, Owen Scott, Mark Brough, Dan Mihaila and Tim Davies we continue to make progress and I believe we are close to consensus now. 

    • One month ago the discussion appeared to have reached stalemate, but with a consensus to decouple the current reporting organisation identifier from the prefix of the activity identifier, and to make the activity identifier persistent we achieved our first breakthrough. We have consensus on this.
    • The next problem was how to ensure that the activity identifier was in fact globally unique and could in some way be validated against a reporting organisation. I have introduced into the proposal above a working out of a solution proposed by John Adams - to report previous organisation identifiers along with the current one. It is late in the day to introduce a new item into the mix, but to add an optional attribute to an existing element is not rocking the boat unduly.We need consensus on this or an alternate proposal.
    • Another issue not yet put to rest is the proposal that organisation identifiers should consist of two separate parts: with the code (or URI) for the Registration Agency (or vocabulary) reported separately from the identifier (registration number) itself. I have not included this in the proposal as my sense is that the consensus is swinging towards the identifier being a single object. We need consensus on this or an alternate proposal.
    • There have been differing opinions on what disruption is acceptable, on what we should do now and what we should put aside for a working group to consider. The above proposal is an attempt to simplify the issue down to its bare bones in order to build a solid foundation for the development of a coherent, consistent, interoperable standard. If we cannot get full agreement on this, we need, at least, assurances that it will not be vetoed.
      •  

     

     

  • 0
    Avatar
    John Adams

    Mark Brough, Ben Webb, Tim Davies, John Adams had a follow-on discussion on IATI organisation identifiers and illegal characters on 2/10/14 to seek consensus on the discussion above. This is a summary of our discussion:

    We agreed:

    Illegal characters in organisation ID and iati-identifier

    • Some characters as part of organisation and activity IDs need to be invalid/illegal to prevent breaking URLs.
    • Those characters should be defined in a blacklist, which should be based on existing RFC standards. The list at least contains the following characters / & | ?
    • An alternative is to use a whitelist, but that could possibly cause issues with non-latin characters for some publishers and incorrectly anticipating all the characters that publishers may have in their internal project IDs.
    • Where the internal activity ID contains illegal characters and therefore the iati-identifier is different from the internal activity ID, the internal activity ID should be recorded in the other-identifier element.

    Separation characters

    • A suggestion was made to replace all hyphens as separation characters, except where these are part of the organisation ID. This would enable data users to decompose the organisation and activity IDs into their constituent parts (country, registration org, organisation, activity etc.).
    • Others were of the view that the IATI registration agency codelist and the organisation ID field in each activity file provided sufficient information for data users to make those lookups, and that it would be better to have the iati-identifier more closely aligned to the internal activity ID. After discussion a consensus was reached and we agreed on this option. 
    • We also agreed that the jurisdiction and registration body should both forbid the “-” character. Organisation IDs could thus be read from left to right, and decomposition remains possible.
    • There is also a need to consider fully how to publish bureaux or departments within an organisation (e.g. USAID Africa Bureau), but this may be specific to each publisher and not therefore a matter for the standard to consider.

    Importance of validation before use

    We can’t prevent publishers publishing invalid characters and even invalid XML to the IATI Registry.

    We can encourage data users (including data brokers such as data stores) to carry out validation before using the data, and to report any invalids through the ticketing system. The Datastore may be able to help with its automatic validation.

    We should also add guidance to the IATI Standard website to warn developers that:

    • they should ensure that IATI-identifiers are percent-encoded
    • IATI-identifiers should not contain any of the illegal characters, but in practice they may do so, so developers should still take precautions
  • 0
    Avatar
    Steven Flower

    B2 should be B9

  • 0
    Avatar
    Bill Anderson

    As far as I understand there is absolutely no risk of "_" becoming invalid.

  • 0
    Avatar
    Bill Anderson

    Mark, I agree with your suggestions. Namely:

    • A1 = Internal project Id
    • @type is not mandatory
    • If @type is not present "A1" is assumed
Article is closed for comments.