IATI Consultations Archive

Live discussions and consultations can be found at discuss.iatistandard.org.

Add an indicator schema to the IATI standard

In order for the results data reported here to be truly useful, we need to know what the indicator being reported on means.

The current activity/result schema includes a few attributes for describing an indicator:

/iati-activities/iati-activity/result/@type (count vs percentage)
/iati-activities/iati-activity/result/@aggregation-status (suitable for aggregation vs not)
/iati-activities/iati-activity/result/indicator/title
/iati-activities/iati-activity/result/indicator/description

In practice much more information is needed to fully describe an indicator and how it is used. For example, the standard USAID performance indicator reference sheet includes:

  • Indicator title
  • Precise definition
  • Unit of measure (e.g. individuals, hours, dollars, km, etc.)
  • How the indicator is to be disaggregated
  • Rationale or justification
  • Data source
  • Method of data collection and construction
  • Reporting frequency
  • Where the indicator fits within the organization's results framework

Other vocabularies have additional types of data and narrative associated with an indicator:

  • Short version of title
  • Direction of improvement (whether higher or lower values are desired)
  • Numerator definition
  • Denominator definition
  • Formula (for indicators that are computed from other indicators)
  • Display format (whole number, decimal number, percentage, ratio, X per thousand/million, etc.)
  • Indicator type (e.g. input vs output vs impact)
  • Sector tags
  • Other keywords
  • Reference URLs
  • Geographical reporting level
  • Status (e.g. active vs deprecated)
  • Data quality notes
  • Strengths, limitations
  • Data review process

In addition, other indicators (in the same vocabulary, or in other vocabularies) might be related to this indicator in various specific ways.

  • A is subset of B
  • A is superset of B
  • A is computed from B (most commonly, B is the numerator or denominator of A)
  • A is referenced by B (likewise)
  • A is identical to B
  • A is similar to B (e.g. measures the same thing in a different way)

Disaggregation requirements also need to be defined precisely.

  • Ideally the disaggregation definitions include canonical vocabularies for the names of the disaggregation factors (e.g. sex, age, crop) and the names of the acceptable subsets (e.g. male, female), along with accepted synonyms (sex = gender, male = m = boy = man = men) and translations of these terms into other languages.
  • It may be necessary to indicate which disaggregations are required by the organization being reported to.
  • In cases of multiple parallel disaggregations, it may be necessary to designate one of the disaggregations as disabled.

That's all a lot of information. It doesn't make sense to embed the full definition of the indicator along with each data point. Instead, we recommend delegating the indicator definitions to the various vocabularies (indicator repositories). The four fields described above could then be deprecated.

The challenge then becomes one of defining a common standard for indicator vocabularies to use. We propose creating a separate IATI Indicator Standard at the same level as the IATI Activity Standard and Organization Standard.

 

Have more questions? Submit a request

4 Comments

  • 0
    Avatar
    Herb Caudill

    I've sketched out what this standard might look like here:

    http://blog.devresults.com/proposed-indicator-definition-schema/

  • 0
    Avatar
    IATI Tech Team

    This item has been moved to the '3.01 Integer Upgrade Proposals' forum for inclusion as part of an integer upgrade.

  • 0
    Avatar
    Herman van Loon
    Herb, can you please explain why this would need to be a seperate schema? Since activities are the means of intervention and thus achieving results, non-aggregated results are inseperable from activities in my opinion. You achieve results by executing activities. Should not therefore publication of non-aggregated results be exclusively part of the activity schema?
  • 0
    Avatar
    Herb Caudill

    Hi, Herman. The results are included in the activity schema. But the indicators that those results represent are independent of the activity. So you might have this:

    <iati-activity>
      ...
      <iati-identifier>ABC</iati-identifier>
        <indicator> 
          <reference vocabulary="4" code="AA.1" />
          ...
              <actual value="5000" />
    .    </indicator> 
        <indicator> 
          <reference vocabulary="4" code="AA.2" />
          ...
              <actual value="800" />
          ...
        </indicator> 
        ...
    </iati-activity>

     

    (Note that the attributes in bold don't currently exist and are proposed here http://support.iatistandard.org/entries/79784435-Results-Require-unambiguous-indicator-reference .)

    In this situation, you've now specified what the number 5000 represents and what the number 800 represents, by linking each to a known indicator (AA.1 and AA.2 respectively) in a known repository (referenced in a codelist as 4). That's a big improvement over the current situation, where there's no way to say what either number is counting.

    But if you're a machine, how do you then find out what that AA.1 represents? If it turns out that USAID's indicator AA.1 is the same the World Bank's indicator ABC.123, how would you know that? 

    The indicator repositories that currently exist in the wild (see https://raw.githubusercontent.com/HerbCaudill/IATI-Codelists-NonEmbedded/10f4f3be8bd56e9b0ed797b797541847af5ca5c6/xml/IndicatorVocabulary.xml for a starter list) are currently published in wildly different ways. Some are Excel documents or PDFs; some are searchable online PDFs; a handful offer their own APIs in idiosyncratic formats. 

    Compare for example: 

    It's great that these definitions are public and online, but they're not machine-readable and therefore nearly useless for any sort of systematic, repeatable analysis. The proposed schema offers a structured way of describing indicators, with the long-term goal of enabling comparisons of results data from disparate sources. 

     

     

Article is closed for comments.