Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

The Adjusted Rand Index (ARI) is a measure of the similarity between two data clusterings. It is a correction of the Rand Index, which is a basic measure of similarity between two clusterings, but it has the disadvantage of being sensitive to chance. The Adjusted Rand Index takes into account the fact that some agreement between two clusterings can occur by chance, and it adjusts the Rand Index to account for this possibility.  It is calculated as follows:

1. Let N be the number of samples in the data set.

2. Let C1 and C2 be two different clusterings of the data set.

3. Let a be the number of pairs of samples that are in the same cluster in both C1 and C2.

4. Let b be the number of pairs of samples that are in different clusters in C1 and C2.

5. Calculate the Rand Index RI as RI = (a + b) / (N choose 2), where (N choose 2) is the number of possible pairs of samples.

6. Calculate the expected value E of the Rand Index for random clusterings, given by: E = (sum(n_i choose 2) * sum(n_j choose 2)) / (N choose 2), where n_i is the number of samples in cluster i and n_j is the number of samples in cluster j.

7. Calculate the Adjusted Rand Index ARI as ARI = (RI - E) / (max(RI) - E), where max(RI) = 1.

The higher the ARI value, the closer the two clusterings are to each other. It ranges from -1 to 1, where 1 indicates perfect agreement between the two clusterings, 0 indicates a random agreement and -1 indicates that the two clusterings are completely different. The ARI is widely used in machine learning, data mining, and pattern recognition, especially for the evaluation of clustering algorithms.

 

 

Related use cases :

Uploaded on Mar 27, 2023
The similarity among samples and the discrepancy between clusters are two crucial aspects of image clustering. However, current deep clustering methods suffer from the inaccurate e...

Uploaded on Mar 27, 2023
Image clustering is a particularly challenging computer vision task, which aims to generate annotations without human supervision. Recent advances focus on the use of self-supervis...


About the metric


Objective(s):


Target sector(s):


Lifecycle stage(s):



Risk management stage(s):

Modify this metric

catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.