These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.
The Adjusted Rand Index (ARI) is a measure of the similarity between two data clusterings. It is a correction of the Rand Index, which is a basic measure of similarity between two clusterings, but it has the disadvantage of being sensitive to chance. The Adjusted Rand Index takes into account the fact that some agreement between two clusterings can occur by chance, and it adjusts the Rand Index to account for this possibility. It is calculated as follows:
1. Let N be the number of samples in the data set.
2. Let C1 and C2 be two different clusterings of the data set.
3. Let a be the number of pairs of samples that are in the same cluster in both C1 and C2.
4. Let b be the number of pairs of samples that are in different clusters in C1 and C2.
5. Calculate the Rand Index RI as RI = (a + b) / (N choose 2), where (N choose 2) is the number of possible pairs of samples.
6. Calculate the expected value E of the Rand Index for random clusterings, given by: E = (sum(n_i choose 2) * sum(n_j choose 2)) / (N choose 2), where n_i is the number of samples in cluster i and n_j is the number of samples in cluster j.
7. Calculate the Adjusted Rand Index ARI as ARI = (RI - E) / (max(RI) - E), where max(RI) = 1.
The higher the ARI value, the closer the two clusterings are to each other. It ranges from -1 to 1, where 1 indicates perfect agreement between the two clusterings, 0 indicates a random agreement and -1 indicates that the two clusterings are completely different. The ARI is widely used in machine learning, data mining, and pattern recognition, especially for the evaluation of clustering algorithms.
Related use cases :
SPICE: Semantic Pseudo-labeling for Image Clustering
Uploaded on Mar 27, 2023Information Maximization Clustering via Multi-View Self-Labelling
Uploaded on Mar 27, 2023About the metric
You can click on the links to see the associated metrics
Objective(s):
Target sector(s):
Lifecycle stage(s):
Target users:
Risk management stage(s):