Catalogue of Tools & Metrics for Trustworthy AI

These tools and metrics are designed to help AI actors develop and use trustworthy AI systems and applications that respect human rights and are fair, transparent, explainable, robust, secure and safe.

stock-analysis-engine



stock-analysis-engine

Optimize and deploy in production πŸ€— Hugging Face Transformer models in a single command line.

=> Up to 10X faster inference! <=

Why this tool?

At Lefebvre Dalloz we run in production semantic search engines in the legal domain, in non-marketing language it's a re-ranker, and we based ours on Transformer.
In those setup, latency is key to provide good user experience, and relevancy inference is done online for hundreds of snippets per user query.
We have tested many solutions, and below is what we found:

Pytorch + FastAPI = 🐒
Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI. Both are great tools but not very performant in inference (actual measures below).

Microsoft ONNX Runtime + Nvidia Triton inference server = οΈπŸƒπŸ’¨
Then, if you spend some time, you can build something over ONNX Runtime and Triton inference server. You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool!

Nvidia TensorRT + Nvidia Triton inference server = βš‘οΈπŸƒπŸ’¨πŸ’¨
However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. You will usually get 5X faster inference compared to vanilla Pytorch.
Sometimes it can rise up to 10X faster inference.
Buuuuttt... TensorRT can ask some efforts to master, it requires tricks not easy to come up with, we implemented them for you!

Detailed tool comparison table

Use Cases

There is no use cases for this tool yet.

Would you like to submit a use case for this tool?

If you have used this tool, we would love to know more about your experience.

Add use case
catalogue Logos

Disclaimer: The tools and metrics featured herein are solely those of the originating authors and are not vetted or endorsed by the OECD or its member countries. The Organisation cannot be held responsible for possible issues resulting from the posting of links to third parties' tools and metrics on this catalogue. More on the methodology can be found at https://oecd.ai/catalogue/faq.