How was the evaluation developed?
It is the result of a participatory work initiated in mid-2019 and carried out by the Labelia Labs association. This approach is described in this blog article that we recommend!
But let's review here some of the contextual elements described in the article. First of all, we can see that there is growing tension between the potential and interest of AI techniques on the one hand, and the difficulty of trusting these techniques or their implementation on the other (whether by private actors such as Apple with the Apple Card, Tesla in this astonishing example, or by public actors such as States, cf. COMPAS on parole in the USA, the controversies each year on Parcoursup in France, unemployment benefits in the Netherlands, and many others). In this context, it is becoming increasingly difficult for an organisation to implement data science approaches in its products and services and to take this on board publicly.
Obviously this tension is not new, certain risks are very real, and it seems to us that there is a general consensus on the fact that it is necessary to develop structuring and reassuring frameworks. You only have to type AI and ethics or responsible AI into a search engine to see the significant number of initiatives in this field. However, many of them are lists of cardinal principles, and do not offer a concrete, operational hook. How can we position ourselves? How to evaluate one's organisation? What should you work on to comply with these principles?
It is based on these observations that we wanted to develop a tool that is intended for practitioners, useful and actionable as soon as possible. Give it a try and tell us what you think!
Who is this evaluation intended for?
How is the evaluation structured?
Is the assessment framework 'finished' or will it keep evolving?
The synthetic score is on a total of 100 theoretical maximum points for a full assessment. It provides an indication of the organization maturity level on responsible and trustworthy data science practices. At the end of 2020, the 50/100 threshold can be considered a very advanced maturity level.
The mechanism for calculating the score is relatively simple:
There is, however, a subtlety in cases where one is not concerned by certain evaluation elements and the risk universes corresponding to them. Indeed, it would be illogical to deprive the organisation of points associated with evaluation elements that do not concern it but which other organisations concerned by this risk can obtain. Similarly, it would be illogical to immediately obtain all possible points, at the risk otherwise of automatically having a very high score as soon as little is actually done. The mechanism for dealing with this point is as follows:
Finally, here is some additional information in the form of answers to frequently asked questions:
Since 2019, Labelia Labs has been bringing together Data Science practitioners through a Meetup with over 300 members to concretely and operationally explore best practices, resources, and tools to bring about a positive practice of Artificial Intelligence, limiting risks and negative externalities.
Thanks to this community, a digital commons has been created: the Responsible and Trustworthy Data Science Assessment . Evolving biannually and overseen by an independent committee, this commons identifies assessment points, best practices, resources and technical tools for responsible AI.
In order to help practitioners in a concrete way, Labelia Labs has set up evaluation and rating platform allowing any organization to evaluate its level of maturity regarding its practices.
Since October 2021, Labelia Labs offers the most mature organizations to become Labelia and join a community of companies applying high standard in their data science practices.