This work was inspired by @timnitGebru's famous paper "Datasheets for Datasets", which discusses the importance of documenting datasets with information around how they were created, potential biases, and recommended use cases. arxiv.org/abs/1803.09010
This is our motivation: "To be successful, AI algorithms need to be trained and tested on data that represent clinical scenarios encountered in real-world settings. Therefore, a clear understanding of data set characteristics is critical."
What we found; however, was sobering. Unsurprisingly, most of the dermatology data used for AI development was siloed in institutions and not shared publicly. You may note here that public datasets such as ISIC (labeled 1) help generate a lot of research and are really valuable
Previously @AdeAdamson has expressed concerns about bias in AI algorithms from the lack of diverse skin tones represented. We wanted to quantify biases with regards to skin tone diversity in the datasets used for AI development. jamanetwork.com/journals/jamad…
However, we were unable to do this because very few papers even reported skin tones or ethnicities used within the data. I highly suspect that these datasets are not diverse, but there is no way to know.
One of the other major concerns we had was label noise in the datasets. While there are not established gold standards for every disease, we believe that skin cancer should be confirmed by histopathology rather than consensus from looking at an image.
Many papers, including one group that recently announced they were going to release their algorithm directly to patients, did not have histopathological confirmation of malignancies.
We understand that there are issues of patient privacy around releasing datasets. However, adequate descriptions of the datasets are absolutely needed. It's like reading about a clinical trial without knowing any information on patient demographics.
Another way to be more transparent is to share models. This is something @whria78 has done with Model Derm. Most people do not share their code or provide an API for testing their models.
• • •
Missing some Tweet in this thread? You can try to
force a refresh