Deep neural networks used in industry applications usually work the best when they are trained using supervised learning given that:
Large amounts of data is available on and outside the Internet, but it is not useful for building machine learning solutions in raw format
Cortex dataset only references URLs to the original images. Images are scraped from Common Crawl database, temporarily stored in RAM and then discarded after labeling is done. Anyone using the dataset must download images they are interested in and suggestion is to use the img2dataset tool. URLs inserted through /upload endpoint are not exposed through /get-labeled-data endpoint if they can not be scraped from Common Crawl database.