For this project, we created the UPMC Food-101 dataset. This dataset contains 101 food categories. For each of them, we gathered around 800 to 950 images from a Google Image seach of the title of the category.
UPMC Food-101 is a large multimodal dataset containing about 100,000 items of food recipes classified in 101 categories. This dataset was crawled from the web and each item consists of an image and the HTML webpage on which it was found.
This dataset can be considered as a “twin dataset” of ETHZ Food-101. Indeed, they both share the same 101 categories and have approximately the same size.
The categories of both UPMC Food-101 and ETHZ Food-101 are the 101 most popular categories from the food picture sharing website foodspotting.com. However, the images from ETHZ Food-101 are also taken from this website whereas UPMC Food-101's images have been crawled from Google Images searches of the category name followed by “recipe”.