DataSet
Flickr-3.5M
The Flickr-3.5M is a collection of 3.5 million social-tagged images randomly collected from Flickr. The dataset was used in our tag relevance learning work
Data
- MetaData of Flickr-3.5M, including user tags, titles, descriptions, geo information, etc. You can use the id.farm.server.secret.txt file to obtain image source urls and download original images (see get-images/get_images.py).
- Ground-truth
- Social20: A ground-truth set for tag-based social image retrieval.
Statistics
| No. of images | ~3,500,000 |
| No. of unique tags | ~570,000 |
| No. of unique user-ids | ~270,000 |
| Proportion of images with faces detected by OpenCV | ~18% |

Further reading
- Xirong Li, Cees G.M. Snoek, and Marcel Worring, Learning Social Tag Relevance by Neighbor Voting, IEEE Transactions on Multimedia, volume 11, issue 7, page 1310-1322, 2009 [ PDF | BibTex ]
- Xirong Li, Cees G.M. Snoek, and Marcel Worring, Unsupervised Multi-Feature Tag Relevance Learning for Social Image Retrieval, in Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR), Xi'an, China, July 2010 [ PDF | BibTex | Slides ] (Best Paper Award)
