You can go to Google Images and type in keywords such as ``apple”, ``bank”, ``cloud” et al
Download the returned images (more than 1000 images), total images will be 100,000.
Extract color histogram or SIFT by using
[login to view URL]
Apply AP clustering to get an optimal partition of the returned images
[login to view URL]~mdehoon/software/cluster/[login to view URL]