Cao Y, Jiang T, Girke T (2010). “Accelerated similarity searching and clustering of large compound sets by geometric embedding and locality sensitive hashing.” Bioinformatics, 26(7), 953–959. ISSN 1367-4803. doi:10.1093/bioinformatics/btq067. http://dx.doi.org/10.1093/bioinformatics/btq067.