Clustering is an analytics tool that uses the extracted text index to group conceptually similar documents. Clustering is run automatically whenever you import new documents, and you can manually re-run it by clicking the Rebuild Clusters button.

Clustering breaks the entire universe of documents in your workspace down to 10 groups, based on their conceptual similarity. These groups are then subdivided into 10 smaller, similar groups. This process is repeated until there are four levels of clusters or until there are no more documents to group. The names of the clusters reflect the most common terms from each cluster (although note that these terms are not the sole determine factor for why a document was placed into a cluster).

Clustering’s analytics engine is backed by Apple’s Latent Semantic Mapping technology, and uses k-means seeding.