Clustering groups data so that those in the same group/cluster have meaningful similarities (i.e. specific features or properties). Clustering facilitates informed decision-making by giving significant meaning to data through the identification of different patterns.
For example, imagine a dataset composed of customer feedback. In this scenario, a cluster could be all feedback about 'delays', another cluster could be all feedback on how the 'support line was/wasn't helpful' and so on.
Why clustering data can be beneficial?
Clustering groups items so that those in the same group/cluster have meaningful similarities. Thus, clustering is a great tool to unravel hidden patterns in the data.
One-to-Many Clustering is meaningful when dealing with responses comprised of multiple sentences. For example, let's look at the example below. This one response includes four sentences and can be categorized into more than one cluster.
My kids love the play ground. But the parkinglot is always packed. The food is great. However you need to hire more staff to shorten the wait time.
One-to-Many clustering in Relevance AI
assigns responses to more than one cluster if their composing sentences are about various topics.
Relevance AI provides you with a no-code workflow to cluster your data with a few clicks. Once you have uploaded your data, select your dataset and click on Cluster Text (One-To-Many) (under Workflows) and follow the setup wizard.
On this page, you will be asked to:
- Select the text field based on which you wish to cluster your data
- Enter the number of categories you wish to see
Note 1: this number identifies to how many groups your data will be broken
Note 2: smaller numbers result in high-level overview of the data, whereas larger values will break the data into more groups
Note 3: if you have an overview of the data, knowing there are N categories (e.g. you know there are roughly 45 categories in a customer feedback dataset) you should enter N, otherwise we recommend 5% of the size of your dataset (i.e. number of entries in the dataset). Read our guide on How to select the number of clusters
- Select Yes, if you wish to receive an email notification upon the workflow completion
- Execute the workflow
Note: This workflow takes care of "vectorizing" so you do not need to vectorize your data in advance to running this clustering workflow.
After the workflow is finalized, clustering results are automatically added to your dataset under a new field. This can be checked under Datasets.
Updated 4 months ago