I am building and operating a ChatGPT like enteprise AI-Assistant for a year. We log all the user queries in a database for future analysis and building personalized features. We have seen its usage grow over time and it is becoming difficult for our small team(4) to use manual quality analysis methods like eye balling and vibe checks to understand system accuracy and usage patterns.
In this quick post I will cover how we can use BERTopic and OpenAI gpt-4o-mini model to cluster user queries into labelled groups. We will run this analysis on Chatbot Arena dataset.
This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp.
BERTopic is an open-source project offers a novel approach to topic modelling. Topic modelling is an unsupervised and exploratory approach to make sense of bunch of documents.
BERTopic leveraging the power of BERT, a state-of-the-art language model, and c-TF-IDF ( a variation of the traditional TF-IDF (Term Frequency-Inverse Document Frequency) algorithm designed to work with multiple classes or clusters of documents), BERTopic helps uncover hidden thematic structures within your text data. This approach assumes that documents grouped by semantic similarity can effectively represent a topic, where each cluster reflects a major theme and the combined clusters paint a broader picture.
Continue reading “Leveraging BERTopic to Understand AI Assistant Usage Patterns”