Perhaps you successfully created a small prototype chatbot, but when trying to productize it, realized it was confused about some of your test questions.
Perhaps you released a chatbot live and your customer satisfaction isn’t as great as you hoped because your chatbot struggles to understand your users.
And the challenge with NLP training data is that when we humans look at it, it seems to make sense.
So how do we improve something that seems good? The trick is to look at the training data the way your NLP provider (not a human) looks at it.
QBox analyzes and benchmarks your chatbot training data by visualizing and understanding where it does and doesn’t perform, and why (for your chosen NLP provider).
With this insight, you can then make informed decisions about how you develop the performance of your chatbot.
QBox builds upon industry standard cross‑validation techniques to provide a multidimensional view of how your training data performs, and which intents need improving. With your training data in QBox, you can choose between two types of test:
Standard test (Default option) -
QBox uses custom cross‑validation techniques incorporating K‑fold and leave‑one‑out (LOOCV) methodologies. Our experience shows that standard techniques do not perform well for natural language classifiers because of the small number of samples per intent. We’ve modified them considerably to be better applied in the context of natural language processing.
BYOTS (Bring your own test set) -
Also known as “Blind Data” or “Test data” - QBox uses utterances you the tester have labeled with the expected intent.
We recommend using the default test type, K‑fold until you have improved your intents enough so that they score above 80.
Once you reach this milestone, you can optionally run some tests with blind data to see how your model responds to a wider range of user interactions.
So what does this mean for you?
Measures understanding - It’s not easy knowing when your chatbot has had sufficient training (or even to quantify the quality of this training).
With the QBox scoring system, you can measure understanding, and recognize when your chatbot has reached the maturity level to serve your customers efficiently and effectively.
Use QBox to regain control of your chatbot testing, and track and analyze the incremental progression of understanding of your Artificial Intelligence chatbot as you train it.
Increases knowledge - At the beginning, your chatbot will inevitably cover a smaller subject area than your long-term plan has. But what happens when you expand the domain that your chatbot covers, and increase its knowledge?
QBox allows you to validate your chatbot’s knowledge as you expand it. This helps you rapidly identify whether you’re causing regressions in knowledge areas that were previously performing well.
Monitors live interactions - Your customers are unpredictable, which is why your chatbot’s knowledge will need updating.
To make sure your chatbot stays relevant, you need to review live interactions between your chatbot and your customers. This can take up a lot of time; time you could be spending on more important areas of your AI chatbot development.
With intelligent sampling, QBox automatically monitors user–bot interactions so you only have to review negative interactions.
Easy to use - Your subject-matter experts are best placed to train your chatbot because they know their subjects better than anyone else.
QBox is easy to use; it doesn’t matter if your subject-matter experts have a background in computer/data science or not. Everyone and anyone can monitor and train AI chatbots.