🤖 Auto QA Setup: Instructions – Kaizo

Writing an Auto QA instruction (which can be also called Auto QA prompting) is like telling (teaching) someone exactly what to do. And in this case, we are teaching Samurai. It is also a fact that the language of explaining is not the same as the language of understanding. We all know that there are a million ways in which explanations can go wrong. The point of this article is to help you understand how to instruct our AI and get the most out of it for Auto QA.

🌟Start Simple

As you embark on the process of crafting instructions, it's important to understand that it's a continually evolving journey that necessitates a considerable amount of trial and error to achieve optimal outcomes.

You can initiate your instructions/prompt with straightforward elements and gradually introduce more details and context as you strive for improved results. Constantly refining your instructions as you go is crucial for this purpose. As you go through the guide, you'll encounter numerous instances where precision, simplicity, and brevity often lead to better outcomes.

When faced with a complex task comprising multiple subtasks, it's beneficial to break down the task into more manageable components and progressively enhance it as you attain better results. This approach prevents an initial overload of complexity in the prompting process.

💡Example

Let’s imagine that we want to Auto QA a criterion called “Welcome/sign-off greeting not used”

We start with the following instructions:

“We consider this criterion to be violated if a welcome greeting or a sign-off phrase was not used”

This instruction is simple yet imperfect as it is missing a lot of details:

What type of greeting and sign-offs do we consider to be correct?
When do we expect the welcome greeting and sign-off to be used? Once at the beginning of the conversation and once at the end of it?
Do we have different considerations for chat vs. email conversations?
Is it okay to have either the greeting OR the sign-off or do we need both?

And so on... You get the point 😉

🧐 Specificity

Be very specific about the instruction and task you want the AI to perform. The more descriptive and detailed the instructions are, the better the results. There aren't specific keywords that lead to better results. It's more important to have a good format and descriptive instruction. In fact, providing examples in the prompt is very effective in getting the desired output.

There is a limit to which specificity and detail can help. Including too many unnecessary details is not necessarily a good approach. The details should be relevant and contribute to the task at hand. This is something you will need to experiment with a lot. We encourage a lot of experimentation and iteration to optimize the instructions.

💡Given this knowledge let’s modify our instruction example:

"The welcome structure should consist of:
- The customer being greeted by name
- Thanking them for their contact

The sign-off structure should consist of:
- Advising the customer to reach out if they require further assistance
- Asking if they have any questions or feedback (except when a reply is expected from the customer. In this case, we inform them we are looking forward to their reply)
- A sign-off sentence (e.g. "Kind regards,")"

Now it looks much better, as we specified the expected welcome and sign-off structures.

💪 Avoid Impreciseness

When instructions are imprecise, Samurai may struggle to understand the rater’s intent, leading to confusion and a lower match rate with the historical QA ratings.

💡Example

Let’s imagine that we want to Auto QA a criterion called “Making grammatical mistakes”. Here are our initial instructions:

"We consider grammatical mistakes to have been made if the agent does grammatical mistakes"

In this case, we did not specify the number of mistakes. A better approach would be:

"We consider this criterion to be violated if the agent performed 1 or more grammar mistakes.

Perform the following to check this criterion:
1 - Collect all the potential grammar mistakes
2 - Count the mistakes
3 - if the mistake count is greater than or equal to 1, flag this criterion as violated"

With the new instructions we specified explicitly the number of mistakes that we tolerate, and added some additional step-by-step instructions to better help the AI implement what we expect.

🤩 More Examples

And here are some additional examples of successful criterion instructions.

- Criterion Name: Security Verification

- Criterion Instructions:

"Security Verification is about understanding if the contacting customer is the one who placed the order or knows all the details of the order.

When is it required? Security verification is required only for inquiries about orders, deliveries, and troubleshooting purchased products. Subscription-related inquiries do not require a security verification.

The agent needs to collect a total of three order-related pieces of information. At least one of the Primary pieces of information and 2 of the secondary pieces of information. We define Primary and Secondary order information.

Primary order information:

- Email on the order
- Phone Number on the order

Secondary order information:

- Order number
- Full name on the order
- Shipping address on the order

Security Verification is considered incomplete if the agent fails to collect at least one of the Primary order information and at least 2 of the Secondary order information.

We consider Security Verification to have been performed by the agent even if the bot had done it"

- Criterion Name: Refund Process Followed

- Criterion Instructions:

"The refund process should be applied in the following cases:

- The customer never received the package after 7 days of receiving the delivery confirmation email
- The customer received a faulty product and refused the solution provided by the agent which includes: sending another unit of the same product.
- A refund should be triggered if the customer renewed their subscription by mistake.

Refrain from triggering a refund if:

- The delivery was delayed
- The product was reported by the customer to be faulty while the warranty was void

Exceptions:

- In very special cases refund requests can be escalated to the management and then it is up to management to decide whether or not a refund should be issued.
- Cases that warrant escalation to management are ones where the client is being unreasonable and is threatening to give a bad review.
- There are also cases where there is no need for a refund at all as the inquiry is about returning/exchanging a product or asking a certain question"

- Criterion Name: Misusing the client's name

- Criterion Instructions:

"The person's (customer) name is to that person the sweetest and most important sound in any language.

Please do not overdo it; using the client's name at the beginning, middle, and end of the chat conversation is sufficient.

We consider the client's name to have been misused if the client's name was not mentioned at the beginning, middle, or end of the whole conversation."

- Criterion Name: Subject line not updated or relevant

- Criterion Instructions:

"This criterion checks if the subject line format conforms to our company's standards

The subject line should have the following format:

1 - The brand name (either Kaizo or Kaizo Support)
2 - The reason for the customer's contact OR a two-word summary of the customer's inquiry
3 - We must refrain from adding any negative connotation to the subject line

We consider this criterion to be violated if at least one of the previous conditions was not met

Some examples of subject lines that are considered to be compliant:

- Your Feedback for Kaizo
- Kaizo - Your request
- Your inquiry for Kaizo
- Your delivery with Kaizo"

🚨 Still do not know where to start?

Assign this task to Samurai and let AI propose a head-start draft of the Instructions by clicking the "Auto-generate Instructions" button. This instruction is built based on the existing information about the criteria and can and should be further customized by you.

🌟Start Simple

💡Example

🧐 Specificity

💡Given this knowledge let’s modify our instruction example:

💪 Avoid Impreciseness

💡Example

🤩 More Examples

🚨 Still do not know where to start?

Related articles