Elevating Behavox's Machine Learning Risk Detection Tool: UI/UX Case Study

Boosting risk detection
by 64%

Role: I worked as a Sr. Product Designer within the Behavox compliance team. This project lasted ∼2 months in 2021. I worked with a Product Manager, a Front-end engineer, and a UX Researcher.

Deliverables: User interviews, User flow mapping, Workflow mapping, Wireframes, Mockups, Prototypes, Usability testing, and Component creation.

PROBLEM

Users struggled to finetune machine learning models.

Behavox offers AI-powered SaaS for enterprise risk management. They rely on human feedback to improve their AI models. Yet, this involved code changes. This resulted in low usage, dissatisfaction, and inaccurate risk prediction.

GOAL

Enhancing the tool's usability and ML forecast accuracy.

We sought to enhance reinforcement learning with human feedback. Our goal was to boost precision and improve user satisfaction.

IMPACT

64%

Boost in risk alert accuracy

∼30%

Rise in user overall satisfaction

Discovery of a new user type

In accordance with my NDA obligations, I have excluded sensitive data and made modifications to visual assets. All information presented in this case study is solely my own and does not represent the views of Behavox.

DISCOVERY

I starter by mapping the refining flow to better understand its complexity.

I shared these flows on Slack to validate my understanding with data engineers.

User interviews unveiled significant issues and user needs associated with the tool.

I helped scope 6 remote user interviews. Some notable discoveries included:

Users required a more efficient way to mark alerts.
Minimizing manual code updates was imperative.
Adding new alert signals (phrases) was essential to improve accuracy.
The discovery of a new user archetype highlighted the diversity in company workflows.

DEFINITION

Armed with our discoveries, I started exploring solutions.

Sketches, wireframes, and mockups became my tools as I explored ideas and engaged in collaborative discussions with the team.

To facilitate the compliance office’s tasks I looked into ways of providing feedback and adding new risk content to ML models.

I then designed methods for compliance managers’ monitoring needs. I focused on real-time updates on model performance and feedback contributions.

TESTING

I created an easy and efficient method to provide feedback on reported content.

We chose to test the thumbs-up/down feedback pattern. I hypothesized this to be the easiest way for users to give feedback on risk signals.

We broadened our strategy to include incorporating additional risk-alert content.

Since highlighting words was an existing pattern in the product, I proposed using the same logic to add a new risk signal.

I created a dashboard focusing on monitoring ML performance and feedback rounds.

For Cameron’s workflow, I focused on displaying essential information. We focused on displaying feedback contributors and model performance.

Setback! Backend restrictions threatened to pause the project.

As I completed the prototypes for testing, engineers found an issue... Adding the backend logic to get dashboard data would take longer than initially expected.

I partnered with my project manager to prevent the project from being halted.

Management wanted us to pause the project. To avoid it, we negotiated a 2-week period to test 1/2 the flow with 7 users.

ITERATION

Testing revealed two main areas that needed improvement.

Simplifying alert justifications

— "Justifications clarify flagged content for me."

It was crucial for users to read risk alert justifications before marking risk content as accurate or inaccurate. Yet, this section was very text-heavy adding visual clutter affecting readability.

I enhanced clarity by optimizing spacing and layout. Next, I introduced expandable sections for secondary information. Finally, I prioritized quick access to regulatory details.

2. Disproving my feedback pattern hypothesis

— "I couldn’t find how to report flagged content."

Contrary to my expectations, users didn't intuitively understand the inline feedback pattern. The majority didn't click the flagged sentence to give feedback.

The click-through rate showed that users first went to the justification section as part of their workflow. Based on this, I suggested adding the pattern there.

HANDOFF

After a last meeting with my team, I made a few small adjustments and then handed over the designs to the developers.

During this time, we were implementing a new design system. As part of this effort, I got tasked to update the project’s designs for the next iteration.*

KEY LEARNING

Focusing on making progress rather than striving for perfection.

In collaboration with the PM, we executed a partial solution preventing project stoppage. By prioritizing insights, we delivered significant results to users and the company. This underscored the value of proactive action over waiting for perfection.