Netflix Two Thumbs Up Function Explained – Protocol

For nearly five years, Netflix has had simple thumbs up and thumbs down icons to express viewing preferences and help its algorithms provide better recommendations. However, in polls, people have frequently expressed that this type of binary voting doesn’t really do justice to their tastes.

What if they were really, really in love with a show?

Tasked with finding a better way to express such levels of adoration, the streaming service recently explored the idea of ​​adding a heart icon to the Netflix app. The heart seemed like an obvious choice. It is a universal sign of love and widely used in apps like Instagram and Twitter.

But Netflix wouldn’t be Netflix if the company didn’t put such features through rigorous testing; in this case, it took almost a year. Meanwhile, the company discovered that cores weren’t actually the best performing feature after all, and instead opted for a new two-inch option which is being made available to its subscribers worldwide this week. .

Here’s how that change of heart happened.

Find a universal symbol for love

Netflix rolled out its new two-inch feature to its mobile and smart TV apps as well as its website on Monday. Subscribers are advised that this type of feedback directly affects future recommendations. A thumbs down means that a title will no longer be suggested; a thumbs up will cause Netflix to recommend similar content. Two thumbs up means “we know you’re a real fan”, as the Netflix mobile app says.

The company started work on the feature about a year and a half ago based on feedback it was getting in surveys and research interviews from its subscribers. “We were hearing members say that ‘like’ and ‘dislike’ wasn’t enough,” said Christine Doig-Cardet, who leads the Custom UI Product Innovation team at the company. “There were shows that they really, really, really enjoyed. It was important to differentiate between what they liked and what they liked.

Once the decision was made to address this issue, Netflix launched a series of design sprints to come up with visuals for this level of fandom. Some of the early ideas included the heart, an applause icon, shooting stars and others. The designers also consulted with the company’s globalization team to find a truly universal icon. “The design team and the globalization team really [homed] about symbols that evoke love,” said Ratna Desai, Director of Product Design at Netflix. “We wanted it to be very precise, very concise, because we wanted it to be a very quick interaction.”

Picture: Netflix

Netflix tested a number of different reactions that could reflect a viewer’s interest in a show.

At the same time, Netflix continued to poll its subscribers, who had another suggestion. “We had a lot of interviews and surveys, [and] the heart wasn’t really resonating,” Doig-Cardet said. “The idea that came from the members was, why don’t you just give a thumbs up?”

At that point, two favorites emerged. The heart seemed like an obvious choice, but two thumbs up also seemed to work well with Netflix’s existing iconography. Plus, as anyone who’s ever read a review by the late Roger Ebert will know, it has long meant a vote of confidence for great entertainment.

Going with what her followers wanted seemed like a good idea, giving both thumbs up credit. What if these subscribers were wrong?

“Some people can talk loudly,” Doig-Cardet said. “But when you look at the big picture, talk to a lot of different members and see how they interact with the different features, it doesn’t always make [match] the first strong voices.

Proving the loudest voices wrong

Netflix has long tried to figure out how best to collect member-based content ratings, and dealing with these loud voices has been difficult. In its early days, Netflix offered a five-star rating system, similar to how people rate their Uber drivers.

At the time, Netflix displayed an average of these ratings on its website to indicate how popular a title was with subscribers. This resulted in some titles having 4.5 stars, or other fractions, leading people to wonder why they couldn’t also rate in half-star increments.

Thousands of people told the company in surveys that they wanted this level of granularity, but Netflix employees weren’t sure those opinions reflected how people actually used the service. To ensure it didn’t fall into the opinion of a vocal minority, Netflix resorted to something that has become a key part of its product development toolkit over the years: an A-test. /B.

In the case of the half-star test, the results were clear: ratings dropped significantly when people were asked to provide feedback with this level of granularity. In other words: A/B testing has proven the loudest voices wrong.

Netflix repeated this type of test when it completely replaced five-star ratings with thumbs in 2017. In A/B testing prior to this change, the company saw rating activity increase by 200% with thumb icons. up and thumbs down. Part of the problem was that these icons were simply simpler, but closer examination of the data also revealed that they tended to be more accurate: people would give five stars to titles they deemed worthy of that status, including including award-winning documentaries that would then linger unattended in their queues for months. At the same time, they would frequently gorge on reality TV shows that they themselves had only rated three stars.

The Moment of Truth: Heart or Thumb?

Now, Netflix is ​​ready to add a little more complexity to those rankings again. That’s partly because media consumption habits and app interfaces have changed across the board. “People use Netflix in the context of their overall lives,” Desai said. “They interact with Instagram, with various social networks, with ridesharing apps.” Some of the interaction models of these apps and experiences weren’t easily applicable to Netflix, which is primarily used on TVs and focuses much more on leanback entertainment than, say, Instagram. “But there are a few levers that our members are asking for now that they didn’t have in the past,” she said.

Still, there were unresolved questions, including which would work better: hearts or thumbs? And would either of them actually have a lasting impact beyond taking these strong voices into account in surveys and other forms of qualitative research?

“We’ve been in situations where we can hear very strong insights in a qualitative framework that run counter to what we find out in A/B testing,” Desai said. “This is where the fun begins.”

Netflix two thumbs up feature

Last summer, Netflix launched a series of A/B tests for the new review feature.Picture: Netflix.

Last summer, Netflix launched a series of A/B tests for the new review feature, testing both the core and the two-inch option. At the same time, the company continued to survey subscribers, including those signed up for trials, to see if the new features actually provided value.

Testing of the feature extended well into the fall as the teams working on it wanted to make sure they got it right. “We’re not rushing a test,” Doig-Cardet said. “Sometimes there’s this impulse to throw early and smash things and stuff. It’s not [our] approach.” One of the reasons for conducting A/B testing over weeks or even months is to allow people to get used to a feature and see if engagement remains high, or if people are attracted by the novelty of a feature, then grow tired of it.

In the end, the numbers were clear: providing additional feedback worked. “We saw a really big increase in engagement because people had a new way of talking to us,” Desai said. This elevator was much bigger with both thumbs up than with the heart, which was a surprise, since the folks at Netflix were expecting the heart to win.

These kinds of unexpected results are what make A/B testing so valuable, Doig-Cardet said. “If we weren’t surprised, we would be doing something wrong,” she said. “We would validate our own assumptions, rather than letting the numbers determine what is a better experience.”

Constant testing, though it may spoil the big reveal

Netflix’s extensive use of A/B testing has been well documented over the years, including by its own data science team. The company is constantly testing a number of different features with subsets of its audience. Basically, if you’re a Netflix subscriber, there’s a good chance you’re signed up for some sort of test right now.

Some of these tests are for obvious interface tweaks, and some are related to codec or infrastructure changes under the hood. In fact, Netflix does so many tests that members can be signed up for more than one test at the same time, which is why the company has developed a comprehensive experimentation platform that helps its data science team avoid test conflicts and to make sense of all the data collected. . (Netflix offers members the ability to opt out of testing through their account settings.)

However, the development of the new two thumbs up feature also shows that A/B testing alone is not enough. Without also talking directly to subscribers, the company would have prioritized the development of the heart icon and not given it a chance to prove itself in A/B testing. “We take this multi-pronged approach of looking at many different inputs,” Doig-Cardet said. “We collect information from our customer service, surveys, interviews that we conduct, and use all of this to inform [what] we should invest and test.

Both surveys and A/B testing carry the risk of exposing future features to the public eye. Subscribers frequently post new things they spotted in the app, and journalists tend to jump on those stories to shine a light on the company’s track record. For Netflix, it’s just a cost of doing business. “We’re comfortable making this tradeoff of providing early visibility because we want to make sure it works for our members,” Doig-Cardet said.

“In the places where I’ve worked, there’s this incredible unveiling of the feature, with the campaign and all that,” Desai added. Instead, Netflix operates a bit more in the open, which includes testing new, unannounced features with tens of thousands of members.

“It’s our bread and butter,” Desai said. “It’s our secret sauce to how we innovate.”