By Ryan Fauber
“I am so clever that sometimes I don’t understand a single word of what I am saying.” – Oscar Wilde.
Have you ever returned to your writing to realize it was a bit less coherent than you thought? It shouldn’t be a surprise then that machine learning models struggle to dissect human language. After all, as cognitive science scholar Douglas Hofstadter writes, “the recursive structure of language mirrors the recursive structure of cognition, and vice versa.”
Thankfully, the authors and organizers at the Association of Computational Linguistics (”ACL”) have been pushing the limits of natural language understanding for many years, which is why we’re so excited to attend this year’s conference in Toronto.
Of the many compelling topics at the conference, we’re particularly excited to explore AI’s intersection with e-commerce. We see this use case as a microcosm of the broader field. We view online sales channels and backend systems as data-rich, but retailers’ inability to leverage that data keeps them insight-poor. We’re not surprised by this, since it is no small feat to parse unstructured and multi-modal data, react to ambiguous searches and reviews, and predict buying behavior in the age of the micro-trend. However, we believe new research presented at ACL can address these fundamental challenges.
Below are some of the most compelling topics that we reviewed:
Classification: When is a Shoe Not a Shoe?
What you call a shoe, I might call a sneaker. And what a model classified as a small, grey shoe may in fact be a medium, off-white boot. In extensive SKU databases, product classifications like color, size, and subcategory can have a major impact on downstream tasks like search. Having the wrong classification is the difference between the shoe showing up in your search results or not – and if you can’t find it, you can’t buy it. To tackle this concern, authors of a recent ACL paper aim to improve traditional classification models using generative AI. Other accepted papers dealt with topics such as multi-modal attributes on product pages and unified vision-language image captioning. A related paper was presented at this month’s Computer Vision and Pattern Recognition conference (”CVPR”) that reconciles inconsistent labeling between image datasets to improve classification. We’re excited about the way these work together to refine product and image categorization, ultimately offering the potential to improve downstream, customer-facing models.
Search and Discovery: The Customer (Search) is (Not) Always Right
The business of search and discovery is in many ways the business of bringing the offline shopping experience online. Think about how a physical store is structured: as a buyer, you are likely to be guided through many aisles, where you might make unplanned purchases (or go ’treasure hunting’). Additionally, you might stop and speak to store workers, who can answer questions and recommend related products. Now, consider your latest online purchase. If you didn’t know exactly what you wanted, you may have struggled to navigate a site’s massive inventory. Search results were probably inconsistent, and recommendations may not have been relevant—after all, to the retailer you’re just an anonymous visitor who’s entered a few vague searches!
We believe brands are eager for ways to better understand user preferences based on online buyer behavior. However, recent changes in Apple’s app tracking transparency policy and EU regulations on cookies have left many business with only session-level behaviors such as search results to rely on.
Several papers at this year’s conference touch on different ways to address these challenges – including inferring intent from a customer’s history of searches, correcting spelling mistakes (which affect 32% of searches), and even greatly improving speech recognition for voice-powered search.
Forecasting and Demand Planning: Trend Tracking in the Age of Fast Fashion
Did you know that for brief moment in 2022, bright pink dipping sauce dominated “FoodTok”? Or that there are over 20,000 mentions of luxury fashion a day on Twitter? Social media ecosystems can reveal quite a bit about consumer preferences – if retailers can keep up.
While this subject may not have received prominent coverage in this year’s ACL papers, it will be discussed in a joint ACL workshop at the Knowledge Discover and Data Mining Conference (”KDD”) in August. We view forecasting and planning as both crucial challenges and enormous opportunities—ones that multi-modal models are especially suited to address. The sheer volume of this unstructured data is daunting, and additionally, signals may be ambiguous and extremely high velocity. However, we believe that improvements in interpreting these signals have the potential to dramatically reduce overstocking, saving the industry billions and reducing its environmental impact.
If you’re excited about these use cases, or any other breakthroughs and applications in ACL work, shoot us a note or find us in Toronto. We may not be as clever as Oscar Wilde, but perhaps we can understand a few words between us.