A Wave Of Billion-Dollar Language AI Startups Is Coming

1 month ago 1136

The field of language AI—also referred to as natural language processing, or NLP—has undergone breathtaking, unprecedented advances over the past few years. Two related technology breakthroughs have driven this remarkable recent progress: self-supervised learning and a powerful new deep learning architecture known as the transformer.

Language is at the heart of human intelligence. It therefore is and must be at the heart of our efforts to build artificial intelligence. No sophisticated AI can exist without mastery of language.

The field of language AI—also referred to as natural language processing, or NLP—has undergone breathtaking, unprecedented advances over the past few years. Two related technology breakthroughs have driven this remarkable recent progress: self-supervised learning and a powerful new deep learning architecture known as the transformer.

We now stand at an exhilarating inflection point. Next-generation language AI is poised to make the leap from academic research to widespread real-world adoption, generating many billions of dollars of value and transforming entire industries in the years ahead.

A nascent ecosystem of startups is at the vanguard of this technology revolution. These companies have begun to apply cutting-edge NLP across sectors with a wide range of different product visions and business models. Given language’s foundational importance throughout society and the economy, few areas of technology will have a more far-reaching impact in the years ahead.

Building a state-of-the-art NLP model today is incredibly resource-intensive and technically challenging. As a result, very few companies or researchers actually build their own NLP models from scratch. Instead, virtually all advanced NLP in use today, no matter the industry or setting, is based on one of a small handful of massive pretrained language models. Stanford researchers recently dubbed these pretrained models “foundation models” in recognition of their outsize influence.

Most often, foundation models are built and open-sourced by the publicly traded technology giants—e.g., BERT from Google, RoBERTa from Facebook.

OpenAI is another important source of state-of-the-art NLP technology. Its large language model GPT-3 is perhaps the most well-known and widely used foundation model today. GPT-3 is a generative model (the G in its name stands for generative): it generates original text in response to prompts from human users. OpenAI has made GPT-3 commercially available via API for use across applications, charging on a per-word basis.

Given Microsoft’s massive investments in and deep alliance with the organization, OpenAI can almost be considered an arm of the tech giant.

But there is also tremendous opportunity in this category for younger startups.

Cohere is a fast-growing startup based in Toronto that, like OpenAI, develops cutting-edge NLP technology and makes it commercially available via API for use across industries. Cohere’s founding team is highly pedigreed: CEO Aidan Gomez is one of the co-inventors of the transformer; CTO Nick Frosst is a Geoff Hinton protégé. The company recently announced a large Series B fundraise from Tiger Global less than a year after emerging from stealth.

While Cohere does produce generative models along the lines of GPT-3, the company is increasingly focused on models that analyze existing text rather than generate novel text. These classification models have myriad commercial use cases: from customer support to content moderation, from market analysis to search.

"Language generation has seemingly monopolized the attention of those interested in NLP, but the most significant opportunity for developers interested in building NLP into their systems actually rests in language representation models like BERT,” said Gomez. “While slightly less 'miraculous', these models form the backbone of some of the most sophisticated NLP systems in the world."

Another leading horizontal NLP startup is Hugging Face. Hugging Face is a wildly popular community-based repository for open-source NLP technology. Unlike OpenAI or Cohere, Hugging Face does not build its own NLP models. Rather, it is a platform that stores, serves and manages the latest and greatest in open-source NLP models, including enabling customers to fine-tune these models and deploy them at scale.

Hugging Face’s secret sauce is its community: it has become a go-to destination for companies and researchers in the world of NLP to collaborate. In this respect it can be loosely analogized to GitHub, but for machine learning rather than traditional software engineering.

Other horizontal NLP providers of note include AI21 Labs and Primer.

Based in Israel, AI21 has a two-pronged business model: it offers proprietary large language models via API to power customers’ applications (its current state-of-the-art model, named Jurassic-1, is roughly the same size as GPT-3), and it also builds and commercializes its own applications on top of those models. Its current application suite focuses on tools to augment reading and writing.

Primer is an older competitor in this space, founded two years before the invention of the transformer. The company primarily serves clients in government and defense.

There is one last wild card worth mentioning in this category. Launched less than a month ago, little is known yet about Inflection AI beyond its eye-catching founding team: Reid Hoffman, DeepMind cofounder Mustafa Suleyman, and decorated DeepMind researcher Karen Simonyan. The company is being incubated at Greylock, where Hoffman is a general partner. Its stated mission is to “fundamentally redefine human-machine interaction” by enabling humans to “relay our thoughts and ideas to computers using the same natural, conversational language we use to communicate with people.”

Given the caliber of the company’s founders and backers, expect Inflection AI to make waves in the world of language AI before long.


The most basic way that humans use natural language to interface with machines is through search. It is the primary means by which we access and navigate digital information; it lies at the heart of the modern internet experience.

Search has been dominated by a single player for so long (Google) that it is often seen as an unpromising or even irrelevant category for startups. But this is far from true.

Last month a blog post titled “Google Search Is Dying” made the rounds and sparked widespread discussion. The post hit home with a simple point: an opportunity exists for an upstart to improve and disrupt the Google search experience.

The new entrant taking on Google most directly is You.com. Founded by Richard Socher, former Chief Scientist at Salesforce and one of the world’s most widely cited NLP researchers, You.com is reconceptualizing the search engine from the ground up. Its product vision includes a horizontal layout, an emphasis on content summarization, and above all, a commitment to user data privacy.

Challenging Google directly will, to state the obvious, be an uphill battle. There is also significant opportunity for startups in search beyond the consumer internet search market with which Google has become synonymous.

ZIR AI is a young startup building a new search platform for enterprise. Leveraging the latest transformer-based techniques, ZIR is seeking to develop search technology with true semantic comprehension (as opposed to keyword-based matching) and more sophisticated multilingual capabilities. Like You.com, ZIR has a pedigreed founding team that includes former Cloudera CTO/cofounder Amr Awadallah.

Algolia is a more well-established player in enterprise search; the company has raised over $300 million in venture funding since graduating from Y Combinator in 2014. Algolia offers an API that enables its customers—from tech companies like Slack to media businesses like the Financial Times—to embed search experiences in their websites and applications. Constructor.io is another fast-growing competitor in this space that focuses specifically on ecommerce search and discovery.

One final enterprise search startup worth keeping an eye on is Hebbia, which is building an AI research platform to enable companies to extract insights from their private unstructured data.

In the words of Hebbia founder/CEO George Sivulka: “Google has only indexed 4% of the world’s online data. We’re unleashing the other 96%.”

All of the companies mentioned above (including Google) focus on text search. But thanks to recent breakthroughs in AI, opportunities now exist for startups to build search tools for data modalities beyond text—and no new modality represents a bigger opportunity than video.

Video has become the dominant medium for our digital lives. A whopping 80% of the data on the internet today is video. Yet remarkably, there is no effective way to search through all this video content—to find, say, a particular moment, concept or discussion. The range of potential commercial use cases for video search is basically endless: from social media to streaming content, from digital asset management to workplace productivity, from content moderation to cloud storage.

One exciting startup building next-generation video search capabilities is Twelve Labs, which announced its seed financing earlier this month. Twelve Labs fuses cutting-edge NLP and computer vision to enable precise semantic search within videos. “Multimodal AI” like this—that is, AI that ingests and synthesizes data from multiple informational modalities at once, like image and audio—will play a central role in AI’s future.

“Large language models are accomplishing incredible things today. We think large multimodal neural networks for video are the obvious next step,” said Twelve Labs cofounder/CEO Jae Lee. “Video embeddings generated by these networks will supercharge current and future video-driven applications with an intelligence that we’ve never seen before.”

Writing Assistants

In today’s information-based economy, perhaps no skill matters more than effective writing.

Yet as anyone who has experienced writer’s block can attest, writing can be a frustrating experience. The act of translating inchoate thoughts into well-crafted language—of finding the right words—can be time-consuming and unsystematic.

Next-generation NLP promises to transform how humans write, reconceptualizing one of civilization’s most basic and vital activities.

Large language models like OpenAI’s GPT-3 can be thought of as auto-complete on (incredibly powerful) steroids. Given some text prompt from a human, these generative models can automatically produce novel sentences, paragraphs or even entire memos that are strikingly coherent, insightful, creative—almost magically so. Of course, their output remains far from perfect: they can also sometimes be nonsensical or harmfully biased.

This technology will transform writing from an act of solo creation to a collaboration between human and machine: one in which the human provides some initial language, the AI suggests edits or follow-up sentences, the human iterates based on the AI’s feedback, and so forth. The skillset required for good writing may accordingly expand to include an understanding of how to get the most out of the AI—how to best guide and coax it into producing the desired language.

This novel paradigm for AI-augmented writing is already starting to become a reality, driven forward by a handful of interesting startups.

The most established player in this category is Grammarly. Founded in 2009, Grammarly has admirably remained abreast of the latest NLP technologies over the years. The company raised funding late last year at a whopping $13 billion valuation. Grammarly’s product provides automated recommendations for improved spelling, grammar, diction and phrasing in real-time as users write.

Textio, LitLingo, and Writer are three newer entrants using next-generation language AI to build advanced Grammarly-like solutions for more targeted use cases. Textio focuses on hiring and recruiting, LitLingo on business compliance and risk management, and Writer on company-wide style and brand consistency.

Trained on millions of writing samples, Textio’s AI can give users nuanced insights about their job postings and other hiring-related content: for instance, that a certain phrase will resonate more with male than with female candidates, that a given word suggests a fixed mindset over a growth mindset, that a particular metaphor may come across as exclusionary to applicants. LitLingo, meanwhile, uses real-time NLP to monitor employees’ digital messages and proactively prevent communications that could trigger litigation or unwanted public attention—say, related to antitrust, workplace discrimination, securities violations or employment law.

All four of the companies mentioned so far use AI primarily to provide recommendations and insights on existing text that humans have already written. Today’s NLP, though, allows us to go one step further. The next frontier in AI-augmented writing will be for the AI to generate novel written content itself based on guidance from the human user.

CopyAI is a Tennessee-based startup backed by Sequoia, Tiger Global and Wing VC that auto-generates customized marketing copy. The way it works is simple. Users enter basic information about their company and select a content format: say, a blog title, a website blurb, a Facebook ad, even an Instagram hashtag. CopyAI’s NLP engine, which is powered by GPT-3, then spits out ten samples of text at a time for the user to use, adapt, or take inspiration from. According to the company, over half a million content marketers are using its technology today, including at organizations like Nestle and Microsoft.

To temper expectations, we should not expect that today’s NLP will immediately take over all writing from humans. Some forms of writing—brief formulaic content like marketing copy or social media posts—will yield more naturally to these new AI tools than will others. Original, analytical, creative work—say, op-eds, thought pieces or investigative journalism—will resist automation for the time being.

But make no mistake: in the years ahead, whether we like it or not, NLP will fundamentally change how humans produce the written word. Ten years from now, writing one’s own content from scratch may well be considered an artisanal craft, with the vast majority of the world’s written text produced or at least augmented by AI.

Language Translation

Language barriers are a fundamental impediment to international business and travel, costing untold billions in lost productivity every year.

More profoundly, the inability for people around the world to understand one another inhibits the advancement of grand global goals and species-level harmony. But in a polyglot world like ours (over 7,000 languages are spoken in the world today), language barriers have always been an unavoidable reality.

The Babel fish from Douglas Adams’ science fiction classic The Hitchhiker’s Guide to the Galaxy—which goes in someone’s ear and automatically enables them to hear any spoken language in their native tongue—is an enchanting but purely fictional concept.

Until now.

Machine translation has been a central goal of artificial intelligence researchers dating back to the very beginnings of the field of AI in the 1950s. Automated language translation products have been available since the dawn of the commercial internet in the 1990s. Yet machine translation has proven to be a devilishly difficult challenge. AI-based translation tools have historically been deeply flawed (as anyone who remembers using AltaVista’s Babel Fish service in their younger days can attest).

But thanks to the remarkable advances underway in language AI, reliable and high-quality machine translation is fast becoming a reality.

The most widely used AI-powered language translation service in the world is Google Translate. Unsurprisingly, given that it is the birthplace of the transformer and the most advanced AI organization in the world, Google has incorporated the latest NLP technologies to vastly upgrade its Translate service in recent years.

But significant opportunities also exist for startups in the fast-changing world of language translation.

BLANC offers AI-powered translations for video. Its AI platform takes a video with spoken dialogue in one language and applies AI to quickly reproduce that video with the dialogue in another language, doing so in a way that the speakers’ lip movements continue to look natural. Think of it as sophisticated dubbing, except that it can be carried out automatically and at scale.

KUDO is a more established competitor that also offers video translation services. Today, KUDO’s platform relies on human interpreters to stream translations over the internet in real-time. But the company envisions a future in which its platform is increasingly powered by AI. In this sense KUDO represents an interesting archetype: a mature non-AI-first business looking to inject more AI into its product offering by leveraging its massively valuable proprietary datasets.

Lilt is a notable growth-stage player working on machine translation. The company was founded by two NLP researchers at Google Translate who came to appreciate that an AI solution like Google Translate could not, on its own, be relied upon to deliver automated language translation with the robustness demanded by enterprise and government organizations.

Thus, Lilt offers a hybrid model that combines cutting-edge AI with “humans in the loop” to translate written content for global organizations, from marketing to mobile apps to technical documentation. This partially automated approach enables Lilt to provide translation that is cheaper than using human translators and at the same time more accurate than using AI alone.

The interesting question—for Lilt and for the entire industry—is whether and how quickly the humans in the loop can be phased out in the years ahead.

One last startup worth mentioning in this category is NeuralSpace. NeuralSpace was founded on a simple but powerful insight: the vast majority of cutting-edge research in NLP is conducted in English, yet 95% of the world does not speak English. NeuralSpace provides a no-code NLP platform that enables users around the world to build NLP models in “low-resource languages”, from Armenian to Punjabi to Zulu.

“Our vision at NeuralSpace is to break down the language barrier in AI for millions of low-resource language speakers,” said NeuralSpace cofounder/CEO Felix Laumann. “We give software developers the ability to train and deploy state-of-the-art large transformer-based language models and easily integrate them into their products, no matter where in the world they are or what language their audience speaks.”

Sales Intelligence

Sales is more of an art than a science. Yet certain repeatable principles and tactics do exist that, if systematized, can meaningfully improve a sales team’s performance.

Is a rep spending the right amount of time on the right topics in sales calls, from product to pricing to small talk? Is she letting the customer ask enough questions? Has she engaged the right senior stakeholders at the customer organization at the right times over the course of the sales process? Is she following up with prospects on the right cadence?

By ingesting vast troves of unstructured data from video calls, phone calls, email exchanges, CRMs and other communication channels, today’s language AI can extract actionable insights about how salespeople are performing and what they can do to improve.

There are few applications of language AI that can more directly affect a company’s top line. Not surprisingly, therefore, the market for sales intelligence AI is booming.

The runaway leader in this category is Gong, which has raised close to $600 million in venture funding. According to the company, its technology boosts average revenue per sales rep by 27%, translating into massive ROI for its customers.

Gong’s closest competitor Chorus.ai exited to ZoomInfo last year in a $575 million sale, further solidifying Gong’s status as the category leader.

Gong is an impressive business, with incredible revenue growth and a long list of blue-chip customers. The company seems destined to debut on public markets before long. Yet by most accounts, the core NLP in Gong’s product offering is not particularly advanced.

This raises an interesting question: might an opportunity exist for an upstart to build a more cutting-edge version of Gong, powered by the latest transformer-based advances in language AI, and take market share from the category leader by offering a more intelligent product?

A handful of young startups have popped up that are nipping at Gong’s heels, though none have yet broken out.

Aircover, which raised a seed round last year, and Wingman, which came out of Y Combinator in 2019, are two examples. Unlike Gong, which provides analytics only after sales calls are finished, both of these startups provide real-time in-call coaching for sales reps. And while Gong has had major success selling to large enterprises, Wingman instead targets small- and medium-sized businesses.

Chatbot Tools and Infrastructure

We all experience it in our daily lives: when we communicate digitally with companies and brands these days—via text message, web chat, social media, and so forth—these interactions are increasingly fielded by automated agents rather than humans.

These AI-powered conversational interfaces are commonly known as chatbots—though some startups today prefer to avoid that terminology and its mixed connotations, given a premature hype cycle for chatbot technology about five years ago.

Notwithstanding earlier false starts, chatbots today have begun to gain real market adoption, thanks to improvements in the underlying NLP as well as in companies’ understanding of how to best productize and deploy these bots.

Companies are now using chatbots to engage with customers in real-time wherever those customer interactions occur—for instance, fielding questions on their websites, automating routine customer support requests, giving customers updates on their orders, or supporting sales efforts.

Most organizations interested in using conversational AI interfaces to interact with their customers—say, a bank, a hotel chain, an airline—lack the requisite technical resources to navigate the latest NLP technologies and build their own chatbot platforms from scratch.

And a lot goes into building an enterprise-grade conversational AI interface: handling data privacy and security requirements, integrating with third-party applications, building the infrastructure to support deployment at scale, providing a graceful “fallback mechanism” when the bot is stumped and human intervention is necessary.

A promising group of startups has emerged to provide the technology and infrastructure for companies across industries to create and operationalize chatbots.

The most well-funded of these competitors is Ada Support, a Toronto-based startup that has raised close to $200 million from blue-chip venture capitalists. Ada powers automated interactions for enterprises in customer support and sales across text-based channels including web chat, SMS, and social media, intelligently looping in a human agent when needed. The company claims its technology can reduce customer wait times by 98%. With a long list of marquee clients including Zoom, Shopify, Verizon and Facebook, Ada powers over one billion customer interactions annually.

Another leading player in this category is Rasa. A close Ada competitor, Rasa’s product caters to more technically savvy users, with a greater focus on chatbot configurability. Rasa’s AI stack is open-sourced, with over 600 contributors and over 10 million downloads. This open-source strategy gives Rasa’s customers greater transparency and control over the conversational AI interfaces that they build and deploy.

Other noteworthy startups in this space include Forethought, a well-capitalized competitor that boasts NLP luminary Chris Manning as an adviser; Clinc, a conversational AI platform built specifically for banks; and Thankful, which focuses on e-commerce.

Read Entire Article