An innovative deep learning model empirically demonstrates the visionary advantage that small firms have over their larger competitors

Vision is a word that can, with equal ease, generate hours of discussion in C-suites and academic departments. Executives worry about finding it. Academics labor to define it. Consultants claim to have it. Yet to date, finding a clear definition of the term that can be used consistently in empirical research has been an elusive goal. For some researchers, vision is being “able to see the future.” For others, it is understanding what products and services future markets will value. For yet another group, vision is about perceiving currents of change that lie beneath the surface of markets and even society. Vision is so elusive a concept to define that in business writing its meaning is often either taken as self-evident or redefined for the author’s purposes. 

Perhaps because of the definitional challenges noted above, conclusions about what kinds of leaders and companies are “more visionary” also have mixed results. Some researchers believe that the most visionary ideas emerge from big companies with large teams and the financial resources needed to run large research and development (R&D) efforts. Others believe that small companies unbeholden to legacy-thinking are best able to innovate new products and services. 

Given the complex history of this term, it is refreshing to see recent research from Paul Vicinanza (Stanford)Amir Goldberg (Stanford), and Sameer B. Srivastava (Berkeley), who present not just a novel conceptualization of what vision is but also put it to an innovative test of validity. 

This team breaks from the typical research trajectory on this topic in two fundamental ways. First, they define vision not as technical innovation per se. Rather, vision “inheres in the difficult-to-observe ideas that originate within groups and that fundamentally restructure how a field will operate in the future.” In other words, vision is the ability to “rethink the contextual assumptions that predominate a given field.” By contextual assumptions, they mean the beliefs that are central to the work a company does and therefore guide its choices.  

Put simply, a company is visionary when it challenges the way the world works in the present, and its new ideas become the way the world works in the future. This outcome can occur for several reasons, from seeing sudden environmental shifts that others do not to simply being better at convincing customers and even competitors to follow their lead. 

With this novel definition in mind, the authors make their second break from past research. While other studies typically focus on the tangible outputs of innovation—e.g., patents and products—they chose to focus on one important intangible input of innovation: language.

The Study

Patents and products have been the focus of innovation research for decades because they are (generally) straightforward to recognize and measure. Indeed, I recently featured a good example of that approach in a study of science-related patents. Using language for the same purpose is another matter, of course, and it presents a different set of challenges. For their research, the authors adopt a recent Google innovation known as Bidirectional Encoder Representations from Transformers, better known as “BERT.”

BERT uses what is known as transformer models that process words in relation to all the other words in a sentence, rather than one-by-one in order. BERT models can better understand the full context of a word because they look at the text that precedes and follows it. For example, Figure 1 below illustrates the non-BERT and BERT results for someone querying “2019 brazil traveler to usa need a visa.” Before BERT, Google would have thought the query was from a U.S. citizen going to Brazil, when the opposite is the reality.

Figure 1: Google search results pre and post-BERT (Source: Google)

For the BERT models used in their study, the authors defined two key metrics. The first is perplexity, which refers to how likely it was to find a specific word in a passage. The more common the word, the lower the perplexity score. The other, higher-order, metric was prescience, which refers to the difference between a term’s perplexity score in the past compared to some point in the future. By way of illustration, consider a company report from the 1950s that used the term “mobile telephone” in describing a small device that would allow someone to make calls on the go. The term mobile telephone would have a high perplexity score, of course, since it did not yet exist. The text in the report would have a high prescience rating as well because it was using a term (crucially, in the right context) common in the future long before everyone else was doing so. Thus, the authors’ basic premise is that people who correctly use language today to describe what is to become common in the future are the true visionaries.

As illustrated in Figure 2 below, the authors applied their BERT models to over 100,000 quarterly earnings calls (QECs) of publicly traded firms, which is where senior executives comment on their performance and strategy to financial analysts, who then make recommendations on whether investors should buy, sell, or hold the company’s shares. The authors opted to begin their analyses in 2011 to “circumvent the potentially confounding effects of the 2008 financial crisis,” and they fine-tuned BERT for each year between 2011 and 2016.

Figure 2: Example prescience calculation for a highly prescient sentence from a sample quarterly earnings call (Source: Authors)

Indeed, as they were validating the BERT models across all the QECs, the authors found that among the least-novel terms were CDdisk, and PC, as well as words connected to major events in the news, such as stimulustsunamiGreecedeficit, and Iraq. Interestingly, the most prescient word in the analysis was onboard. Originally used in the context of adding new employees, in the early 2000s onboarding started to refer to users and customers, thus signifying the arrival of the Software-as-a-Service business model that would become common a decade later. Another example from the same period is the term compassionate use pre-approval, which is the precursor of the term COVID-19 made famous: emergency use authorization.


The first finding noted is that the belief that breakthroughs come from the edge of business is correct. The authors found that smaller firms, and those outside the mainstream, tend to be the most visionary. Indeed, a 1 standard deviation (SD) decrease in firm size corresponds with a 0.27 SD increase in the use of visionary language.

Perhaps as expected, the authors find the most visionary firms in knowledge-intensive sectors such as medical devices (e.g., SurModics), information technology (e.g., Openwave Systems), clean energy (e.g., Xcel Energy), fiber optics (e.g., EMCORE), and pharmaceutical drug development (e.g., Neurocrine Biosciences). 

In contrast to the visionaries, the authors note, the least prescient firm, “Oil-Dri Corporation, has been in existence for over 75 years and makes products from sorbent minerals (e.g., cat litter).” Other low prescience firms include “beverage packaging (Crown Holdings) and energy firms (CONSOL Energy) that were founded in the 19th century, as well as Yingli Green Energy that announced in 2018 that it would be delisted from the New York Stock Exchange.” 

Perhaps the most surprising ranking in the study is the one given to Netflix, and the authors consider it illustrative of the insight generated by the approach: 

In 2011, Netflix announced it would split its mail-order DVD service into a separate entity. Seen as backward-looking by investors, the move resulted in an almost 80% drop in Netflix’s stock price, leading it to backtrack. Though two years later Netflix would pioneer an online content production model, in 2011 it struggled to break from its past and develop a coherent strategy.

With scoring complete, the authors further validated the idea that prescient firms yield better financial returns, especially for firms with very high prescience scores. The team notes that firms in the top 5 percent have exceptionally high market returns: “50 percentage point higher stock returns than average just 3 years after 2011.” Moreover, a “1 SD increase in prescience corresponds to approximately a 5% increase in the likelihood a firm experiences above-average growth.”

The final analysis concerns prescience in specific industries. In this analysis, the authors find wide variations. For example, firms in mining, utilities, and construction are least prescient, while those in the professional services sector (e.g., finance and insurance, information technology, management consulting) are most prescient. As noted above, “prescient firms appear to be concentrated in knowledge-intensive industries where the traditional artifacts of innovation (e.g., patents) are not consistently relevant.” Interestingly, the authors find no correlation between prescience and past performance. In other words, “highly prescient firms do not: spend more on R&D, disproportionately come from high-tech industries, patent at higher rates, or produce higher impact or disruptive patents.” Indeed, given the disconnect between prescient firms and outputs such as patents, the authors suggest that measuring prescience in earnings calls may be the first way to quantify the non-technical side of business innovation.


We know that visionary innovations come from both large and small companies. This interesting study, however, reaffirms that smaller, less-influential, firms are consistently the most visionary. Moreover, scale and past success do not really predict which companies have the best sense of what is to come. As the authors conclude, “consistent with popular intuitions, and contrary to theories of incumbent advantage, we demonstrate that smaller and less established firms are more likely to be visionaries than their larger and more entrenched peers.”

In preparing this post, I had a chance to discuss the authors’ approach with a scientific researcher, who noted that the concept of prescience would be instantly recognizable in medical research. In that world, it is quite common that the person who conceives a visionary breakthrough is not the one who eventually puts it into use clinically. Indeed, many technical fields, such as physics and mathematics, often separate the measurement of the impact of theoretical vision from success in field applications. In contrast, business innovation research has tended to focus only on the “clinical” side of the equation, hindering its impact and reach. It is a welcome development to see a team attempting this new, more nuanced, approach in such an important subject. 

In closing, the authors speculate that their new way of thinking about what signals a visionary thinker could be extended to other areas such as “identifying political vision in speeches and debates to tracing visionary arguments in legal proceedings to detecting visionary knowledge in scholarly publications.” This is a challenge other researchers should pursue. For such common terms, vision and visionaries are not well understood. It would be useful for researchers and business leaders alike to broaden our discussion of these terms to a wider understanding of the — sometimes unknown—leaders and companies whose ideas and language “ultimately change the world.”

The Research

Paul Vicinanza, Amir Goldberg, and Sameer B. Srivastava. Who Sees the Future? A Deep Learning Language Model Demonstrates the Vision Advantage of Being Small. May, 2020, Stanford Graduate School of Business, Working Paper No. 3869.

Posted by:Carlos Alvarenga