Embracing Risk

This article explores the importance of strategic risk-taking in turning publicly funded research into economic impact. It is an evidence-based approach to the "invest-in-everything" model, grounded in the idea that Canada’s underperformance is the result of a negative feedback loop caused by risk aversion, and that the only way to address Canada’s productivity challenge is to embrace risk as a key element of strategy, rather than something to be minimized.

If you are investing in power-law distributed assets like startups, up to a point, you can increase the median performance of your portfolio simply by investing the same total amount of money in a larger number of companies.

The converse is also true: by making fewer investments, you are actually driving a real, absolute reduction in return on investment. This is not just an illusion of relative underperformance caused by under-sampling, it is real underperformance driven entirely by sample size.

The value of emerging technologies follows a power law distribution in which a tiny minority of technologies deliver the vast majority of value. Successfully investing in assets that follow this kind of distribution requires a deep understanding of the dynamics at play. Venture capitalists (VCs) learned how to navigate power law dynamics a long time ago, and the basic structure of the approach persists to this day. The phrase has its origins in early 19th century whaling expeditions. Agents (analogous to modern VCs) would raise money from corporations and high net worth individuals (LPs) to fund ship’s captains (founders) to go whale hunting (ventures). Even the distribution of payout (the 2 and 20 rule) arose during this time. As with modern investments in emerging technologies, the vast majority of whaling expeditions ended in failure, while the top 2% returned so much value that they more than paid for the rest. Without any real way to predict which ones would return profits when they set sail, the only winning move was to fund many expeditions, enough to ensure (statistically speaking) that at least one of the investments would be in that 2%. The nature of the startups in which we invest has changed since then, but the underlying math has not.

Power law math

Simulation modelling of power law dynamics has been done well by a number of people: Matt Lerner over at Medium developed a very simple yet enlightening toy model that simulates random investment with power-law distributed returns, while Mike Arpaia from Moonfire developed a more sophisticated model as a means to test variations on early-stage investment theses.

In a nutshell, Matt’s Blind Squirrel model does a Monte Carlo simulation of portfolio values over random investments made in prospects with a power law distribution of returns. Mike Arpaia builds a framework that allows comparison of different fund structures and allocations, but the core of the model is the same: an underlying distribution of investment outcomes is sampled according to a set of rules that collectively define an investment thesis, and the hypothetical returns are compared.

In both cases, it quickly becomes clear that even if all you do is blindly throw darts, you’re all but guaranteed a positive return. The only requirements are that you must build a sufficiently large portfolio to have a reasonable certainty of hitting a few top performers, and have the patience to see it through to the end. The Blind Squirrel approach achieves this simply by making a lot of bets and letting the law of averages sort it out. The Moonshot model advocates for deploying most of your investments in the first stage with minimal reserve for follow on rounds) for the simple reason that this approach allows more bets to be made.

While the Blind Squirrel approach underperforms average VC performance, it is still net positive. The implication: you do not need to be able to pick winners to have reasonable certainty of return when playing a power law, as long as you make enough bets to cut through the noise.

The numbers-game nature of early-stage investment comes across most clearly when considering the impact of the top performer in a given portfolio. In the Blind Squirrel model, if you remove the top company from each portfolio, the returns drop dramatically, with the top company providing as much as 50% of the profit for small portfolios. This dependence on the top performer gets less and less important as the number of investments made by your fund increases.

In other words, if you are running a small fund and you pass on the one big opportunity, it’s the difference between wild success and complete failure. As a Blind Squirrel, the only winning move is to invest in a much larger pot of companies, and let statistics take care of picking winners. This is not a particularly controversial or surprising conclusion. This is basically just restating the hypothesis that portfolio diversification is important. Anyone investing in a broad market ETF is effectively a Blind Squirrel investor, albeit using a different asset class.

When considering emerging technologies, the power law gets even more extreme, the timelines get extended, and it gets even harder to pick winners given that there is an entirely new class of risk involved. Not only do we need a team that can deliver and a willing market to buy, in many cases we also need the laws of physics to cooperate with the research efforts. With the added filter, it takes a lot more sampling to cut through the noise. If the top fraction f of companies or IP portfolios are responsible for most of the value creation, a Blind Squirrel needs to build a portfolio of 5/f companies to ensure a 99% chance of hitting at least one big winner (that’s 250 investments, assuming 2% of companies are the profit-creating outliers). The earlier an investment is made, the smaller f becomes.

The conclusion is simple: successful incubation of emerging technologies requires a support structure that is both willing and able to make a very large number of bets, understanding and accepting that the majority of them will fail. All that matters from a profit perspective is aggregate performance, which is dictated almost entirely by a small minority of investments. In many ways optimal investment strategy resembles a midwit meme, with VC as the midwit picking winners while both very early and very late stage investors are best served by just buying (almost) everything.

Under-sampling creates an illusion of relative underperformance

Canadian startups on average raise less money and return smaller multiples for investors than American startups. The natural but probably wrong conclusion is that Canadian startups are less capable, in some sense, than their American counterparts.

The problem with the simplistic conclusion relates to sample size. Given that there are fewer Canadian startups than there are American ones, and given further that startup performance is dominated by outliers, apparent Canadian startup underperformance relative to American startups may simply be a sample size effect. The more you sample, the higher the chance that you will hit an outlier.

In other words, it can be true both that American startups and Canadian startups are equally performant (in the sense of being drawn from the same underlying value distribution), and that a Canadian startup will probably never be top-ranked by valuation at any given time when considering both ecosystems together. The resolution of the apparent contradiction lies in the fact that there are just more American startups than there are Canadian ones.

No comparison that we have ever seen of the US and Canadian ecosystems has attempted to take this into account. Various explanations of this have floated around for a while, but we have not yet come across any attempt to rule out the most obvious issue: that the sample sizes are vastly different.

This is a real problem, because perceived under-performance is a self-fulfilling prophecy. Investors considering allocating their assets care only about measurable performance, meaning that an investor with the option to invest on either side of the border, all else being equal, will invest preferentially in American companies. This increases the availability of capital and increases the chances of success in that ecosystem - effects which will drive real differences in the underlying distributions. In other words, under-sampling creates a negative feedback loop in which the expectation of performance differences makes them real. This prophecy is played out in several ways, including difficulty raising money in Canada, a shallower and more risk-averse capital pool, and a tendency for highly successful Canadian founders to have companies that are based in the US.

Under-sampling power laws causes real underperformance

When considering return on investment for startups there is another, much less intuitive complication not present in the case of chess ratings that needs to be considered, which arises from the asymmetric nature of the power law. Most of our intuition about how statistics work is based on the normal distribution, which is symmetric, but the power law that dictates startup value is about as asymmetric as it is possible for a distribution to be. This matters, because when we draw a sample population from a symmetric distribution, the distribution of the population average is itself symmetric, and its median and mean therefore independent of sample size. Sampling more will get you closer to the true mean, but the error is not directionally biased. We tend to implicitly assume that averages are independent of population size, while variability goes down as it gets larger, but this is only true for symmetric distributions (or for very large sample sizes).

For power laws and other heavy-tail distributions with smaller sample sizes, or any situation in which it is the outliers that we care about, both the sample average and sample standard deviations are strongly sample-size dependent until the sample size is very large, and smaller populations actually have a distribution of population averages that is skewed toward smaller numbers. The smaller your sample size, the worse the average performance of that population, and vice versa.

(For the stats nerds out there: the central limit theorem guarantees that for a sufficiently large sample size, the distribution of the average will converge to a symmetric Gaussian distribution, but different underlying distributions dictate very different values of “sufficiently”, which matters here.)

Restating this in investment terms: if you are investing in power-law distributed assets like startups, up to a point, you can increase the median performance of your portfolio simply by investing the same total amount of money in a larger number of companies. The converse is also true: by making fewer investments, you are actually driving a real, absolute reduction in return on investment. This is not just an illusion of relative underperformance caused by under-sampling, it is real underperformance driven entirely by sample size.

The following plot shows the impact clearly. Using the same distribution parameters as presented in the Moonfire article for the multiplier on ROI, we draw 10,000 examples of populations of the size given on the x axis, and for each population size we plot the boxplot of the distribution of the sample average of these “portfolios” (you can interpret the y axis as the average multiplier returned by each company in a portfolio of N companies, with N given on the x axis, with the boxplots showing the quartiles of the resulting distribution portfolio averages).

A value of 1 on the left y axis is the point at which an investment portfolio breaks even, on average, but the absolute numbers are not all that important (they are drawn from a distribution with realistic but arbitrary parameters that reflect qualitatively the underlying distribution, not necessarily quantitatively, and we are obviously ignore the impact of time in this simplistic analysis). What is important is that the median of the distribution of portfolio averages (orange line) is an increasing function of portfolio size until we reach portfolio sizes that are orders of magnitude larger than typical VC portfolios.

The secondary y axis on the right shows the fraction of population samples that have at least one portfolio company that returns a multiple that exceeds 100, which we use as a proxy for the fraction of portfolios that hit at least one homerun. You can see that once for small sample sizes you are all but guaranteed to miss, but as this probability goes up, the median performance approaches saturation, which is just another way of seeing that all that matters when making investments is not missing the homeruns.

For this specific example distribution, a Blind Squirrel investor needs about 300 investments to be made before the sample size effect goes away and average portfolio company performance saturates. This is the core of why the Moonfire model found that best performance occurred when deploying all their capital in the first round - by considering portfolio sizes below this threshold, they are operating in a regime where extra portfolio performance could be extracted simply by making more investments.

With power laws, under-sampling causes real underperformance.

How does this get reconciled with traditional VC?

The obvious question, given the above, is “how can VCs still operate with portfolios of 20-50 companies?” For a given niche, and at a later stage, it is actually possible to pick winners to some extent. What this means in mathematical terms is that in some narrow circumstances, it is possible to change the underlying distribution to a power law where average performance saturates for smaller portfolio sizes (mostly this is about reducing the exponent in the power law). B2B SaaS, where VC has proven effective, is one such niche. Later stages of investment across a broader set of companies (series A and beyond, for example) also have distributions with reduced exponents. In these cases, VC can be successful because it is possible, to some degree, to pick winners and bias the pool of possible investments toward a more forgiving power law.

That being said, in most cases, VC performance would similarly be improved through larger portfolio sizes. The limitation is due diligence times - changing the exponent that drives the power law is a lot of work.

This generalizes poorly to making early investments in emerging technologies and deep tech, though, and this is the reason that traditional VC does poorly here relative to B2B SaaS. Picking winners (by which we mean reducing the power law exponent) is much more difficult in the earliest stages and in sectors for which the metrics used by traditional VCs to perform valuation break down. Here we are at the mercy of the underlying power law, but we can use our statistical understanding of the impact of sampling to correct for challenges with picking winners.

Picking Winners is a Losing Strategy

There is a fascinating piece of research that gets into the dynamics of the unavoidable tradeoff between false negatives and false positives in public spending at a cultural level. The authors explore cultural differences in consideration of “second-best fairness”, the idea that there is “a trade-off between giving some individuals more than they deserve, false positives, and others less than they deserve, false negatives.”

While they focus on welfare payments, the idea generalizes well to anything that involves spending of public funds. If we optimize our systems to reduce the possibility of making a payment in error, we will end up refusing payments in error at least some of the time (false negative), which can have severe consequences for the individuals involved, or be the reason a startup moves south. On the other hand, if we optimize to ensure that payments are made to those who need them, we will make some payments in error, whether honest mistake or the result of fraud (false positive). It is not possible to build a system that completely eliminates both of these, and the balance selected is an active policy choice.

Canadian tax code is a good example of a system optimized to avoid false negatives. CRA assumes your return to be accurate, a few basic checks aside. There is some degree of errors and fraud occurring at any given time as a result, but it is only worth correcting if the amount recovered by doing so exceeds the cost recovery, which leaves some false positives as acceptable outcomes.

Where innovation policy is concerned, Canada optimizes heavily to avoid false positives. If you have engaged with almost any public funding for innovation, you find that these policy frameworks require that a success narrative be told for every project funded irrespective of their aggregate impact. Tom Goldsmith suggests that many policy decisions are made not to ensure maximal benefit, but to avoid blame for failures. No politician or public servant wants their name attached to a project that will not bear fruit in their term, since it becomes an easy target for opponents.

When the goal is to avoid the possibility of blame for failure at the level of individual investments, we lose sight of the broader context in which that investment occurs, a problem that is exacerbated by the lack of unifying mission that characterizes Canadian innovation policy space generally. Our innovation support programs are incentivized to avoid individual failures by only supporting low-risk projects, which in turn ensures low reward.

We try to pick winners, and we are not very good at it.

Many programs have minimum revenues or headcounts to be eligible, for example, the implicit assumption being that revenues (or jobs created, or T4s issued, or any numbers of other short-term measures of economic impact) represent de-risking of an idea. This is a problematic assumption when considering emerging technologies, which are often years of research away from any revenue and are often best advanced by small, agile teams. Where emerging technologies are concerned, the signals that our innovation support systems use to pick winners are uncorrelated to long-term impact until far too late in the game.

Power law math tells us that Blind Squirrels cannot afford risk aversion. The power law that dictates the value of emerging technology portfolios, combined with Canadian systemic public risk aversion, means that much of Canadian innovation policy is constructed to miss out on value creation arising from research. Canada’s Productivity Paradox is anything but paradoxical: it falls naturally out of even the simplest toy model we can conceive.

The models and discussion above make a key assumption: that our investment strategy does not alter the underlying distribution of investment returns. In reality, any program that invested in literally everything would immediately be subject to fraud that would drive the value of the underlying distribution to zero. Some degree of selectivity is required, but the math suggests that, at the earliest stage of technology development, the selection should not be much more stringent than filtering out obvious fraud, making sure the founder is both credible and coachable, and ensuring that the amount requested is aligned with the actual need.

Value is More than Profit

While the track record shows that on average VCs do outperform the Blind Squirrel investor, the effort required to do due deep diligence limits a typical fund to 20-50 investments, and having profit as the primary driver limits the timespan over which those investments can be held. Both of these limitations make traditional VC poorly suited to supporting emerging technologies, where timescales are extended and picking winners is all but impossible. It’s also a vicious cycle: the need to pick winners increases due diligence requirements, which in turn further limits the number of investments that can be made, further increasing the required success rate. It works for B2B SaaS for which standardized playbooks have been developed to assist in the process, but it translates poorly to other sectors.

There are two sources of investment that have the ability to truly play the numbers game while being tolerant to long development timelines: the public sector (which, as discussed, lacks the required risk tolerance), and venture philanthropy.

To make early investment strategies compatible with securing socioeconomic impact from emerging technologies, we need to expand our definition of value creation to recognize that value is not synonymous with profit. If instead of pure profit-seeking we include in our definition things like economic development, security, and independence, retention of talent and IP, education of entrepreneurs, and progress toward ambitious societal goals like climate change targets, the Blind Squirrel approach becomes much more attractive. While such positive spillovers are of no use to a for-profit VC’s balance sheet, they are of clear value to the Canadian taxpayers and donors seeking to establish their legacy.

Unlike traditional VC that is limited to just a few investments over a fixed timespan and all that matters is (some metric of) profit, public or philanthropic investment in emerging technologies can afford to place many bets with patient capital, all but guaranteeing that opportunities are not missed. In a model that seeks economic development and social impact over profit it suffices to be self-sustaining, which is an easy target to hit given a sufficiently large portfolio.

Companies that a for-profit VC firm would write off as failed bets have value in this model. A failed entrepreneur is now someone with invaluable entrepreneurial experience who is better equipped to navigate the process on round two and is incentivized to stay in Canada to try again, knowing they will be supported. Companies that return 2-5X in the long run, while complete failures on a VCs balance sheet, are all contributors to a resilient domestic economy. Exits to a foreign buyer need not be a detriment to Canada, but to secure benefit, companies must develop sufficiently domestically first, and the ecosystem must incentivize retention of the talent and capital that results. An acquisition can be the basis for a cascade of positive spillovers as entrepreneurs become mentors and investors in the next generation, as long as the talent and capital is retained in the ecosystem. The story of the transformative impact of the Skype acquisition on the Estonian ecosystem is a great example of this.

Effective Emerging Technology Development

There is a desperate need in Canada to embrace the idea that a high failure rate in supporting emerging technologies is perfectly acceptable, indeed unavoidable, so long as the aggregate, long-term impact of the whole portfolio is net positive, and to expand the definition of “impact” to include positive spillovers beyond direct return on investment.

Other ecosystems are way ahead of us.

DARPA had an 85-90% failure rate as of 1975, and the US government continued to fund and even expand the program.As of 2019, the SBIR (“America’s seed fund”) has failure rates that exceed 90% in some cases. Far from being embarrassments, these programs have existed for decades and are widely recognized as cornerstones of American technological dominance. Early-stage risk-taking is a common feature of many other ecosystems that innovate effectively, as well.

A more directly comparable example to Canada is France, where Les Deep Tech has set an ambitious goal of “500 startups created every year, 10 deeptech unicorns, 50 new industrial sites per annum” by 2030, and has $3B in public funding committed to achieving this. Some quick math shows that they are comfortable with targeting a 2% success rate (10 unicorns out of 500 startups per year). They are taking the Blind Squirrel approach of making thousands of small, low-overhead bets, understanding that positive spillovers will ensure net positive value creation.

To make commercialization of emerging technology possible and to secure value from Canada’s public investment in research, we first need to embrace risk and recognize that aggregate value creation is more important than individual project success or profit. The Canadian public sector must accept that a high failure rate is an essential and intentional feature of an effective innovation ecosystem and adopt a whole-of-government approach to enacting this in its innovation strategy.

For policy makers and those seeking to support emerging technologies, the important point is to recognize that underperformance of Canadian deep tech (either relative to the Americans or in absolute terms) is the result of a self-fulfilling prophecy that reinforces the systemic risk-aversion that leads to that underperformance in the first place. It is at least partially, and possibly entirely, because of risk-aversion-driven under-sampling that Canadian startups underperform. Breaking this cycle requires an active choice to take more risks, and the math tells us that we will be rewarded for doing so.

In other words, our innovation policy frameworks have cause and effect precisely backwards. Risk-intolerance is a direct cause of real underperformance, not the other way around. Canadian deep tech and emerging technology startup underperformance can be remedied simply by embracing risk and investing broadly, staying the course long enough to see the impact.

Where emerging technologies are concerned, if we acknowledge that there is not much we can do to predict technology portfolio value early, the only winning move is to take the Blind Squirrel approach and invest broadly. This is the core of why the SAIL Fund adopts a risk-tolerant, philanthropic approach to investment that combines public and private capital toward a common goal. Our approach can tolerate long timelines and high levels of failure, can consider the value of positive spillovers that are missed by traditional VC, and, most importantly, can place lots of bets.

Learn how the SAIL Fund puts this into action

building socioeconomic value from publicly funded research by investing in startups across Canada.

Canada produces world-class research, yet many promising discoveries fail to reach the market due to a lack of early, risk-tolerant capital. The SAIL Fund addresses this gap by using venture philanthropy to deploy truly patient capital. We de-risk technologies aligned with national priorities, and support a new generation of entrepreneurial talent that can grow and remain in Canada.