Just thinking aloud here, rather than presenting a polished thought: Generally speaking, massive scaling is often connected to establishing a new product/category/industry.
That makes it dependent on discovering new problems in these emerging categories. To pick up on Roman’s iterative approach, it’s about discovering new bottlenecks and solve one by one with the application of additional resources and focus.
For example, based on their volume, Netflix and Paypal had both more data available than their competitors. With the resources available to invest in these areas.
For one, this generated an additional moat in form of customer insights that weren’t available to anyone else at that scale of pattern recognition. Which would allow them to tweak their product closer to what the customer wants.
But the data also brought previously unknown problems to the surface. For example, Paypal’s understanding that fraud is one of the critical bottlenecks that need to be addressed, while everyone else still thinks it’s the interface/technology to send money digitally from email to email.
Once declaring fraud a major problem, adding focus and actively developing technology dedicated to just that problem.
Then, in addition to understanding the problem, being targeted by the fraudsters forced them to apply themselves even more, as it turned into a swim-or-sink scenario. Which turned into a cold-war-like, nuclear-capabilities competition, with both sides heavily investing, iterating, and progressing at an accelerated pace. Due to the short feedback look based on the volume, development cycles (or OODA loops) were shorter than for anybody else, resulting in most other competitors vanishing.
That makes scaling not only a function of efficiently administrating more resources but utilizing them in a way that builds additional moats. It’s ultimately scale/volume + focus on the right problems that accelerated growth for these companies.
Good observation on reorienting on fraud. Certainly not every competitor was savvy enough to focus on fraud while growing; those that did cracked down so hard and so onerously that they crippled their growth.
Discussion for the Texas Instruments case goes below. What are the similarities or differences to the cases that came before?
Building off @roman’s thought on PayPal…
For Ford’s vision, scale was a feature
For PayPal’s plans, it was inevitable
For TI’s industry, it was a prerequisite.
But what makes sense for infrastructural manufacturing doesn’t always translate into scaling for mass consumer goods.
I have a TI data math calculator and a TI 99/4A computer - about which Bill Cosby famously joked that it’s easy to get people to buy a computer when you pay them $100 to do it. The situation with Commodore and TI reminds me of an anecdote from this book about two railroad builders who were competing over a route. One started subsidizing cattle transport to drive the other out of business; the other started investing in cattle and then transported them on his competitors line.
It strikes me that TI’s turmoil occurs at about the same time (just a few years before) Intel chose to drop their DRAM business.
Finally, it seems that TI’s approach subtly shifted over time. In the beginning, they recognized that scale was the only thing holding back certain types of development. (You can, I think, make a similar case for solar panels and lithium batteries over the past decade). But by the 70s they were looking on scale as a way to conquer markets, period.
Catching up on these - there’s a few so I’m going to try to brief. Apologies if I repeat myself, I’m working through the ideas out loud.
This framing makes a lot of sense to me. When I read it, I was thinking of it in financial terms first:
For Ford, pricing was a byproduct of scale bringing unit economics down.
For TI, pricing was a way to build scale.
I’m going to assume (hopefully not incorrectly) that Ford never really underpriced themselves. They maintained the margin by using scale and operational efficiency. Meanwhile, TI/Chang underpriced to reach scale because their business could only exist at scale.
Chang calls it “learning curve pricing”, which is an important distinction from what TI had later, aka the "race to the bottom”. Learning curve pricing only works while there is a learning curve… when an industry matures and yield is high , there is no learning curve - it becomes a war of attrition for who can bring prices down. That tends to not be entirely within a company’s control: international competition can benefit from lower costs labor, proximity to vendors/suppliers, government subsidies in key developing industries, or tariff imbalances.
“Learning curve pricing”/underpricing to build scale is also a pattern I think we see with modern venture-funded startups that pursue growth at all costs. I.e., food/grocery delivery and on-demand transportation are the poster-children of what it looks like to try to build scale while also have competition flush with cheap cash, driving pricing down.  It becomes a race to see who can survive longer, though instead of overseas competition, they compete on access to cheap capital.Many of these companies don’t have the same problem of a “learning curve”, but they do have similar problems vis-a-vis needing to build two-sided marketplaces large enough to support each other. 
Within the TI case study itself, there’s another comparison/difference to draw out. As Peter pointed out, sales DTC and sales to enterprise buyers are really different beasts.
Enterprise sales tend to be made in bulk precommitments and are driven by a sales team. A single sale might be for a million + units and an enterprise-focused company can thrive even with relatively few clients. (Arguably, enterprise sales is itself a form of scale).
Individual DTC product sales are made per-unit and are mostly driven by marketing team. Even if you wholesell to accounts, many have provisions where they can return unsold inventory. (Another way to think of it might be as supplier vs retailer)
Change moved from high revenue bulk sales (chips) to low revenue individual sales (watches). With chips, you could presell them in bulk at a lower cost knowing that you have some lead time to iron out the kinks and make the low price point work (that’s learning curve pricing in a nutshell – the optimization catches up to the price. Said another way, optimization follows price rather than price following optimization). The commitment reduces uncertainty and risk – you can forecast what pricing should be if yeild was at X%. Whereas for consumer goods, you need to have the product to sell it. You enter a market with uncertainty as to current and future consumer preferences and that uncertainty/risk grows as others bring products into the marketplace. If you’re competing on price, can you hold out long enough to put competitors out of business before consumer preferences change? I’d hate to be in that business. 
Another thing I am noticing is that Ford and TI were individual product sales, while PayPal and Netflix were more akin to marketplaces (if you squint at it).
Ford and TI sold individual products and optimized for the production of those individual products. Their revenue model was directly related to individual products they produced, and they were optimized for producing the same thing, over and over. When consumer preferences changed, they had a hard time changing with that.
PayPal and Netflix tackled the whole marketplace problem. (again, you have to kind of squint at PayPal’s underlying business). PayPal wasn’t concerned with any individual fraud-fighting feature – they skim off the top of any and all transactions. Likewise, Netflix isn’t concerned with the relative success of any individual show/movie – get paid for access to a variety of movies/shows. You can be into Horror or SciFi or K-dramas, you can watch one show or one hundred and they’ll get the same value from you, and the same amortization of production across the whole consumer base. They were designed for producing lots of different one-off things rather than the same thing over and over. And whereas Ford/TI used scale to bring pricing down, PayPal/Netflix used scale to drive market resilience up.
Also, reflecting further on CFT - in addition to teasing an idea apart and finding differences/similarities between cases, I’m also noticing the inverse effect: as you look at different case that might be unrelated, you can build up an ill-structured domain.
Particularly, I’m noticing the disintegration of a first-mover advantage across all of the cases (which extends to the next case as well).
 - Mark Stoller has a really in-the-weeds newsletter about monopoly power (https://mattstoller.substack.com/) and one of the recurring themes is using underpricing to drive competition out of business, then raising prices back up beyond their original levels. I’m starting to notice that same sort of behavior in VC-funded markets as well, especially as businesses are being pushed to show actual profit.
 - Or to use Andrew Chen’s framing, most network-effect businesses are like this. If your two-sided network has an imbalance (too much or too little supply, too much or too little demand) then the network can collapse on itself, and there’s a critical point to hit for the network to be useful.
 - As a personal aside, pricing psychology is super interesting. For example, at one of my businesses we could not compete on price, and the lower price positioned us for completely different buyers and market segments. We found that price-conscious buyers were often more critical and more selective and were looking for different tradeoffs than what we were bringing to the market. By raising our prices (we currently sell at probably 500% more than what our former competition was, we actually entered into a new market with different consumers. Our sales grew because of the price increase and what consumers positioned us against – we didn’t change anything but the price.
The learning curve and economies of scale are two separate ideas. The learning curve sees the reduction of cost over time–as learning takes place. Economies of scale deals with cost as a function of scale for a given period.
So you’re getting at something really interesting @kharling. I think one common way of talking about concepts and reality is that reality is ‘continuous’ and concepts are ‘discrete’ — that is, we lift certain clumps up from continuous reality to an abstract level, so that we may discuss them with each other or manipulate them in books, etc.
And so one of the things that’s come up while writing this series of cases is ‘do we call out all the discretised concepts?’ For instance, we could have written a concluding piece explaining:
Scale economies deals with cost as a function of scale (as you’ve noted)
If we want to be pedantic, learning economies are separate from scale economies.
The Nike case study is actually an example of growth economies … which Wikipedia defines as ‘growth economies occur when a company acquires an advantage by increasing its size. These economies are due to the presence of some resource or competence that is not fully utilized, or to the existence of specific market positions that create a differential advantage in expanding the size of the firms.’
The reason I’ve refrained from going down this path is because I’m not sure that calling out exact definitions are that useful to the business practitioner.
(Feel free to convince me otherwise, by the way — I’m ambivalent about this).
What do I mean by this? Well, let’s take the learning economies as an example. To cite Wikipedia:
Learning and growth economies are at the base of dynamic economies of scale, associated with the process of growth of the scale dimension and not to the dimension of scale per se. Learning by doing implies improvements in the ability to perform and promotes the introduction of incremental innovations with a progressive lowering of average costs. Learning economies are directly proportional to the cumulative production.
That last sentence is key. Let’s say that a 5% reduction in costs occurs every 100k units produced. That means that in theory, anyone can drive their costs down so long as they cumulatively produce many 100k units. Of course the scale player will reach there first, but in theory everyone could do it!
But in practice, the scale player hits 100k units first, enjoys the 5% cost reduction, slashes its prices to screw with the non-scale players (while guaranteeing yet more scale for itself), and then continues to push its advantage (see: the Texas Instruments case).
Is there a modern instantiation of this? Yes — Samsung’s 5nm node process reportedly has a measly 35% yield (circa March this year), while the scale player — TSMC — has yields in the 80% range.
So can you really separate the learning economies effect from scale economies? I’m not so sure; I’d want to go hunting for real world cases where that is true.
Ok but here’s the interesting thing. The reason I brought up David MacIver’s “discretised concepts vs continuous reality” is because the Texas Instruments case is the first time learning economies were weaponised in this way. It is literally an example of someone innovating in continuous reality, before his hired consultants came in to concretise the concept for their own benefit.
While researching that case, @guanjief and I learnt that:
Henry Ford experienced learning curve effects intuitively. He never calls out the concept, but you see him living it by constantly slashing prices as a way to increase volume; he then forces his company to drive costs downwards. The interesting thing is that he doesn’t seem to realise there is a relationship between volume and learning opportunities. He just thought the two things were good ideas.
As such, Wright’s Law turns out to be the first articulation of the concept, in 1936 — Theodore Paul Wright observed that every time total aircraft production doubled, costs dropped by 20%.
But the strategic implications of the learning curve effect was really thanks to Morris Chang, who used BCG to crunch the numbers for this intuition he had about scale and pricing.
In other words, Chang the practitioner was operating in continuous reality, while BCG — then a small handful of consultants — came in later to discretise the concept and claim it for themselves. BCG’s early growth was really due to the learning curve pricing — they took Chang’s insight, firmed it up, renamed it ‘experience curve’ and turned it into a thing for their firm.
Right now what we’ve decided to do is to write out our cases to include as many related concepts as possible, without necessarily calling out every discretised concept.
I’m a bit torn up about this:
On one hand, ideally we should highlight every concept that shows up in the text.
On the other hand, calling out discretised concepts isn’t very helpful to the business practitioner? If you’re an exec faced with a scale incumbent, calling out the discretised concepts won’t give you ideas on how to beat the player — but giving you stories of many players who won through scale and then lost it might.
Sorry for the messy reply — I got very very excited about the history of learning economies and went into that history in the middle of my reply. What do you think though? Should we try as much as possible to call out discretised concepts?
One idea I’ve been toying with is just telling the stories, and letting the community call out the concepts on their own. That way we can move quickly on case creation, while the community gets the bulk of the pedagogical benefit for calling out the discretised concepts.
This speaks to something I was trying to figure out how to say around cash strapped hiring - that the opposite of cash strapped hiring is to be overpaying for people because you’re hiring without a sense of what you actually need. But another way to look at this is “learning curve hiring” - hiring when you’re not sure yet what you need.
I would suggest that there’s an opportunity here…to think of this case study project as a learning economy. The inherent problem in ill structured domains is that we don’t know what’s important - by learning to highlight it, we’re bringing structure to the domain, and we’re refocusing our attention on the known knowns and known unknowns. A good starting point could be something like a collection of “further reading” links similar to what you send out in your weekly email update, and to actively augment the case study content with the sorts of references called out in this post. That would create an interesting opportunity for lateral reading through the cases - where someone engages with the study series because it’s adjacent to something they’re interested in, but as they go through it they recognize that there’s a more precise way to discuss the thing they’re interested in and can pivot off naturally from there.
I like this, provided it can be tied in closely to the case text. I have found it more difficult to cross-reference ideas through this forum conversation.
Noted. I said earlier in this thread that I wasn’t sure if having a separate thread for each case was a better idea, but I’ve received enough feedback now to know that it likely is. Will make that change going forward.
I think having separate threads is a win - but I’m also attracted to the idea of being able to engage with the case more as hypertext, where comments about a part of the text can be more closely mapped to their context in the source document. (I’m thinking something along the lines of where Medium or my Kindle will show me sections that others have highlighted, but then allow me to dig in more deeply to their comments.)
I’m a late starter in the series, so still parsing the ford & netflix cases that landed in my mailbox.
I’m not sure I agree with the scale description of the netflix case. Their original scale came from the amount of customers. The way streaming video content works over the internet you are likely to experience both scale and network effects based on your audience size. And Netflix surely did outrank it’s closest competitors. As is clear from the case, it was 4x the size of Amazon in 2010, while Amazon was giving the service away for free
Pushing large volumes of data over the internet becomes cheaper as you scale due to the nature of pricing network capacity on bandwidth instead of bits sent and peering arrangements. In addition you enjoy network effects as a large player, once you have established a local cache with a local-loop network provider for your customers, any additional customer is “free” or much much cheaper…
You can still see that back in Netflix’ GTM strategy on bundling with local-loop providers and later mobile providers
There is a sentence in the case on 2008 changes that is undervalued from a strategic perspective: " (At the time, Netflix paid studios based on content utilisation. It has since switched to fixed fees)"
Had they not managed to make this shift in cost model this early in the game they would never have been able to start their second tier of scale advantage in content and (I think) would have had to start much earlier and under less favorable financial conditions with their original content bet. I think this is worth highlighting clearer in the case description.
There is an additional aspect to netflix scaling that I’ve always found interesting, curious what other folks think about this aspect /flywheel.
Because of the huge costs in content acquisition / content distribution, your R&D costs become almost invisible in your overall COGS. Combined with a fixed cost content model instead of rev share model, you have basically incentivised your product/R&D group to run as many experiments on demand generation and/or customer engagement as possible because the potential payoff is large. Even small increases in customer content engagement yield large benefits especially when combined with demand generation capabilities that steer additional engagement to high profit parts of your content catalogue.
A digital streaming service is uniquely suited for these kinds of experiments as opposed to the other earlier content business models. But it puts the recent subscription woes in a different light IMO.
Looking forward to the rest of the cases, really like the experiment
I’m going to add my vote to individual threads per business case
My late entry after reading netflix case ended up after a whole discussion on TI, PayPal e.a that I was trying to avoid to read because spoilers
Hi @ramon! You’re right — the Netflix case needs to be completely rewritten to focus on the switch to content creation as a scale advantage (i.e. turning variable costs into a fixed cost, amortised across the entire subscriber base). We were so in the weeds that we didn’t realise that it was not clear. A clearer, shorter version may be found in the 7 Powers summary (in the scale economies section)
Just finished reading the PayPal case that was emailed to me today
I am curious why PayPal was included as a scale economy case? To me it read like an example of Hamilton cornered resource or process power instead.
Or potentially a learning economy on fraud.
I can’t make out from the case description how transaction costs went down when they captured more market share. The entire case focuses on the differential fraud detection/risk management tech and at the end of the interview/case it seems that they point out that PayPal is still maintaining its lead in that area by dropping their fraudrate by another 10 base points from their already low level in the industry. The case lacks fraud comparison numbers to its hard to make out.
Their access to that tech certainly allowed them to scale because they had much lower costs than competitors. But my understanding was that economy of scale powers meant that the lower cost/higher margin would need to be causally linked to the larger scale. And I can’t find that causality in this case.
Competitive equilibrium will mean that the casinos who can bid the highest for the “customer” is the house that can:
a) source the most uncorrelated offsets to the wager
b) has the biggest bankroll
In the trading business, condition A is satisfied by the market makers with the best data/analytics and “see the most flow”. A firm entrenched in both equity markets and futures markets with licenses from both the SEC and CFTC is going to be more efficient at laying off the risks it acquires from serving tourists regardless of the venue they choose to play in.
A and B will create a virtuous loop. The best players will build larger bankrolls which allow them to outbid competitors further which earns them first look at the flow which improves their models and so forth.
The overall problem that market makers face in public markets is something called ‘adverse selection’ — that is, the risk that the person putting on a trade is an extremely smart institution with privileged information about the securities being traded.
To give a simplified example: let’s say that you’re a market maker. The way market making works is that someone comes to a stock exchange and says “I want to sell $100 million of Amazon stock at the following price range: X — Y” and you say to the exchange “ok, I can fulfil that order with the following spread”, and if that spread is good (that is, small) the exchange will give you the order to fill, and you will be allowed to buy the securities from them and hold it on your books for a few seconds/minutes before you sell it off to someone else. The risk, of course, is that the person buying is some super smart hedge fund that knows something about Amazon that you don’t, and within minutes of the trade the market realises what is going on and the price of Amazon slides and you take a huge loss.
So what market makers want is a pool of relatively uninformed, uncorrelated traders to offset against the risk they’re taking when their counter-party is an ultra smart institutional trader. This is the logic for ‘payment for order flow’ — market makers will pay companies like Robinhood to send retail trades their way, since the odds are good that these folk are all uncorrelated, relatively misinformed folk who are willing to buy securities when the smart money is fleeing (and vice versa).
(Btw, I am not an expert in this, so do tell me if I got the fundamental description of market making wrong.)
Notice that in order to get trades, market makers have to offer competitive bid-ask spreads. And so the benefits of scale show up in the following way:
A and B will create a virtuous loop. The best players will build larger bankrolls which allow them to outbid competitors further which earns them first look at the flow which improves their models and so forth.
Of course, in practice the exchanges don’t want just one market maker (or the MMs would have too much power), so the exchanges spread the retail order flow out across a couple of market makers, and have complicated methods for figuring out how to do this while still getting good prices for their retail traders.
The most interesting property of this instantiation of scale economies, to me, is the idea that everyone else who isn’t a scale player plays in an environment that slowly becomes more toxic over time.
So … how is this similar to the Paypal case? And how is the Paypal case different?
I have now also received the TI/TSMC case, which is similar in some extends to the PayPal case / instantiation we’re discussing.
Caveat: I know next to nothing about stock trading internal mechanics so I had to look up what a marketmaker is and learned a bunch of new stuff about how exchanges work along the way. So, thank you for that but certainly even more of a novice than you claim to be. So this might end up in a case of one-eye leading the blind
However reading your analysis of the market-maker position and risks and assuming bankroll and access to uncorrelated customers to the transaction are both scale factors that are used to hedge against a superior knowledge advantage from someone else. I think both the bankroll + access to uncorrelated customers will allow for offering a better spread if scale for either goes up.
So whether or not there is knowledge involved on the marketmaker side of the equation hinges entirely on your comment that:
And I really don’t understand enough about stock markets or market makers to assess if this is a similar learning effect as described in the PayPal & TI cases., however the example you give afterwards looks like a scale example to me, not a learning example (I think?!)
What makes PayPal stand out to me, even compared to TI/TSMC was that they needed knowledge to survive irrespective of scale. The toxicity of the market was generated by an (illegal) outside party, not through scale. And the knowledge was pretty closely tied to a phase-transition of taking internet payments from custom to product. Or in other words, the fraud knowledge that PayPal needed to survive and grow was a pre-requisite for this new form of digital payment. It would kill small and large alike.
It reminds me a lot more of the co-evolution of practices during a phase transition as described by wardley maps here
@ramon I find Wardley Maps intriguing — the tool seems well adapted for shared sensemaking in a team setting, though I haven’t dug too deeply into it yet.
This may be true, but PayPal’s position as the market leader likely contributed to their ability to arrest fraud. Jimmy Soni’s book The Founders is a little oblique on this — he dedicates one whole chapter to talk about the fraud problem, and how Max Levchin (sidelined by Musk after the X.com-Confinity merger) threw himself at the problem after realising that it was an existential threat.
Initially Levchin’s edge was that he could hang out on Russian hacker forums and speak to said hackers in Ukrainian, but eventually PayPal built tools to visualise patterns of transactions between wallets. This second thing was arguably key; it was later turned into data analysis software company Palantir — and it required some scale to do well. The more patterns PayPal’s tools saw, the better it could recognise and flag them to human checkers.
Interesting, if the volume of patterns is key to developing the capability then scale advantages clearly apply…
I still think the PayPal case is different from the tsmc case because of the transitional aspect of it.
If the interfaces that PayPal used to offer payments on the interwebs where already present and pretty much standardized, why didn’t any of the owners of those interfaces compete with them?
It’s reasonable to assume that those entities would have access to a much larger fraud pattern library than PayPal and likely to have more experience with fraud as well.
There is something to this case that feels very different than the texas instruments case where increased scale was achieved through pricing manipulation. PayPal case registers for me as a transitional scale shift, TI & TSMC registers as optimisation scale shifts.
But perhaps this is just my personal bias because I’m more familiar with the complexity of the challenges in PayPal’s world and lack of context makes it easier to dismiss the challenges faced by TI/TSMC.
PS did you see the stratechery analysis or re-evaluation of the aggregator pattern for Netflix vs Spotify?
Wait, but to context set — the case we cited was from a 2004 talk by Thiel and Levchin at Stanford, only a few years after their PayPal experience and four years after the dotcom boom and bust. At the time:
No banks had good money transfer services tied to email (I think they still don’t to be fair — though in Singapore we now have a scheme that’s tied to phone numbers!)
No standardised interfaces existed (for context, PayPal had to ‘invent’ the captcha as well as invent the dual transaction verification method for verifying bank accounts — source: The Founders by Jimmy Soni)
Sure, banks had fraud departments, but a) you had to open a bank account to use their money transfer services whereas with PayPal and its competitors, anyone with an email address could sign up, b) their growth wasn’t as meteoric as PayPal, and c) it took awhile before banking UIs caught up with internet-first companies.
So the patterns and behaviours were probably very different from the fraud patterns in more traditional companies.