Is the Boeing 737 MAX safe?

Photo by Oleg Belyakov, CC BY-SA 3.0

Two high-profile crashes in less than six months have raised questions about the safety of the latest iteration of the Boeing 737, the MAX. Aviation authorities in various parts of the world grounded the aircraft within 48 hours of the second crash. It took the United States longer to act. But do two fatal crashes prove that the aircraft is unsafe? This question is fundamentally statistical in nature — it is equivalent to asking whether bad luck could explain these two crashes, or whether they indicate some underlying flaw with the aircraft.


Historical crash rates

To understand how improbable these crashes were, one must first appreciate the incredible safety of modern aviation. In over 87 million flights by the MAX’s predecessor, the 737NG, there have been only seven “hull loss” accidents with fatalities. Only five of these involved double- or triple-digit fatalities. The rate of “catastrophic” accidents was thus less than 1 per 17 million flights. The 737’s main competitor, the Airbus A320, sports a roughly similar safety record. The MAX has flown approximately 350,000 flights with two catastrophic accidents, or 1 per 175,000 flights. This rate stands out as almost 100 times higher than its predecessor. With these statistics we can ask, if the MAX had the same safety level as the previous generation 737 or the A320, what is the chance that we would have observed two catastrophic crashes so quickly after its entry to service?

Figure 1

Sources: “Statistical Summary of Commercial Jet Airplane Accidents, Worldwide Operations, 1959–2017,” Boeing Corporation; “Boeing 737 Max 8 Jets Are Grounded Nearly Everywhere,” New York Times, March 12; Boeing Commercial Deliveries Log.


The odds of crashing

There are many reasonable approaches to answering this question. All, however, yield the same answer: the MAX does not meet the safety standards set by the 737NG or the A320. The easiest way to see this is to note that, since its entry, the MAX has accounted for 1% of flights by 737NGs and A320s but is responsible for 100% of catastrophic crashes. If it had the same safety level as its contemporaries, the chance that the first catastrophic crash would hit the MAX, instead of a different model 737 or A320, is approximately 1 in 98. Thus the first crash was already cause for concern. But it was not definitive. Among other things, we did not set out to test whether the MAX was safe, as there was no reason ex ante to believe it would be less safe than any other new aircraft model. If Boeing and Airbus were to introduce dozens of new aircraft models over many decades, all of them just as safe as the 737NG and the A320, eventually one would crash as quickly as the MAX did.

The second crash foreclosed this possibility. The chance that both the first and second crashes would hit a MAX, instead of another model, is approximately 1 in 9,600, or 0.01%. Before the second crash the MAX was already under intense scrutiny, and regulators were implicitly testing whether it was safe. Mere bad luck is no longer a plausible explanation. Instead the only remaining explanation is that the MAX is, for whatever reasons, less safe than other 737s and A320s.

Figure 2

Picture these circles as roulette wheels. The roulette ball is tiny, the size of a single pixel on your screen. The chance that the ball lands inside the red stripe on the left wheel is 1 in 98. The chance that the ball lands inside the (barely visible) red stripe on the right wheel is 1 in 9,600.


How likely is it that the MAX is flawed?

It is tempting to conclude that the data reveal a 99.99% chance the MAX is unsafe, but that would be misleading. After all, most new aircraft models safely enter service. With enough new entries a string of improbable bad luck will eventually strike one of them. A formula known as Bayes Theorem allows us to quantify how likely it is that the MAX is unsafe. If we believe that the chance of a new aircraft having a serious flaw is 1 in 10, then Bayes Theorem reveals that the chance that the MAX is unsafe is 99.9%. Intuitively, the improbability of a flaw, 1 in 10, must be weighed against the (extreme) improbability of two fatal crashes of an unflawed aircraft, lowering the chance from 99.99% to 99.9%. Alternatively, suppose we have strong faith in Boeing and believe that the chance of a new aircraft having a serious flaw is only 1 in 100. Then Bayes Theorem reveals that the chance that the MAX is unsafe is still 98.5%. Even optimistic observers should thus be deeply concerned about the safety of the MAX.

Statistics and probability reveal that the MAX is unsafe, but they cannot tell us why. Only thorough accident investigations by trained, impartial personnel will reveal these details. Until we know the causes of the two accidents, however, safety-conscious air carriers and authorities that understand data science should continue grounding the MAX.


If you are concerned about the safety of the MAX, please share this article.

Advertisements

Polls

In the first week of my graduate econometrics course, I try to emphasize the difference between using regression results for the purposes of prediction versus using them to make causal inferences. If your goal is to make predictions (without altering the data generating process), then simple regression, applied to observational data, can be a remarkably robust tool, and you don’t need to worry about designing a randomized controlled trial or coming up with a fancy quasi-experimental research design. This is also why polling is such a reliable predictive tool, and so I was surprised to see that in 2008 a two-day old CNN poll apparently did a terrible job of predicting the GOP Nevada caucuses results.

The linked reference, a tweet, claimed that CNN predicted vote shares of 29% for McCain, 19% for Romney, and 6% for Paul, but that the actual results came in at 51% for Romney, 14% for Paul, and 13% for McCain. The Nevada caucuses have very low turnout, so it’s conceivable that CNN completely mispredicted who would show up at the polls, but a deviation of that magnitude is still surprising. That the linked tweet had no reference piqued my suspicions, and a few minutes of searching on Lexis-Nexis confirmed them: the referenced CNN/ORC poll is a national level poll, not a Nevada state poll.

To eliminate any doubt, note the polling shares for each GOP candidate in the archived poll: McCain – 29%, Huckabee – 20%, Romney – 19%, Giuliani – 14%, and Paul – 6%. These numbers exactly match those in the tweet. The CNN/ORC poll was conducted on Jan 14-17, 2008, and the Nevada Republican caucuses occurred two days later on Jan 19, 2008. The probability that two separate CNN polls of different populations of voters, both supposedly conducted on Jan 17, 2008, would produce the exact same numbers for five candidates is on the order of 0.001 or less.

The bottom line:

  1. If a deviation looks too extreme to be plausible, maybe it is.
  2. Be even more suspicious of numbers that look implausible and have no references attached.

On Roads and Apple Cars

An abundance of evidence suggests that Apple is developing a car. Significant to their motivation may be the fact that automotive sales exceed $1 trillion of revenue per year in the US alone, and thus are one of the few consumer goods that command greater spending than smartphones. The rumors started with frequent sightings of Apple-leased Dodge Caravan minivans on public roads. The minivans are equipped with cameras and other apparatus, sparking speculation that they either may be part of a Streetview-style mapping operation or that they may represent an attempt to develop a self-driving car. For a couple of reasons, I have been virtually certain since the initial article that they represent the former rather than the latter:

  1. Per the original article, Rob Enderle claims, “it’s a self-driving car rather than a mapping car.” The fact that Rob Enderle claims it’s not a mapping car should make you fairly confident that it’s a mapping car.
  2. Apple vans have been spotted in Hawaii. It seems incredibly unlikely that Hawaii would be a convenient test location for any Apple Car, self-driving or otherwise.

The Economics of Mapping

Adding to my confidence that the vans represent a mapping effort is the fact that Streetview-style maps are surprisingly cheap to create. Apple Maps has improved tremendously in coverage and accuracy since its rocky 2012 debut, but Streetview is one genuinely useful feature that it still lacks.1 It makes sense that Apple would add Streetview if the cost is reasonable, but how much might that cost be?

If a mapping van averages 20 mph – including stops, breaks, pauses for bad weather, etc. – for 4,000 hours per year (two shifts), then each van can cover 80,000 miles per year. The United States currently has 8.7 million lane-miles of public roads. Thus a fleet of 110 vans could map every public road in the US in one year. If accurate mapping requires multiple passes, it might take several hundred vans – conservatively assume 300 of them.

Hiring a light vehicle typically costs up to several dollars per mile, so the operating costs for a single van might be as high as $300,000 per year. Even at that level, total costs to map every public road in the US would be “only” $90 million (plus associated hardware, server, programming, and data processing costs).

In reality, mapping efficiency is probably higher than estimated here, and Apple needn’t map every public road in the first iteration (about 30% of lane-miles are on unpaved rural roads that receive almost no traffic). Still, even $90 million is much cheaper than I would’ve expected. It equates to 8 hours of operating income during one of Apple’s most recent financial quarters.

The bottom line is that creating Streetview-style maps costs a drop in Apple’s revenue bucket and eliminates one of the last two big gaps in their mapping product. It seems like a no-brainer, and I will be surprised if they don’t announce this feature by iOS 10, and possibly as soon as Monday.

Update, March 2019: We are now on iOS 12, and still no Streetview! They have, however, been putting the mapping data to use.

Update 2, June 2019: Finally. Better late than never!

The Apple Car

Although the camera-equipped vans are clearly related to maps, not car development, it is immediately clear why Apple might also have an interest in building a car. Cars represent a market that has existed for the better part of a century but has yet to be commoditized. From a purely utilitarian perspective, the Toyota Corolla and Toyota Sienna are more than “good enough” to meet the transportation needs of almost all American families; higher-end models and luxury vehicles provide virtually no functional advantage in real-life driving scenarios over these baseline vehicles. Yet many Americans buy vehicles that are significantly more expensive than a basic sedan or minivan. A market like this one that resists trends towards commoditization is one in which Apple can profit from its design skills.

For the same reason, I’m skeptical that Apple would want to design a fully autonomous vehicle like the proposed Google self-driving car. Fully autonomous vehicles make little sense to own, particularly in urban areas, because they can simply be summoned on demand as robo-taxis. The problem with a robo-taxi, from Apple’s perspective, is that most people don’t particularly care what kind of taxi or rental car they hire, as long as it meets a certain baseline. Hence traditional taxis (and now UberX and Lyft) have much more market share than limousine services, and midsize sedans from Kia, Hyundai, and the Big Three dominate rental fleets. The robo-taxi market, which would almost surely become commoditized, would be an unattractive market for Apple to play in. But the good news for Apple’s automotive ambitions is that true mass-market Level 4 autonomous vehicles are likely still far away.

Notes

1. The other is transit directions, but it sounds like this too may be addressed shortly.

MUNI–

If you’ve ever ridden San Francisco’s MUNI system, you might have felt that it’s run by political hacks and buffoons.  You might have felt that the average MUNI worker doesn’t care about anything except getting his or her next paycheck.  But you have no experience in operating a multimodal mass transit system under a budget constraint in a challenging urban environment.  Is MUNI terrible because the people who run it are terrible, or is MUNI terrible because the problems it faces are insurmountable?

The new Muni+ app allows us to resolve this conundrum.  Here MUNI has the opportunity to build an app within the well-defined boundaries of the iOS APIs and list it in the highly curated App Store.  There are no transients, drunks, or criminals that will board the app and interfere with its operation.  The app does not require hard-to-find spare parts for buses built 40 years ago.  The budget for the app is effectively unlimited in comparison to the overall MUNI budget.  MUNI does not need to negotiate a special union contract with their app developer.  So how does MUNI do when creating a product in this pristine, controlled environment?  Can they execute in a competent manner?

No!  This app is preposterously bad.  The reviews do not do it justice; you must download it and try it for yourself (as I did).  It almost makes me think that it’s a sick prank designed to discredit MUNI, except that they actually promoted it on their own website.  Anyway, one good thing has come from this app: we now know that MUNI is truly incompetent.

Update: Replaced some links that expired with archive.org links.

Update: “Desktop Class”

Re: last week’s post on the feasibility of ARM-based MacBooks, the A7 benchmarks are out, and it looks like an A7-based MacBook Air could be reasonably competitive with the Intel MacBook Airs. The bottom line is that the Haswell MacBook Air is 1.6x faster than the iPhone 5S in all single-threaded integer tasks (I’m less interested in floating-point performance since that has lesser impacts on perceived performance for the “average” user). If we assume a 25-30% increase in clock speed when moving to a laptop power envelope, then an A7-based MacBook Air could achieve 80% of the performance of the current Haswell MacBook Air.  I suspect that would qualify as “good enough” performance for an entry-level MacBook Air, but ultimately that’s Apple’s decision to make.

“Desktop Class”

Like many, I was surprised by the new 64-bit A7 at today’s Apple Event. I didn’t think it would be possible to double CPU speeds again in one year (vis a vis the already fast A6), and I certainly didn’t expect a 64-bit CPU. However, I don’t think the biggest news here is the performance of the iPhone 5S, as impressive as it is. I think the biggest news is that the A7 is legitimately a “desktop class” processor, as Phil Schiller put it.

Performance

Is that possible? Could the A7 really be desktop class? After browsing some Geekbench 3 scores, I believe the answer is “yes.”  To be competitive with the current Intel MacBook Airs, an ARM-based MacBook Air will need to generate roughly similar single-threaded integer performance, because not all tasks can scale with additional cores.  There are currently no iPhone 5S benchmarks in the Geekbench browser (though that will change in about a week), but we can extrapolate from the iPhone 5 benchmarks.  Relative to the iPhone 5, a Haswell Core i5 MacBook Air is 2.1x to 4.2x faster on single-threaded integer tasks (I omit the AES benchmark since it is orders of magnitude faster on Intel hardware due to the AES Instruction Set).  The geometric mean across all 12 tests is 2.5x, so an A7-based MacBook Air would need to be around 2.5x faster than today’s iPhone 5.

Let’s assume that the A7 really is “2x faster” than the iPhone 5’s A6, as Apple claims.  Honestly I suspect this is conservative; the A6 is about 2.7x faster in Geekbench 3 integer performance than the A5, and Apple claimed a “2x” performance gain there as well.  Let’s also assume that Apple could crank the clock speed by at least 25% with a MacBook Air power envelope instead of an iPhone power envelope.  If so, an A7 optimized for a MacBook Air could achieve similar single-threaded integer performance as the current Haswell MacBook Air (2x*1.25 = 2.5x).  Combined with multiple cores and good memory bandwidth, I expect that an A7-based MacBook Air would perform quite well.  It would also  get great battery life.

Update: the benchmarks are out, and while a 25-30% clock-boosted A7 doesn’t quite get to Haswell MacBook Air levels, it gets 80% of the way there.

Economics

What is the advantage of an ARM-based MacBook Air?  In a word, cost.  No one outside Apple and Intel knows exactly how much Apple pays Intel for each Core i5, but it’s likely in the range of $250–300.  According to Wikipedia the CPU’s price is $342, though large OEMs like Apple surely get a discount.  Most of this price represents pure margin, going either straight to Intel’s profits or into developing the next generation of CPUs and fabs.  The bottom line is that an A7 would cost Apple much, much less than $250–300 to manufacture.  For reference, the bill of materials for the entire iPhone 5S is very unlikely to exceed $300.

Cutting a couple hundred dollars of cost from the MacBook Air would give Apple a lot of options.  One option would be to lower the price of the low-end model by 20% while maintaining the same profit margin.  Another option would be to maintain the current price while dramatically boosting profit margins.  In reality I expect they would choose some combination of these two options.

Put another way, Apple could match Wintel prices and maintain a comfortable 25–30% margin while the Wintel OEMs struggle to maintain a 5–10% margin after paying the Microsoft and Intel IP “fees” for using Windows and x86/x64.  That’s not to say that Apple would offer $500 laptops; I doubt they’d be willing to make the build-quality sacrifices needed to produce a $500 laptop.  But they could literally price match any “Ultrabook” out there with similar specs and earn a much, much higher margin than their Wintel competitors.

The clear loser in this scenario would be Intel.  Not only would they lose current Apple sales, but they could lose current sales to other OEMs as well if Apple gets aggressive on pricing to gain additional market share.  In the longer term Microsoft could also suffer.  The ARM version of Windows, Windows RT, has been a resounding failure to date.  ARM MacBooks with competitive performance would put Microsoft in the uncomfortable position of either saddling its OEMs with high hardware costs (by forcing them to use expensive Intel CPUs) or abandoning its biggest competitive advantage (the huge library of existing x86-compatible Windows software).  If it becomes necessary to make that decision, Steve Ballmer might be glad he was forced out.

The Bottom Line

  • An A7-based MacBook Air could likely be performance competitive with current Haswell MacBook Airs
  • But it would likely cost Apple a couple hundred dollars less to manufacture than current MacBook Airs
  • This cost advantage could put Intel, PC OEMs, and eventually Microsoft in a bind

Palmdale Spur

Clem Tillier recently (and convincingly, in my view) argued that a Tejon alignment for the California High Speed Rail System would be significantly cheaper and faster than the proposed Tehachapi alignment. The main pushback from blog commenters (admittedly a non-representative population) appears to be that isolating the Antelope Valley from high-speed service is not worth billions of dollars in capital cost savings and 12 minutes of reduced running time. Furthermore, some are concerned that eliminating the connection to the future XpressWest line would doom that project.

Even if we accept these counterarguments at face value, it still appears that a Tejon alignment could save money while preserving Antelope Valley service and a connection to XpressWest. What both sides seem to have ignored is that laying track in the Antelope Valley is, for the most part, absurdly cheap because it is flat and relatively uninhabited.  Below I sketch out a spur that could connect Palmdale and Lancaster to the Tejon alignment at modest cost.

I don’t have a train performance calculator, but reasonable estimates of running times suggest it would take 18 minutes to reach the CAHSR mainline from Lancaster and 23 minutes to reach the CAHSR mainline from Palmdale. At that point it would be approximately 12 minutes to Sylmar or, eventually, 25 minutes to Union Station. Total nonstop running time to Union Station would be around 43 minutes from Lancaster and 48 minutes from Palmdale. This compares favorably to current Metrolink service, which takes 120 minutes from Lancaster (local) or 93 minutes from Palmdale (express). One-seat rides from the Antelope Valley to Bakersfield, Fresno, and the San Francisco Bay Area would also be feasible. A connection to XpressWest could be added north of Lancaster. The spur also allows for a future station at the planned Centennial development off of I-5 and SR-138, which could contain up to 70,000 people.1

What about cost? A detailed KML file listing major civil structures (viaducts, grade crossing, and cuts) is available for download here.2 The bottom line is that there would be approximately 52 miles of track at grade, 4 miles of viaduct, 1.5 miles of cuts or fills, 9 grade separations, and 1 canal crossing. Using the FRA cost measures found in the Merced-Fresno Final EIR, this works out to about $1.24 billion before any overhead, and $1.98 billion after accounting for overhead. This implies a cost of $34.7 million per mile, which is almost identical to the cost per mile of the initial Madera-Fresno segment.3

$2 billion is no small amount, and we can reasonably discuss whether the high-speed spur is worth it compared to double-tracking the Metrolink Antelope Valley line and purchasing tilting DMUs.  However, $2 billion is less than half as much as the $5.2 billion that Clem argues a Tejon alignment would save.  Thus the overall cost of a Tejon alignment with an Antelope Valley spur would still be significantly less than the proposed Tehachapi alignment.

Finally, it is worth noting that for Antelope Valley residents, even a spur still would not be as fast to Los Angeles as the proposed Tehachapi alignment. From Lancaster, the spur would take about 12 minutes longer to LA than the Tehachapi alignment. From Palmdale, the spur would take about 20 minutes longer to LA than the Tehachapi alignment. However, these figures must be weighed against the fact that a Tehachapi alignment would add 12 minutes to the ride of virtually every other high speed rail rider in the state.4 At that point, I think the choice that maximizes public welfare becomes clear.

Notes

1. Naturally the state would get the rights to this development as well when it used eminent domain to acquire Tejon Ranch Company, per Clem’s suggestion. The development would undoubtedly become much more valuable after the state completed a Tejon alignment.

2. Those who wish to view the detailed map without installing Google Earth and its associated crapware (i.e., Google Software Update) may view it here.

3. Whether we should expect the cost per mile to be higher or lower than the Madera-Fresno segment is unclear. On the one hand, unlike the spur costs quoted above, the initial Madera-Fresno contract does not include systems or electrification. On the other hand, the share of track in viaducts, trenches, cuts, or fills is almost twice as high on the Madera-Fresno segment as it is on the Antelope Valley spur. I should also note that I do not have an estimate of ROW acquisition costs, though these seem likely to be modest given the low price of land in the desert.

4. The main exception would be travelers between the Central Valley and the SF Bay Area, for whom the southern mountain crossing is irrelevant except insofar as it affects train frequency.