Some petroleum is light crude oil just below the ground, which gushes forth if you dig a deep enough hole in the dirt. Other petroleum is trapped far beneath the earth or locked in sedimentary shale rocks, and requires deep drilling and elaborate fracking or high-heat pyrolysis to be usable. When oil prices were low prior to the 1973 embargo, only the cheaper sources were economically viable to exploit. But during periods of soaring prices over the decades since, producers have been incentivized to use increasingly expensive means of unlocking further reserves.
The same dynamic applies to data—which is after all the plural of the Latin datum. Some data exist in neat and tidy datasets—labeled, annotated, fact-checked, and free for download in a common file format. But most data are buried more deeply. Data may be on badly scanned handwritten pages; may consist of terabytes of raw video or audio, without any labels on relevant features; may be riddled with inaccuracies and measurement errors or skewed by human biases. And most data are not on the public internet at all.
An estimated 96% to 99.8% of all online data are inaccessible to search engines—for example, paywalled media, password-protected corporate databases, legal documents, and medical records, plus an exponentially growing volume of private cloud storage. In addition, the vast majority of printed material has still never been digitized—around 90% for high-value collections such as the Smithsonian and U.K. National Archives, and likely a much higher proportion across all archives worldwide.
Yet arguably the largest untapped category is information that’s currently not captured in the first place, from the hand motions of surgeons in the operating room to the subtle expressions of actors on a Broadway stage.
Between the end of the Second World War and 1980, manufacturing employment as a share of total employment fell from around one-third of all jobs to one in five. Today, that number is down to less than one in 10.
Though economic nationalists argue that this trend was caused by globalization, in reality the primary driver (by a large margin) was technological change: Advances in technology, which led to substantial gains in manufacturing productivity, displaced many middle-skill, middle-wage manufacturing workers.
These labor-market disruptions were not confined to manufacturing: New technologies also eliminated the need for many middle-wage administrative, office, clerical, construction, and manufacturing-production jobs. Instead of an administrative assistant, many white-collar workers now have Microsoft Outlook and voicemail. Instead of depositing a check by handing it to a cashier, people use ATMs or their smartphones, or make and receive payments electronically.
But as disruptive as this wave of technological change was, the past several decades have been a net positive for American households. In fact, the story of the last half-century is mostly a story of upward mobility.