From Oil to Data – A Career that Lived the Metaphors

After a few years on those offshore platforms, battling through the turbulence of Bombay High and the newly found excitement of crude in India in the 1980s, then migrating to Australia to watch more cricket (my passion) thereby pivoting into the world of ones and zeros, the echoes of my first career are everywhere. Even after nearly four decades.

It’s almost uncanny how the language of oil and gas production has seeped, almost organically, into the lexicon of data and artificial intelligence.

When I first heard “data pipeline,” my mind immediately conjured images of miles of steel tubing, snaking across the seabed, carrying black gold from reservoir to refinery. In our world, it’s not hydrocarbons flowing, but streams of information – from sensors, databases, user interactions – being transported, transformed, and delivered to their destination. The concept is identical: a structured, efficient pathway for raw material to reach its destination through continuous processing.

Then there’s “model distillation.” On offshore platforms, we’d run crude through towering distillation columns, separating it into valuable components like gasoline, diesel, and jet fuel based on their boiling points. In AI, we’re taking a large, complex “teacher” model – often unwieldy for deployment – and “distilling” its knowledge into a smaller, more efficient “student” model. It’s about extracting the essence, the most critical patterns, without the bulk. The goal is the same: to get a more refined, usable product.

And “model refinery”? That one hits close to home. An oil refinery is where the real magic happens – crude is broken down, treated, and reassembled into high-value products. In AI, our “models” are constantly being refined. We’re not just training them once; we’re continuously optimizing, pruning, fine-tuning, and enhancing their performance, much like a refinery constantly adjusts its processes to yield better quality products and higher throughput. We’re taking raw algorithms and data and turning them into intelligent, actionable tools.

“Data extraction” is another direct lift. On the rigs, we were literally extracting oil and gas from the earth, or under the sea, pulling out the valuable resources from a vast, complex geological formation. In data, we’re doing the same – pulling out meaningful features, insights, or specific records from massive, often unstructured, datasets. It’s about finding the signal in the noise, the valuable resource hidden within the raw material.

The parallels don’t stop there. We talk about “data lakes” and “data warehouses,” which, to my ears, sound remarkably like the vast reservoirs of oil and gas we sought to tap. The process of “data exploration” feels very much like the geological surveys and seismic imaging we’d conduct to find new reserves. And “drilling down into the data” – that’s a direct echo of the physical act of drilling a well to access the resource.

Even concepts like “flow assurance” – ensuring the continuous, uninterrupted movement of fluids through pipelines – find their counterpart in ensuring data quality, integrity, and the smooth, continuous flow of information through our systems. Managing “pressure” and “flow rates” in a pipeline system is akin to managing data throughput, latency, and system load in a distributed AI architecture.

It’s a testament, I suppose, to the fundamental principles that underpin complex systems, regardless of the industry. Whether you’re dealing with physical fluids or abstract information, the challenges of sourcing, transporting, processing, refining, and optimizing for value remain remarkably consistent.

The jargon, it seems, is just a natural byproduct of shared problem-solving across seemingly disparate fields. It makes me smile sometimes, hearing my data teams talk about “pipelines” or “distillation” – a little piece of my past life, alive and well in the digital frontier.