📖 Read more: The Story of 'Free': Why Nothing Is Truly Free
The First Data: Clay, Papyrus, Census
The first “data” weren't data — they were accounts. 5,400 years ago, in Sumeria, scribes pressed cuneiform symbols into wet clay tablets. Each tablet was a "record": how many sheep, how much wheat, how many slaves. Writing wasn't invented for poetry — it was invented for accounting.
Sumerian clay tablets
The oldest known written information: temple inventory lists. Over 500,000 tablets found — the first “database” in history. Some were archived in “libraries” (Ashurbanipal, Nineveh).
Egyptian papyrus
Lighter, more flexible than clay. The Egyptians recorded everything: taxes, population, soldiers, labor. The “papyrus” of the first “census” was found: Census of 2 BC — a Roman census in Egypt, Joseph and Mary.
Domesday Book (England)
William the Conqueror ordered a complete census of England: every house, estate, animal, farmer. 13,418 locations were recorded. The purpose? Taxation. Data was born — and remains — a tool of power.
"Data is not information, information is not knowledge, knowledge is not wisdom."
The Age of Statistics
The word “statistics” comes from the Latin status (state) — literally “the science of the state.” Data was power — and the first to use it systematically were governments.
🗺️ John Graunt (1662)
The London merchant analyzed the “Bills of Mortality” — weekly death reports. He discovered patterns: more died in winter, women lived longer, the plague appeared in cycles. He's considered the father of demography — statistical analysis before the term even existed.
🗃️ Florence Nightingale (1858)
Nightingale wasn't just a nurse — she was a pioneering statistician. She created “polar area diagrams” (circular charts) that proved more soldiers died from disease than from battle. Her data saved thousands of lives — through sanitary reforms.
🔢 Herman Hollerith (1890)
The US census of 1880 took 8 years to analyze. Hollerith invented machines with punch cards — the 1890 census was completed in 1 year. His company evolved into IBM.
The Digital Age: Bits, Bytes, and Databases
IBM RAMAC 305: the first hard drive
Capacity: 5 MB. Weight: 1 ton. Cost: $10,000/month rental. It stored data on 50 metal disks of 24 inches. Today, 5 MB fits in a single iPhone photo.
Edgar Codd: the relational database
The British mathematician (IBM) published “A Relational Model of Data” — the foundation for SQL, Oracle, MySQL, PostgreSQL. The idea: data is stored in tables that “relate” to each other. Simple in theory — revolutionary in practice.
Oracle: the first commercial SQL database
Larry Ellison (based on Codd's paper) created Oracle. First customer: the CIA. Oracle became the backbone of every bank, airline, and telecom. Ellison became a billionaire — Codd received almost nothing.
World Wide Web
Tim Berners-Lee created the World Wide Web — and suddenly data wasn't just internal: it was public, interconnected, searchable. The first website (info.cern.ch) explained... what the World Wide Web is.
💾 Data Storage Through the Ages
Big Data: When Data Becomes Oil
In 2006, Clive Humby — the mathematician behind the Tesco Clubcard points card — declared: "Data is the new oil." The phrase took off. Data was no longer just records — it was the raw material of a new economy.
The term Big Data is defined by the "3Vs": Volume (enormous scale), Velocity (speed of creation), Variety (variety of types — text, image, GPS, audio). Later came: Veracity (reliability) and Value. But the essence is simple: so much data that traditional tools can't process it.
📖 Read more: Tunguska Explosion: The Day Siberia Exploded
📊 What We Produce Every Minute (2024)
Google: 5.9 million searches
YouTube: 500 hours of video uploaded
WhatsApp: 41.6 million messages
Instagram: 66,000 photos
Email: 231.4 million emails
Spotify: 40,000 hours of music listening
Who Owns Your Data?
The real question isn't how much data exists — but who controls it. And the answer is: fewer than you think.
🔵 Google/Alphabet
Search, Gmail, Maps, YouTube, Android, Chrome, Nest, Fitbit, Waze. Google knows: where you go (Maps), what you search (Search), what you watch (YouTube), what you write (Gmail), how you sleep (Fitbit). An average user: ~5.5 GB of data on Google.
🔵 Meta (Facebook)
Facebook, Instagram, WhatsApp, Messenger, Threads. 3.05 billion daily users. Meta knows: your relationships, your interests, your politics, your face (recognition, since removed). 2023 profits: $39 billion — almost exclusively from your data.
🟠 Amazon
What you buy, what you search, what you watch (Prime Video), what you listen to (Alexa/Echo — always listening), what you read (Kindle). AWS (Amazon Web Services) hosts the data of other companies: Netflix, Airbnb, CIA.
🟢 Data Brokers
Companies that buy, aggregate, and sell data: Acxiom, Experian, Equifax, Oracle Data Cloud. Acxiom has data on 2.5 billion people, with ~1,500 data points per person (gender, income, debts, purchases, pets, political leanings).
Data Breaches: When Data Gets Lost
If data is “oil,” then data breaches are oil spills — but far worse, because data never “cleans up.”
Yahoo: 3 billion accounts
The largest breach in history. Every Yahoo account was hacked. Yahoo initially reported “1 billion” — three years later admitted “3 billion.” The Verizon acquisition price was reduced by $350 million.
Equifax: 147 million Americans
Full names, SSNs, birthdates, addresses — data that never changes. Equifax detected the breach 76 days later. Executives sold stock before the announcement. Fine: $700 million.
Facebook: 533 million users
Phones, emails, names — published for free on a hacker forum. Meta didn't notify users — claiming the data was “old” (2019). The Irish DPC imposed a fine of €265 million.
💀 Largest Data Breaches
AI and the Era of “Synthetic Data”
Artificial intelligence wouldn't exist without data — it's its “food.” ChatGPT was trained on 570 GB of text — nearly the entire Internet. Midjourney on billions of images. GPT-4 cost $100+ million just in training — but the data was “free” (scraped from the web).
New question: if an AI was trained on your data — your texts, your images, your music — who owns what it produces? The New York Times sued OpenAI. Artists sued Midjourney/Stability AI. The battle has just begun.
"In the 20th century, the most valuable raw material was oil. In the 21st, it's data. But there's one critical difference: oil runs out. Data multiplies."
Data started as a clay tablet in a Sumerian temple. Today it's the invisible substance that builds empires, decides elections, designs cities, cures diseases — and tracks your every step. The question is no longer “how much data exists.” The question is: who decides what happens to it — and whether you have a say in that decision.