Why Computers Won't Replace You Just Yet

Advertisement
By Sendhil Mullainathan, The New York Times | Updated: 3 July 2014 11:22 IST
Consider these two tweets by Al Gore, both promoting the same article:

- "40% of smartphone users connect to Internet immediately upon awakening, before leaving bed. #TheFuture http://bit.ly/WYRz39 @TheAtlantic"

- "Cybercrime market now greater than annual global market for marijuana, cocaine, and heroin #TheFuture http://bit.ly/WYRz39 @TheAtlantic"

Can you guess which one was retweeted more often?

Three computer scientists, Chenhao Tan, Lillian Lee and Bo Pang, have built an algorithm that also makes these guesses, as described in a recent paper, and the results are impressive. (The answer: The first one got more retweets).

Advertisement

That an algorithm can make these kinds of predictions shows the power of "big data." It also illustrates its fundamental limitation: Specifically, guessing which tweet gets retweeted is significantly easier than creating one that gets retweeted.

Advertisement

To see why, it is useful to see how the algorithm was built. It used a data set of around 11,000 paired tweets - two tweets about the same link sent by the same person - to learn which word patterns looked predictive and then tested whether these patterns hold in new data. This is usually how "smart" algorithms are created from big data: Large data sets with known correct answers serve as a training bed and then new data serves as a test bed - not too different from how we might learn what our co-workers find funny.

The end result is an algorithm that guesses well. It can guess which tweet gets retweeted about 67 percent of the time, beating humans, who on average get it right only 61 percent of the time.

Advertisement

This is striking when you think of the enormous handicap the algorithm has. Yes, it could learn from 11,000 pairs of tweets. But it has no other knowledge. It has none of the wealth of contextual information you have accumulated over the years. It has never heard friends' complaints about spouses checking their phone the first thing in the morning. It does not have a sense of humor or know what a pun is. It does not know what makes a turn of phrase elegant or awkward.

It must rely on a few crude features, such as length of the tweet, the presence of certain words ("retweet" or "please") or the use of indefinite articles. Yet with so little, it does so much. This is one of the miracles of big data: Algorithms find information in unexpected places, uncovering "signal" in places we thought contained only "noise."

Advertisement

But we do not need to roll out the welcome mat for our machine overlords just yet. While the retweet algorithm is impressive, it has an Achilles' heel, one shared by all prediction algorithms.

We care about predicting retweets mainly because we want to write better tweets. And we assume these two tasks are related. If Netflix can predict which movies I like, surely it can use the same analytics to create better TV shows. But it doesn't work that way. Being good at prediction often does not mean being better at creation.

One barrier is the oldest of statistical problems: Correlation is not causation. Changing a variable that is highly predictive may have no effect. For example, we may find the number of employees formatting their résumés is a good predictor of a company's bankruptcy. But stopping this behavior is hardly a fruitful strategy for fending off creditors.

The causality problem can show up in very subtle ways. For example, the tweet predictor finds that longer tweets are more likely to be retweeted. It seems unlikely that you should therefore write longer tweets. The old adage that "less is more" is, if anything, truer in this medium. Instead, length is probably a good predictor because longer tweets have more content. So the lesson is not "make your tweets longer" but "have more content," which is far harder to do.

Another problem comes from an inherent paradox in predicting what is interesting. Rarity and novelty often contribute to interestingness - or at least to drawing attention. But once an algorithm finds those things that draw attention and starts exploiting them, their value erodes. When few people do something, it catches the eye; when everyone does it, it is ho-hum. Calling a food "artisanal" was eye-catching, until it became so common that we're not far away from an artisanal plunger.

In the Twitter example, the use of the words "retweet" or "please" were predictive. But if everyone starts asking you to "Share this article. Please," will it continue to work?

Finally, and perhaps most perversely, some of the most predictive variables are circular.

For example, in another paper, the computer scientists Lars Backstrom, Jon Kleinberg, Lillian Lee and Cristian Danescu-Niculescu-Mizil predict which posts on Facebook generate many comments. One of the most predictive variables is the time it takes for the first comment to arrive: If the first comment arrives quickly, then the post is likely to generate many more comments in the future. This helps Facebook decide which posts to show you. But it does not help anyone to write a highly commented post. It says: "Want to write a post people like? Well, write one that people like!"

These limitations are not meant to take away from the power of predictive algorithms. It is truly amazing, for example, how well an algorithm can predict which tweets will get retweeted.

It does remind us to moderate expectations. Arthur C. Clarke once posited three laws of prediction. The third is apropos here: "Any sufficiently advanced technology is indistinguishable from magic." Because algorithms armed with big data can do some impressive things - self-driving cars! - we can too easily treat them like magic and overstate what they do. This can lead to extrapolations that are simply not realistic. (Soon computers will be doing my job!) It can create fears that are ill-founded. (Soon companies will know enough about me to get me to buy anything!) It can create expectations that we are very far from meeting. (Soon computers will write movies!)

The new big-data tools, amazing as they are, are not magic. Like every great invention before them - whether antibiotics, electricity or even the computer itself - they have boundaries in which they excel and beyond which they can do little.

By the way, share this article. Please.

© 2014 New York Times News Service

 

Catch the latest from the Consumer Electronics Show on Gadgets 360, at our CES 2026 hub.

Further reading: Algorithm, Computers, Laptops, PCs
Advertisement

Related Stories

Popular Mobile Brands
  1. Oppo Reno 15 Series 5G Launching Today: Everything We Know So Far
  2. Oppo Reno 15 Series 5G Launches in India With These Features
  3. Here's When the Samsung Galaxy S26 Series Is Expected to Launch
  4. Here's When the Flipkart Republic Day Sale Will Start in India
  5. Poco M8 5G Launches in India With a 5,520mAh Battery and This Price Tag
  6. Poco M8 Pro 5G Launched Globally With 6,500mAh Battery at This Price
  7. GTA 6 Reportedly Still Not 'Content Complete' and Could Be Delayed Again
  8. Sony Will Launch Three New Designs for PS5 Accessories in March
  9. Top OTT Releases of the Week: De De Pyaar De 2, Akhanda 2, Mask, and More
  10. CES 2026: Infinix's Phones Could Launch With These Next-Gen Features
  1. Crypto Market Slumps as Bitcoin Price Drops Below $91,000 Amidst Risk-Off Sentiment
  2. Poco M8 Pro 5G Launched Globally With 6,500mAh Battery, Snapdragon 7s Gen 4 SoC: Price, Features
  3. CMF Headphone Pro India Launch Date Announced: Expected Features, Specifications
  4. Sony's New Hyperpop Collection of PS5 Console Covers, DualSense Controllers Launches March
  5. Grok AI Scandal: X Faces Global Crackdown Over Non-Consensual Deepfakes
  6. Radheyaa Now Streaming on Sun NXT: A Dark Crime Thriller Exploring the Mind of a Serial Killer
  7. The Bluff OTT Release Date: When and Where to Watch Priyanka Chopra Jonas Starrer Online?
  8. Silent Screams: The Lost Girls Of Telangana Now Streaming Online: What You Need to Know
  9. Samsung's Patent Document Explores Unique Flip Phone With Reversible Design
  10. Apple Could Equip Future iPhone Cameras With Multispectral Imaging for Improved Image Processing
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.