Why Computers Won't Replace You Just Yet

Advertisement
By Sendhil Mullainathan, The New York Times | Updated: 3 July 2014 11:22 IST
Why Computers Won't Replace You Just Yet
Consider these two tweets by Al Gore, both promoting the same article:

- "40% of smartphone users connect to Internet immediately upon awakening, before leaving bed. #TheFuture http://bit.ly/WYRz39 @TheAtlantic"

- "Cybercrime market now greater than annual global market for marijuana, cocaine, and heroin #TheFuture http://bit.ly/WYRz39 @TheAtlantic"

Can you guess which one was retweeted more often?

Three computer scientists, Chenhao Tan, Lillian Lee and Bo Pang, have built an algorithm that also makes these guesses, as described in a recent paper, and the results are impressive. (The answer: The first one got more retweets).

That an algorithm can make these kinds of predictions shows the power of "big data." It also illustrates its fundamental limitation: Specifically, guessing which tweet gets retweeted is significantly easier than creating one that gets retweeted.

Advertisement

To see why, it is useful to see how the algorithm was built. It used a data set of around 11,000 paired tweets - two tweets about the same link sent by the same person - to learn which word patterns looked predictive and then tested whether these patterns hold in new data. This is usually how "smart" algorithms are created from big data: Large data sets with known correct answers serve as a training bed and then new data serves as a test bed - not too different from how we might learn what our co-workers find funny.

The end result is an algorithm that guesses well. It can guess which tweet gets retweeted about 67 percent of the time, beating humans, who on average get it right only 61 percent of the time.

Advertisement

This is striking when you think of the enormous handicap the algorithm has. Yes, it could learn from 11,000 pairs of tweets. But it has no other knowledge. It has none of the wealth of contextual information you have accumulated over the years. It has never heard friends' complaints about spouses checking their phone the first thing in the morning. It does not have a sense of humor or know what a pun is. It does not know what makes a turn of phrase elegant or awkward.

It must rely on a few crude features, such as length of the tweet, the presence of certain words ("retweet" or "please") or the use of indefinite articles. Yet with so little, it does so much. This is one of the miracles of big data: Algorithms find information in unexpected places, uncovering "signal" in places we thought contained only "noise."

Advertisement

But we do not need to roll out the welcome mat for our machine overlords just yet. While the retweet algorithm is impressive, it has an Achilles' heel, one shared by all prediction algorithms.

We care about predicting retweets mainly because we want to write better tweets. And we assume these two tasks are related. If Netflix can predict which movies I like, surely it can use the same analytics to create better TV shows. But it doesn't work that way. Being good at prediction often does not mean being better at creation.

One barrier is the oldest of statistical problems: Correlation is not causation. Changing a variable that is highly predictive may have no effect. For example, we may find the number of employees formatting their résumés is a good predictor of a company's bankruptcy. But stopping this behavior is hardly a fruitful strategy for fending off creditors.

The causality problem can show up in very subtle ways. For example, the tweet predictor finds that longer tweets are more likely to be retweeted. It seems unlikely that you should therefore write longer tweets. The old adage that "less is more" is, if anything, truer in this medium. Instead, length is probably a good predictor because longer tweets have more content. So the lesson is not "make your tweets longer" but "have more content," which is far harder to do.

Another problem comes from an inherent paradox in predicting what is interesting. Rarity and novelty often contribute to interestingness - or at least to drawing attention. But once an algorithm finds those things that draw attention and starts exploiting them, their value erodes. When few people do something, it catches the eye; when everyone does it, it is ho-hum. Calling a food "artisanal" was eye-catching, until it became so common that we're not far away from an artisanal plunger.

In the Twitter example, the use of the words "retweet" or "please" were predictive. But if everyone starts asking you to "Share this article. Please," will it continue to work?

Finally, and perhaps most perversely, some of the most predictive variables are circular.

For example, in another paper, the computer scientists Lars Backstrom, Jon Kleinberg, Lillian Lee and Cristian Danescu-Niculescu-Mizil predict which posts on Facebook generate many comments. One of the most predictive variables is the time it takes for the first comment to arrive: If the first comment arrives quickly, then the post is likely to generate many more comments in the future. This helps Facebook decide which posts to show you. But it does not help anyone to write a highly commented post. It says: "Want to write a post people like? Well, write one that people like!"

These limitations are not meant to take away from the power of predictive algorithms. It is truly amazing, for example, how well an algorithm can predict which tweets will get retweeted.

It does remind us to moderate expectations. Arthur C. Clarke once posited three laws of prediction. The third is apropos here: "Any sufficiently advanced technology is indistinguishable from magic." Because algorithms armed with big data can do some impressive things - self-driving cars! - we can too easily treat them like magic and overstate what they do. This can lead to extrapolations that are simply not realistic. (Soon computers will be doing my job!) It can create fears that are ill-founded. (Soon companies will know enough about me to get me to buy anything!) It can create expectations that we are very far from meeting. (Soon computers will write movies!)

The new big-data tools, amazing as they are, are not magic. Like every great invention before them - whether antibiotics, electricity or even the computer itself - they have boundaries in which they excel and beyond which they can do little.

By the way, share this article. Please.

© 2014 New York Times News Service

 

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Algorithm, Computers, Laptops, PCs
Advertisement

Related Stories

Popular Mobile Brands
  1. Vivo S30, S30 Pro Mini, Pad 5, TWS Air 3 Launch Date, Key Features Confirmed
  2. iQOO Neo 10 Pro+ Battery and Charging Details Revealed Ahead of Debut
  3. OnePlus 13s With Snapdragon 8 Elite Chip to Launch in India on This Date
  4. Samsung Galaxy S25 FE Tipped to Retain Galaxy S24 FE Rear Cameras
  5. Apple AirPods With Built-in Camera Tipped to Launch Next Year
  6. Coinbase Faces Multiple Lawsuits After User Data Breach: Report 
  7. Android Desktop Mode Said to Debut With Android 17 on Pixel
  8. Realme P3 5G Series to Get a Limited Period Discount in India
  9. Xiaomi Civi 5 Pro Design, Key Features Revealed; May Launch on May 22
  1. Huawei MateBook Fold Ultimate Design With 18-Inch Double-Layer Flexible OLED Display Launched: Price, Features
  2. Huawei Nova 14 Ultra, Nova 14 Pro, Nova 14 With 5,500mAh Battery, 100W Charging Launched: Price, Specifications
  3. Coinbase Faces Multiple Lawsuits After User Data Breach: Report 
  4. Dubai's VARA Sets June 19 Deadline for Crypto Firms to Comply With Updated Activity-Based Rulebooks
  5. Acer AI TransBuds With Ear-Hook Design Unveiled at Computex 2025
  6. Nintendo Switch 2 to Support Text-to-Speech in GameChat, VRR Support Limited to Handheld Mode
  7. Honor 400 Series China Launch Date Revealed; Confirmed to Offer Battery Upgrade Over Predecessors
  8. Xiaomi to Launch 'Tesla-Challenging' YU7 on Thursday
  9. Acer FreeSense Ring With AI-Powered Health Tracking Features Unveiled in Seven Size Options
  10. Acer Swift Go 14 AI, Swift Go 16 AI Copilot+ PCs Launched at Computex 2025 Alongside Swift Edge 14 AI
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.