Why Computers Won't Replace You Just Yet

Advertisement
By Sendhil Mullainathan, The New York Times | Updated: 3 July 2014 11:22 IST
Consider these two tweets by Al Gore, both promoting the same article:

- "40% of smartphone users connect to Internet immediately upon awakening, before leaving bed. #TheFuture http://bit.ly/WYRz39 @TheAtlantic"

- "Cybercrime market now greater than annual global market for marijuana, cocaine, and heroin #TheFuture http://bit.ly/WYRz39 @TheAtlantic"

Can you guess which one was retweeted more often?

Three computer scientists, Chenhao Tan, Lillian Lee and Bo Pang, have built an algorithm that also makes these guesses, as described in a recent paper, and the results are impressive. (The answer: The first one got more retweets).

Advertisement

That an algorithm can make these kinds of predictions shows the power of "big data." It also illustrates its fundamental limitation: Specifically, guessing which tweet gets retweeted is significantly easier than creating one that gets retweeted.

Advertisement

To see why, it is useful to see how the algorithm was built. It used a data set of around 11,000 paired tweets - two tweets about the same link sent by the same person - to learn which word patterns looked predictive and then tested whether these patterns hold in new data. This is usually how "smart" algorithms are created from big data: Large data sets with known correct answers serve as a training bed and then new data serves as a test bed - not too different from how we might learn what our co-workers find funny.

The end result is an algorithm that guesses well. It can guess which tweet gets retweeted about 67 percent of the time, beating humans, who on average get it right only 61 percent of the time.

Advertisement

This is striking when you think of the enormous handicap the algorithm has. Yes, it could learn from 11,000 pairs of tweets. But it has no other knowledge. It has none of the wealth of contextual information you have accumulated over the years. It has never heard friends' complaints about spouses checking their phone the first thing in the morning. It does not have a sense of humor or know what a pun is. It does not know what makes a turn of phrase elegant or awkward.

It must rely on a few crude features, such as length of the tweet, the presence of certain words ("retweet" or "please") or the use of indefinite articles. Yet with so little, it does so much. This is one of the miracles of big data: Algorithms find information in unexpected places, uncovering "signal" in places we thought contained only "noise."

Advertisement

But we do not need to roll out the welcome mat for our machine overlords just yet. While the retweet algorithm is impressive, it has an Achilles' heel, one shared by all prediction algorithms.

We care about predicting retweets mainly because we want to write better tweets. And we assume these two tasks are related. If Netflix can predict which movies I like, surely it can use the same analytics to create better TV shows. But it doesn't work that way. Being good at prediction often does not mean being better at creation.

One barrier is the oldest of statistical problems: Correlation is not causation. Changing a variable that is highly predictive may have no effect. For example, we may find the number of employees formatting their résumés is a good predictor of a company's bankruptcy. But stopping this behavior is hardly a fruitful strategy for fending off creditors.

The causality problem can show up in very subtle ways. For example, the tweet predictor finds that longer tweets are more likely to be retweeted. It seems unlikely that you should therefore write longer tweets. The old adage that "less is more" is, if anything, truer in this medium. Instead, length is probably a good predictor because longer tweets have more content. So the lesson is not "make your tweets longer" but "have more content," which is far harder to do.

Another problem comes from an inherent paradox in predicting what is interesting. Rarity and novelty often contribute to interestingness - or at least to drawing attention. But once an algorithm finds those things that draw attention and starts exploiting them, their value erodes. When few people do something, it catches the eye; when everyone does it, it is ho-hum. Calling a food "artisanal" was eye-catching, until it became so common that we're not far away from an artisanal plunger.

In the Twitter example, the use of the words "retweet" or "please" were predictive. But if everyone starts asking you to "Share this article. Please," will it continue to work?

Finally, and perhaps most perversely, some of the most predictive variables are circular.

For example, in another paper, the computer scientists Lars Backstrom, Jon Kleinberg, Lillian Lee and Cristian Danescu-Niculescu-Mizil predict which posts on Facebook generate many comments. One of the most predictive variables is the time it takes for the first comment to arrive: If the first comment arrives quickly, then the post is likely to generate many more comments in the future. This helps Facebook decide which posts to show you. But it does not help anyone to write a highly commented post. It says: "Want to write a post people like? Well, write one that people like!"

These limitations are not meant to take away from the power of predictive algorithms. It is truly amazing, for example, how well an algorithm can predict which tweets will get retweeted.

It does remind us to moderate expectations. Arthur C. Clarke once posited three laws of prediction. The third is apropos here: "Any sufficiently advanced technology is indistinguishable from magic." Because algorithms armed with big data can do some impressive things - self-driving cars! - we can too easily treat them like magic and overstate what they do. This can lead to extrapolations that are simply not realistic. (Soon computers will be doing my job!) It can create fears that are ill-founded. (Soon companies will know enough about me to get me to buy anything!) It can create expectations that we are very far from meeting. (Soon computers will write movies!)

The new big-data tools, amazing as they are, are not magic. Like every great invention before them - whether antibiotics, electricity or even the computer itself - they have boundaries in which they excel and beyond which they can do little.

By the way, share this article. Please.

© 2014 New York Times News Service

 

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Algorithm, Computers, Laptops, PCs
Advertisement

Related Stories

Popular Mobile Brands
  1. Realme 15T With 50-Megapixel Selfie Camera Debuts in India: See Price
  2. Amazon Great Indian Festival Sale: Deals on Smartphones, Laptops Teased
  3. Su From So OTT Release Date is Here! Know all the Details
  4. India's Indigenous Vikram Microprocessor Showcased at Semicon India 2025
  5. Redmi 15 5G, Note 14 Pro Prices Dropped During Diwali With Xiaomi Sale
  6. Cannibal Solar Storm May Trigger Aurora in the Sky Soon
  1. BCCI Says Crypto, Real Money Gaming Platforms Can’t Bid for Team India’s Title Sponsorship
  2. Scientists Discover Hidden Mantle Layer Beneath the Himalayas Challenging Century-Old Theory
  3. Astronomers Propose Rectangular Telescope to Hunt Earth-Like Planets
  4. Microsoft Testing Native Clipboard Sync Feature to Share Text Between Windows PCs, Android Devices
  5. Su From So OTT Release: When and Where to Watch This Kannada-Language Horror-Comedy Online
  6. Sennheiser Momentum 4 Wireless 80th Anniversary Edition Launched in India With Up to 60 Hour Battery Life
  7. Call of Duty Film Adaption Said to Be a 'Priority' at Paramount, Negotiations on to Acquire Rights
  8. Cannibal Solar Storm May Trigger Auroras as Powerful Geomagnetic Storm to Hit Earth Soon
  9. Apple's iPhone 8 Plus Listed as Vintage Product Ahead of iPhone 17 Launch, 11-Inch MacBook Air Now Obsolete
  10. Hidden Reason Behind Portugal’s Deadly Earthquakes Finally Explained
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.