EU AI Act Checker Reveals Big Tech's Compliance Pitfalls

A new tool designed by Swiss startup LatticeFlow and partners has tested generative AI models developed by big tech companies like Meta and OpenAI.

Advertisement
By Reuters | Updated: 16 October 2024 13:22 IST
Highlights
  • The EU's AI Act will come into effect over the next two years
  • Most AI models currently fall short of the EU's regulations in key areas
  • Popular AI models were tested across categories in line with the AI Act

Models developed by Alibaba, Anthropic, OpenAI, Meta and Mistral received average scores of 0.75 or more

Photo Credit: Reuters

Some of the most prominent artificial intelligence models are falling short of European regulations in key areas such as cybersecurity resilience and discriminatory output, according to data seen by Reuters.

The EU had long debated new AI regulations before OpenAI released ChatGPT to the public in late 2022. The record-breaking popularity and ensuing public debate over the supposed existential risks of such models spurred lawmakers to draw up specific rules around "general-purpose" AIs (GPAI).

Now a new tool designed by Swiss startup LatticeFlow and partners, and supported by European Union officials, has tested generative AI models developed by big tech companies like Meta and OpenAI across dozens of categories in line with the bloc's wide-sweeping AI Act, which is coming into effect in stages over the next two years.

Advertisement

Awarding each model a score between 0 and 1, a leaderboard published by LatticeFlow on Wednesday showed models developed by Alibaba, Anthropic, OpenAI, Meta and Mistral all received average scores of 0.75 or above.

However, the company's "Large Language Model (LLM) Checker" uncovered some models' shortcomings in key areas, spotlighting where companies may need to divert resources in order to ensure compliance.

Advertisement

Companies failing to comply with the AI Act will face fines of 35 million euros ($38 million) or 7% of global annual turnover.

Mixed Results

At present, the EU is still trying to establish how the AI Act's rules around generative AI tools like ChatGPT will be enforced, convening experts to craft a code of practice governing the technology by spring 2025.

Advertisement

But LatticeFlow's test, developed in collaboration with researchers at Swiss university ETH Zurich and Bulgarian research institute INSAIT, offers an early indicator of specific areas where tech companies risk falling short of the law.

Advertisement

For example, discriminatory output has been a persistent issue in the development of generative AI models, reflecting human biases around gender, race and other areas when prompted.

When testing for discriminatory output, LatticeFlow's LLM Checker gave OpenAI's "GPT-3.5 Turbo" a relatively low score of 0.46. For the same category, Alibaba Cloud's "Qwen1.5 72B Chat" model received only a 0.37.

Testing for "prompt hijacking", a type of cyberattack in which hackers disguise a malicious prompt as legitimate to extract sensitive information, the LLM Checker awarded Meta's "Llama 2 13B Chat" model a score of 0.42. In the same category, French startup Mistral's "8x7B Instruct" model received 0.38.

"Claude 3 Opus", a model developed by Google-backed Anthropic, received the highest average score, 0.89.

The test was designed in line with the text of the AI Act, and will be extended to encompass further enforcement measures as they are introduced. LatticeFlow said the LLM Checker would be freely available for developers to test their models' compliance online.

Petar Tsankov, the firm's CEO and cofounder, told Reuters the test results were positive overall and offered companies a roadmap for them to fine-tune their models in line with the AI Act.

"The EU is still working out all the compliance benchmarks, but we can already see some gaps in the models," he said. "With a greater focus on optimising for compliance, we believe model providers can be well-prepared to meet regulatory requirements."

Meta declined to comment. Alibaba, Anthropic, Mistral, and OpenAI did not immediately respond to requests for comment.

While the European Commission cannot verify external tools, the body has been informed throughout the LLM Checker's development and described it as a "first step" in putting the new laws into action.

A spokesperson for the European Commission said: "The Commission welcomes this study and AI model evaluation platform as a first step in translating the EU AI Act into technical requirements."

© Thomson Reuters 2024

 

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Artificial Intelligence, AI, Regulation, EU
Advertisement

Related Stories

Popular Mobile Brands
  1. Samsung Galaxy S24 Ultra Deal Revealed Ahead of Amazon GIF Sale
  2. Nothing Ear 3 With 'Super Mic' Feature, Up to 45dB ANC Launched: See Price
  3. Amazon Sale 2025: Check Top Deals on These iQOO Smartphones
  4. iQOO 15 Design Leak Reveals Colour-Changing Panel: See Benchmark Scores
  5. Meta's Ray-Ban Display Smart Glasses Get a Screen, Brings New Features
  6. iOS 26's Liquid Glass Design Causes Optical Illusions, Users Claim
  7. Xiaomi Announces Offers on These Products Ahead of Amazon, Flipkart Sales
  8. Samsung Is Now Rolling Out One UI 8 to the Galaxy S25 Series in India
  9. DJI Mini 5 Pro With 1-Inch Camera Sensor Launched at This Price
  1. Cellecor Comet CBS-05 Pro Bluetooth Speaker Launched in India: Price, Features
  2. Samsung Galaxy S24 Ultra, Galaxy S24 FE, Galaxy A55 5G and More to Go on Sale With Discounts During Festive Season
  3. Coinbase Urges US DOJ Action as SEC Mulls Dropping Lawsuit Against Crypto Exchange
  4. Vivo V60 Lite 4G Design, Specifications Leaked; Tipped to Launch With Snapdragon 685 SoC, 6,500mAh Battery
  5. Nothing Ear 3 Launched With Super Mic Feature, Up to 45dB Active Noise Cancellation: Price, Features
  6. Nvidia Bets Big on Intel With $5 Billion Stake and Chip Partnership
  7. Samsung Project Moohan XR Headset Launch Reportedly Postponed to October
  8. Samsung Galaxy S25 Series' Android-16-Based One UI 8 Update Rollout Expands to India
  9. Xiaomi Announces Festive Offers on Redmi Note 14 Series, Xiaomi Pad 7, QLED TVs and More
  10. Borderlands 4 Players Report Performance Issues on PS5 Pro, Gearbox CEO Confirms Patch
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.