Breaking 09:09 Morocco Sends Nine Tons of Medical Aid to Support Ebola Response in DR Congo 08:47 Morocco’s Trade With Africa Reaches $9.5 Billion in 2025 08:30 Morocco Tops Arab National Teams by Market Value at the 2026 FIFA World Cup 08:12 Venezuela’s Earthquake Death Toll Rises to 188 as Rescue Operations Continue 07:25 Press Review: Government, Transport, Diplomacy, and the 2026 World Cup Dominate the News 07:09 HM King Mohammed VI congratulates Nabil Fahmy on his election as Secretary-General of the Arab League 20:00 Russia Condemns UN Decision to Keep It on Child Rights Violations List in Ukraine 19:36 Samsung Named Morocco’s Most Admired Electronics Brand in Brand Africa Ranking 19:19 European Union Expands Financial Support for Ukraine Amid Rising Economic Pressures 19:00 Morocco Strengthens Its Role as a Strategic Trade Hub Between China, Africa, and Europe 18:46 US Economy Regains Momentum in Early 2026 Amid Stronger Growth Data 18:32 United States-Morocco: Washington Prepares a Strategic Military Partnership Until 2036 18:30 Ebola in the DRC: Morocco Strengthens MONUSCO's Efforts with Emergency Medical Aid 18:28 World Cup 2026: The Moroccan Embassy in Mexico Issues Practical Guide for Atlas Lions Supporters 18:25 IAEA expects inspections of Iran’s nuclear facilities under new framework agreement 18:13 Crédit du Maroc appoints Mehdi Qalbi to lead CDM Salaf to accelerate its development 18:08 Apple raises global product prices amid rising AI chip costs 17:52 Meta explores prediction markets with new Arena platform 17:33 Gold Price Decline Seen as Temporary Correction, Says World Gold Council 17:20 Royal Air Maroc launches special flights to Monterrey for Lions de l'Atlas supporters 17:17 Indonesia tightens cryptocurrency promotion rules for social media influencers 17:15 Earthquakes in Venezuela: UN Intensifies Support for Relief Operations 16:54 European rearmament plans face challenges despite rising defense spending 16:38 Washington rejects fees on international waterways amid Strait of Hormuz debate 16:16 Czech court orders inclusion of President Pavel in NATO summit delegation amid constitutional dispute 15:55 World Health Organization expects end of hantavirus outbreak by early July 15:36 Morocco coach praises team unity after Haiti victory and World Cup Round of 32 qualification 15:19 Tanger Med strengthens Morocco’s global trade position as a strategic logistics hub 15:13 Ryad Mezzour Highlights Morocco's Industrial Strengths to a MEDEF Delegation 15:05 Rema lights up Mawazine 2026 with electrifying performance in Rabat 14:55 Google Warns of Rise in AI-Powered Cyberattacks 14:34 Rising Ferry Prices Between Spain and Morocco Raise Concerns 14:30 Rubio warns that proposed Strait of Hormuz transit fees could trigger global maritime disruption 14:14 Mustapha Baitas announces Morocco's definitive return to GMT legal time 14:14 Ismael Saibari Makes World Cup History with Three Consecutive Scoring Matches for Morocco 13:53 Mawazine festival blends music and football in a vibrant night featuring Chami and Hatim Ammor 13:36 Food Security: Prime Minister to Address the House of Councillors 13:23 Ebola outbreak in the Democratic Republic of the Congo claims 291 lives as cases rise 13:01 Federal Reserve overhauls banking supervision structure to boost efficiency and transparency 12:42 Lionel Messi continues World Cup brilliance at 39, extending his legendary record 12:21 Trump Pledges Immediate Aid to Venezuela After Devastating Earthquakes 12:14 Morocco to Return to Legal GMT Time at Summer's End, Announces Aziz Akhannouch 12:00 Trump requests $87.6 billion from Congress to cover Iran conflict costs and military replenishment 11:47 Venezuela earthquake death toll rises to 164 after twin powerful quakes 11:30 Rubio strengthens Gulf diplomacy amid rising tensions over Iran and the Strait of Hormuz 11:28 King Mohammed VI congratulates the President of Slovenia on National Day 11:11 Fawzi Lekjaa Emerges as a Key Figure in Morocco’s Political and Sporting Landscape 10:45 Anthropic unveils Claude Tag, an AI teammate designed for Slack collaboration 10:39 HM King Mohammed VI congratulates the Emir of Qatar on the anniversary of his accession to power 10:38 Morocco-Colombia: The Message from King Mohammed VI Praised by Elected President Abelardo De La Espriella 10:27 OpenAI unveils Jalapeño, Its first AI chip to accelerate inference 10:18 Artificial intelligence challenges Google’s search dominance despite its continued leadership 09:57 International Olympic Committee strengthens commitment to sporting neutrality 09:31 UN Chief António Guterres Calls for Greater Transparency on AI’s Climate Impact 09:28 Online Protection for Minors: Australia's Model Faces Its First Limits

AI coding tools show reliability gaps in structured output tasks

Tuesday 17 March 2026 - 16:00
AI coding tools show reliability gaps in structured output tasks

A new study from the University of Waterloo finds that leading artificial intelligence coding tools still fail in roughly one out of four cases when generating structured outputs, raising concerns about their reliability in real-world software development workflows.

The research, released on March 16 and scheduled for presentation at the International Conference on Learning Representations 2026, evaluated 11 large language models across 18 structured output formats and 44 tasks. Even the best-performing proprietary systems reached only about 75 percent accuracy, while top open source models achieved close to 67 percent.

Structured output remains a critical weak point

The study, titled “StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs,” focused on formats commonly used in development pipelines, including JSON, YAML, CSV, HTML, React and SVG. These formats are essential for integrating AI-generated code into production systems.

Researchers assessed model outputs using a combination of syntax validation, keyword matching and visual question answering. The results showed that while models performed reasonably well on text-based tasks such as documentation and simple data structures, they struggled with more complex outputs.

Tasks involving visual or layout elements, including image generation, video content, dynamic web design and diagram code, produced the highest error rates. The study also found that generation tasks, where models convert natural language instructions into structured formats, were significantly more difficult than conversion tasks between existing formats.

Human oversight remains essential

The research team included Dongfu Jiang, Jialin Yang and Wenhu Chen, supported by a group of 17 contributors involved in annotation and evaluation. According to Jiang, the study measured both syntactic correctness and whether outputs meaningfully addressed the task.

He noted that despite rapid advances, AI coding systems still require close human supervision. Developers using these tools cannot rely solely on automated outputs, particularly in environments where precision is critical.

Chen emphasized the collaborative research model at Waterloo, where students contribute to and lead benchmarking efforts, reflecting a broader trend in AI development that combines experimentation with evaluation.

Widespread adoption meets practical limitations

The findings come at a time when AI-assisted coding tools have become deeply embedded in software engineering workflows. A recent survey by The Pragmatic Engineer indicates that 95 percent of respondents use AI tools at least weekly, and 75 percent rely on them for at least half of their engineering tasks.

Platforms such as GitHub Copilot, Claude Code and Cursor are now standard in many development environments. However, the Waterloo study highlights a key risk: errors in structured outputs may not always be immediately visible, increasing the likelihood of hidden bugs or configuration issues.

In complex systems, such issues can propagate and lead to broader failures, making validation and review processes more important than ever.

The study has been published in Transactions on Machine Learning Research and contributes to ongoing discussions about the role of large language models in production-grade software development.


  • Fajr
  • Sunrise
  • Dhuhr
  • Asr
  • Maghrib
  • Isha

Read more

This website, walaw.press, uses cookies to provide you with a good browsing experience and to continuously improve our services. By continuing to browse this site, you agree to the use of these cookies.