
Exploring the key developments shaping the artificial intelligence landscape this week, from major product launches to emerging policy and safety debates.

OpenAI has significantly expanded its Codex desktop app by introducing computer control, an integrated browser, image generation, and more than 90 plugins, aiming to make it a central tool for developers. With over 3 million weekly users, Codex can now interact directly with a Mac by viewing the screen, controlling the cursor, and operating within apps, while running multiple agents simultaneously. The new browser allows users to guide tasks directly on webpages, and built-in image generation removes the need for separate API access. These additions are designed to streamline workflows, particularly in frontend development, testing, and environments without APIs.
The update also strengthens integration across developer tools and platforms, adding features such as multiple terminal tabs, GitHub pull request handling, SSH connections, and a detailed summary pane for tracking tasks. A proactive mode suggests tasks based on context from connected apps like Google Docs, Slack, and Notion, helping users resume work efficiently. The expanded capabilities position Codex alongside competitors such as Claude Code and OpenClaw, while combining multiple functions into a single desktop experience tied to a ChatGPT account. Despite its broad rollout, some features, including personalisation and computer control, are not yet available in the EU or UK. Source
Artificial intelligence use across the US federal government has expanded quickly in recent years, but adoption is uneven and concentrated in a small number of large agencies. According to the Brookings Institution, five major agencies account for over half of reported use cases, while smaller agencies struggle to keep pace, highlighting a widening gap in capability. Despite this growth, progress is constrained by persistent bottlenecks including shortages of AI-specialised talent, risk-averse organisational cultures, and procurement rules that are poorly suited to fast-evolving AI systems, with additional concerns that political dynamics such as the Department of Government Efficiency may be reinforcing caution in some areas.
Public trust and accountability are emerging as central challenges, with surveys indicating only 17% of Americans believe AI will benefit the country and just 16% expressing strong trust in the federal government to act in the public interest. The report also finds that more than 85% of high-impact AI systems lack required documentation on risk mitigation, despite official requirements, raising concerns about oversight and transparency. While poorly implemented systems could undermine confidence further, well-designed AI applications could improve public services and help rebuild trust. Brookings recommends expanding AI literacy across agencies, modernising procurement processes, strengthening transparency, and focusing deployment on use cases with clear public benefit. Source
Two months after a federal judge in New York ruled that AI chatbot conversations can be accessed by prosecutors, the legal industry has been rapidly reassessing how clients and lawyers use tools like ChatGPT and Claude. The ruling stemmed from a fraud case involving Anthropic’s Claude, where the court found that the defendant’s AI-generated materials could be seized because the chatbot is not a legal professional and therefore cannot provide privileged communication. As a result, more than a dozen major law firms have begun warning clients that chatting with AI systems about legal matters may risk waiving attorney-client privilege.
Law firms are now embedding these warnings into engagement contracts and updating internal guidance, with some advising clients to use only enterprise-grade, closed AI systems under lawyer supervision. Others are attempting to preserve privilege by ensuring AI use is explicitly directed by counsel, sometimes instructing clients to state this inside prompts to help support potential legal protections under doctrines that extend privilege to agents of attorneys. However, court decisions remain inconsistent, with some rulings suggesting AI chats are not privileged while others have protected them as work product depending on context and whether a user is represented by counsel.
The emerging legal consensus is still forming, but a clear divide is appearing: individuals using consumer AI tools independently are more exposed, while those operating under legal supervision may have stronger protections. Courts are beginning to shape rules through conflicting decisions, but until clearer precedent emerges, law firms are increasingly treating AI chat logs as potentially discoverable evidence and advising clients to assume their prompts could be read in court. Source
Anthropic has introduced identity verification for certain users of Claude, requiring government-issued ID and, in some cases, a live selfie through a third-party provider. The move is notable because it goes further than any major competing AI chatbot, and comes shortly after a surge in users who joined Claude due to concerns about privacy and surveillance policies at rival companies. The company says verification will only be triggered in specific situations, such as suspected abuse, fraud prevention, or access to certain capabilities, rather than being a universal requirement.
The verification system uses Persona as a third-party identity provider, meaning documents and biometric data are handled outside Anthropic’s own systems and are not used for model training. Accepted forms of ID include passports and national identity cards, and the process may involve live image checks to confirm authenticity. While Anthropic frames this as a limited safety and compliance measure, the policy has sparked concern among users who worry about increased data exposure, especially given past incidents involving breaches of identity verification databases and broader questions about how securely such sensitive information can be stored and managed. Source
Starbucks has launched a beta app inside ChatGPT that uses artificial intelligence to recommend drinks based on how users describe their mood or even from uploaded images. The integration allows customers to explore menu items, customise drinks, and select pickup locations directly within the chat interface, although final ordering still has to be completed through the Starbucks app or website. The idea is to shift the starting point of ordering away from fixed menus and towards more personalised, conversational prompts that reflect how customers are feeling in the moment.
The move is part of a wider trend of major brands embedding AI into shopping and discovery experiences, alongside companies like Walmart, Target, Etsy and Booking.com, which are experimenting with similar ChatGPT-based tools for retail and travel. Starbucks is also using AI internally, including tools for staff support and operational efficiency, as it tries to improve service and recover from a period of declining sales. While early financial results show some improvement in US transactions, the company still faces operational challenges such as meeting peak-time service targets, making AI-driven customer engagement a key part of its strategy. Source
Anthropic is preparing to release Claude Opus 4.7 alongside a new AI-powered design tool aimed at building websites, presentations and landing pages using plain English prompts. The tool is expected to appeal to both developers and non-technical users, and could be released as soon as this week, with reports suggesting it has already unsettled parts of the design software market. Alongside this, Opus 4.7 is described as part of Anthropic’s broader Claude model family, which is increasingly being positioned for enterprise and developer-focused use cases rather than purely consumer-facing applications.
More significantly, Anthropic’s most advanced system is said to be Claude Mythos, a restricted cybersecurity-focused model that is not being released publicly and is instead being shared only with selected security firms. Evaluations suggest it can carry out highly complex autonomous cyber attack simulations at a level exceeding other known models, highlighting rapid capability growth but also raising concerns about safety and measurability. However, comparing model progress remains difficult due to unreliable benchmarks, making it unclear how much of an improvement Opus 4.7 represents. Overall, Anthropic appears to be shifting towards a full-stack AI platform approach where models are used to build, deploy and manage complete digital products rather than simply generate text. Source
Frontier AI models were tested across a full 2023–24 English Premier League season to see whether they could profit from sports betting using a structured framework called KellyBench, which applies the Kelly criterion for optimal stake sizing. Eight leading models, including Claude, GPT-5.4, Gemini and Grok, were each given a virtual bankroll and tasked with building and executing betting strategies over time. Despite being able to explain the correct mathematical principles, every model lost money, with several going completely bankrupt and even the best-performing systems still finishing the season in deficit.
The results highlighted a persistent gap between knowing a strategy and reliably executing it in a dynamic environment. Some models identified errors in their assumptions or even wrote correct staking functions, but failed to implement fixes or consistently apply their own logic during live decision-making. A simpler statistical baseline from the 1990s, Dixon-Coles, outperformed most of the frontier models, reinforcing how difficult it is for current AI systems to operate in long-horizon, real-world scenarios with shifting conditions. Researchers concluded that the main issue is not lack of knowledge but a breakdown in translating reasoning into sustained action, with even sophisticated systems unable to avoid compounding execution errors over time. Source
.png)
Markethive presents its “HIVE Intelligence” as a distinct approach from conventional artificial intelligence, positioning it as a system built around proactive innovation, user ownership of data, and a combination of algorithmic processing with human input. Rather than relying on centralised data exploitation, the platform claims to prioritise privacy and autonomy, with its AI Assistant designed to support entrepreneurs by analysing activity across areas such as banners, blogs and networks. It generates insights intended to help users optimise engagement, content performance and overall digital strategy within the ecosystem.
The assistant is described as a multifunctional coaching tool that produces reports, suggests improvements, tracks performance metrics and can export data for further analysis, while integrating with broader platform features such as community chat systems and structured data access. Users can interact through different interfaces, including an AI Chat and a community “My Chat” split into global and country-based channels, with ongoing development aimed at deeper integration across platform modules. The article also notes that the platform’s development philosophy is framed by a stated spiritual principle centred on divine intelligence, which it presents as a guiding foundation for its technological direction. Source
Allbirds has announced a dramatic strategic shift away from footwear, revealing plans to rebrand as NewBird AI and reposition itself as a GPU-as-a-Service and AI infrastructure company. The company has also entered a $50 million convertible financing agreement with an institutional investor to fund the transition, while simultaneously selling its original footwear and brand assets to American Exchange Group for $39 million. The move effectively marks the end of Allbirds as a sustainable shoe brand and the beginning of its attempt to enter the AI compute market.
The announcement triggered a major market reaction, with shares surging more than 400% at peak trading as investors responded enthusiastically to the pivot into AI infrastructure. Under the proposed strategy, NewBird AI plans to acquire high-performance GPU hardware and lease computing capacity to customers needing AI processing power, positioning itself as an alternative provider in a constrained compute market. However, the move draws comparisons to past speculative corporate pivots that initially boosted valuations but later collapsed, raising questions about whether the surge reflects genuine long-term value or short-term hype. Source
A new open-source model family called Gemopus has been released by developer Jackrong, building on Google’s Gemma 4 foundation and aiming to replicate aspects of Anthropic’s Claude Opus-style reasoning in a lightweight, locally runnable format. The project follows earlier work called Qwopus, which attempted a similar distillation approach using Alibaba’s Qwen models, but Gemopus shifts to Gemma 4 in response to concerns about using non-US base models. The result is a set of fine-tunes designed to deliver more structured, conversational, and “frontier-like” reasoning behaviour while remaining suitable for consumer hardware.
The Gemopus family includes a larger mixture-of-experts model and a smaller edge version intended to run on devices like laptops and even smartphones. Performance claims suggest strong results in instruction following, long-context handling, and reasoning benchmarks, with support for very large context windows when extended techniques are applied. However, the developer emphasises that the models are experimental rather than production-ready, with known issues such as unreliable tool calling in common local inference frameworks. Overall, Gemopus is positioned as a community exploration of how far open models can be pushed towards frontier-model behaviour while still remaining lightweight and locally accessible. Source
Japan’s leading industrial and tech firms, including SoftBank, Sony, Honda, and NEC, have formed a new joint company aimed at developing a trillion-parameter AI model focused not on conversational systems, but on controlling physical machines. The initiative is backed by approximately $6.7 billion in government support through Japan’s NEDO agency, with additional investment from major banks and steelmakers, reflecting a coordinated national push rather than a standalone private-sector project. The goal is to create what is being described as “Physical AI”, where models are embedded into robotics, autonomous vehicles, and industrial systems.
Unlike Western AI efforts that prioritise chatbots and digital assistants, this project is explicitly designed to integrate AI into real-world machinery and infrastructure, leveraging Japan’s established strengths in robotics and manufacturing. Key participants will divide responsibilities across development and deployment, with firms like Honda focusing on autonomous driving applications and Sony contributing hardware expertise. The plan also emphasises keeping Japanese data and compute infrastructure domestic, reducing reliance on foreign cloud providers. With hiring underway and government funding expected to scale over several years, the project signals a long-term industrial strategy aimed at making Japan a leader in embodied AI systems rather than language-based models. Source
MiniMax has released its new M2.7 AI model on Hugging Face, presenting it as a state-of-the-art open-weight system that rivals leading closed models on several coding and reasoning benchmarks. The model reportedly achieves strong results on software engineering and real-world task evaluations, placing it close to top-tier systems like Claude Opus 4.6 in performance. It is built as a large mixture-of-experts architecture, designed to deliver high capability while activating only a fraction of its total parameters during inference, making it more efficient to run than its full size might suggest.
Shortly after release, however, the licencing terms were quietly updated to restrict commercial use without written permission, creating confusion and criticism within the developer community. While non-commercial and research use remains free, commercial deployment now requires authorisation, despite the model being described in “MIT-style” terms. MiniMax defended the change by arguing it was necessary to prevent degraded or misconfigured third-party hosting that could harm the model’s reputation. The update reflects a broader shift in parts of the open-source AI ecosystem, where companies are increasingly tightening control over how widely their models can be used even after public release. Source
The UK’s AI Safety Institute has assessed Anthropic’s Claude Mythos Preview and found that it may represent a major cybersecurity capability, with the model able to autonomously carry out complex cyber attacks in controlled testing. Originally revealed through a leak in late March and later confirmed by Anthropic, the system is described as being capable of identifying and exploiting vulnerabilities in software at a level not previously seen in other AI models, including issues in operating systems and web browsers.
In structured evaluations, Claude Mythos Preview was able to complete a 32-step corporate network attack simulation known as “The Last Ones”, achieving partial or full success in several attempts without human assistance, and outperforming previous models such as Claude Opus 4.6 by completing more stages on average. It also achieved a 73% success rate on advanced capture-the-flag cybersecurity tasks and demonstrated scaling performance with increased compute resources. Despite these results, Anthropic has not released the model publicly, instead limiting access and warning stakeholders such as banking executives about the potential security risks posed by its capabilities. Source
Meta is reportedly developing a photorealistic AI-powered digital clone of Mark Zuckerberg, designed to act as a conversational stand-in for the CEO during interactions with employees, according to reporting from the Financial Times. The aim is to create a scalable form of leadership presence that is always available, allowing staff to interact with an AI version of Zuckerberg that can speak and respond in his style. This marks a significant shift from the earlier Horizon Worlds avatar, which became widely mocked online and was associated with Meta’s struggles to gain traction in the metaverse space.
The AI clone is being trained on Zuckerberg’s mannerisms, voice patterns, public statements and strategic thinking, with Zuckerberg himself reportedly involved in testing the system. The project sits within Meta’s Superintelligence Labs and reflects a broader company push into advanced AI tools, alongside heavy investment in infrastructure and acquisitions such as voice technology firms PlayAI and WaveForms. Meta’s projected capital expenditure for 2026 is estimated at between 115 billion and 135 billion, highlighting the scale of its AI ambitions, while internal initiatives include tools like OpenClaw and new employee training exercises focused on building AI agents. The shift contrasts sharply with the costly metaverse period, suggesting a move towards AI-driven internal communication and control as the company’s next major focus. Source
A new initiative from Matterhorn and the Artificial Superintelligence Alliance aims to reduce the risks associated with AI-generated blockchain code by introducing additional safeguards for so-called “vibe coding”, where developers describe an application in natural language and AI generates smart contract code automatically. The concern driving the project is that while this approach speeds up development and lowers technical barriers, it can also produce insecure or flawed code that may be exploited once deployed on a blockchain.
To address this, the platform combines automated AI analysis, human security audits and testing tools to review smart contracts before they go live, and is being built on ASI:Chain, a blockchain ecosystem linked to organisations including Fetch.ai, SingularityNET and CUDOS. Developers will be able to access third-party auditors and AI-driven review agents through the system, although the creators stress that it does not provide absolute guarantees of security. The initiative also includes “blessed templates” and formal verification approaches intended to make contract design more reliable, as well as plans to onboard 20,000 developers in 2026 while expanding infrastructure for AI-driven blockchain applications. Source
A new multi-university study involving economists, AI experts and superforecasters suggests that the long-standing belief that technology mainly augments rather than replaces jobs is under strain. Across all groups surveyed, there is agreement that faster AI progress is likely to reduce overall employment, with researchers from institutions including the Federal Reserve Bank of Chicago, Yale, Stanford and the University of Pennsylvania examining different scenarios for how AI could reshape the US labour market.
In the most aggressive “rapid” scenario, where AI reaches human-level performance across many tasks by around 2030, economists project a decline in labour force participation from 62% to 54% by 2050, with around 10 million jobs potentially displaced directly by AI. At the same time, they forecast strong economic growth, with GDP potentially reaching post-war boom levels, alongside a sharp rise in inequality. While current employment data does not yet show mass unemployment, younger workers in AI-exposed roles are already seeing noticeable declines, and debate is now shifting towards whether AI will eventually eliminate not just jobs, but the creation of new ones altogether. Source
Disclaimer: These articles are provided for informational purposes only. They are not offered or intended to be used as legal, tax, investment, financial, or any other advice.
