Anthropic finds a compromise to release its controversial AI

Anthropic’s AI-based hacking tool, which was considered too dangerous for the general public, will be available, albeit in a slightly modified format. Claude Fable 5 has become the most technologically advanced and powerful tool Anthropic has ever offered to the masses, but it will not detect zero-day vulnerabilities in software.

By now, almost every user has probably seen a demonstration of AI capabilities at least once that made them feel a little uncomfortable because of the idea that it was not just done well, but frighteningly flawlessly. Whether it’s artists being pushed out of their jobs by artificial intelligence or creating entirely new proofs for complex mathematical puzzles, humanity is regularly discovering new areas where computer systems are demonstrating results so impressive that it’s starting to seem even alarming due to their excessive power.

In particular, recently, developers from Anthropic have openly expressed their concerns about the release of the latest modification of Claude Mythos, which proved to be too effective in identifying weaknesses in the program code. However, today, Anthropic is starting to take real steps to bring this release to life.

The process of automated software bug detection is not something radically new, as computer scientists have been using tools such as phasers for a long time. The way these tools work is to bombard software with huge amounts of random input data in the hope of triggering a system crash. However, AI technologies pose a much more serious threat, and as vulnerability scanners like Mythos have evolved and become more sophisticated, their authors have felt justified in their concern about making their best work freely available.

Instead of condemning their technological achievements to an eternity of shelf life, Anthropic is now demonstrating a compromise that will allow the Mythos capabilities to be deployed in a safe and responsible manner. The essence of this compromise is to split the development into two separate independent models, called Claude Fable 5 and Claude Mythos 5.

The Fable 5 model is aimed at a wide audience and is a full-featured tool for performing various analytical tasks. Its scope is not limited to the search for defects in program security – Fable 5 is able to provide significant assistance in writing code, has highly effective tools for visual analysis, and can independently build internal long-term strategies. According to Anthropic, the potential of Fable 5 is significantly higher than any other model the company has ever released to the public.

However, Anthropic releases this product with certain limitations and caveats. The Fable 5 configuration is designed to successfully counteract any attempts to use it to detect the latest zero-day exploits that could be used by attackers to wreak havoc on global computer networks. In situations where the user tries to force Fable 5 to go beyond the permitted limits, the model will automatically redirect the request and return to the Claude Opus 4.8-based operation.

Readers are also invited to join the discussion, express their opinions in the comments below the article and share their impressions of what they have heard by being the first to leave a review.

On the other hand, there is the Mythos 5 model, which is exactly analogous to Fable 5 in its internal architecture, but lacks most of the above-mentioned security barriers. The main feature is that Anthropic plans to distribute it on an extremely strict and selective basis, providing access only to proven and reliable members of the cybersecurity research community. In order to stay ahead of the curve, it is critical to use such tools to find and subsequently eliminate bugs, and the proposed scheme is designed to minimize the likelihood that the potential of Mythos 5 will be used for malicious activities.

For this purpose, Anthropic has created a set of so-called classifiers, whose task is to recognize the intentions of users when interacting with Fable 5. These algorithms are designed not only to block hacking activities. They are also designed to prevent the model from being used to design methods for synthesizing dangerous chemicals or biological compounds, as well as to extract enough internal data to allow someone to create their own version of Fable 5, but without integrated security systems.

It is hoped that the measures taken will be enough to ensure the safe operation of this public release based on Mythos, as it is obvious that a huge number of people will do their best to circumvent the limitations and unlock the full range of the system’s capabilities.

Read also:

Sourceandroidauthority

Subscribe

0 Comments

Newest

OldestMost Voted

Too dangerous for the public: Anthropic finds a compromise to release its controversial AI

New comments