gtag('config', 'G-0PFHD683JR');
Bitcoin

The Chinese Deepseek AI is new after R1 on Openai

Deepseek, Deepseek, Deepseek, has released a new model of open intelligence (LLM).

Dibsic Lifted Its latest model, Prover V2, to Face Service Service Face on April 30. The latest model, released under the Massachusetts Institute of Open Source Technology licenseIt aims to address the verification of mathematics proof.

Deepseek-prover-V2 Hugingface warehouse. source: Lugingface

Prover V2 has 671 billion teachers, making it much larger than its predecessors, Prover V1 and Prover V1.5, which was released in August 2024. The paper accompanying the first version Make up It was trained that the model was trained to translate the mathematics competition problems into an official logic using the LEAN 4 programming language – a tool that is widely used to prove theories.

The developers say Prover V2 presses mathematical knowledge in coordination that allows them to create and verify proofs, and may help research and education.

Related to: That is why Deepseek bitcoin and Crypto crashes

What does everything mean?

The model, which is also informal and incorrectly referred to as “weights” in the area of ​​artificial intelligence, is the file or collection of files that allow one to implement AI locally without relying on external servers. However, it should be noted that the latest LLMS requires devices that most people cannot reach.

This is because these models tend to get a large number of parameters, which leads to large files that require a lot of RAM or VRAM (GPU) and processing power. The new Prover V2 weighs about 650 GB, and is expected to operate from RAM or VRAM.

For this size, the amount of Prover V2 weights is determined to the accuracy of a floating point 8 bit, which means that each teacher has been rounded to take half of the usual 16 -bit area, with the number one in the dual numbers. This effectively, the result is the bulk of the model.

The Propert V1 is based on the seven -billion Deepsekmath model with the parameter and has been well seized on artificial data. Artificial data indicates the data used to train artificial intelligence models that in turn also created through artificial intelligence models, with data created by humans usually seen as an increasingly rare source of high -quality data.

Prover v1.5 It is said Improved the previous version by improving both training and implementation and achieving a higher standard of standards. To date, the improvements made by the Prover V2 have been unclear, as no research paper or other information has been published at the time of this report.

The number of teachers in Prover V2 weights indicates that it is likely to be based on the previous R1 model. When it was first released, the R1 made waves in the artificial intelligence space with its performance similar to the OPENAI latest O1 model.

Related to: South Korea suspends Deepseek downloads on user data fears

The importance of open weights

The launch of LLMs publicly is a controversial theme. On the one hand, it is a democratic force that allows the public to reach artificial intelligence on their own conditions without relying on the company’s private infrastructure.

On the other hand, this means that the company cannot intervene and prevent misuse of the model by imposing certain restrictions on user’s dangerous lying inhabitants. The launch of the R1 in this way sparked security concerns, and some described it as China “China”The moment of Sputnik

Supporters of the open source rejoiced that Deepseek continued as Meta stopped with its launch Lama A series of open source artificial intelligence models, proving that Open AI is a serious competitor to AI Openai. These models also continue to improve.

Language models are accessible

Now, even users who cannot access the super computer that costs more than average home in most of the world can operate locally LLMS. This is primarily thanks to two techniques for the development of artificial intelligence: distillation and measurement.

The distillation indicates a compact “Talib” network training to repeat the behavior of a larger “teacher” model, so you can maintain most performance while cutting parameters to make it accessible to the less powerful devices. The measurement consists of reducing the digital accuracy of the model weights and activating it to reduce the size of the size and increase the speed of the conclusion with the loss of simple accuracy only.

An example of this is to reduce Prover V2 from 16 to eight floating numbers, but there are more possible discounts through half. Each of these techniques has consequences for the performance of the model, but usually leaves the model to a large function.

R1 Deepseek has been distilled Versions With Retrained Llama and QWEN models ranging from 70 billion teachers up to 1.5 billion teachers. The smallest of these models can be reliably operated on some mobile devices.

magazine: “Chernobyl” needs to awaken people to the dangers