GPT-4’s ‘secret weapon’ is available as open source


After raising $415 million in its second round of funding in a year and taking the value of the company to approximately $2 billion, Mistral AI, a Paris startup, released its latest large language model, with the model architecture seemingly similar to that of GPT-4, in a totally inconspicuous way.

The company posted a link to the download link on X.

Can't blame Mistral AI, as Google Gemini solely captured the tech world's attention.

One X user commented that companies usually post a blog on their website first and then announce the product on social media. Mistral AI did the opposite; the blog came three days later.

An open-source alternative to GPT?

A step up from its previous model, Mistral-7B-v0.1, Mixtral-8x7B aims to improve how machines comprehend and generate text. To conceptualize it, envision a collaborative team of specialized experts, each possessing expertise in a distinct area, just like ChatGPT.

The latest release is a high-quality, sparse mixture of expert models (SMoE) licensed under Apache 2.0. Outperforming Llama 2-70B on most benchmarks with 6x faster inference, Mixtral stands out as the strongest open-sourced model.

Notably, it also surpasses GPT-3.5 on standard benchmarks. It exhibits capabilities such as handling a context of 32k tokens, supporting multiple languages, excelling in code generation, and achieving a score of 8.3 on MT-Bench as an instruction-following model.

"On MT-Bench, it reaches a score of 8.30, making it the best open-source model, with a performance comparable to GPT3.5," said the company blog.

It can handle specialized tasks

Despite having 45B total parameters, Mixtral operates effectively with only 12B parameters per token. This allows it to process input and generate output at the speed and cost equivalent to a 12B model.

Mixtral undergoes pre-training on data extracted from the open web, with simultaneous training of both experts and routers.

Only a couple of months ago, the company had its first round of funding, in which it raised $113 million, taking the company's valuation to $260 million.

Mistral AI is also committed to open-sourcing its technology, making its computer code freely accessible for copying, modification, and reuse. This approach contrasts with rivals like OpenAI and Google, who express concerns that open-source technology may pose risks, potentially enabling the spread of disinformation and other harmful content.

Originally published on Interesting Engineering : Original article

Leave a Reply

Your email address will not be published. Required fields are marked *