AI Act deal, Mistral.ai and open models

Last week, and specifically last Friday was a big day for AI in europe. Let's talk about AI Act and Mistral.ai.

The AI Act on its way

The european parliament and european council have reached a deal for the AI Act. Its objectives “aims to ensure that fundamental rights, democracy, the rule of law and environmental sustainability are protected from high risk AI, while boosting innovation and making Europe a leader in the field.

From European Parliament website, the AI act provides:

It looks like Europe is preventing a horrible dystopian future for ourselves. I believe those are good points, and I agree with this entirely.

The part that interests me the most in the AI act are the articles on foundation model (basically LLM), and this is what the AI act stipulates:

To account for the wide range of tasks AI systems can accomplish and the quick expansion of its capabilities, it was agreed that general-purpose AI (GPAI) systems, and the GPAI models they are based on, will have to adhere to transparency requirements as initially proposed by Parliament. These include drawing up technical documentation, complying with EU copyright law and disseminating detailed summaries about the content used for training. For high-impact GPAI models with systemic risk, Parliament negotiators managed to secure more stringent obligations. If these models meet certain criteria they will have to conduct model evaluations, assess and mitigate systemic risks, conduct adversarial testing, report to the Commission on serious incidents, ensure cybersecurity and report on their energy efficiency. MEPs also insisted that, until harmonised EU standards are published, GPAIs with systemic risk may rely on codes of practice to comply with the regulation.

The press release gives a bit a little more information on the governance aspect:

...an AI Office within the Commission is set up tasked to oversee these most advanced AI models, contribute to fostering standards and testing practices, and enforce the common rules in all member states. A scientific panel of independent experts will advise the AI Office about GPAI models...

So…, there is a lot to unpack here. First and foremost, it looks like models such as GPT4 are supposed to be regulated the moment the AI act will be active. Second, they want to implement an AI office, essentially an AI institution that will decide how those models will be regulated. While the word “regulation” is not mentioned, it sure sounds like regulation to me.

Also, AI orgs like Openai & Anthropic, the way they have been operating up until today is completely misaligned with the AI act for now: closed source, no technical documentation, besides a vague rumour of 1.78 Trillion parameters for GPT4, MoE design, and completely mysterious about the training dataset (common question that never gets answered: was book3 dataset used for training GPT4?). It leans however in the direction of the narrative Sam Altman and the Effective Altruism gang have been pushing: a strong and restrictive regulation on foundational models, in the name of preventing an AI disaster. Now, I don’t think I would say the AI act pushes for extreme regulation, but it seems there is a form of transparency requested on those models. If indeed AI is the real deal, then an institution overseeing those may be an acceptable solution, it really depends on the reach this institution is planning to have. As long as this does not restrict open-source work to be freely released, I guess it would be OK… however this seems the point of contention.

A very interesting viewpoint was shared by the CEO of Mistral.ai on Twitter a few weeks ago:

A very good read. In short, Arthur Mensch's position is that the EU AI Act should focus solely on product safety, and not regulating foundational AI models or systemic risks. He also makes some good points on systemic risk and its vague definition. Mistral.ai and its leaders position is hardcore on open sourcing models, and today we’re not quite sure how this fits with the EU AI Act. That’s a shame.

Let's talk about Mistral.ai

Mistral.ai started its operation in June 2023 (only 6 months ago), and they made a lot of noise raising over 100 million euros after only presenting a 7 pages strategic memo. This is the biggest seed investment ever made in Europe. This document is definitely worth a read, as it lays out the goals of Mistral and presents a small tech roadmap for the company onwards. The 100 millions are mostly justified by the enormous compute cost needed for creating an LLM, and the credibility of the project is made by the team behind Mistral, some strong experts, who worked and led AI projects in Meta and Google deepmind.

Here's a summary of mistral.ai's development roadmap coming from the memo:

The first 6 months of Mistral have been wildly successful, and frankly achieved every expectation set by the memo and more. They released their first teaser model, Mistral 7B, at the end of September. A very small model in size but with performance that was not expected originally for such a small model. The AI community absolutely loved this model and very quickly fine-tuned their own model using Mistral 7B as their base model for transfer learning (e.g. Zephir, Dolphin). It has very quickly become a preferred model for fine tuning LLM, in a very short time.

The memo talks about raising 200M in Q3 2024. As of December 2023 (9 months earlier), this has already been secured with a 385 Millions raised at a 2Bn valuation. In that sense, Mistral has overachieved their objectives for 2023, after only 6 months of operations.

On December 8th 2023, last week, they released a new model, larger (Mixture of experts implementation 8X7B), that gets frighteningly close to performance of GPT 3.5, for quite possibly an order of magnitude less in compute cost to run inferences. The reception by the community has been phenomenal. Still too early to tell what will be the impact, but we should expect fine tunings from hackers that will get tremendous performance. How good it will get, we don’t know yet.

Now GPT4 is still way ahead, and Openai have access to a vast amount of compute to work on GPT5, so let’s see how the race goes. However, we could say that Openai is to Mistral what Microsoft/Windows is to Redhat/Linux: today the whole web runs on open source software, tomorrow the whole AI could run on open models. An open-source, or more specifically open-weight strategy, completely makes sense, and to me and many hackers out there, it is very appealing.

Mistral's position on open sourcing has been clear from the very beginning with their strategic memo, and we can hope that they have enough influence in France and Europe to push for a meaningful AI agenda. It looks like Cedric O, one of the founders, who used to be Secretary of State for digital affairs under Macron, the current French president, is having some subtantial influence.

Open-weight vs close weigh

Arthur Mensch's tweet I mentioned earlier had an interesting reply:

I strongly recommend Jeremy’s essay AI Safety and the Age of Dislightenment. It critically examines proposals for strict AI model licensing and surveillance, arguing that they may be ineffective or even harmful, potentially centralising power and undermining progress in society. It echoes the Mistral strategic memo on open models on various points but goes in much much more depth.

Unfortunately, the AI Act is not really reflecting advice made from the essay and it’s a shame that we’re already regulating LLM with the deal in place. It looks like Sam Altman and Openai's narrative for strong regulation has impacted the AI Act in some ways. It’s no secret that Openai has been lobbying hard for this in both the US and Europe.

I’m a little mad about this.

However, I’m very hopeful for the future. Open models such as Mistral are popping up every day, and their performance is getting closer and closer to proprietary models. The AI community is also very excited about this, and are simply happy they can play with models and apply their own fine tunings. It is truly an interesting time for AI.