Darío Gil, IBM Research: “This year there will be the first artificial intelligence model in Spanish with use cases” | Technology

0
207

Darío Gil, vice president of IBM and head of the company’s global research division.

The human brain is the most complex natural structure in the known universe, with 86 billion neurons that transmit 1,000 impulses per second. Imitating just a part of that extraordinary biological machinery to learn, understand and respond in the Spanish spoken by more than 500 million people is the monumental task commissioned by the Government and in which the Barcelona Supercomputing Center (BSC) has embarked with its MareNostrum supercomputer and the multinational IBM, which will make all its programming, research and global artificial intelligence infrastructure potential available to the project.

Darío Gil, a Murcian raised in Madrid who turns 49 in 2024, has been the main technological supporter of this adventure as head of IBM’s global research division. He participates in this interview shortly after signing the agreement that will give rise to the first major artificial intelligence language model born in Spanish. Convinced that this advance is exponential and solid – “a revolution”, he proclaims – he is committed to an open and collaborative architecture where everything fits, from the minutes of parliamentary sessions to the Royal Spanish Academy or any Spanish-speaking university or group. He estimates that the first development can be shown this year.

Ask. Why is IBM betting on artificial intelligence in Spanish?

Answer. Our point of view on artificial intelligence allows us to align with the Government’s strategy. We firmly believe that the future of artificial intelligence will be created and defined through an open ecosystem and that contrasts with other companies. It happened with operating systems more than 20 years ago and was done well at the time. It was a determined bet that the future of operating systems, both in supercomputing and for all distributed systems on the Internet and so on, would be based on open source. We have reached the same conviction and the community wants to participate in the creation of artificial intelligence.

Q. What will participation be like?

R. We are going to create collaborative environments to develop foundational models based on open source where there is transparency in the data that will be used for pre-training and a methodology. It is very important to increase the capacity of the models discreetly and day by day. Existing foundational models require six to nine months of pre-training and release versions once a year. If we have a base model for the entire developer community, for everyone who wants to add knowledge or capabilities, we will work together to create the best open foundational models in Spanish and the co-official languages.

If there is a bubble it is one of catastrophists, but the technological base of artificial intelligence is powerful and solid

Q. How is it going to develop?

R. From a computing point of view, we will use the existing capabilities in the BSC. With its MareNostrum there is a commitment from the minister (for the Digital Transition, José Luis Escrivá) to continue investing to accelerate the transition from MareNostrum 5 to 6. For our part, we provide supercomputing centers dedicated to artificial intelligence and the latest advances from IBM Research .

Q. What are the priority sectors targeted by the model?

R. To all, but the Government has an interest in guaranteeing that small and medium-sized companies benefit, where there is greater inhibition or complexity when adopting this type of innovation, and the State Administration. We have defined use cases to implement the diffusion of artificial intelligence in these two sectors.

The model will use public data, but there is a desire to use documents that are owned by the State, which is a very unique and interesting issue. The transcripts of all Parliament debates is an example

Q. How much does it cost?

R. There is no concrete answer. But, if it serves as an approximation, I can say that to create a high-performance foundational model, thousands of processing units are needed, each costing at least $35,000 (32,300 euros). They are very ambitious projects; This is not two people talking one afternoon and making a PowerPoint. At IBM Research we have 3,600 scientists and engineers participating and our own supercomputing teams dedicated exclusively to creating foundational models. And, as it is an open system, the community of Spanish-speaking developers who want to participate should be added. They are efforts that go in parallel.

Q. Where will the data come from to feed the model?

R. It will be public data, but there is a desire to use documents that are property of the State, which is a very unique and interesting issue. The transcripts of all parliamentary debates is an example. But the collaboration of the national libraries and the Royal Spanish Academy is also foreseen. It will be data that is public.

You can ask the model for answers consistent with the Argentine experience and it will behave that way, it will learn from the context from which the interaction is generated.

Q. And how is the diversity of Spanish speakers saved?

R. From a mathematical point of view, a great diversity of languages ​​can be incorporated. The diversity is within the same neural network and then, when the fine tuning (adjustment or improvement), it adapts, even if it has been trained with the base documents, to the different variations of Spanish. You can ask the model for answers consistent with the Argentine experience and it will behave that way, it will learn from the context from which the interaction is generated.

Q. But “make an appointment” in Spanish may not have the same meaning in some Latin American countries, to give an example for an administrative use case.

R. The base model will increase, specialize and gain skills with everyone’s efforts so that it understands specific contexts. The open source model allows for this enormous diversity. And the goal is to expand it as much as possible, including to Brazil. Ibero-America is a huge market of opportunities and it is important to take advantage of the competitive advantage of Spanish.

The goal is to expand it as much as possible, including to Brazil. Ibero-America is a huge market of opportunities and it is important to take advantage of the competitive advantage of Spanish

Q. When will the first model be available?

R. The goal is to do something this year and, in parallel, develop some use cases around the same time. In this world (of artificial intelligence), which is very dynamic, no one is interested in deadlines of years.

Q. And what does IBM gain?

R. I will give the example of Red Hat (multinational open source programming company whose parent company is IBM). It invoices billions of dollars annually and is the largest company in the world. software open. The model is to facilitate the program to companies and governments that will then want maintenance or security compatible with the equipment. We are used to that business model. We do it not because we are altruistic, but because we believe in that model. We want strategic partners and we have found many resonances in the Government of Spain. In the artificial intelligence alliance that we have, there are more than 80 institutions that are part of this commitment.

Q. Is there an artificial intelligence bubble?

R. Technology itself is evolving at a speed like I have never seen anything else do. And we no longer plan for a year or two, but for a month, weeks or days from now. I don’t see a bubble. If there is, it is catastrophic, but the technological base is powerful and solid.

You can follow The USA Print in Facebook and x or sign up here to receive our weekly newsletter.

Subscribe to continue reading

Read without limits

_