Working Group on reproducibility on large language models (LLMs) and
non-deterministic algorithms



About this Working Group

This in an international open working group (WG) gathering researchers on the computer science field, interested on the problem of reproducibility of results produced by large language models (LLMs) and, in general, non-deterministic algorithms. The topic is rather specific and technical, and our goal is to unveil relevant knowledge in the field and communicate it to the scientific community by publications in conferences, specialized journals, tech reports, and to the general public too.

LLMs have attracted a large interest since generative language bots such chatGPT and more recently other competitors emerged. Understanding in detail their underlying mechanisms and achiving reproducibility with these models remains an open question and a challenging problem, as well as of mejor scientific interest. Contrary to more classic, simple, and deterministic algorithms where reproducibility is relatively easy to achieve, in the case of LLMs there are several challenges such as the availability and size of the training data or the computational resources requiered to re-train the models. Some of the last LLMs products available have released both their source code and trained data, making analysis and research on them feasible.

In general, we will also address the problem of non-deterministic algorithms. Indeed, many machine-learning algorithms are non-deterministic by nature, but however it is possible to assess their reproducibility considering that once the model is well understood, it should provide equivalent infered results. Note that we write equivalent, and not exactly (bit by bit) the same.

Is this group open? Can I join?

Yes, it is open to any researcher interested in the field. The Call for Contributors is open, and it will be renewed regularly until eventually the WG finishes its activity.

Why the group is open? Many other WGs are closed

Closed groups are indeed easier to manage internally, but there might be doubts about the legitimity of the recommendations usually they emit, which could be biased towards specific interests. This can promote credibility doubts which certainly we need to avoid in a WG around reproducibility!

What will be the activities?

This working group is just starting and the exact activities will be decided collectively. However, you can expect regular video-conferences to discuss the topic together, presentations of the research on the subject conducted by members of the WG. Research itself is a fundamental activity in the WG, as well as communicating it in conferences and journal articles on the subject.

For how long this WG will exist?

We will start for at least two years, and then we will decide if we renew the working group. It is important to establish a duration for the WG to have a clear horizon to accomplish our objectives.

Which will be the language used in the meetings and reports?

Since this is an open and international WG, we will use English. This does not only allows us to communicate with any other members of the WG despite their location, but also allows for a proper diffusion of our research and activities globally.

How can I join?

The Call for Participants is open. You can use this form to express your interest in joining.

Do you have a mailing list, forums, or any other way to share with the community?

As soon as the call for Participants ends we will setup the mailing list and forums for everybody. This website is just a temporary bootstrap to initiate the WG.