Making the spectrum of ‘openness’ in AI more visible

A (very) recent history of openness in AI

Google released demos of Gemini last week with much fanfare, but no way to even test it except with a supposed integration with Bard.

Mistral AI tweeted a Magnet link to one of its models. No fanfare. No press. Anyone with decent LLM skills could download, use, and even fine-tune the model. For open-source enthusiasts, it was a much better release than Gemini. This kind of accessibility to pretrained parameters of the neural network is called open weights. It enables users to use the model for inference and finetuning.

Open weights are better than just a demo or access to a product like ChatGPT or an API, no doubt. The example of Mistral is a case in point on what seems to be open source, might not be open source or fully open source. A post from The Register discusses in detail how Meta’s llama 2 isn’t exactly open source despite the claims.

Other models are more open. BLOOM (BigScience Large Open-science Open-access Multilingual Language Model) provides fully accessible source code and uses responsibly sourced training data, with support for diverse languages and cultures.

My main argument is that whenever an AI model is released for public consumption, where the model falls on the spectrum of openness should be clearly expressed and understood, without putting the burden of digging that information from the tome of license agreements on the user. AI, as a community of practice, should engage more in making that happen.

Spectrum of openness in AI

To make the idea of the spectrum of openness easier to understand, let’s take the example of openness in software. Openness, or typically a digital artifact being “open” is often thought of as binary. Whether something is open or closed. A straightforward example is that Linux is open while Windows is not. OpenStreetMap is open while Google Maps is not.

Openness is not exactly binary, it’s a spectrum. It’s easier to understand with the example of open-source software, as the history of free/open/libre software movements paves the way for discussions in openness of other artifacts such as data, research, science, etc. Software can be open source, but still varies in the level of “freedom” it provides the users.

Here’s what a spectrum of freedom in open source software might look like:

  • Freedom to modify source code and redistribute
  • Freedom to modify source code, but not to redistribute
  • Freedom to modify source code of core components, but additional features are proprietary
  • Freedom to view source code, but not to modify

This is only for software that’s considered open source. Some freemiums are free to use, but source code is not available and sometimes might be mistaken for open source. This kind of freedom is only one dimension in which we can discuss the openness of software. There are other dimensions to consider, for example: community engagement and governance, language support, documentation, interoperability, commercial engagement, and more.

Extrapolating the same concepts to openness in AI, even for an open weights model, the following (at the very least) are most likely closed:

  • Training dataset (with all potential bias and ethical issues, including legal compliance and copyright issues
  • Ethical guidelines and safety measures behind the creation of the model
  • Training code, methodology, hyperparameters, optimization techniques, post-training
  • Complete model architecture
  • Documentation
  • Objective evaluation following the norms of open, reproducible science
  • Organizational collaboration, governance
  • Finance, GPU, labor, and other resources necessary

Why is openness to all this information important?

Mainly, because we should be able to trust AI before using it, like we need to trust any product before we use it. Some instances of what trustworthy AI might look like:

  • Model architecture can be studied to make further developments. For example, the publication of the “Attention Is All You Need” paper with details on the attention mechanisms enabled much of the recent developments in Large Language Models.
  • An AI auditor can look at the training datasets and methodology to identify potential legal and ethical issues.
  • A startup developing an LLM-based app for their customers can understand potential security issues with the app and address those to save their customers from harm.
  • A lot of social bias and potential harm to underprivileged communities can be scrutinized so they can be avoided or remarkably mitigated.

However, the benefits of a level of privacy must be acknowledged as with all discussions in openness. Information that might affect the privacy or security concerns of stakeholders, including trademark and copyright issues should be private. Ultimately, it’s about finding the right trade-off to maximize social utility.

What next?

Now that we understand the value of openness and its visibility in AI, here are some actions the community can take.

We should develop a framework to define openness in AI.

The framework covers all the information about a model that its users need to be aware of. Some efforts have already been made. Sunil Ramlochan makes the distinction between open source, open weights, and restricted weights and suggests a simple framework for openness in AI. We can consolidate similar efforts to develop a comprehensive framework for openness in AI.

We should encourage the practice of discussing openness of AI models/products, not just using them.

AI as a community of practice, has enabled discussions on finetuning models and building products on top of them, pushing the limits of making diffusing AI to the masses. In addition to this, we should also discuss openness. Openness is not only an idealistic concept for academic discussions, but also a property of the models that can enable or hinder innovation and usefulness.

AI creators/companies should make openness information more accessible during release.

Instead of burying limitations in license agreements, creators/companies can make the information of where the models like in the spectra of openness in accessible language help the users understand the possibilities and limitations more easily and help reduce friction for the creators to enforce compliance with the terms.

We should develop a community-supported index to track and discuss openness of AI models/products.

Leaderboards have been very helpful recently in facilitating discussions of the performance of recently released models. Since openness is more qualitative than benchmark performance, an index can be designed that represents the openness of models in various dimensions in quantitative or definitive qualitative terms. Open data has a rich history of using indices to assess the current state of openness and pinpoint areas for improvement. Open Knowledge Foundation’s Open Data Index and Web Foundation’s Open Data Barometer can serve as good references for the AI models’ openness index. It could be hosted on a platform with good community support, for instance, HuggingFace. [I was involved in the Open Data Index and Open Data Barometer as a country reviewer for Nepal.] Stanford University has recently launched the Foundation Model Transparency Index which provided a rating of openness of 10 large foundation models. The project can provide lessons for a more active and community-managed project in which the openness of models can be assessed and compared with others soon after release.

We should increase community engagement in developing licenses for AI models.

Similar to how Creative Commons has made licensing content (text, images, etc.) easier, we need a variety of licenses that suit AI models with substantial community engagement. A notable initiative is the OpenRAIL project that has made a great start but still feels niche. The conversation about licensing needs to be more mainstream, and for that we need greater community engagement. As someone involved with open data, open source software, and OpenStreetMap communities for over a decade, vibrant community support is required to make open projects more widely accessible.

Summing up

Open access to AI research, openly available neural network architectures, open weights, and in general support for open source in various forms even from large tech companies have gotten us this far in making powerful AI more accessible. Openness in provenance information and source, and the freedom this enables will help make the future of AI more trustworthy.

Embedding Shiny App in WordPress

I mostly code in R and Python for my data science/machine learning projects and use WordPress in my portfolio blog. In order to communicate my experiments as interactive visualizations, I can either publish those as ShinyApps, or Quarto websites.

I wanted to test if I could embed a Shiny app in WordPress. It could help me write the data analysis and interactive visualization code in R, and publish it to my WordPress-based personal website.

The solution was to embed a Shiny app as an “iframe” in a WordPress blog.

An iframe (short for inline frame) is an HTML element that allows us to embed another HTML document within the current document. It provides a way to include external content from another source or website into your web page. The content within the iframe is displayed as a separate independent window within the parent document.

I published the example ShinyApp in “https://kshitizkhanal7.shinyapps.io/basic_shiny/“. Then I used the following HTML code in this WordPress code to get the app embedded here.

<iframe src="https://kshitizkhanal7.shinyapps.io/basic_shiny/" width="150%" height = "650"></iframe>

Let’s break it down:

  • <iframe>: This is the opening tag of the iframe element.
  • src="https://kshitizkhanal7.shinyapps.io/basic_shiny/": The src attribute specifies the URL of the external web page you want to display within the iframe. In this case, it is set to "https://kshitizkhanal7.shinyapps.io/basic_shiny/".
  • width="150%": The width attribute determines the width of the iframe. In this example, it is set to "150%", indicating that the iframe will be 150% of the width of its container. This allows the iframe to expand beyond the normal width of the container if needed.
  • height="650": The height attribute specifies the height of the iframe in pixels. In this case, it is set to "650" pixels.
  • </iframe>: This is the closing tag of the iframe element.

The resulting embedded app follows.

I plan to use this and explore other tools to create scrolly data stories in WordPress. Follow this space for more.

I am on Twitter @kshitizkhanal7.