How to use AI for better image accessibility

Feb 29, 2024

Stylized illustration of a piece of cake with a cherry on top inside a web browser window, alongside HTML code snippets.

Topic

If you have completed a Quality Course Review with Missouri Online, you already know that course materials must be digitally accessible and that one of the criteria for accessibility is alternative text (often shortened to “alt text”) for images. The primary purpose of alt text is to describe the image to anyone who cannot see it. This is critical for visually impaired users and anyone who uses text-to-speech technology. It also benefits users who have disabled image loading or who have encountered a broken image.

Adding alt text to your images as you are creating documents takes only a few extra minutes, and it provides an opportunity to be intentional about what purpose each image serves in your instructional materials. However, it can be tedious to add alt text retroactively to many documents, and writing adequate alt text for complex images can be especially challenging.

The good news is that recent advances in generative artificial intelligence (AI) and large language models (LLMs) such as ChatGPT, Bard, and Copilot have the potential to simplify this task. These tools allow you to upload an image and, with the appropriate prompt, generate a description that can be used as alternative text.

Let's look at some examples of how you can incorporate these tools into your daily teaching practice.

AI-generated alt text for images

We'll use ChatGPT 4 (which requires a paid subscription) for the examples below, but please keep in mind that many LLMs now have similar features — please use one that best works for you. Note: Arizona State University offers a free Image Accessibility Creator for this specific purpose.

Alt text prompt examples

We’ll start with the following prompts for generating alternative text for a given image:

Describe what you can see in the image, and generate the description so it is suitable to be used for alt text.
Please write accessible alt text for this image in 150 characters or less.
Write alt text for the attached image. The alt text should be concise. A short phrase or at most a couple of sentences. Should the image or graphic contain text, add a section and transcribe all the text presented.

Examples

Two students looking at a laptop on outdoor steps, with others studying in the background.

Let's see what ChatGPT-4 generates when we use these prompts for the above image above:

Prompt: Describe what you can see in the image, and generate the description so it is suitable to be used for alt text.

Output: In the image, there are six people, likely students, engaging in various activities on the steps outside a brick building. In the foreground, two young women are sitting side by side on the steps. One has an open laptop on her lap and is gesturing as she talks to the other, who is listening and smiling. They appear to be collaborating on a project or studying together. The first woman is wearing a green shirt, headphones around her neck, and khaki pants, while the second woman is wearing a striped blue shirt and white pants.

In the background, on the left, another person is seated on the ground with a book open in their lap, while on the right, a group of three people are sitting on the steps. Two of them are engrossed in a large open book, while the third is looking on. Everyone is casually dressed, suggesting a relaxed, academic atmosphere, possibly on a college or university campus.
Prompt: Please write accessible alt text for this image in 150 characters or less.

Output: Two students share a laptop and smile on steps outside, with others studying in the background.
Prompt: Write alt text for the attached image. The alt text should be concise. A short phrase or at most a couple of sentences. Should the image or graphic contain text, add a section and transcribe all the text presented.

Output: Students sitting on stairs with laptops and books, engaging in study and conversation. No discernible text in image.

As we can see in these examples, each prompt has its advantages and disadvantages. The first AI-generated output is very descriptive, probably because our prompt started with the word "describe." The second output is much more concise and to the point, though the words used could be edited for further clarity. The third output example is more clear, but it also added "No discernible text in image" to the alt text, which isn't necessary.

Let's look at another example, this one a highway map of the state of Missouri that includes text within the image.

A map of Missouri highlighting major highways, cities like Kansas City, St. Louis, and Jefferson City, and neighboring states.

Prompt: Describe what you can see in the image, and generate the description so it is suitable to be used for alt text.

Output: A map of Missouri showing major highways, roads, rivers, and cities. Key cities like Kansas City, Jefferson City, and St. Louis are marked, as are bordering states.
Prompt: Please write accessible alt text for this image in 150 characters or less.

Output: Map of Missouri highlighting major highways, rivers, and cities with state borders.
Prompt: Write alt text for the attached image. The alt text should be concise. A short phrase or at most a couple of sentences. Should the image or graphic contain text, add a section and transcribe all the text presented.

Output: A detailed road map of Missouri with labeled cities, highways, and state borders.

Recommendations

These examples demonstrate how slight variations in prompts can generate different outputs, often with significant differences.

The World Wide Web Consortium (W3C), the organization that sets standards and guidelines for web development, does not set a maximum length for alt text in its guidelines for image accessibility. Other guidelines recommend a limit of one sentence, or 80–150 characters; the Canvas accessibility checker will flag any alt text longer than 120 characters as an error. I personally like the 150-character limit, as I find it allows for slightly more descriptive alt text while remaining concise.

Generally speaking, I would suggest to find a prompt that works best for you, while still being aware that AI-generated alt text is not perfect and will often still require editing.

AI-generated descriptions for images

We'll use ChatGPT 4 (which requires a paid subscription) again for the following examples. Note: Arizona State University offers a free Image Accessibility Creator for this specific purpose.

Image description prompt

Because of the requirement to keep alt text concise, you might also need to provide a more detailed image description in a caption or in accompanying text. Crafting a detailed description for an image reduces ambiguity, given that these descriptions afford more freedom with word count compared to alt text. The objective is to still succinctly communicate the image's intent, however. Again, focus on the purpose the image serves — why you included it and what students need to know about it.

For image descriptions, we will use the following prompt (which is a modified version of the prompt included in the Arizona State Image Accessibility Generator):

Using the provided image, create a description. The image description should be objective, concise, and descriptive. Descriptions should be straight forward and factual, avoiding interpretations. Begin with a general overview of what the image portrays before providing details. Not everything needs to be described if it is not contextually important. Descriptions should utilize vivid terminology to describe various features like composition, shapes, size, texture, and color. Avoid using picture of, image of, and photo of unless needed in the context of an illustration or painting type graphic.

Examples

For these two image description examples, we will use the same two images that were used previously to generate alt text.

Prompt: Using the provided image, create a description. The image description should be objective, concise, and descriptive. Descriptions should be straight forward and factual, avoiding interpretations. Begin with a general overview of what the image portrays before providing details. Not everything needs to be described if it is not contextually important. Descriptions should utilize vivid terminology to describe various features like composition, shapes, size, texture, and color. Avoid using picture of, image of, and photo of unless needed in the context of an illustration or painting type graphic.

Output: The image captures a group of students on a campus setting. Two young women are seated on concrete steps; one holds a laptop and gestures while talking, and the other listens, smiling. They are dressed in casual attire; one wears a green jacket and khaki pants, and the other sports a striped blue shirt and white pants. In the background, other students are engaged in reading and conversing. The surrounding architecture has red brick detailing, and the overall mood is lively and communal.

Prompt: Using the provided image, create a description. The image description should be objective, concise, and descriptive. Descriptions should be straight forward and factual, avoiding interpretations. Begin with a general overview of what the image portrays before providing details. Not everything needs to be described if it is not contextually important. Descriptions should utilize vivid terminology to describe various features like composition, shapes, size, texture, and color. Avoid using picture of, image of, and photo of unless needed in the context of an illustration or painting type graphic.

Output: The graphic is a stylized map of the state of Missouri, detailing its network of highways and major roads. Prominently displayed are red lines denoting highways, with blue and red shield icons indicating U.S. Route numbers. Major cities are marked with red dots and labeled in black text, while rivers are represented by sinuous blue lines. The map is bordered by its neighboring states: Kansas, Illinois, Kentucky, Tennessee, Arkansas, Oklahoma, and Iowa, which are labeled in bold black text. The color palette is primarily beige for the land, with red, blue, and black features providing contrast.

Conclusion

The importance of crafting accessible alt text and image descriptions cannot be overstated. By prioritizing accessibility, we not only comply with best practices but also ensure that our course materials are fully available to all students to engage fully with the content.

Using generative AI and large language models can streamline the process of creating these descriptions. The examples above illustrate how varying prompts can yield different levels of detail, reaffirming the need for a balance between brevity and descriptiveness.

As we continue to embrace these new tools, remember that while AI can provide a strong starting point, the human touch in fine-tuning this content is still needed.

Topic