Audio
Is AI a blessing or a curse? Dr Scott Hollier (part 1)
Part 1 of a presentation on Artificial Intelligence and its use by people with print disabilities.
This series from Blind Citizens Australia looks at the organisation's work and issues surrounding fair access for people who are blind or have low vision.
In this episode: another presentation from the May 2024 Round Table on Information Access for People with a Print Disability.
Dr. Scott Hollier, CEO of the Centre for Accessibility Australia, delivered this presentation on Artificial Intelligence, AI, and its use by Blind and Vision Impaired People, and those with other Print Disabilities.
Our thanks to the Round Table for allowing us the use of this material.
New Horizons is produced in the studios of Vision Australia.
Pictured on this page: Dr Scott Hollier
Speaker 1 00:07 (Program theme)
Thanks for watching!
Speaker 2 00:07
It's up to you and me / To shine a guiding light and lead the way / United by our cause / We have power to pursue what we believe...
Speaker 1
We'll achieve the realisation of our...
Speaker 2 00:29
Hello and welcome to this week's episode of New Horizons. I'm Vaughan Bennison, thanks for joining us. This week we go back to the Round Table on information access for people with a print disability once again, and we hear the first of a two-part series featuring Dr Scott Hollier. Scott is the founder and CEO of the Centre for Accessibility Australia, which is based in Western Australia.
Speaker 4 00:49
Today we're going to be talking a bit about the implications for AI in relation to supporting people with print disability and is it a blessing or a curse? So I think a good place to start is to try to live AI things and figure out pretty quickly if it's a blessing or curse sort of day. So the first one is happening on my slides right now. We do have live captioning happening on my slides. So in theory, as I'm talking, there should be captions appearing on the screen. And I'll circle back to that in a little while. So I can't see those myself. But for people who are able to see those captions, just keep an eye on them. And we'll check back later to see how well those are going and how accurate they've been.
And the second demo is that I always like to try and take a selfie. Now, until recently, that's been very difficult being legally blind. But Google has put a guided frame feature on my phone. So I'm going to try to take a live AI based selfie. So I'll turn up the volume a bit on my phone here and we'll see how she.... pardon? How successful this is. I'm going into the camera app, and I will choose the front-facing camera. Photo gap, switch to front face. Take photo button. Now in theory, double tap to activate. If I line up my camera. Actions available, use tap with three fingers to view. One. Place in frame, remove your phone left and down. Okay, I'll try that. One face in frame... move your phone slightly left and hold for photo... three, two, one.
So AI seems to be our friend today. That's a great thing to know. However, if you've seen the news this week, we've seen ChatGPT4 launched. We've seen Google Gemini. So it's in Google's own AI launched. We've also seen Microsoft starting to move away from its open AI efforts to form more of its own thing. We've also seen Apple announce that it is now partnering with chat GPT and Siri will likely get an update based on ChatGPT4. And that's all been in the past week. So it has been quite the time and depending on your point of view, you may feel like either all our AI is going to be really, really friendly, enthusiastic, mostly female voices chatting to us, or you might feel like the Terminators are about to take us out.
So it really does vary a bit depending on your perspective. But importantly, from our point of view, we want to understand what the relevance is to our work, how we can best support people with disability, particularly print disability at this conference. So really what we want to do is work out how can we maximize the opportunity for AI and we really need to understand is it going to be a blessing or a curse in the work that we do. And as part of that, we really need to get two things right. We need to make sure that we have the right tools in our device of choice. And when we prepare our content, is it something that AI can help us with in terms of supporting people with print disability?
For the first one, we've just seen a good example with our selfie camera that in our everyday devices, whether it's on Windows or Mac, iPhone, Android, there's some great accessibility features and those are definitely being improved with AI support, as we've just noted with AI, scanning what the image is in front of it, how to help me line up the photo and take that photo.
04:01
But what about creating our content? How can we ensure that our content that we create is accessible? What's the current state of play of AI in terms of providing that support and exactly what our features are available to us? And that's really a big part of what I want to walk through today. So really there's two main AI approaches when it comes to how AI can support us. One is, is AI at a place where it can change things in real time to make sure that we have the effective access that we need?
So can AI effectively add alt text to our images in a way that is helpful? Can AI pick up that, let's say for example, some text has been bolded but it's meant to be a heading? Can AI figure out that's a heading and then mark it up properly for us? Can AI look at text that's clumped together and figure out, okay, we need to spread that out. We need to adjust that or maybe the font's too small. Can we make that a better size? Is AI in a place where it can do these things on the fly?
And the second part is, can AI help us with the tools we need to do it ourselves? So where humans are still needed, can AI provide us with a guidance and advice when we scan through our text and make the changes we need? So can AI be a supporting mechanism rather than changing it directly? So we'll have a look at some of these situations. So when it comes to alt text, question is often asked is how effective is alternative text in terms of generative AI? Can we just put an image now into Word or is there an option to just put an image out there and have confidence that AI is going to do a good job in figuring out what that image is? So I thought that's something that would be good to explore.
Now, if I was having this presentation two years ago and you said that how good is AI in figuring out what images are and giving us alt text, I'd say it's terrible. It wasn't that long ago when I put a picture of a submarine into Word and the auto-generated alt text declared it as a series of watches. I'm assuming it figured out the round submarine window in the illustration was probably a watch face or something like that. But in fairness, there has been massive updates in this area and because we have this rapid evolution of AI, we are seeing significant improvements in how AI is able to pick up visuals and try to make sense of it.
So I have an example on the screen here. So I have on the screen a bar graph and the auto-generated AI description when this was put into Word essentially said that it did identify it as a bar graph and it said that there are a number of colored bars. So is that accurate? Well, yes, technically the image is a bar graph and it does have a series of colored bars. So from a technical accuracy perspective, that would be correct. Is it useful? Well, no. So the graph is actually saying that there was a survey done of what the favorite color was of children. Yellow was the winner, we have nine votes. But we have no understanding of that based on the alt text that was created.
Which is interesting because there is text in this image but with AI currently, we tend to either have to get it to figure out what is the broad image or we can turn on OCR type capabilities to scan for that. So this is really interesting in that AI has evolved to a point where it doesn't get it completely wrong as often as it used to. It was actually quite hard to get an image into Word where it was just completely off track. As a year or two ago, every image just about was off track. So we have seen some rapid evolution in this space. But even though it did broadly recognise what the image was, it didn't recognise the importance of that image and STEM related materials are particularly challenging in that science, technology, engineering, mathematical areas.
07:53
So let's try another one. Using the app with Microsoft's Seeing AI app, which many people I suspect use, and happily is now available on Android as of last year. And so we thought we'd take a photo of a mint packet called Yes Kiss Mints, which I'm assuming if you have a hot date lined up would be very useful. So took a photo of that with the Seeing AI app to see how that would go. And I should also mention that although both the Word and this one are both Microsoft based AI generators, I tried a few different ones and there's not a lot of difference between different platforms at this stage.
So I took a photo of that and it did identify that it's basically a square box. But it did make one error in that it said that there was white writing and blue writing where there is white writing but it's on the blue background. So we do have a slight error this time, but again, it does in fairness. It gives you the gist of what that image is and that can be helpful to know. But it did occur to me that if you did buy Yes Kiss Mints specifically because you have a hot date in mind, you probably want to know which, if you had a series of tins in front of you, which one had the mint in it. Because the alt text has not told us that. It has told us that there is a box. It has told us a little bit about the box.
But if we had like cough medicine next to the mint, you really want to make sure you got the right one. And unfortunately, the alt text is not helping us in that regard. So again, we have some information available to us. Is the alt text right? Well, there is an error in this one, but it's in the ballpark. But again, it's not giving us the information that is useful to us. If we want to get a mint, we'd probably have to switch to more of the text reading function of AI rather than just the broad, what is this image?
So to show you an example of where we just don't have AI in a place, which is useful at the moment, I have an image on the screen from the James Webb Space Telescope. And I'm very happy to report that every single image from the James Webb Space Telescope does have a human generated alt text. And this is really, really important. So I'm going to play a sample of the alt text being read out. The image is divided horizontally by an undulating line between a cloudscape forming a nebula along the bottom portion and a comparatively clear upper portion, reads 1.
Speckled across both portions is a star field showing innumerable stars of many sizes. The smallest of these are small, distant and faint points of light. The largest of these appear larger, closer, bracer and more fully resolved with eight point diffraction spikes. The upper portion of the image is bluish and has wispy translucent, cloud-like streaks rising from the nebula blur. That is a very long alt text, but arguably probably as concise as is possible given the detailed image. So a human wrote that and really thankful that they did. But when I put the same image into a generative AI engine, it said, A sky full of stars.
Now, I love it when programs like chat GPT reference Coldplay songs, but I don't think it's really that helpful in terms of reflecting exactly what this image is. Clearly, we needed a lot more detail than sky full of stars. And this is, again, a really good example of where AI just isn't up to scratch. So we've seen some examples where AI is technically accurate, but doesn't give us the importance. We're seeing AI where it has some degree of accuracy, but again, misses the point. And in this case, it just completely doesn't know what to do. So this is a really, I think, good example to show that when we apply AI across a range of different areas, you know, Is it effective? It's better. Is it solving our problems? Not yet.
So now I'm going to circle back to our live captioning. And the reason why live captioning is a good feature in addition to supporting people who are hearing impaired is that there are lots of technologies of which having generated speech is really important. It's the same technology we use if we upload a podcast, for example, to get it converted into text. It's the same process. And if it's all based on the same Microsoft engine, and if we want to have a record of spoken words, we want to convert that into different formats, then it's really critical that this is correct.
12:26
And I agree with the general consensus around the room that it's okay. It's not terrible. And again, if we went back three to four years, you don't have to look too far on YouTube to see that it was very primitive and really didn't do a good job. But now it's generally thought to be about 85 to 90% accurate. So when you think about the applicability of, you know, getting text, getting information converted into text, then converting that text into other accessible formats, then it is okay. But it's certainly not perfect. And this environment is probably as perfect as we can get.
We have, you know, a very I'm right in front of the microphone, the microphone is broadcasting to the whole room. So the computer is able to pick up that audio quite well. But you know, if we were in a different scenario, if this was a skateboarding convention instead, and someone shouted something random into a mic as they flew past on the skateboard, I doubt that the live captioning would do so well. So again, we have an example here where the technology is actually pretty good. And when this is paired with curated slide content, then it is helpful. So unlike our alt text examples, I think this one is more helpful.
And I think it is getting close to a point where we can rely on but it's not accurate enough yet to be standalone. But as a supporting mechanism, it is okay. And if people aren't aware of how this feature is happening, it's built into PowerPoint, you can go to the slideshow tab, and there's a little use subtitles tick box. And that can just turn it on for you. So we tend to view it more as a quick win that we can provide that option, but with the caveats I just mentioned.
Speaker 2 14:09
And we'll bring you the second part of that presentation next week. If you'd like to get in touch with Blind Citizens Australia, 1-800-033-660 is the number to call, 1-800-033-660... or you can email bca@bca.org.au ... BCA at BCA dot org dot AU. I'm Vaughan Bennison, I'll talk to you again next week.
Speaker 3 14:30 (Theme)
We will cheer...
Speaker 2 14:38
Of a dream...