Thursday, 19 November 2009
Automatic captions in YouTube
Since the original launch of captions in our products, we’ve been happy to see growth in the number of captioned videos on our services, which now number in the hundreds of thousands. This suggests that more and more people are becoming aware of how useful captions can be. As we’ve explained in the past, captions not only help the deaf and hearing impaired, but with machine translation, they also enable people around the world to access video content in any of 51 languages. Captions can also improve search and even enable users to jump to the exact parts of the videos they're looking for.
However, like everything YouTube does, captions face a tremendous challenge of scale. Every minute, 20 hours of video are uploaded. How can we expect every video owner to spend the time and effort necessary to add captions to their videos? Even with all of the captioning support already available on YouTube, the majority of user-generated video content online is still inaccessible to people like me.
To help address this challenge, we've combined Google's automatic speech recognition (ASR) technology with the YouTube caption system to offer automatic captions, or auto-caps for short. Auto-caps use the same voice recognition algorithms in Google Voice to automatically generate captions for video. The captions will not always be perfect (check out the video below for an amusing example), but even when they're off, they can still be helpful—and the technology will continue to improve with time.
In addition to automatic captions, we’re also launching automatic caption timing, or auto-timing, to make it significantly easier to create captions manually. With auto-timing, you no longer need to have special expertise to create your own captions in YouTube. All you need to do is create a simple text file with all the words in the video and we’ll use Google’s ASR technology to figure out when the words are spoken and create captions for your video. This should significantly lower the barriers for video owners who want to add captions, but who don’t have the time or resources to create professional caption tracks.
To learn more about how to use auto-caps and auto-timing, check out this short video and our help center article:
You should see both features available in English by the end of the week. For our initial launch, auto-caps are only visible on a handful of partner channels (list below*). Because auto-caps are not perfect, we want to make sure we get feedback from both viewers and video owners before we roll them out more broadly. Auto-timing, on the other hand, is rolling out globally for all English-language videos on YouTube. We hope to expand these features for other channels and languages in the future. Please send us your feedback to help make that happen.
Today I'm more hopeful than ever that we'll achieve our long-term goal of making videos universally accessible. Even with its flaws, I see the addition of automatic captioning as a huge step forward.
* Partners for the initial launch of auto-caps: UC Berkeley, Stanford, MIT, Yale, UCLA, Duke, UCTV, Columbia, PBS, National Geographic, Demand Media, UNSW and most Google & YouTube channels.
Update on 11/24: We've posted a full length video of our announcement event in Washington D.C. on YouTube. We've included English captions using our new auto-timing feature.
Wednesday, 21 October 2009
More accessibility features in Android 1.6
The most recent release of Android 1.6, a.k.a. Donut, introduces accessibility features designed to make Android apps more widely usable by blind and low-vision users. In brief, Android 1.6 includes a built-in screenreader and text-to-speech (TTS) engine which make it possible to use most Android applications, as well as all of Android's default UI, when not looking at the screen.
Android-powered devices with Android 1.6 and future software versions will include the following accessibility enhancements:
- Text-to-Speech (TTS) is now bundled with the Android platform. The platform comes with voices for English (U.S. and U.K.), French, Italian, Spanish and German.
- A standardized Text To Speech API is part of the Android SDK, and this enables developers to create high-quality talking applications.
- Starting with Android 1.6, the Android platform includes a set of easy to use accessibility APIs that make it possible to create accessibility aids such as screenreaders for the blind.
- Application authors can easily ensure that their applications remain usable by blind and visually impaired users by ensuring that all parts of the user interface are reachable via the trackball; and all image controls have associated textual metadata.
- Starting with Android 1.6, the Android platform comes with applications that provide spoken, auditory (non-speech sounds) and haptic (vibration) feedback. Named TalkBack, SoundBack and KickBack, these applications are available via the Settings > Accessibility menu.
- In addition, project Eyes-Free (which includes accessibility tools such as TalkBack) provides several UI enhancements for using touch-screen input. Many of these innovations are available via Android Market and are already being heavily used. We believe these eyes-free tools will serve our users with special needs as well.
Saturday, 17 October 2009
A new home for accessibility at Google
We regularly develop and release accessibility features and improvements. Sometimes these are snazzy new applications like the a new talking RSS reader for Android devices. Other times the changes aren't flashy, but they're still important, such as our recent incremental improvements to WAI-ARIA support in Google Chrome (adding support for ARIA roles and labels). We also work on more foundational research to improve customization and access for our users, such as AxsJax (an Open Source framework for injecting usability enhancements into Web 2.0 applications).
We've written frequently about accessibility on our various blogs and help forums, but this information has never been easily accessible (pun intended) in one central place. This week we've launched a handy new website for Accessibility at Google to pull all our existing resources together: www.google.com/accessibility. Here you can follow the latest accessibility updates from our blogs, find resources from our help center, participate in a discussion group, or send us your feedback and feature requests. Around here, we often say, "launch early and iterate" — meaning, get something out the door, get feedback, and then improve it. In that tradition, our accessibility website is pretty simple, and we expect this site to be the first of many iterations. We're excited about the possibilities.
The thing we're most excited about is getting your feedback about Google products and services so we can make them better for the future. Take a look and let us know what you think.
Posted by Jonas Klink, Accessibility Product Manager
Tuesday, 14 April 2009
An ARIA for Google Moderator
Google-AxsJAX was launched in late 2007 as a library for access-enabling Web-2.0 applications. Since then, we have released accessibility enhancements for many Web-2.0 applications via the AxsJAX site as early experiments that have eventually graduated into full-fledged products. Just recently we posted about using the AxsJAX library to provide ARIA enhancements for Google Calendar, Google Finance and Google News. Now we are happy to share an early AxsJAX extension for Google Moderator that enables fluent eyes-free use of the tool.
For details about AxsJAX enhancements, see the AxsJAX FAQ. Briefly, you need Firefox 3.0 and a screenreader that supports W3C ARIA to take advantage of these enhancements. Users who do not have a screenreader installed can most easily experience the results by installing Fire Vox, a freely available self-voicing extension for Firefox.
You can activate the AxsJAX enhancement for Google Moderator either by clicking on the link that says "Click here for ARIA enhanced Google Moderator" or by accessing the ARIA-enhanced version directly. After enabling the enhancement, you can use Google Moderator via the keyboard, with all user interaction producing spoken feedback via W3C ARIA.
Here is a brief overview of the experience:
1. The user interface is divided into logical panes — one listing topic areas, and the other listing questions on a given topic. At times (e.g., before a meeting), you may find an additional Featured Question pane that shows a randomly selected question that you can vote on.
2. Users can ask new questions under a given topic, or give a thumbs-up/thumbs-down to questions that have already been asked.
3. Use the left and right arrow keys to switch between the two panes. You hear the title of the selected pane as you switch.
4. Use up and down arrows to navigate among the items in the selected pane. As you navigate, you hear the current item.
5. Hit enter to select the current item.
6. The current item can be magnified by repeatedly pressing the + (or =) key. To reduce magnification, press the - key.
7. When navigating the questions in a given topic, hit y or n to vote a question up or down.
8. When navigating items in the topic pane, hit a to ask a question. Once you confirm your request to post the question, it will show up in the list of questions for that topic so that others can vote that question up or down.
Please visit the Google Group for accessibility to provide feedback. This AxsJAX extension is still a work in progress, so we'd love to hear from you as we continue to work out the kinks.
Update on 4/14: Clarified in the second and third paragraphs that you do not need to install this enhancement. You can access it directly from Google Moderator.
Posted by Posted by T. V. Raman, Research Scientist, and Charles L. Chen, Software Engineer
Friday, 3 April 2009
ARIA for Google Calendar, Finance and News: In praise of timely information access
In our continued efforts to make Google applications more accessible, we have launched ARIA support for several Google applications over the last few months. W3C ARIA is a set of HTML DOM properties that enables adaptive technologies like screenreaders to work better with dynamic web applications. As with previous ARIA-enabled Google solutions, screenreader users can now switch on ARIA support in the following applications by activating an invisible Enable Screenreader Support link. Alternatively, simply browse to the links in this blog with a supporting screenreader and Firefox 3.0 to experience the interface enhancements. If you do not have a screenreader installed, but are curious to experience what eyes-free interaction with these applications feels like, we recommend the freely downloadable Firefox enhancement Fire Vox by Charles Chen.
- Google Calendar: The ARIA-enhanced Google Calendar enables speech-enabled access to the day view in Google Calendar. You can use the keyboard to move through events, move through the days of the week, as well as to cycle through your various calendars. As you work with the calendar, the application raises appropriate DOM events through W3C ARIA to invoke the relevant spoken feedback through the screenreader.
- Google Finance: The Finance page can be viewed as a set of logical panes, with related content appearing as items in each pane. The ARIA-enhanced version of Google Finance enables you to switch panes, and navigate the current pane with the arrow keys. Navigation produces spoken feedback through the screenreader. In addition, Google Finance provides several power user tools, including a stock screener, all of which are speech-enabled through ARIA. These power user tools provide interesting examples for Web developers experimenting with ARIA. (ARIA support for Finance was developed by intern Svetoslav Ganov as his starter project.)
- Google News: Finally, we have added ARIA support to enable rapid eyes-free access to Google News. These enhancements follow the same pattern as seen earlier for Google Finance, and the ability to navigate between the different views provided by Google News, (e.g., World News vs Sports enables rapid access to the large volume of news that is accessible via the Google News interface).
Posted by T. V. Raman, Research Scientist, and Charles L. Chen, Software Engineer
Thursday, 6 November 2008
Accessible View: An ARIA for web search
In the spirit of a recent post discussing some of our search experiments, last week we launched an opt-in search experiment we're calling Accessible View, which makes it easy to navigate search results using only the keyboard. Like many of our recent accessibility-related enhancements, this experiment is built using the basic functionality provided by W3C ARIA and Google-AxsJAX, an evolving set of HTML DOM properties that enable adaptive technologies to work better with AJAX-style applications.
The Accessible View experiment is another step toward making our search results more accessible for everyone. In July 2006, we launched Accessible Search on Google Labs, where the goal was to help visually impaired users find content that worked well with adaptive technologies. We continue to refine and tune the ranking on Accessible Search. And with Accessible View, users can easily toggle between regular Google search results and Accessible Search results by using the 'A' and 'W' keys.
When we designed the Accessible View interface, we first looked at how people used screen readers and other adaptive technologies when performing standard search-related tasks. We then asked how many of these actions we could eliminate to speed up the search process. The result: a set of keyboard shortcuts for effectively navigating the results page, and to arrange for the user's adaptive technology to speak the right information during navigation.
We've also added a magnification lens that highlights the user's selected search result. Since launching Accessible Search, one of the most requested features has been support for low-vision users. While implementing the keyboard navigation described here, we incorporated the magnification lens first introduced by Google Reader.
Bringing it all together, we implemented keyboard shortcuts that extend what was originally pioneered by the keyboard shortcuts experiment. These shortcuts help users navigate through different parts of the results page with a minimal number of keystrokes. The left and right arrows cycle through the various categories of items on the page (e.g., results, ads, or search refinements), and the up and down arrow keys move through the current category. Power users can leave their hands on the home row by using the h, j, k, and l keys. In addition, we enable an infinite stream of results viewed through the n and p keys — so you can move through the results without getting disoriented by a page refresh after the first 10 results.
Key | Behavior |
j/k | next/previous result |
n/p | next/previous result, scroll if necessary |
enter | open current result |
up/down | next/previous result |
left/right | switch categories (results, ads, refinements) |
a | jump to ads |
A | switch to Accessible Search results |
W | switch to default Google results |
r | jump to related searches |
Try out the experiment and give us your feedback.
Posted by T.V. Raman, Research Scientist, and Charles L. Chen, Software Engineer
Wednesday, 15 October 2008
Google Health feels accessible
From time to time, our own T.V. Raman shares his tips on how to use Google from his perspective as a technologist who cannot see -- tips that sighted people, among others, may also find useful.
Keeping track of personal health records using printed paper is painful at best for most users; as someone with a visual impairment, this is a show-stopper for me. As I begin paying more attention to my own health, I've come to realize first-hand how hard it is at present to track one's health using the means that traditional health care programs provide.
As luck would have it, Google Health arrived at around the same time that I started dealing with these issues, and focusing on the usability of Google Health from the perspective of someone who cannot see was therefore a no-brainer. Today, we are launching a version of Google Health that has been augmented with several usability enhancements that aid users of screen readers and self-voicing browsers. These enhancements are implemented using W3C ARIA, an emerging set of Web standards that make AJAX applications work smoothly with screen readers — see our related post on the GWT blog for details. With these enhancements, I can now easily navigate Google Health to not only manage my own health records; Google Health enables me to quickly research various relevant health conditions, track medications and do a myriad health-related tasks.
Google Health gives me a single unified web interface to manage all of my health-related information. Kudos to the Google Health and GWT teams for creating an extremely useful and usable solution!
Posted by T.V. Raman, Research Scientist