There are a couple of ways to extract subtitles from a YouTube video -
By specifying the language and VideoId in this generic URL - http://video.google.com/timedtext?lang={LANG}&v={VIDEOID} http://www.youtube.com/api/timedtext?lang={LANG}&v={VIDEOID} http://www.youtube.com/api/timedtext?lang={LANG}&v={VIDEOID} you can get an .xml.xml file containing the subtitles in the desired language for a choosenchosen video.
To get rid of the tags within that file and to just have the plain-text transcript, here is what you have to do:
- Open up Microsoft Excel
- Copy paste the subtitles inside one cell
- Press Ctrl+HCtrl+H
- In the replace tab type <*> in the Find What"Find What" textbox and leave the Replace With textbox blank"Replace With" textbox blank, and click Replace AllReplace All. The search expression will remove all tags within the original text.
Alternatively, there is an open-source tool called Google2SRT that downloads all available subs from a YouTube video with one click & converts them into .srt.srt format so that it can be used within media players like VLC Media Player.
Update: Ted.com now provides transcripts of the talks on its site.