Java
Pulls the intro sentence from a random Wikipedia article:
import java.io.InputStream; import java.net.URL; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; public class RandomSentence { public static void main (String[] args) throws Exception { String sentence; do { InputStream in = new URL("https://en.wikipedia.org/wiki/Special:Random").openStream(); Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in); String intro = doc.getElementsByTagName("p").item(0).getTextContent(); sentence = intro.replaceAll("\\([^(]*\\) *", "").replaceAll("\\[[^\\[]*\\]", "").split("\\.( +[A-Z0-9]|$)")[0]; } while (sentence.endsWith(":") || sentence.length() < 30 || sentence.contains("?")); System.out.println(sentence + "."); } } Sometimes you get unlucky; I try to minimize this by setting a minimum sentence length and filtering out sentences that end with ":" (all disambiguation pages start that way) or contain a "?" (there seem to be many articles with unresolved unknown info marked by question marks). Sentence boundaries are a period followed by whitespace followed by a number or capital letter.
I also filter out text in parentheses (the result is still a valid sentence) to try and remove some periods that aren't sentence boundaries. I filter out square braces to remove source citation numbers. Example (5 runs)Examples:
- Idle Cure was an arena rock band from Long Beach, California.
- Self-focusing is a non-linear optical process induced by the change in refractive index of materials exposed to intense electromagnetic radiation.
- TB10Cs4H3 is a member of the H/ACA-like class of non-coding RNA molecule that guide the sites of modification of uridines to pseudouridines of substrate RNAs.
- The Six-headed Wild Ram in Sumerian mythology was one of the Heroes slain by Ninurta, patron god of Lagash, in ancient Iraq.
- Sugar daddy is a slang term for a man who offers to support a typically younger woman or man after establishing a relationship that is usually sexual.
- Old Bethel United Methodist Church is located at 222 Calhoun St., Charleston, South Carolina.
- Douglas Geers is an American composer.
If you notice any grammar issues, well, that's your fault for not being a diligent Wikipedia editor! ;-)