Hi so I tried opening the link below in a browser and it works but not in the code. The link is actually a combination of a news site and then the extension of the article called from another file url.txt. I tried the code with a normal website (www.google.com) and it works perfectly.
import sys import MySQLdb from mechanize import Browser from bs4 import BeautifulSoup, SoupStrainer from nltk import word_tokenize from nltk.tokenize import * import urllib2 import nltk, re, pprint import mechanize #html form filling import lxml.html with open("url.txt","r") as f: first_line = f.readline() #print first_line url = "http://channelnewsasia.com/&s" + (first_line) t = lxml.html.parse(url) print t.find(".//title").text And this is the error I am getting.
And this is the content of url.txt
/news/asiapacific/australia-to-send-armed/1284790.html
