0

I want to access a webpage & store the contents of the webpage into a database this is the code I have tried for reading the contents of the webpage

 public static WebClient wClient = new WebClient(); public static TextWriter textWriter; public static String readFromLink() { string url = "http://www.ncedc.org/cgi-bin/catalog-search2.pl"; HttpWebRequest webRequest = WebRequest.Create(url) as HttpWebRequest; webRequest.Method = "POST"; System.Net.WebClient client = new System.Net.WebClient(); byte[] data = client.DownloadData(url); string html = System.Text.Encoding.UTF8.GetString(data); return html; } public static bool WriteTextFile(String fileName, String t) { try { textWriter = new StreamWriter(fileName); } catch (Exception) { return false; Console.WriteLine("Data Save Unsuccessful: Could Not create File"); } try { textWriter.WriteLine(t); } catch (Exception) { return false; Console.WriteLine("Data Save UnSuccessful: Could Not Save Data"); } textWriter.Close(); return true; Console.WriteLine("Data Save Successful"); } static void Main(string[] args) { String saveFile = "E:/test.txt"; String reSultString = readFromLink(); WriteTextFile(saveFile, reSultString); Console.ReadKey(); } 

but this code gives me an o/p as- This script should be referenced with a METHOD of POST. REQUEST_METHOD=GET

please tell me how to resolve this

1
  • 1
    You've invented a 24 line method that does what File.WriteAllText does. And lines of the first method don't even matter... Commented Jul 9, 2013 at 4:32

4 Answers 4

3

You are mixing HttpWebRequest with System.Net.WebClient code. They are a different. You can use WebClient.UploadValues to send a POST with WebClient. You will also need to provide some POST data:

 System.Net.WebClient client = new System.Net.WebClient(); NameValueCollection postData = new NameValueCollection(); postData.Add("format","ncread"); postData.Add("mintime","2002/01/01,00:00:00"); postData.Add("minmag","3.0"); postData.Add("etype","E"); postData.Add("outputloc","web"); postData.Add("searchlimit","100000"); byte[] data = client.UploadValues(url, "POST", postData); string html = System.Text.Encoding.UTF8.GetString(data); 

You can find out what parameters to pass by inspecting the POST message in Fiddler. And yes, as commented by @Chris Pitman, use File.WriteAllText(path, html);

Sign up to request clarification or add additional context in comments.

6 Comments

thanx :) bt i have one more query... should i directly POST to the specified url (mentioned in the code) or do i have to POST to the page previous to this??? (ncedc.org/anss/catalog-search.html)
You should use the one in the code, that is the POST url. This one (ncedc.org/anss/catalog-search.html), is the GET page where the values are compiled in a form and sent to the POST page. Are you having problems getting the POST response?
thanx i got the solution :) now the content i am getting is all mixed, i.e it lies between html tags, how do i gt only the required content & store it in any file(eg CSV)
Use HtmlAgilityPack to parse HTML.
@fcuesta- cn u post some code on how to parse html doc? cn the same be used to parse txt docs?
|
0

I'm not sure if it's a fault on your side as I get the same message just by opening the page. The page source does not contain any html so I don't think you can do webRequest.Method = "POST". Have you spoken to the administrators of the site?

Comments

0

The .NET framework provides a rich set of methods to access data stored on the web. First you will have to include the right namespaces:

using System.Text; using System.Net; using System.IO; 

The HttpWebRequest object allows us to create a request to the URL, and the WebResponse allows us to read the response to the request.

We’ll use a StreamReader object to read the response into a string variable.

HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(URL); myRequest.Method = "GET"; WebResponse myResponse = myRequest.GetResponse(); StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8); string result = sr.ReadToEnd(); sr.Close(); myResponse.Close(); 

In this code sample, the URL variable should contain the URL that you want to get, and the result variable will contain the contents of the web page. You may want to add some error handling as well for a real application.

1 Comment

That's nice information and all, but its pretty obvious that the issue is he is doing GET instead of a POST to get some search results, and your answer does nothing to help that.
0

As far as I see, the URL you're requesting is a perl script. I think it demands POST to get search arguments and therefore delivers search results.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.