1

Hi all as per the requirement i am having i would like to extract the data from this site

http://loving1.tea.state.tx.us/lonestar/Menu_dist.aspx?parameter=101902

I would like to extract the data that was presented in grid how can i can any one help me

I tried this

WebRequest request = WebRequest.Create("http://loving1.tea.state.tx.us/lonestar/Menu_dist.aspx?parameter=101902"); WebResponse response = request.GetResponse(); Stream data = response.GetResponseStream(); string html = String.Empty; using (StreamReader sr = new StreamReader(data)) { html = sr.ReadToEnd(); } 

enter image description here

The gird data i would like to extract is in the image. Please help

4
  • Why don't you simply ask them where do they get the data from and request that data through an api that they might have, or you will probably run into legal issues here... Commented Nov 7, 2011 at 10:20
  • thre is Export To Excel action button, I believe you can export in Excel and then parse table or more straightforward way - read whole HTML page and parse it by finding specific table tag Commented Nov 7, 2011 at 10:21
  • I am unable to find the specified tag when i view source Commented Nov 7, 2011 at 10:22
  • 1
    How about using Html Agility for this Commented Nov 7, 2011 at 10:23

2 Answers 2

1

Use WebClient.DownloadString("http://loving1.tea.state.tx.us/lonestar/Menu_dist.aspx?parameter=101902") to get the data from the server.
And than use HTMLAgilityPack to parse the html.

Sign up to request clarification or add additional context in comments.

1 Comment

As @Oded wrote, you can all do with agility pack... first download and then extract data with XPath
1

Straightforward way - download a page and parse HTML by finding out appropriate <table> tags, but in this way your "parser" has to be updated each time even HTML layout has been changed or whatever...

An other way is to leverage "Export To..." feature which is kindly provided by the site, so you can simulate HTTP request using "Export to Excel 2007 button". The idea is Excel 2007 workbooks is a zip archive with an XML data files and CSS style sheets. So you would be able to load well-formed XML data file/multiple files.

Underlying URL:

http://loving1.tea.state.tx.us/Common.Cognos/Secured/ReportViewer.aspx?reportSearchPath=/content/folder[@name='TPEIR']/folder[@name='LS']/package[@name='Districts and Schools']/report[@name='AAG5_Dist_Over']&ui.name=AAG5_Dist_Over&year=2010&district=101902&server=Loving1.tea.state.tx.us/lonestar

then download XLSX file which is ZIP archive with embedded XML files

  • xl\worksheets\Sheet1.xml
  • xl\workbook.xml

so just unzip, load XML and enjoy it...

4 Comments

@sll so as per ur saying i have to save the xlsx and then read the required content from that right
@Vivekh : yep, basically 1) save XLSX 2) unzip (use any of these libs) 3) load required XML files
Hi Sll a small question will i get the XLSX from the link u posted
You can use Fiddler to see exact URL whilst downoading a file (I believe you see Export button on top of the page)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.