1

I'm working on a project that connects to a grade viewer and pulls off the html from the site. however, when it does so, it seems to be losing something. I'm connecting to the page and printing the web source with a Selenium WebDriver, yet the html it pulls off is slightly different than the html I can see on the page. there are just little chunks missing here and there. here is my code:

 //get into frame driver.switchTo().frame(driver.findElement(By.id("sg-legacy-iframe")));//get inside iFrame grades are held in WebElement full = driver.findElement(By.id("btnView")); full.submit();//click "show full view button" //Print out source PrintWriter pw = new PrintWriter(new FileWriter(new File("grades.txt"))); pw.println(driver.getCurrentUrl());//confirms the driver is on the correct page pw.println((driver.getPageSource()));//prints out html pw.close(); 

I suspect that perhaps it's some sort of cookie issue when switching between the page and the iFrame, but i have no idea really. i also have a copy of the correct HTML code it should be fetching and it's actual output, but they're large and cannot fit in the body. these are links to the HTML expected and output, with any confidential info changed. the main issue is the "AssignmentClass" div's not being found.

Desired HTML Output(HTML of the site)

HTML being output by my program

If anyone can shed some light as to why this could be happening or how to fix, I would love you forever.

4
  • How do you compare the html? and how do you get the html in the txt file? are those same version? Commented Mar 2, 2015 at 0:52
  • for the desired output, I just connected to the site and copy pasted the html into a text file. for the actual output, it was all done by java. they are very similar, i just scroll through them side by side to compare them. the documents start to vary shortly after the "<form method="post" action="Assignments.aspx" id="fmMain">" line Commented Mar 2, 2015 at 1:03
  • I am afraid but the versions of html you are comparing are not same. The one from source and one from client will not be necessarily same as per my understanding. Commented Mar 2, 2015 at 1:09
  • it's the same version of HTML, they come from the exact same page. its just mine is a subsequence of the desired output Commented Mar 2, 2015 at 1:15

1 Answer 1

1

In one of my past projects, I used getAttribute() to get the source of the html. So something along these lines, have you tried it already?

driver.switchTo().frame(driver.findElement(By.id("sg-legacy-iframe"))); WebElement full = driver.findElement(By.id("btnView")); full.submit();//click "show full view button" WebElement body = driver.findElement(By.tagName("body")); //Print out source PrintWriter pw = new PrintWriter(new FileWriter(new File("grades.txt"))); pw.println(driver.getCurrentUrl());//confirms the driver is on the correct page pw.println(body.getAttribute("innerHTML"));//prints out html pw.close(); 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.