Rectangle 27 0

java JSOUP converting original html to some additional encoded values?


<!DOCTYPE html>
<html xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" class="SAF" id="global-header-light">
 <head></head>
 <body>
  <div style="background-image: url(http://aka-cdn-ns.adtech.de/rm/ads/23274/HPWomenLOFT_1381687318.jpg);background-repeat: no-repeat;-webkit-background-size: 1001px 2059px; height: 2059px; width: 1001px; text-align: center; margin: 0 auto;">
   <div style="height:2058px; padding-left:0px; padding-top:36px;">
    <iframe style="height:90px; width:728px;"></iframe>
   </div>
  </div>
 </body>
</html>
Parser.xmlParser()
String html = "<!DOCTYPE html>" +
                "<html xmlns:og=\"http://opengraphprotocol.org/schema/\" xmlns:fb=\"http://www.facebook.com/2008/fbml\" xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\" class=\"SAF\" id=\"global-header-light\">" +
                "<head></head>" +
                "<body>" +
                "<div style=\"background-image: url(http://aka-cdn-ns.adtech.de/rm/ads/23274/HPWomenLOFT_1381687318.jpg);background-repeat: no-repeat;-webkit-background-size: 1001px 2059px; height: 2059px; width: 1001px; text-align: center; margin: 0 auto;\">" +
                "<div style=\"height:2058px; padding-left:0px; padding-top:36px;\">" +
                "<iframe style=\"height:90px; width:728px;\" /></div></div></body></html>";

Document doc = Jsoup.parse(html, "", Parser.xmlParser());
System.out.println(doc);
String url = request.getParameter("htmluri").trim(); 
System.out.println("Fetching %s..."+url); 
String xml = Jsoup.connect(url).get().toString();
Document doc = Jsoup.parse(xml, "", Parser.xmlParser());

If you parse it using xmlParser it won't add the additional values. For example:

You could first get the remote file as a String and then use the rest of my code as normal:

can you let me know how you will modify this code to match your suggestion String url = request.getParameter("htmluri").trim(); System.out.println("Fetching %s..."+url); Document doc = Jsoup.connect(url).get();

Note