Rectangle 27 0

java input values missing when parsing html with Jsoup?


Reading from the URL directly
Saving the Browser(Chrome) content to html file and reading from that html
http://www.4shared.com/get/i-EbooI0/batman_hd.html
public static void main(String[] args) throws IOException
{

    try
    {
        Map<String, String> cookieMap = new HashMap<String, String>();
        cookieMap.put("day1host", "h");
        cookieMap.put("d1.loginity.mark", "1");
        cookieMap.put("hostid", "-1314014314");
        cookieMap.put("__qca", "P0-2042580316-1371938383086");
        cookieMap.put("cd1v", "OOhB");
        cookieMap.put("c29", "1");
        cookieMap.put("__utma", "210074320.280144312.1371938377.1371938377.1371938377.1");
        cookieMap.put("__utmb", "210074320.4.10.1371938377");
        cookieMap.put("__utmc", "210074320");
        cookieMap.put("__utmz", "210074320.1371938377.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)");


        Document document = Jsoup.connect("http://www.4shared.com/get/i-EbooI0/batman_hd.html")
        .userAgent("Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36")
        .followRedirects(true)
        .cookies(cookieMap)
        .get();
        //System.out.println(document.html());
        //System.out.println("====================================================================");
        Elements elements = document.select("input[type=hidden]");
        for (Iterator<Element> iterator = elements.iterator(); iterator.hasNext();)
        {
            Element element = iterator.next();
            System.out.println(element);

        }
    }
    catch (Exception e)
    {
        e.printStackTrace();
    }

}

@user2489210 That is not listed in the html code when viewed in Chrome as well

Because the specific hidden field I am looking for is ommited in the JSOUP, fetched html. I would like to get all the hidden fields but am specifically looking for this one: <input type="hidden" id="baseDownloadLink" value="dc611.4shared.com/download/i-EbooI0/; This is NOT listed in my Jsoup code

Here you go, try this out, I'll explain you things if this works!!

I need to get every "hidden" value in this document=view-source:4shared.com/get/i-EbooI0/batman_hd.html. This is Chrome's source of the webpage. JSoup does not give me all the Hidden values. What am I missing? I previously tried Doc.html() and it did not help

If you observe the same behavior for other URL's as well then you have to write the code to catch the cookies of a Response and then pass them in the subsequent Request until you get the desired Hidden fields.

Im not sure if the below pattern is same for all theURL's you are trying.

Im performing Step 3 directly in the code.

Jsoup will clean up your HTML content while parsing and also It can handle your HTML though its not well-formed. Try to dump the html after parsing i.e, Document.html() and check the dump if your discarded elements are eligible for your select clause.

No hidden fields in the <body> yet. Confirm this looking into the Elements tab.

Now you have the required Hidden fields in the <body>.

There is a site redirection from /get/i-EbooI0/batman_hd.html to android/i-EbooI0/batman_hd.html. While redirection its sending out 2 cookies in response to the 1st request.

when using inspect element it is. Will post screenshot

Note