How to get the body content from a PHP-generated HTML page?

Squids4Life

I am trying to get the content of an HTML page, using this code:

String malSearch = "http://myanimelist.net/anime.php?letter=" + firstLetter;
URL url = new URL(malSearch);
URLConnection con = url.openConnection();
InputStream in = con.getInputStream();
String encoding = con.getContentEncoding();
encoding = encoding == null ? "UTF-8" : encoding;
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[8192];
int len = 0;
while ((len = in.read(buf)) != -1) {
    baos.write(buf, 0, len);
}
String body = new String(baos.toByteArray(), encoding);

It works fine, but it doesn't give me what I really want. It gives me this:

<html>
 <head>
  <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
  <meta name="format-detection" content="telephone=no">
  <meta name="viewport" content="initial-scale=1.0">
  <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
 </head>
 <body style="margin:0px">
  <iframe src="/_Incapsula_Resource?CWUDNSAI=9&xinfo=6-122029399-0 0NNN RT(1404149034204 2) q(0 -1 -1 -1) r(0 -1) B12(4,315,0) U1&incident_id=124001330081285077-564449081699338326&edet=12&cinfo=4ee46646c753833e04000000" frameborder=0 width="100%" height="100%" marginheight="0px" marginwidth="0px">Request unsuccessful. Incapsula incident ID: 124001330081285077-564449081699338326</iframe>
 </body>
</html>

when it should give me the whole page (approximately 800 lines).

I think it's due to the fact this is a website using PHP, but I'm not really sure. Can someone tell me how I could get the whole HTML content?

Here's the page I'm trying to get the content from: http://myanimelist.net/anime.php?letter=A

ZigZag_IL

This site uses a service called Incapsula. The website admins configured Incapsula to prevent bots from accessing it's content.

I suggest you contact the admins and ask to be whitelisted, Trying to bypass the system will likely get you banned and blacklisted.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to get the content of a html page from another html page?

From Dev

How to insert HTML dynamically on php generated content

From Dev

How to get xml content that is generated by php

From Java

How to get content body from a httpclient call?

From Dev

Scala get html content from web page

From Dev

Get non html content from a page

From Dev

Get content from remote html page

From Dev

How to get current Content from HTML Table with PHP

From Dev

How to print the body content of a html page using sed

From Dev

How to get and send values from html <textarea> to a PHP page?

From Dev

How to show data from the generated php page to the another page?

From Dev

PHP MYSQL $_GET['ID'] from table on dynamically generated page

From Dev

Get location of href from page generated via PHP

From Dev

Python RoboBrowser - How to get content from this page

From Dev

How to get event page from content script?

From Dev

MS Access | How to get content from mail body to table?

From Dev

How to pass a variable generated from foreach loop to another page in PHP

From Dev

How to pass a variable generated from foreach loop to another page in PHP

From Dev

How to get the html content from UIWebView?

From Dev

Display content from database on HTML page via Ajax and PHP

From Dev

Get an specific class field content from the same page (Javascript or PHP)

From Dev

Get and save the content generated by PHP using FPDF

From Dev

How To Get Multiple HTML Page URL into PHP

From Dev

How PHP get the content from web service?

From Dev

How to get the link from content in php

From Dev

get body content of html file in java

From Dev

Content type HTML on PHP Page

From Dev

Get ajax generated content from another website

From Dev

How to get HTML page content using Mink with PhantomJS?

Related Related

  1. 1

    How to get the content of a html page from another html page?

  2. 2

    How to insert HTML dynamically on php generated content

  3. 3

    How to get xml content that is generated by php

  4. 4

    How to get content body from a httpclient call?

  5. 5

    Scala get html content from web page

  6. 6

    Get non html content from a page

  7. 7

    Get content from remote html page

  8. 8

    How to get current Content from HTML Table with PHP

  9. 9

    How to print the body content of a html page using sed

  10. 10

    How to get and send values from html <textarea> to a PHP page?

  11. 11

    How to show data from the generated php page to the another page?

  12. 12

    PHP MYSQL $_GET['ID'] from table on dynamically generated page

  13. 13

    Get location of href from page generated via PHP

  14. 14

    Python RoboBrowser - How to get content from this page

  15. 15

    How to get event page from content script?

  16. 16

    MS Access | How to get content from mail body to table?

  17. 17

    How to pass a variable generated from foreach loop to another page in PHP

  18. 18

    How to pass a variable generated from foreach loop to another page in PHP

  19. 19

    How to get the html content from UIWebView?

  20. 20

    Display content from database on HTML page via Ajax and PHP

  21. 21

    Get an specific class field content from the same page (Javascript or PHP)

  22. 22

    Get and save the content generated by PHP using FPDF

  23. 23

    How To Get Multiple HTML Page URL into PHP

  24. 24

    How PHP get the content from web service?

  25. 25

    How to get the link from content in php

  26. 26

    get body content of html file in java

  27. 27

    Content type HTML on PHP Page

  28. 28

    Get ajax generated content from another website

  29. 29

    How to get HTML page content using Mink with PhantomJS?

HotTag

Archive