我在存储JSON数据的String中创建了一个正则表达式,可以在JSON数据中找到所有图像:
Pattern pattern = Pattern.compile("<a[^>]*>");
Matcher matcher = pattern.matcher(contentString.toString());
while(matcher.find()) {
Log.i(TAG, "MATCHER : "+ matcher.group());
}
返回值:
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a rel="prettyPhoto[gallery-113]" href='http://www.bundoransurfco.com/wp-content/uploads/2014/11/april-13.jpg'>
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a rel="prettyPhoto[gallery-113]" href='http://www.bundoransurfco.com/wp-content/uploads/2014/11/10501752_10152650053307000_6249740615573255728_n1.jpg'>
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.windguru.cz/int/index.php?sc=103244">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.xcweather.co.uk/">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.buoyweather.com/wxnav6.jsp?region=UK&program=nww3BW1&grb=nww3&latitude=55.0&longitude=-8.75&zone=0&units=e">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.windguru.cz/int/index.php?sc=103244">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.xcweather.co.uk/">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.buoyweather.com/wxnav6.jsp?region=UK&program=nww3BW1&grb=nww3&latitude=55.0&longitude=-8.75&zone=0&units=e">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://magicseaweed.com/Bundoran-Surf-Report/50/">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://magicseaweed.com/UK-Ireland-MSW-Surf-Charts/1/">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.marine.ie/Home/site-area/data-services/marine-forecasts/wave-forecasts">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://magicseaweed.com/Bundoran-Surf-Report/50/">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://magicseaweed.com/UK-Ireland-MSW-Surf-Charts/1/">
04-13 16:33:57.279 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.marine.ie/Home/site-area/data-services/marine-forecasts/wave-forecasts">
04-13 16:33:57.280 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://news.bbc.co.uk/weather/forecast/13000">
04-13 16:33:57.280 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.met.ie/">
04-13 16:33:57.280 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://news.bbc.co.uk/weather/forecast/13000">
04-13 16:33:57.280 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.met.ie/">
04-13 16:33:57.280 3642-3657/jardelcompany.bundoransurfco I/MainActivity﹕ MATCHER : <a href="http://www.irishtimes.com/weather/tides.html">
但是现在我想计算以“ <a rel="prettyPhoto[gallery-113]"
”开头的链接数,并将其href链接存储在Array中...
你有什么主意吗 ?干杯!
您不应该使用正则表达式来解析HTML,而是使用正确的解析器。造成这种情况的原因有很多,
rel
和href
可以更改),这样您就可以看到像<a href="..." rel="">
这样的元素,如果您的正则表达式可以描述的话,可以轻松地跳过这些元素<a rel="..." href="...">
。"
但是'
这会带来跳过重要数据的额外风险。我喜欢使用jsoup,因此这里是该库的示例:
Document doc = Jsoup.parse(htmlText);
Elements relLinks = doc.select("a[rel]");
//or if you want rel only with "prettyPhoto[gallery-113]" use
//Elements relLinks = doc.select("a[rel=prettyPhoto[gallery-113]]");//
System.out.println("number of `rel`: "+relLinks.size());
for (Element el : relLinks){
System.out.println(el.attr("href"));
}
输出:
number of `rel`: 2
http://www.bundoransurfco.com/wp-content/uploads/2014/11/april-13.jpg
http://www.bundoransurfco.com/wp-content/uploads/2014/11/10501752_10152650053307000_6249740615573255728_n1.jpg
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句