Parse log with regular expression

Jacob

I'm looking for kind of solution for parsing the Varnish log file. It looks like:

178.232.38.87 - - [23/May/2012:14:01:05 +0200] "GET http://static.vg.no/iphone/js/front-min.js?20120509-1 HTTP/1.1" 200 2013 "http://touch.vg.no/" "Mozilla/5.0 (Linux; U; Android 2.3.3; en-no; HTC Nexus One Build/GRI40) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1"

There can be distinguished following elements:

%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i"

but I still have no idea how to do this. Simple String.split(" "); won't work.

I know regular expressions has general rules, but the most suitable would be java one.

Thanks

laune

I'd come up with a way to build a regular expression from chunks matching the individual fields according to their possible/expected values.

    String rexa = "(\\d+(?:\\.\\d+){3})";  // an IP address
    String rexs = "(\\S+)";                // a single token (no spaces)
    String rexdt = "\\[([^\\]]+)\\]";      // something between [ and ]
    String rexstr = "\"([^\"]*?)\"";       // a quoted string
    String rexi = "(\\d+)";                // unsigned integer

    String rex = String.join( " ", rexa, rexs, rexs, rexdt, rexstr,
                              rexi, rexi, rexstr, rexstr );

    Pattern pat = Pattern.compile( rex );
    Matcher mat = pat.matcher( h );
    if( mat.matches() ){
        for( int ig = 1; ig <= mat.groupCount(); ig++ ){
            System.out.println( mat.group( ig ) );
        }
    }

It is, of course, possible to make do with rexs in place of rexa or rexi.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Regular expression to parse log file

From Dev

Regular expression to parse url

From Dev

Regular expression to parse data

From Dev

Regular Expression Parse Double

From Dev

Parse Regular expression in R

From Dev

Parse key=value with regular expression

From Dev

Regular expression to parse escape characters

From Dev

Javascript Regular Expression to Parse CSS

From Dev

Regular expression to parse AT command response

From Dev

Regular expression for parse xrandr response

From Dev

unable to parse - in Regular expression in Javascript

From Dev

Regular expression to parse escape characters

From Dev

Regular expression to parse my string

From Dev

Regular expression to parse configuration file

From Dev

Parse EML text With Regular Expression

From Dev

Parse arithmetic string with regular expression

From Java

Parse custom data by JavaScript regular expression

From Dev

Regular expression to parse FTP link string

From Dev

Parse multiple choice list with Regular Expression

From Dev

Regular Expression to Parse Limited SQL Where Clause

From Dev

Regular expression to parse delimited, qualified string

From Dev

Java regular expression to parse between dates?

From Dev

How to parse dice notation with a Java regular expression?

From Dev

Regular expression for parse function arguments with functions

From Dev

parse HTML table row using regular expression

From Dev

Regular expression to parse template blocks in HTML

From Dev

Regular Expression to Parse Limited SQL Where Clause

From Dev

How to parse this string into tokens using Regular Expression

From Dev

How to use regular expression parse text with symbol "| "

Related Related

HotTag

Archive