Parser as Log

The PARSE-AS-LOG directive parses access log files, such as from Apache HTTPD and nginx servers.


parse-as-log <column> <format>

Usage Notes

The PARSE-AS-LOG directive provides a generic log parser that you construct by specifying the <format> of the log line or the format in which the file was written. The format, which specifies the configuration options of the log line, becomes the schema of the access log lines as written by the service.

In addition to the config options specified in the Apache HTTPD manual under Custom Log Formats, these options are recognized:

  • common
  • combined
  • combinedio
  • referer
  • agent

For example: to parse either combined log format or common log format, you can use:

parse-as-log body combined
parse-as-log body common

If you have logs that are in a format that is not supported, you can specify a custom format. In that case, format is a string of tokens specifying the log line format.

For nginx, the format tokens are specified by log_format and embedded variables.


The format for common log is:

%h %l %u %t "%r" %>s %b

and a corresponding log line as a record that needs to be parsed, in that format:

  "body": " - frank [10/Oct/2000:13:55:36 -0700] \"GET /apache_pb.gif HTTP/1.0\" 200 2326"

Applying either of these directives:

parse-as-log body %h %l %u %t "%r" %>s %b
parse-as-log body common

would result in this record:

  "ip_connection_client_host": "",
  "ip_connection_client_host.last": ""
  "number_connection_client_logname": null,
  "number_connection_client_logname.last": null,
  "http_path_request_firstline_uri_path": "/apache_pb.gif",
  "http_ref_request_firstline_uri_ref": null

Using the Combined Log Format:

%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\

and a corresponding log line: - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "
html" "Mozilla/4.08 [en] (Win98; I ;Nav)"

Complex formats, such as this, can also be parsed:

%t %u [%D %h %{True-Client-IP}i %{UNIQUE_ID}e %r] %{Cookie}i %s \"%{User-Agent}i\" \"%{host}i\" %l %b %{Referer}i

Example log line for such a format (note: reformatted from a single line):

[03/Dec/2013:10:53:59 +0000] - [32002 Up24RwpmBAwAAA1LWJsAAAAR GET
image.W0N539E3452S3991w313.original.jpg HTTP/1.1] __utmc=94539802; dtCookie=EFD9D09B6A2E1789F1329FC1
381A356A|_default|1; dtPC=471217988_141#_load_; Carte::KerberosLexicon_getdomain=6701c1320dd96688b2e
40b92ce748eee7ae99722; UserData=Username%3ALSHARMA%3AHomepage%3A1%3AReReg%3A0%3ATrialist%3A0%3ALangua
ge%3Aen%3ACcode%3Aae%3AForceReReg%3A0; UserID=1375493%3A12345%3A1234567890%3A123%3Accode%3Aae; USER_D
main%3A1386060868.51392%3A6701c1320dd96688b2e40b92ce748eee7ae99722; MODE=FONTIS; __utma=94539802.9110
97326.1339390457.1386060848.1386065609.190; __utmz=94539802.1384758205.177.38.utmcsr=google|utmccn=(o
rganic)|utmcmd=organic|utmctr=(not%20provided); __kti=1339390460526,,;
__ktv=28e8-6c4-be3-ce54137d9e48271; WT_FPC=id=
6046439530; _opt_vi_3FNG8DZU=42880957-D2F1-4DC5-AF16-FEF88891D24E; __hstc=145721067.750d315a49c642681
GvPmmyn8Bk67OUv-HwjVU4Ff3q1w|1351772962000; hubspotutk=750d315a49c64268192826b3911a4e5a; __ptca=14572
1067.jQ7lN5U3C4eN.1351758562.1381136713.1381283233.66; __ptv_62vY4e=jQ7lN5U3C4eN; __pti_62vY4e=jQ7lN5
U3C4eN; __ptcz=145721067.1351758562.1.0.ptmcsr=google|ptmcmd=organic|ptmccn=(organic)|ptmctr=domain.
com; RM=Lsharma%3Ac163b6097f90d2869e537f95900e1c464daa8fb9; wcid=Up2cRApmBAwAAFOiVhcAAAAH%3Af32e5e5f5
b593175bfc71af082ab26e4055efeb6; __utmb=94539802.71.9.1386067462709; edge_auth=ip%3D195.229.241.182~
87dcd44c1d837e2e54; has_js=1; SECTION=%2Fcontent%2Fsection%2Finspiration-design%2Fstreet-shots.html;
JSESSIONID=b9377099-7708-45ae-b6e7-c575ffe82187; WT_FPC=id=
3618209:ss=1386053618209; USER_GROUP=LSharma%3Afalse; NSC_wtfswfs_xfcgbsn40-41=ffffffff096e1a1d45525
d5f4f58455e445a4a423660 200 "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
"" - 24516