What regex matches a timestamp in a log file?

For ISO 8601 / RFC 3339 timestamps use `\d{4}-\d{2}-\d{2}[T ]\d{2}:\d{2}:\d{2}`, optionally followed by a fractional part and a `Z` or `±hh:mm` offset. Apache's default format `[10/Oct/2024:13:55:36 +0000]` needs a different pattern built around the `dd/Mon/yyyy` layout inside square brackets.

How do I match an IP address with regex?

A format-only pattern like `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}` is fine for grepping logs, but it also matches `999.999.999.999`. For real validation, use the octet pattern `(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d)` repeated four times with dots between.

Are named capture groups safe to use in log regex?

Yes. Named groups `(? ...)` have been supported across all major browsers since July 2020 and in Python, PCRE, Go, and .NET for far longer. They make field extraction self-documenting because you read `match.groups.ip` instead of `match[1]`.

Can I parse logs without uploading them to a server?

Yes. A browser-based regex tester runs the match locally in JavaScript, so log lines containing IPs, user IDs, or tokens never leave your machine. That matters because access logs are often classed as personal data under privacy rules.

Regex for Log Parsing: Extract Timestamps, IPs & Codes

Logs are dense, semi-structured text, and split(' ') breaks the moment a field contains a space or a quoted string. Regex is the right tool: one pattern can pull the timestamp, client IP, HTTP method, and status code out of a single access-log line in one pass. This guide gives you copy-ready patterns for each field, plus a single named-group regex that parses a full combined-format line.

TL;DR

Use regex, not split(), because log fields contain spaces and quotes.
Match ISO 8601 timestamps with \d{4}-\d{2}-\d{2}[T ]\d{2}:\d{2}:\d{2}.
Validate IPv4 octets with 25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d, not \d{1,3}.
Capture HTTP codes with \b[1-5]\d{2}\b near the request.
Named groups (?<name>...) make the whole line self-documenting.

Why parse logs with regex instead of splitting on spaces

The first instinct is to split each line on whitespace and index into the array. It works on toy examples and fails in production within an hour.

The quoting problem

The Apache combined format wraps the request line and user agent in double quotes: "GET /api?q=a b HTTP/1.1". That single field contains spaces, so a naive split shatters it into three or four pieces and every index after it shifts. Regex sidesteps the problem by matching the quotes explicitly with "([^"]*)" and treating everything between them as one unit.

When `split()` falls apart

Real log lines mix delimiters: spaces between top-level fields, colons inside timestamps, brackets around the date, and quotes around free text. A delimiter-based parser needs a different rule for each, which is just a regex written badly. One well-formed pattern handles all of them at once.

Regex as a single source of truth

When the pattern lives in one place, changing the log format means editing one string. You paste a sample line and the pattern into a browser-based regex tester and watch the groups light up before you wire it into a script. Because the match runs locally in JavaScript, log lines with IPs and tokens never leave your machine — which matters, since access logs frequently count as personal data under privacy rules and should not be pasted into a random server-side tool.

There is also a performance angle. A single anchored regex scans each line once, while a chain of splits, slices, and conditionals walks the same string several times. On a multi-gigabyte log, the difference between one pass and five is the difference between a query that finishes and one you abandon.

How to extract a timestamp from a log line with regex

Timestamps are the field most worth getting right, because nearly every downstream query filters on time.

ISO 8601 / RFC 3339 timestamps

Most modern services log in ISO 8601, standardised by ISO and profiled for the internet by RFC 3339 as 1994-11-05T13:15:30Z. The T separates date from time and the trailing Z (Zulu) means UTC. A pattern that covers the common variants:

(?<ts>\d{4}-\d{2}-\d{2}
[T ]\d{2}:\d{2}:\d{2}
(?:\.\d+)?(?:Z|[+-]\d{2}:\d{2})?)

Read as one line, that is a date, a T or space, a time, an optional fractional second, and an optional Z or ±hh:mm offset. The offset matters: the same instant logged in +00:00 and -05:00 is identical, so you normalise to UTC before comparing.

Apache / Common Log Format timestamps

Apache's default does not use ISO 8601. It writes [10/Oct/2024:13:55:36 +0000] with a dd/Mon/yyyy date inside square brackets. Match it with:

\[(\d{2})/(\w{3})/(\d{4}):
(\d{2}:\d{2}:\d{2})\s
([+-]\d{4})\]

The month is a three-letter abbreviation, so you map Oct to 10 in code rather than in the pattern.

Syslog's headerless timestamp

Classic syslog is harder: Oct 10 13:55:36 has no year and pads single-digit days with a space (Oct 9). Use \w{3}\s+\d{1,2}\s\d{2}:\d{2}:\d{2} and supply the year from context. Once extracted, you can convert any of these to an epoch value with a Unix timestamp converter to make ranges sortable as plain integers.

How to match an IP address in a log file with regex

The client IP is usually the first field, but matching it correctly is where most patterns quietly go wrong.

The naive pattern that accepts 999.999.999.999

The everywhere-copied pattern is:

\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

It matches the shape of an IPv4 address and nothing more. It happily accepts 999.999.999.999 and 300.1.1.1, neither of which is a real address. For grepping a log you already trust, shape-matching is fine and fast. For validation, it is wrong.

A correct IPv4 octet pattern

To enforce the 0–255 range, match each octet with an alternation that spells out the valid ranges:

(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d)

Each branch covers a slice: 25[0-5] for 250–255, 2[0-4]\d for 200–249, 1\d{2} for 100–199, [1-9]\d for 10–99, and \d for 0–9. The full address repeats it four times:

\b(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d)
(?:\.(?:25[0-5]|2[0-4]\d|1\d{2}
|[1-9]\d|\d)){3}\b

The leading [1-9]\d branch also rejects leading zeros, so 01.02.03.04 will not match.

A note on IPv6

IPv6 is a different problem: eight hex groups with :: compression that can appear once anywhere. A correct IPv6 regex is long and error-prone, so for IPv6-heavy logs it is usually better to match a loose [0-9a-fA-F:]+ candidate and validate it with a real parser. Do not hand-write a "perfect" IPv6 regex under deadline.

How to extract HTTP status and error codes with regex

Status codes are how you find the failures, so this field drives most ad-hoc log queries.

Matching 4xx and 5xx errors

An HTTP status is always three digits starting 1–5. To find only errors, anchor on the leading digit:

\b[45]\d{2}\b

That matches 400 through 599 and ignores 200, 301, and stray three-digit numbers like byte counts when combined with position context.

Capturing the status into a named group

In a full line you want the status tied to a name, not a magic index. Named capturing groups, written (?<name>...), have been supported across all major browsers since July 2020 per MDN, and you read the result off the groups object:

const m = line.match(re);
if (m) {
  console.log(m.groups.status); // "404"
  console.log(m.groups.ip);     // "203.0.113.7"
}

Application log levels: ERROR, WARN, INFO

App logs use words, not numbers. A case-tolerant level matcher:

\b(?<level>ERROR|WARN(?:ING)?
|INFO|DEBUG|TRACE|FATAL)\b

Anchoring with \b stops INFORMATION from matching as INFO. Pair the level with the timestamp group and you can slice a log down to "every ERROR after 14:00" with two captures.

Putting it together: one regex for a full access-log line

Individual patterns are useful, but the real win is one regex that names every field of a line at once.

Named groups for every field

For an Apache/nginx combined line, a readable pattern looks like this:

^(?<ip>\S+) \S+ \S+
\[(?<ts>[^\]]+)\]
"(?<method>\S+) (?<path>\S+)
[^"]*" (?<status>\d{3})
(?<bytes>\d+|-)

Each field is a named group, so the extraction reads like the format spec instead of a row of numbered indexes.

Apache combined vs nginx combined

The two servers log almost the same line, which is why one pattern fits both. The field order is identical; only the source tokens differ. Apache's mod_log_config and nginx's ngx_http_log_module define them:

Field	Apache token	nginx variable
Client IP	`%h`	`$remote_addr`
Timestamp	`%t`	`$time_local`
Request line	`%r`	`$request`
Status code	`%>s`	`$status`
Bytes sent	`%b`	`$body_bytes_sent`

Because both default to the same field sequence, the named-group pattern above parses either without changes.

Testing before you deploy

Never ship a log regex you have not run against real lines. A quick checklist before it goes into a script:

Test against a normal line, an error line, and a malformed line.
Confirm every named group captures the expected substring.
Check that an empty field (a - for bytes) still matches.
Verify the timestamp branch covers your server's exact format.

Paste a handful of real lines into the regex tester and confirm each group before automating. Once fields are clean, you can pipe them into structured output — a CSV ↔ JSON converter turns the extracted rows into JSON for a dashboard without a server round-trip.

References

Named capturing group - JavaScript | MDN — confirmed (?<name>...) syntax, the groups accessor, and cross-browser support since July 2020.
RFC 3339 — Date and Time on the Internet: Timestamps — internet profile of ISO 8601 used for the timestamp pattern and the Z / offset rules.
Apache Module mod_log_config — source of the combined-log field tokens (%h, %t, %r, %>s, %b).
Module ngx_http_log_module — nginx combined-format variables used in the field comparison table.

Related on iKit

Keep this regex cheatsheet open while you build log patterns — the character classes and quantifiers behind every pattern in this article.
Capture groups are how you pull fields out of a matched line — deeper on $1, $&, and (?<name>) groups used here.
Test a log pattern in 30 seconds before scripting it — the paste-and-check workflow for validating these regexes.
Matching URLs has the same edge cases as matching log paths — query strings and encoded characters that trip up request-line matching.
Lookahead and lookbehind help when a field's boundary is tricky — useful for status codes wedged between other numbers.
Your log regex may behave differently in Python vs JavaScript — flavor differences that matter when the same pattern runs in two tools.
Email patterns show why strict validation regexes get long — the same strict-vs-loose tradeoff you face with IP octets.
Once you've extracted a timestamp, turn it into an epoch number — reading the 10-digit timestamps that show up in many log lines.

Regex for Log Parsing: Extract Timestamps, IPs & Codes (2026)

Regex for Log Parsing: Extract Timestamps, IPs & Codes

TL;DR

Why parse logs with regex instead of splitting on spaces

The quoting problem

When `split()` falls apart

Regex as a single source of truth

How to extract a timestamp from a log line with regex

ISO 8601 / RFC 3339 timestamps

Apache / Common Log Format timestamps

Syslog's headerless timestamp

How to match an IP address in a log file with regex

The naive pattern that accepts 999.999.999.999

A correct IPv4 octet pattern

A note on IPv6

How to extract HTTP status and error codes with regex

Matching 4xx and 5xx errors

Capturing the status into a named group

Application log levels: ERROR, WARN, INFO

Putting it together: one regex for a full access-log line

Named groups for every field

Apache combined vs nginx combined

Testing before you deploy

References

Related on iKit

Related posts

Validate Phone Numbers with Regex Across Countries (2026)

Convert HEIC to JPG, PNG & WebP in the Browser (2026)

Serve AVIF, WebP & JPG With One <picture> Tag (2026)

Regex for Log Parsing: Extract Timestamps, IPs & Codes

#TL;DR

#Why parse logs with regex instead of splitting on spaces

#The quoting problem

#When split() falls apart

#Regex as a single source of truth

#How to extract a timestamp from a log line with regex

#ISO 8601 / RFC 3339 timestamps

#Apache / Common Log Format timestamps

#Syslog's headerless timestamp

#How to match an IP address in a log file with regex

#The naive pattern that accepts 999.999.999.999

#A correct IPv4 octet pattern

#A note on IPv6

#How to extract HTTP status and error codes with regex

#Matching 4xx and 5xx errors

#Capturing the status into a named group

#Application log levels: ERROR, WARN, INFO

#Putting it together: one regex for a full access-log line

#Named groups for every field

#Apache combined vs nginx combined

#Testing before you deploy

#References

#Related on iKit

Related posts

Validate Phone Numbers with Regex Across Countries (2026)

Convert HEIC to JPG, PNG & WebP in the Browser (2026)

Serve AVIF, WebP & JPG With One <picture> Tag (2026)

TL;DR

Why parse logs with regex instead of splitting on spaces

The quoting problem

When `split()` falls apart

Regex as a single source of truth

How to extract a timestamp from a log line with regex

ISO 8601 / RFC 3339 timestamps

Apache / Common Log Format timestamps

Syslog's headerless timestamp

How to match an IP address in a log file with regex

The naive pattern that accepts 999.999.999.999

A correct IPv4 octet pattern

A note on IPv6

How to extract HTTP status and error codes with regex

Matching 4xx and 5xx errors

Capturing the status into a named group

Application log levels: ERROR, WARN, INFO

Putting it together: one regex for a full access-log line

Named groups for every field

Apache combined vs nginx combined

Testing before you deploy

References

Related on iKit