Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BBE for Regexp find operations #5677

Merged
merged 7 commits into from
Oct 25, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions examples/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -1201,6 +1201,13 @@
"verifyBuild": true,
"verifyOutput": true,
"isLearnByExample": true
},
{
"name": "RegExp find operations",
"url": "regexp-find-operations",
"verifyBuild": true,
"verifyOutput": true,
"isLearnByExample": true
}
]
},
Expand Down
70 changes: 70 additions & 0 deletions examples/regexp-find-operations/regexp_find_operations.bal
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import ballerina/io;
import ballerina/lang.regexp;

public function main() {
string logContent = string `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, its better to include non matching example as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other non-matching info and warn logs in this logContent

2024-09-19 10:02:01 WARN [UserLogin] - Failed login attempt for user: johndoe
2024-09-19 10:03:17 ERROR [Database] - Connection to database timed out
2024-09-19 10:04:05 WARN [RequestHandler] - Response time exceeded threshold for /api/v1/users
2024-09-19 10:05:45 INFO [Scheduler] - Scheduled task started: Data backup
2024-09-19 10:06:10 ERROR [Scheduler] - Failed to start data backup: Permission denied
2024-09-19 10:11:55 INFO [Security] - Security scan completed, no issues found
2024-09-19 10:12:30 ERROR [RequestHandler] - 404 Not Found: /api/v1/products`;

// Regular expression to match error logs with three groups:
// 1. Timestamp (e.g., 2024-09-19 10:03:17).
// 2. Component (e.g., Database, Scheduler).
// 3. Log message (e.g., Connection to database timed out).
string:RegExp errorLogPattern = re `(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) ERROR \[(\w+)\]\s-\s(.*)`;

// Retrieve the first error log from `logContent`.
regexp:Span? firstErrorLog = errorLogPattern.find(logContent);
if firstErrorLog == () {
io:println("Failed to find a error log");
return;
}
io:println("First error log: ", firstErrorLog.substring());

// Retrieving all error logs from the `logContent`.
poorna2152 marked this conversation as resolved.
Show resolved Hide resolved
regexp:Span[] allErrorLogs = errorLogPattern.findAll(logContent);
io:println("\n", "All error logs:");
poorna2152 marked this conversation as resolved.
Show resolved Hide resolved
foreach regexp:Span errorLog in allErrorLogs {
io:println(errorLog.substring());
}

// Retrieving groups (timestamp, component, message) from the first error log.
poorna2152 marked this conversation as resolved.
Show resolved Hide resolved
regexp:Groups? firstErrorLogGroups = errorLogPattern.findGroups(logContent);
if firstErrorLogGroups == () {
io:println("Failed to find groups in first error log");
return;
}
io:println("\n", "Groups within first error log:");
printGroupsWithinLog(firstErrorLogGroups);

// Retrieving groups from all error logs.
regexp:Groups[] allErrorLogGroups = errorLogPattern.findAllGroups(logContent);
io:println("\n", "Groups in all error logs");
foreach regexp:Groups logGroup in allErrorLogGroups {
printGroupsWithinLog(logGroup);
}
}

function printGroupsWithinLog(regexp:Groups logGroup) {
// The first element in the `logGroup` is the entire matched string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly wouldn't it be better if we say why any of these values could be nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do explicit null checks here, because since there is a match the expected groups should be there. WDYT?

// The subsequent elements in `logGroup` represent the captured groups
// (timestamp, component, message).
string timestamp = extractStringFromMatchGroup(logGroup[1]);
poorna2152 marked this conversation as resolved.
Show resolved Hide resolved
string component = extractStringFromMatchGroup(logGroup[2]);
string logMessage = extractStringFromMatchGroup(logGroup[3]);

io:println("Timestamp: ", timestamp);
io:println("Component: ", component);
io:println("Message: ", logMessage);
}

function extractStringFromMatchGroup(regexp:Span? span) returns string {
if span !is regexp:Span {
return "";
}
return span.substring();
}
12 changes: 12 additions & 0 deletions examples/regexp-find-operations/regexp_find_operations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# RegExp find operations

The `RegExp` type provides a set of langlib functions to find patterns within strings. These functions enable efficient pattern matching, grouping, and extraction based on specific regular expressions.

poorna2152 marked this conversation as resolved.
Show resolved Hide resolved
::: code regexp_find_operations.bal :::

::: out regexp_find_operations.out :::

## Related links
- [RegExp type](/learn/by-example/regexp-type)
- [RegExp API Docs](https://lib.ballerina.io/ballerina/lang.regexp)
- [string API Docs](https://lib.ballerina.io/ballerina/lang.string)
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
description: This BBE demonstrates how to use the regexp langlib functions relevant to regex find operations.
keywords: ballerina, ballerina by example, bbe, regexp, RegExp, regex, regular expressions, ballerina regex functions, regexp langlib functions, find, findAll, findGroups, findAllGroups
23 changes: 23 additions & 0 deletions examples/regexp-find-operations/regexp_find_operations.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
$ bal run regexp_find_operations.bal
First error log: 2024-09-19 10:03:17 ERROR [Database] - Connection to database timed out

All error logs:
2024-09-19 10:03:17 ERROR [Database] - Connection to database timed out
2024-09-19 10:06:10 ERROR [Scheduler] - Failed to start data backup: Permission denied
2024-09-19 10:12:30 ERROR [RequestHandler] - 404 Not Found: /api/v1/products

Groups within first error log:
Timestamp: 2024-09-19 10:03:17
Component: Database
Message: Connection to database timed out

Groups in all error logs
Timestamp: 2024-09-19 10:03:17
Component: Database
Message: Connection to database timed out
Timestamp: 2024-09-19 10:06:10
Component: Scheduler
Message: Failed to start data backup: Permission denied
Timestamp: 2024-09-19 10:12:30
Component: RequestHandler
Message: 404 Not Found: /api/v1/products
Loading