Introduction
In this lab, you will learn how to check if a given string matches a URL format in Java using regular expressions. We will define a regex pattern specifically designed for URLs, utilize the Pattern.matches() method to test strings against this pattern, and explore how to validate common URL schemes. This hands-on exercise will guide you through the practical steps of implementing URL validation in your Java applications.
Define URL Regex Pattern
In this step, we will learn how to define a regular expression pattern in Java to match URLs. Regular expressions, often shortened to "regex" or "regexp", are sequences of characters that define a search pattern. They are extremely powerful for pattern matching and manipulation of strings.
For validating URLs, a regex pattern helps us check if a given string follows the standard structure of a URL (like http://www.example.com or https://example.org/path).
Let's create a new Java file to work with regex.
Open the WebIDE. In the File Explorer on the left, make sure you are in the
~/projectdirectory.Right-click in the empty space within the
~/projectdirectory and select "New File".Name the new file
UrlValidator.javaand press Enter.The
UrlValidator.javafile should open in the Code Editor.Copy and paste the following Java code into the editor:
import java.util.regex.Pattern; public class UrlValidator { public static void main(String[] args) { // Define a simple regex pattern for a URL String urlRegex = "^(http|https)://[^\\s/$.?#].[^\\s]*$"; // Compile the regex pattern Pattern pattern = Pattern.compile(urlRegex); System.out.println("URL Regex Pattern Defined."); } }Let's break down the new parts of this code:
import java.util.regex.Pattern;: This line imports thePatternclass, which is part of Java's built-in support for regular expressions.String urlRegex = "^(http|https)://[^\\s/$.?#].[^\\s]*$";: This line defines aStringvariable namedurlRegexand assigns it our regular expression pattern.^: Matches the beginning of the string.(http|https): Matches either "http" or "https".://: Matches the literal characters "://".[^\\s/$.?#]: Matches any character that is NOT a whitespace character (\\s), a forward slash (/), a dollar sign ($), a period (.), a question mark (?), or a hash symbol (#). This is a simplified way to match the domain name part..: Matches any character (except newline).[^\\s]*: Matches zero or more characters that are NOT whitespace. This is a simplified way to match the rest of the URL path and query.$: Matches the end of the string.- Note the double backslashes (
\\) befores. In Java strings, a single backslash is an escape character, so we need\\to represent a literal backslash in the regex pattern.
Pattern pattern = Pattern.compile(urlRegex);: This line compiles the regex string into aPatternobject. Compiling the pattern is more efficient if you plan to use the same pattern multiple times.System.out.println("URL Regex Pattern Defined.");: This line simply prints a message to the console to indicate that the pattern has been defined.
Save the file (Ctrl+S or Cmd+S).
Now, let's compile this Java program. Open the Terminal at the bottom of the WebIDE. Make sure you are in the
~/projectdirectory.Compile the code using the
javaccommand:javac UrlValidator.javaIf there are no errors, the command will complete without output. A
UrlValidator.classfile will be created in the~/projectdirectory.Run the compiled program using the
javacommand:java UrlValidatorYou should see the output:
URL Regex Pattern Defined.
You have successfully defined and compiled a Java program that includes a basic regex pattern for URLs. In the next step, we will use this pattern to test if different strings are valid URLs.
Test URL with Pattern.matches()
In the previous step, we defined a regex pattern for URLs and compiled it into a Pattern object. Now, let's use this pattern to check if different strings are valid URLs using the Pattern.matches() method.
The Pattern.matches(regex, input) method is a convenient way to check if an entire input string matches a given regular expression. It compiles the regex and matches the input against it in a single step.
Let's modify our UrlValidator.java file to test some example URLs.
Open the
UrlValidator.javafile in the WebIDE editor if it's not already open.Modify the
mainmethod to include the following code. You will add this code after the linePattern pattern = Pattern.compile(urlRegex);.import java.util.regex.Pattern; public class UrlValidator { public static void main(String[] args) { // Define a simple regex pattern for a URL String urlRegex = "^(http|https)://[^\\s/$.?#].[^\\s]*$"; // Compile the regex pattern Pattern pattern = Pattern.compile(urlRegex); // Test some URLs String url1 = "http://www.example.com"; String url2 = "https://example.org/path/to/page"; String url3 = "ftp://invalid-url.com"; // Invalid scheme String url4 = "http:// example.com"; // Invalid character (space) System.out.println("\nTesting URLs:"); boolean isUrl1Valid = Pattern.matches(urlRegex, url1); System.out.println(url1 + " is valid: " + isUrl1Valid); boolean isUrl2Valid = Pattern.matches(urlRegex, url2); System.out.println(url2 + " is valid: " + isUrl2Valid); boolean isUrl3Valid = Pattern.matches(urlRegex, url3); System.out.println(url3 + " is valid: " + isUrl3Valid); boolean isUrl4Valid = Pattern.matches(urlRegex, url4); System.out.println(url4 + " is valid: " + isUrl4Valid); } }Here's what we added:
- We defined four
Stringvariables (url1,url2,url3,url4) containing different example strings, some valid URLs according to our simple pattern, and some invalid ones. - We added a print statement to make the output clearer.
- We used
Pattern.matches(urlRegex, url)for each test string. This method returnstrueif the entire string matches theurlRegexpattern, andfalseotherwise. - We printed the result of the validation for each URL.
- We defined four
Save the
UrlValidator.javafile.Compile the modified code in the Terminal:
javac UrlValidator.javaAgain, if compilation is successful, there will be no output.
Run the compiled program:
java UrlValidatorYou should see output similar to this:
URL Regex Pattern Defined. Testing URLs: http://www.example.com is valid: true https://example.org/path/to/page is valid: true ftp://invalid-url.com is valid: false http:// example.com is valid: false
This output shows that our simple regex pattern correctly identified the first two strings as valid URLs (according to the pattern) and the last two as invalid.
You have now successfully used the Pattern.matches() method to test strings against a regular expression pattern in Java.
Validate Common URL Schemes
In the previous steps, we defined a simple regex pattern and used Pattern.matches() to test it. Our current pattern only validates URLs starting with http or https. However, URLs can have other schemes like ftp, mailto, file, etc.
In this step, we will modify our regex pattern to include more common URL schemes. A more robust regex pattern for URLs is quite complex, but we can expand our current pattern to include a few more common schemes for demonstration purposes.
Let's update the UrlValidator.java file.
Open the
UrlValidator.javafile in the WebIDE editor.Modify the
urlRegexstring to includeftpandmailtoschemes in addition tohttpandhttps. We will also add a test case for anftpURL.Replace the line:
String urlRegex = "^(http|https)://[^\\s/$.?#].[^\\s]*$";with:
String urlRegex = "^(http|https|ftp|mailto)://[^\\s/$.?#].[^\\s]*$";Notice that we simply added
|ftp|mailtoinside the parentheses()which represent a group, and the|symbol acts as an "OR" operator. This means the pattern will now match strings starting withhttp,https,ftp, ormailtofollowed by://.Add a new test case for an FTP URL. Add the following lines after the definition of
url4:String url5 = "ftp://ftp.example.com/files"; // Valid FTP URLAdd the validation for
url5after the validation forurl4:boolean isUrl5Valid = Pattern.matches(urlRegex, url5); System.out.println(url5 + " is valid: " + isUrl5Valid);Your complete
mainmethod should now look like this:import java.util.regex.Pattern; public class UrlValidator { public static void main(String[] args) { // Define a simple regex pattern for a URL including more schemes String urlRegex = "^(http|https|ftp|mailto)://[^\\s/$.?#].[^\\s]*$"; // Compile the regex pattern Pattern pattern = Pattern.compile(urlRegex); System.out.println("URL Regex Pattern Defined."); // Test some URLs String url1 = "http://www.example.com"; String url2 = "https://example.org/path/to/page"; String url3 = "invalid-url.com"; // Invalid (missing scheme) String url4 = "http:// example.com"; // Invalid character (space) String url5 = "ftp://ftp.example.com/files"; // Valid FTP URL System.out.println("\nTesting URLs:"); boolean isUrl1Valid = Pattern.matches(urlRegex, url1); System.out.println(url1 + " is valid: " + isUrl1Valid); boolean isUrl2Valid = Pattern.matches(urlRegex, url2); System.out.println(url2 + " is valid: " + isUrl2Valid); boolean isUrl3Valid = Pattern.matches(urlRegex, url3); System.out.println(url3 + " is valid: " + isUrl3Valid); boolean isUrl4Valid = Pattern.matches(urlRegex, url4); System.out.println(url4 + " is valid: " + isUrl4Valid); boolean isUrl5Valid = Pattern.matches(urlRegex, url5); System.out.println(url5 + " is valid: " + isUrl5Valid); } }Save the
UrlValidator.javafile.Compile the updated code in the Terminal:
javac UrlValidator.javaRun the compiled program:
java UrlValidatorYou should now see output similar to this, with the FTP URL also being marked as valid:
URL Regex Pattern Defined. Testing URLs: http://www.example.com is valid: true https://example.org/path/to/page is valid: true invalid-url.com is valid: false http:// example.com is valid: false ftp://ftp.example.com/files is valid: true
You have successfully modified the regex pattern to include more common URL schemes and tested the updated pattern. This demonstrates how you can adjust regex patterns to match a wider range of inputs.
Summary
In this lab, we began by learning how to define a regular expression pattern in Java specifically for validating URLs. We created a new Java file, UrlValidator.java, and imported the java.util.regex.Pattern class. We then defined a String variable urlRegex containing a basic regex pattern designed to match strings starting with "http" or "https" followed by "://", and compiled this pattern using Pattern.compile(). This initial step focused on setting up the necessary tools and defining the core pattern for URL validation using Java's built-in regex capabilities.



