Step by step: URL validation in JavaScript
Uniform Resource Locators (URLs) are the backbone of the internet and a fundamental part of web development. They identify and provide access to resources on the web, including web pages, images, and other files. You will probably need to validate URLs at some time, and this article will give you several ways of doing that.
There are different scenarios during development where you may need to verify the authenticity of URLs provided by users, and Javascript
sure provides different techniques to achieve an operation like this one. One quick method is the native URL method, as shown below.
const url = new URL("https://www.example.com");
if (url.protocol === "https:" && url.host !== "" && url.pathname !== "") {
console.log("Valid URL");
}
While this code provides a starting point for validating URLs, there’s much more to URL validation than just checking the protocol
), host
, and pathname
. In this article, we’ll dive into the fundamentals of URL validation in JavaScript
and explore various techniques and best practices for creating robust and reliable URL validation in your web applications.
URL Structure
Before we start validating URLs in our JavaScript application, let’s quickly explore how they work and the different structures that make up a URL.
-
Protocol
: Denoted byhttp://
orhttps://
at the URL’s start decides how to access the resource. -
Domain name
: Usually found after the protocol (for example, www.example.com), the domain name acts as a unique identifier for a website. -
Port number
: The port number, which is an optional part of the URL, determines the method of communication. A default port such as 80 for “http” and 443 for “https” is utilized if not specified. -
Pathname
: The path located after the domain name (for example,/page1
) identifies the precise location of the resource on the server hosting the website. -
Query string
: A URL’s optional query string, denoted by a?
symbol, sends extra info to the server. It can have multiplekey=value
pairs, for example,?key1=value1&key2=value2
. -
Fragment identifier
: The fragment identifier is also an essential part of a URL but is distinct from other elements by a hash symbol (#). It designates a particular spot within the resource, for example,#section2
.
A full URL could look like this:
https://www.example.com:8080/path/to/file?key1=value1#section2
When it comes to validating URLs, there are several methods, and in the following sections, we will explore three practical approaches to ensure that your URLs are valid and secure.
1. Setting up the URL constructor
The URL constructor is a JavaScript
feature integrated into the language that helps process and verifies URLs. Given a string that symbolizes the URL being analyzed, the URL constructor will analyze it and yield a fresh URL representation with attributes like the protocol
, host
, pathname
, and search
, which enable exploration of the different parts of the URL.
The URL constructor will produce a new url
object if the string passed to it is a valid URL. In other words, if the string is not a valid URL, the constructor will return an error. Here’s an illustration:
const fccUrl = new URL("https://blog.openreplay.com/");
console.log(fccUrl);
When you log the fccUrl
to the console, this is the output you will receive:
The object output shown above indicates that the string passed to the URL
constructor was a valid URL.
Now, let’s examine the outcome when an invalid URL string is passed:
const fccUrl = new URL('openreplay');
console.log(fccUrl);
The string 'openreplay'
is not a valid URL, resulting in a TypeError
:
In summary:
- Passing a valid URL string to the URL constructor returns a newly created
URL
object. - Passing an invalid URL string to the
URL
constructor throws aTypeError
.
With this understanding, it is possible to build a custom function that verifies the authenticity of a particular URL string.
Creating URL validator function with the url
constructor
With the URL
constructor and a try...catch
statement, it is possible to develop a custom function named isValidUrl
to determine the validity of a specified URL string:
function isValidUrl(string) {
try {
new URL(string);
return true;
} catch (err) {
return false;
}
}
The custom isValidUrl
function returns a boolean value indicating whether the string passed as an argument is a valid URL. The function returns false
if the string is not a valid URL and returns true
if the string is a valid URL.
console.log(isValidUrl("https://blog.openreplay.com/")); // true
console.log(isValidUrl("mailto://federico@openreplay.com")); // true
console.log(isValidUrl("openreplay")); // false
Validating Only HTTP URLs with the url
Constructor
However, you may need to validate if the string represents a valid HTTP URL and exclude other valid URLs like 'mailto://federico@openreplay.com'
.
If you examine the URL
object, one of its attributes is the protocol
property:
Above is an example of a URL
object where the value of the protocol
property is 'https:'
.
The validity of a string as an HTTP URL can be determined by utilizing the protocol
property of the URL object:
function isValidHttpUrl(string) {
try {
const newUrl = new URL(string);
return ["http:", "https:"].includes(newUrl.protocol);
} catch (err) {
return false;
}
}
console.log(isValidHttpUrl("https://blog.openreplay.com/")); // true
console.log(isValidHttpUrl("mailto://federico@openreplay.com")); // false
console.log(isValidHttpUrl("openreplay")); // false
In the code written above, we first create a new URL object. We then verify if the protocol
property of the URL holds either ‘http:’ or ‘https:‘. If it does, we return “true”; otherwise, we return “false”.
Session Replay for Developers
Uncover frustrations, understand bugs and fix slowdowns like never before with OpenReplay — an open-source session replay tool for developers. Self-host it in minutes, and have complete control over your customer data. Check our GitHub repo and join the thousands of developers in our community.
2. Validating URL with Regex
Validating URLs using regular expressions (regex
) is a standard method to check if a string matches the pattern of a URL
All valid URLs adhere to a specific pattern, which consists of three main components:
- Protocol
- Domain
- Path
On occasion, a query string or fragment identifier appears after the path.
By understanding the composition of URLs, you can employ regular expressions to search for corresponding patterns in a string. If the pattern is found, the string passes the regular expression test; otherwise, it fails.
Also, using regex, you can check for all valid URLs or only check for valid HTTP URLs.
How to Validate URLs with Regex
Validating URLs with Regex may not be the best approach as it is complex and is likely not the most efficient method for validating URLs accurately.
As a result, it is advisable to use a package specifically designed for URL validation, such as the is-http-url
package, which will be discussed later in this article.
Here is an illustration of a regular expression that verifies the authenticity of URLs:
function isValidUrl(str) {
const pattern = new RegExp(
"^([a-zA-Z]+:\\/\\/)?" + // protocol
"((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|" + // domain name
"((\\d{1,3}\\.){3}\\d{1,3}))" + // OR IP (v4) address
"(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*" + // port and path
"(\\?[;&a-z\\d%_.~+=-]*)?" + // query string
"(\\#[-a-z\\d_]*)?$", // fragment locator
"i"
);
return pattern.test(str);
}
console.log(isValidUrl("https://blog.openreplay.com/")); // true
console.log(isValidUrl("mailto://federico@openreplay.com")); // true
console.log(isValidUrl("openreplay")); // false
The regular expression in the isValidUrl
function determines if a string is a valid URL. The protocol
check ^([a-zA-Z]+:\\/\\/)?
is not limited solely to https:
.
This is why the second example, which includes the mailto:
protocol
, returns as true
.
Validating HTTP URLs with Regex
Modifying the protocol
verification to utilize regular expressions is necessary to determine if a string is a valid HTTP URL.
Instead of using ^([a-zA-Z]+:\/\/)?
, it is recommended to use ^(https?:\\/\\/)?
:
function isValidHttpUrl(str) {
const pattern = new RegExp(
"^(https?:\\/\\/)?" + // protocol
"((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|" + // domain name
"((\\d{1,3}\\.){3}\\d{1,3}))" + // OR ip (v4) address
"(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*" + // port and path
"(\\?[;&a-z\\d%_.~+=-]*)?" + // query string
"(\\#[-a-z\\d_]*)?$", // fragment locator
"i"
);
return pattern.test(str);
}
console.log(isValidHttpUrl("https://blog.openreplay.com/")); // true
console.log(isValidHttpUrl("mailto://federico@openreplay.com")); // false
console.log(isValidHttpUrl("openreplay")); // false
Only the first example with a valid https:
protocol returns as true
. Please note that URL strings with http:
are also valid.
3. Utilizing npm
Packages
When it comes to validating URLs with npm
packages, you are provided with two options, namely is-url
and is-url-http
.
These packages offer the most straightforward approach to verifying if a string is a valid URL. You only need to pass the string as a parameter, and the packages will return either true
or false
.
Let’s look into the functionality of these two packages.
Validating URLs with is-url
The is-url
package can be utilized to verify if a string is a valid URL, but it does not validate the protocol
of the URL passed in, which could be a red flag for anyone looking to validate a URL protocol
using the is-url
package.
To utilize is-url
, initially install it by executing the following command:
npm install is-url
Then, import the package and pass your URL string as an argument:
import isUrl from "is-url";
const firstCheck = isUrl("https://blog.openreplay.com/");
const secondCheck = isUrl("mailto://federico@openreplay.com");
const thirdCheck = isUrl("openreplay");
console.log(firstCheck); // true
console.log(secondCheck); // true
console.log(thirdCheck); // false
The is-url
package returns true
for strings that possess valid URL formats and false
for those that have an invalid format.
In the example provided, both the firstCheck
(which has the https:
protocol
) and secondCheck
(which has the mailto:
protocol
) return a value of true
.
Validating HTTP URLs with the is-http-url
Package
The is-http-url
is another beautiful package that can be used to verify if a string is a valid HTTP URL. This package weighs ‘3.31KB’ and offers many wonderful features, such as URL validation and HTTP protocol
validation; it is also lightweight, making it easy and quick to download and use. However, its downside is that it only validates URLs with the HTTP protocol
and is limited to protocols such as HTTPS, which might not make it a favorable solution.
Install the package using the command below:
npm install is-url-http
Next, import the package and pass the URL string to it as follows:
import isUrlHttp from "is-url-http";
const firstCheck = isUrlHttp("https://blog.openreplay.com/");
const secondCheck = isUrlHttp("mailto://federico@openreplay.com");
const thirdCheck = isUrlHttp("openreplay");
console.log(firstCheck); // true
console.log(secondCheck); // false
console.log(thirdCheck); // false
In this scenario, only firstCheck
returns true
. The is-url-http
package not only verifies that the string is a valid URL, but it also ensures it’s a valid HTTP URL. This is why secondCheck
returns false
, as it is not a valid HTTP URL.
Conclusion
This guide introduced you to the various methods of URL validation with javascript, including the pros and cons that come with some of these methods. Additionally, we discussed the application and the usefulness of these methods; it is now left to you to utilize the one that best fits your needs.