Back

Step by step: URL validation in JavaScript

Step by step: URL validation in JavaScript

Uniform Resource Locators (URLs) are the backbone of the internet and a fundamental part of web development. They identify and provide access to resources on the web, including web pages, images, and other files. You will probably need to validate URLs at some time, and this article will give you several ways of doing that.

There are different scenarios during development where you may need to verify the authenticity of URLs provided by users, and Javascript sure provides different techniques to achieve an operation like this one. One quick method is the native URL method, as shown below.

const url = new URL("https://www.example.com");
if (url.protocol === "https:" && url.host !== "" && url.pathname !== "") {
  console.log("Valid URL");
}

While this code provides a starting point for validating URLs, there’s much more to URL validation than just checking the protocol), host, and pathname. In this article, we’ll dive into the fundamentals of URL validation in JavaScript and explore various techniques and best practices for creating robust and reliable URL validation in your web applications.

URL Structure

Before we start validating URLs in our JavaScript application, let’s quickly explore how they work and the different structures that make up a URL.

URL Structure

  • Protocol: Denoted by http:// or https:// at the URL’s start decides how to access the resource.

  • Domain name: Usually found after the protocol (for example, www.example.com), the domain name acts as a unique identifier for a website.

  • Port number: The port number, which is an optional part of the URL, determines the method of communication. A default port such as 80 for “http” and 443 for “https” is utilized if not specified.

  • Pathname: The path located after the domain name (for example, /page1) identifies the precise location of the resource on the server hosting the website.

  • Query string: A URL’s optional query string, denoted by a ? symbol, sends extra info to the server. It can have multiple key=value pairs, for example, ?key1=value1&key2=value2.

  • Fragment identifier: The fragment identifier is also an essential part of a URL but is distinct from other elements by a hash symbol (#). It designates a particular spot within the resource, for example, #section2.

A full URL could look like this:

https://www.example.com:8080/path/to/file?key1=value1#section2

When it comes to validating URLs, there are several methods, and in the following sections, we will explore three practical approaches to ensure that your URLs are valid and secure.

1. Setting up the URL constructor

The URL constructor is a JavaScript feature integrated into the language that helps process and verifies URLs. Given a string that symbolizes the URL being analyzed, the URL constructor will analyze it and yield a fresh URL representation with attributes like the protocol, host, pathname, and search, which enable exploration of the different parts of the URL.

The URL constructor will produce a new url object if the string passed to it is a valid URL. In other words, if the string is not a valid URL, the constructor will return an error. Here’s an illustration:

const fccUrl = new URL("https://blog.openreplay.com/");
console.log(fccUrl);

When you log the fccUrl to the console, this is the output you will receive:

A new url object

The object output shown above indicates that the string passed to the URL constructor was a valid URL.

Now, let’s examine the outcome when an invalid URL string is passed:

const fccUrl = new URL('openreplay');
console.log(fccUrl);

The string 'openreplay' is not a valid URL, resulting in a TypeError:

TypeError

In summary:

  1. Passing a valid URL string to the URL constructor returns a newly created URL object.
  2. Passing an invalid URL string to the URL constructor throws a TypeError.

With this understanding, it is possible to build a custom function that verifies the authenticity of a particular URL string.

Creating URL validator function with the url constructor

With the URL constructor and a try...catch statement, it is possible to develop a custom function named isValidUrl to determine the validity of a specified URL string:

function isValidUrl(string) {
  try {
    new URL(string);
    return true;
  } catch (err) {
    return false;
  }
}

The custom isValidUrl function returns a boolean value indicating whether the string passed as an argument is a valid URL. The function returns false if the string is not a valid URL and returns true if the string is a valid URL.

console.log(isValidUrl("https://blog.openreplay.com/")); // true
console.log(isValidUrl("mailto://federico@openreplay.com")); // true
console.log(isValidUrl("openreplay")); // false

Validating Only HTTP URLs with the url Constructor

However, you may need to validate if the string represents a valid HTTP URL and exclude other valid URLs like 'mailto://federico@openreplay.com'.

If you examine the URL object, one of its attributes is the protocol property:

Protocol property as a url object attribute

Above is an example of a URL object where the value of the protocol property is 'https:'.

The validity of a string as an HTTP URL can be determined by utilizing the protocol property of the URL object:

function isValidHttpUrl(string) {
  try {
    const newUrl = new URL(string);
    return ["http:", "https:"].includes(newUrl.protocol);
  } catch (err) {
    return false;
  }
}

console.log(isValidHttpUrl("https://blog.openreplay.com/")); // true
console.log(isValidHttpUrl("mailto://federico@openreplay.com")); // false
console.log(isValidHttpUrl("openreplay")); // false

In the code written above, we first create a new URL object. We then verify if the protocol property of the URL holds either ‘http:’ or ‘https:‘. If it does, we return “true”; otherwise, we return “false”.

Session Replay for Developers

Uncover frustrations, understand bugs and fix slowdowns like never before with OpenReplay — an open-source session replay tool for developers. Self-host it in minutes, and have complete control over your customer data. Check our GitHub repo and join the thousands of developers in our community.

2. Validating URL with Regex

Validating URLs using regular expressions (regex) is a standard method to check if a string matches the pattern of a URL

All valid URLs adhere to a specific pattern, which consists of three main components:

  • Protocol
  • Domain
  • Path

On occasion, a query string or fragment identifier appears after the path.

By understanding the composition of URLs, you can employ regular expressions to search for corresponding patterns in a string. If the pattern is found, the string passes the regular expression test; otherwise, it fails.

Also, using regex, you can check for all valid URLs or only check for valid HTTP URLs.

How to Validate URLs with Regex

Validating URLs with Regex may not be the best approach as it is complex and is likely not the most efficient method for validating URLs accurately.

As a result, it is advisable to use a package specifically designed for URL validation, such as the is-http-url package, which will be discussed later in this article.

Here is an illustration of a regular expression that verifies the authenticity of URLs:

function isValidUrl(str) {
  const pattern = new RegExp(
    "^([a-zA-Z]+:\\/\\/)?" + // protocol
      "((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|" + // domain name
      "((\\d{1,3}\\.){3}\\d{1,3}))" + // OR IP (v4) address
      "(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*" + // port and path
      "(\\?[;&a-z\\d%_.~+=-]*)?" + // query string
      "(\\#[-a-z\\d_]*)?$", // fragment locator
    "i"
  );
  return pattern.test(str);
}

console.log(isValidUrl("https://blog.openreplay.com/")); // true
console.log(isValidUrl("mailto://federico@openreplay.com")); // true
console.log(isValidUrl("openreplay")); // false

The regular expression in the isValidUrl function determines if a string is a valid URL. The protocol check ^([a-zA-Z]+:\\/\\/)? is not limited solely to https:.

This is why the second example, which includes the mailto: protocol, returns as true.

Validating HTTP URLs with Regex

Modifying the protocol verification to utilize regular expressions is necessary to determine if a string is a valid HTTP URL.

Instead of using ^([a-zA-Z]+:\/\/)?, it is recommended to use ^(https?:\\/\\/)?:

function isValidHttpUrl(str) {
  const pattern = new RegExp(
    "^(https?:\\/\\/)?" + // protocol
      "((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|" + // domain name
      "((\\d{1,3}\\.){3}\\d{1,3}))" + // OR ip (v4) address
      "(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*" + // port and path
      "(\\?[;&a-z\\d%_.~+=-]*)?" + // query string
      "(\\#[-a-z\\d_]*)?$", // fragment locator
    "i"
  );
  return pattern.test(str);
}

console.log(isValidHttpUrl("https://blog.openreplay.com/")); // true
console.log(isValidHttpUrl("mailto://federico@openreplay.com")); // false
console.log(isValidHttpUrl("openreplay")); // false

Only the first example with a valid https: protocol returns as true. Please note that URL strings with http: are also valid.

3. Utilizing npm Packages

When it comes to validating URLs with npm packages, you are provided with two options, namely is-url and is-url-http.

These packages offer the most straightforward approach to verifying if a string is a valid URL. You only need to pass the string as a parameter, and the packages will return either true or false.

Let’s look into the functionality of these two packages.

Validating URLs with is-url

The is-url package can be utilized to verify if a string is a valid URL, but it does not validate the protocol of the URL passed in, which could be a red flag for anyone looking to validate a URL protocol using the is-url package.

To utilize is-url, initially install it by executing the following command:

npm install is-url

Then, import the package and pass your URL string as an argument:

import isUrl from "is-url";

const firstCheck = isUrl("https://blog.openreplay.com/");
const secondCheck = isUrl("mailto://federico@openreplay.com");
const thirdCheck = isUrl("openreplay");

console.log(firstCheck); // true
console.log(secondCheck); // true
console.log(thirdCheck); // false

The is-url package returns true for strings that possess valid URL formats and false for those that have an invalid format.

In the example provided, both the firstCheck (which has the https: protocol) and secondCheck (which has the mailto: protocol) return a value of true.

Validating HTTP URLs with the is-http-url Package

The is-http-url is another beautiful package that can be used to verify if a string is a valid HTTP URL. This package weighs ‘3.31KB’ and offers many wonderful features, such as URL validation and HTTP protocol validation; it is also lightweight, making it easy and quick to download and use. However, its downside is that it only validates URLs with the HTTP protocol and is limited to protocols such as HTTPS, which might not make it a favorable solution.

Install the package using the command below:

npm install is-url-http

Next, import the package and pass the URL string to it as follows:

import isUrlHttp from "is-url-http";

const firstCheck = isUrlHttp("https://blog.openreplay.com/");
const secondCheck = isUrlHttp("mailto://federico@openreplay.com");
const thirdCheck = isUrlHttp("openreplay");

console.log(firstCheck); // true
console.log(secondCheck); // false
console.log(thirdCheck); // false

In this scenario, only firstCheck returns true. The is-url-http package not only verifies that the string is a valid URL, but it also ensures it’s a valid HTTP URL. This is why secondCheck returns false, as it is not a valid HTTP URL.

Conclusion

This guide introduced you to the various methods of URL validation with javascript, including the pros and cons that come with some of these methods. Additionally, we discussed the application and the usefulness of these methods; it is now left to you to utilize the one that best fits your needs.

Gain Debugging Superpowers

Unleash the power of session replay to reproduce bugs and track user frustrations. Get complete visibility into your frontend with OpenReplay, the most advanced open-source session replay tool for developers.

OpenReplay