Skip to main content

Levenshtein edit distance

The Levenshtein edit distance, also known as the minimum edit distance, is a metric used to quantify the difference between two strings. It measures the minimum number of single-character insertions, deletions, or substitutions required to transform one string into another. This distance metric finds applications in various fields, including spell checking, DNA sequence analysis, natural language processing, and more.

Problem Statement

Given two input strings, X and Y, we are tasked with finding the Levenshtein edit distance between them. Our goal is to compute the minimum number of operations required to transform string X into string Y, where each operation can be one of the following:

  1. Inserting a character into string X.
  2. Deleting a character from string X.
  3. Substituting a character in string X with another character.

Example

Let's take an example to understand the problem better. Consider two strings: X = "kitten" Y = "sitting"

To convert X to Y, we need to perform the following operations:

  1. Substitute 'k' with 's'.
  2. Insert 'i' after 's'.
  3. Insert 't' after 'i'.
  4. Substitute 'e' with 't'.
  5. Substitute 'n' with 'g'.

Thus, the Levenshtein edit distance between "kitten" and "sitting" is 5, as we need at least five operations to transform X into Y.

Standard Pseudocode

function levenshteinDistance(X, Y):
    n = length(X)
    m = length(Y)
    dp = 2D array of size (n+1) x (m+1)

    for i from 0 to n:
        dp[i][0] = i

    for j from 1 to m:
        dp[0][j] = j

    for i from 1 to n:
        for j from 1 to m:
            if X[i-1] = Y[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = min(dp[i-1][j], dp[i-1][j-1], dp[i][j-1]) + 1

    return dp[n][m]

Algorithm Explanation

  1. We create a 2D array dp to store the distances of substrings X[1:i] and Y[1:j].
  2. Initialize the first column dp[i][0] with the values from 0 to n (length of X).
  3. Initialize the first row dp[0][j] with the values from 0 to m (length of Y).
  4. Loop through the strings X and Y using nested loops, starting from index 1.
  5. If the characters at index i in X and j in Y are the same, set dp[i][j] to the value of dp[i-1][j-1]. Otherwise, set dp[i][j] to the minimum of the values of dp[i-1][j], dp[i-1][j-1], and dp[i][j-1], and increment it by 1.
  6. After completing the loops, the final edit distance will be stored in dp[n][m].

Resultant Output Explanation

In the given Java code, the main function creates an instance of the Distance class and initializes two strings X and Y. It then calculates the length of each string and calls the levenshteinDistance method to compute the edit distance between them. Finally, it prints the result on the console.

For the provided example: X = "bitterness" Y = "buttress"

The output of the code will be:

Given x : bitterness
Given y : buttress
Distance : 4

The Levenshtein edit distance between "bitterness" and "buttress" is 4, as it takes a minimum of 4 operations to transform X into Y. One possible transformation sequence is:

  1. Substitute 'b' with 'b'.
  2. Substitute 'i' with 'u'.
  3. Substitute 'e' with 't'.
  4. Substitute 'n' with 't'.

Code Solution

/*
  Java Program 
  Levenshtein edit distance
*/
public class Distance
{
    // Returns the minimum of a given three values
    public int minValue(int a, int b, int c)
    {
        if (a > b)
        {
            if (b > c)
            {
                return c;
            }
            else
            {
                return b;
            }
        }
        else
        {
            if (a > c)
            {
                return c;
            }
            else
            {
                return a;
            }
        }
    }
    public void levenshteinDistance(String x, String y, int n, int m)
    {
        int[][] dp = new int[n + 1][m + 1];
        // Loop controlling variables
        int i = 0;
        int j = 0;
        
        // Set the value of first column
        for (i = 0 ; i <= n; ++i ) 
        {
            dp[i][0] = i;
        }   
        // Set the value of first row
        for (i = 1 ; i <= m; ++i ) 
        {
             dp[0][i] = i;
        }   
        
        // iterate the loop through by length of x+1
        for (i = 1; i <= n; ++i)
        {
            // iterate the loop through by length of y+1
            for (j = 1; j <= m; ++j)
            {
                // minValue(top, top left, and left element)
                //  b   a
                //    ↖ ↑
                //  c ← x
                dp[i][j] = minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]);
            

                if (x.charAt(i - 1) != y.charAt(j - 1))
                {
                    
                    
                    dp[i][j] += 1 ;
                }
            }
        }
      	// Display given text
      	System.out.print("\n Given x : "+x);
      	System.out.print("\n Given y : "+y);
        // Display calculated result
        System.out.print("\n Distance : " + dp[n][m]);
    }
    public static void main(String[] args)
    {
        Distance task = new Distance();
        // Given text
        String x = "bitterness";
        String y = "buttress";
        // Get the length
        int n = x.length();
        int m = y.length();
        // Test 
        task.levenshteinDistance(x, y, n, m);
    }
}

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
// Include header file
#include <iostream>
#include <string>

using namespace std;
/*
  C++ Program 
  Levenshtein edit distance
*/
class Distance
{
	public:
		// Returns the minimum of a given three values
		int minValue(int a, int b, int c)
		{
			if (a > b)
			{
				if (b > c)
				{
					return c;
				}
				else
				{
					return b;
				}
			}
			else
			{
				if (a > c)
				{
					return c;
				}
				else
				{
					return a;
				}
			}
		}
	void levenshteinDistance(string x, string y, int n, int m)
	{
		int dp[n + 1][m + 1];
		// Loop controlling variables
		int i = 0;
		int j = 0;
		// Set the value of first column
		for (i = 0; i <= n; ++i)
		{
			dp[i][0] = i;
		}
		// Set the value of first row
		for (i = 1; i <= m; ++i)
		{
			dp[0][i] = i;
		}
		// iterate the loop through by length of x+1
		for (i = 1; i <= n; ++i)
		{
			// iterate the loop through by length of y+1
			for (j = 1; j <= m; ++j)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				dp[i][j] = this->minValue(
                  dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]
                );
				if (x[i - 1] != y[j - 1])
				{
					dp[i][j] += 1;
				}
			}
		}
		// Display given text
		cout << "\n Given x : " << x;
		cout << "\n Given y : " << y;
		// Display calculated result
		cout << "\n Distance : " << dp[n][m];
	}
};
int main()
{
	Distance task = Distance();
	// Given text
	string x = "bitterness";
	string y = "buttress";
	// Get the length
	int n = x.length();
	int m = y.length();
	// Test
	task.levenshteinDistance(x, y, n, m);
	return 0;
}

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
// Include namespace system
using System;
/*
  C# Program 
  Levenshtein edit distance
*/
public class Distance
{
	// Returns the minimum of a given three values
	public int minValue(int a, int b, int c)
	{
		if (a > b)
		{
			if (b > c)
			{
				return c;
			}
			else
			{
				return b;
			}
		}
		else
		{
			if (a > c)
			{
				return c;
			}
			else
			{
				return a;
			}
		}
	}
	public void levenshteinDistance(String x, String y, int n, int m)
	{
		int[,] dp = new int[n + 1,m + 1];
		// Loop controlling variables
		int i = 0;
		int j = 0;
		// Set the value of first column
		for (i = 0; i <= n; ++i)
		{
			dp[i,0] = i;
		}
		// Set the value of first row
		for (i = 1; i <= m; ++i)
		{
			dp[0,i] = i;
		}
		// iterate the loop through by length of x+1
		for (i = 1; i <= n; ++i)
		{
			// iterate the loop through by length of y+1
			for (j = 1; j <= m; ++j)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				dp[i,j] = minValue(dp[i - 1,j], dp[i - 1,j - 1], dp[i,j - 1]);
				if (x[i - 1] != y[j - 1])
				{
					dp[i,j] += 1;
				}
			}
		}
		// Display given text
		Console.Write("\n Given x : " + x);
		Console.Write("\n Given y : " + y);
		// Display calculated result
		Console.Write("\n Distance : " + dp[n,m]);
	}
	public static void Main(String[] args)
	{
		Distance task = new Distance();
		// Given text
		String x = "bitterness";
		String y = "buttress";
		// Get the length
		int n = x.Length;
		int m = y.Length;
		// Test
		task.levenshteinDistance(x, y, n, m);
	}
}

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
<?php
/*
  Php Program 
  Levenshtein edit distance
*/
class Distance
{
	// Returns the minimum of a given three values
	public	function minValue($a, $b, $c)
	{
		if ($a > $b)
		{
			if ($b > $c)
			{
				return $c;
			}
			else
			{
				return $b;
			}
		}
		else
		{
			if ($a > $c)
			{
				return $c;
			}
			else
			{
				return $a;
			}
		}
	}
	public	function levenshteinDistance($x, $y, $n, $m)
	{
		$dp = array_fill(0, $n + 1, array_fill(0, $m + 1, 0));
		// Loop controlling variables
		$i = 0;
		$j = 0;
		// Set the value of first column
		for ($i = 0; $i <= $n; ++$i)
		{
			$dp[$i][0] = $i;
		}
		// Set the value of first row
		for ($i = 1; $i <= $m; ++$i)
		{
			$dp[0][$i] = $i;
		}
		// iterate the loop through by length of x+1
		for ($i = 1; $i <= $n; ++$i)
		{
			// iterate the loop through by length of y+1
			for ($j = 1; $j <= $m; ++$j)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				$dp[$i][$j] = $this->minValue(
                  $dp[$i - 1][$j], $dp[$i - 1][$j - 1], $dp[$i][$j - 1]
                );
				if ($x[$i - 1] != $y[$j - 1])
				{
					$dp[$i][$j] += 1;
				}
			}
		}
		// Display given text
		echo "\n Given x : ". $x;
		echo "\n Given y : ". $y;
		// Display calculated result
		echo "\n Distance : ". $dp[$n][$m];
	}
}

function main()
{
	$task = new Distance();
	// Given text
	$x = "bitterness";
	$y = "buttress";
	// Get the length
	$n = strlen($x);
	$m = strlen($y);
	$task->levenshteinDistance($x, $y, $n, $m);
}
main();

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
/*
  Node Js Program 
  Levenshtein edit distance
*/
class Distance
{
	// Returns the minimum of a given three values
	minValue(a, b, c)
	{
		if (a > b)
		{
			if (b > c)
			{
				return c;
			}
			else
			{
				return b;
			}
		}
		else
		{
			if (a > c)
			{
				return c;
			}
			else
			{
				return a;
			}
		}
	}
	levenshteinDistance(x, y, n, m)
	{
		var dp = Array(n + 1).fill(0).map(() => new Array(m + 1).fill(0));
		// Loop controlling variables
		var i = 0;
		var j = 0;
		// Set the value of first column
		for (i = 0; i <= n; ++i)
		{
			dp[i][0] = i;
		}
		// Set the value of first row
		for (i = 1; i <= m; ++i)
		{
			dp[0][i] = i;
		}
		// iterate the loop through by length of x+1
		for (i = 1; i <= n; ++i)
		{
			// iterate the loop through by length of y+1
			for (j = 1; j <= m; ++j)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				dp[i][j] = this.minValue(
                  dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]
                );
				if (x.charAt(i - 1) != y.charAt(j - 1))
				{
					dp[i][j] += 1;
				}
			}
		}
		// Display given text
		process.stdout.write("\n Given x : " + x);
		process.stdout.write("\n Given y : " + y);
		// Display calculated result
		process.stdout.write("\n Distance : " + dp[n][m]);
	}
}

function main()
{
	var task = new Distance();
	// Given text
	var x = "bitterness";
	var y = "buttress";
	// Get the length
	var n = x.length;
	var m = y.length;
	// Test
	task.levenshteinDistance(x, y, n, m);
}
main();

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
#   Python 3 Program 
#   Levenshtein edit distance

class Distance :
	#  Returns the minimum of a given three values
	def minValue(self, a, b, c) :
		if (a > b) :
			if (b > c) :
				return c
			else :
				return b
			
		else :
			if (a > c) :
				return c
			else :
				return a
			
		
	
	def levenshteinDistance(self, x, y, n, m) :
		dp = [[0] * (m + 1) for _ in range(n + 1) ]
		#  Loop controlling variables
		i = 0
		j = 0
		#  Set the value of first column
		while (i <= n) :
			dp[i][0] = i
			i += 1
		
		i = 1
		#  Set the value of first row
		while (i <= m) :
			dp[0][i] = i
			i += 1
		
		i = 1
		#  iterate the loop through by length of x+1
		while (i <= n) :
			j = 1
			#  iterate the loop through by length of y+1
			while (j <= m) :
				#  minValue(top, top left, and left element)
				#   b   a
				#     ↖ ↑
				#   c ← x
				dp[i][j] = self.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1])
				if (x[i - 1] != y[j - 1]) :
					dp[i][j] += 1
				
				j += 1
			
			i += 1
		
		#  Display given text
		print("\n Given x : ", x, end = "")
		print("\n Given y : ", y, end = "")
		#  Display calculated result
		print("\n Distance : ", dp[n][m], end = "")
	

def main() :
	task = Distance()
	#  Given text
	x = "bitterness"
	y = "buttress"
	#  Get the length
	n = len(x)
	m = len(y)
	#  Test
	task.levenshteinDistance(x, y, n, m)

if __name__ == "__main__": main()

Output

 Given x :  bitterness
 Given y :  buttress
 Distance :  3
#   Ruby Program 
#   Levenshtein edit distance

class Distance 
	#  Returns the minimum of a given three values
	def minValue(a, b, c) 
		if (a > b) 
			if (b > c) 
				return c
			else 
				return b
			end

		else 
			if (a > c) 
				return c
			else 
				return a
			end

		end

	end

	def levenshteinDistance(x, y, n, m) 
		dp = Array.new(n + 1) {Array.new(m + 1) {0}}
		#  Loop controlling variables
		i = 0
		j = 0
		#  Set the value of first column
		while (i <= n) 
			dp[i][0] = i
			i += 1
		end

		i = 1
		#  Set the value of first row
		while (i <= m) 
			dp[0][i] = i
			i += 1
		end

		i = 1
		#  iterate the loop through by length of x+1
		while (i <= n) 
			j = 1
			#  iterate the loop through by length of y+1
			while (j <= m) 
				#  minValue(top, top left, and left element)
				#   b   a
				#     ↖ ↑
				#   c ← x
				dp[i][j] = self.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1])
				if (x[i - 1] != y[j - 1]) 
					dp[i][j] += 1
				end

				j += 1
			end

			i += 1
		end

		#  Display given text
		print("\n Given x : ", x)
		print("\n Given y : ", y)
		#  Display calculated result
		print("\n Distance : ", dp[n][m])
	end

end

def main() 
	task = Distance.new()
	#  Given text
	x = "bitterness"
	y = "buttress"
	#  Get the length
	n = x.length
	m = y.length
	#  Test
	task.levenshteinDistance(x, y, n, m)
end

main()

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
/*
  Scala Program 
  Levenshtein edit distance
*/
class Distance
{
	// Returns the minimum of a given three values
	def minValue(a: Int, b: Int, c: Int): Int = {
		if (a > b)
		{
			if (b > c)
			{
				return c;
			}
			else
			{
				return b;
			}
		}
		else
		{
			if (a > c)
			{
				return c;
			}
			else
			{
				return a;
			}
		}
	}
	def levenshteinDistance(x: String, y: String, n: Int, m: Int): Unit = {
		var dp: Array[Array[Int]] = Array.fill[Int](n + 1, m + 1)(0);
		// Loop controlling variables
		var i: Int = 0;
		var j: Int = 0;
		// Set the value of first column
		while (i <= n)
		{
			dp(i)(0) = i;
			i += 1;
		}
		i = 1;
		// Set the value of first row
		while (i <= m)
		{
			dp(0)(i) = i;
			i += 1;
		}
		i = 1;
		// iterate the loop through by length of x+1
		while (i <= n)
		{
			j = 1;
			// iterate the loop through by length of y+1
			while (j <= m)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				dp(i)(j) = this.minValue(
                  dp(i - 1)(j), dp(i - 1)(j - 1), dp(i)(j - 1)
                );
				if (x.charAt(i - 1) != y.charAt(j - 1))
				{
					dp(i)(j) += 1;
				}
				j += 1;
			}
			i += 1;
		}
		// Display given text
		print("\n Given x : " + x);
		print("\n Given y : " + y);
		// Display calculated result
		print("\n Distance : " + dp(n)(m));
	}
}
object Main
{
	def main(args: Array[String]): Unit = {
		var task: Distance = new Distance();
		// Given text
		var x: String = "bitterness";
		var y: String = "buttress";
		// Get the length
		var n: Int = x.length();
		var m: Int = y.length();
		// Test
		task.levenshteinDistance(x, y, n, m);
	}
}

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
import Foundation
/*
  Swift 4 Program 
  Levenshtein edit distance
*/
class Distance
{
	// Returns the minimum of a given three values
	func minValue(_ a: Int, _ b: Int, _ c: Int)->Int
	{
		if (a > b)
		{
			if (b > c)
			{
				return c;
			}
			else
			{
				return b;
			}
		}
		else
		{
			if (a > c)
			{
				return c;
			}
			else
			{
				return a;
			}
		}
	}
	func levenshteinDistance(_ x1: String, _ y1: String, _ n: Int, _ m: Int)
	{
        let x = Array(x1);
      	let y = Array(y1);
		var dp: [[Int]] = 
          Array(repeating: Array(repeating: 0, count: m + 1), count: n + 1);
		// Loop controlling variables
		var i: Int = 0;
		var j: Int = 0;
		// Set the value of first column
		while (i <= n)
		{
			dp[i][0] = i;
			i += 1;
		}
		i = 1;
		// Set the value of first row
		while (i <= m)
		{
			dp[0][i] = i;
			i += 1;
		}
		i = 1;
		// iterate the loop through by length of x+1
		while (i <= n)
		{
			j = 1;
			// iterate the loop through by length of y+1
			while (j <= m)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				dp[i][j] = self.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]);
				if (x[i - 1]  != y[j - 1])
				{
					dp[i][j] += 1;
				}
				j += 1;
			}
			i += 1;
		}
		// Display given text
		print("\n Given x : ", x1, terminator: "");
		print("\n Given y : ", y1, terminator: "");
		// Display calculated result
		print("\n Distance : ", dp[n][m], terminator: "");
	}
}
func main()
{
	let task: Distance = Distance();
	// Given text
	let x: String = "bitterness";
	let y: String = "buttress";
	// Get the length
	let n: Int = x.count;
	let m: Int = y.count;
	// Test
	task.levenshteinDistance(x, y, n, m);
}
main();

Output

 Given x :  bitterness
 Given y :  buttress
 Distance :  3
/*
  Kotlin Program 
  Levenshtein edit distance
*/
class Distance
{
	// Returns the minimum of a given three values
	fun minValue(a: Int, b: Int, c: Int): Int
	{
		if (a > b)
		{
			if (b > c)
			{
				return c;
			}
			else
			{
				return b;
			}
		}
		else
		{
			if (a > c)
			{
				return c;
			}
			else
			{
				return a;
			}
		}
	}
	fun levenshteinDistance(x: String, y: String, n: Int, m: Int): Unit
	{
		var dp: Array < Array < Int >> = Array(n + 1)
		{
			Array(m + 1)
			{
				0
			}
		};
		// Loop controlling variables
		var i: Int = 0;
		var j: Int ;
		// Set the value of first column
		while (i <= n)
		{
			dp[i][0] = i;
			i += 1;
		}
		i = 1;
		// Set the value of first row
		while (i <= m)
		{
			dp[0][i] = i;
			i += 1;
		}
		i = 1;
		// iterate the loop through by length of x+1
		while (i <= n)
		{
			j = 1;
			// iterate the loop through by length of y+1
			while (j <= m)
			{
				// minValue(top, top left, and left element)
				//  b   a
				//    ↖ ↑
				//  c ← x
				dp[i][j] = this.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]);
				if (x.get(i - 1) != y.get(j - 1))
				{
					dp[i][j] += 1;
				}
				j += 1;
			}
			i += 1;
		}
		// Display given text
		print("\n Given x : " + x);
		print("\n Given y : " + y);
		// Display calculated result
		print("\n Distance : " + dp[n][m]);
	}
}
fun main(args: Array < String > ): Unit
{
	var task: Distance = Distance();
	// Given text
	var x: String = "bitterness";
	var y: String = "buttress";
	// Get the length
	var n: Int = x.length;
	var m: Int = y.length;
	// Test
	task.levenshteinDistance(x, y, n, m);
}

Output

 Given x : bitterness
 Given y : buttress
 Distance : 3
Levenshtein edit distance solution

Time Complexity

Let's analyze the time complexity of the algorithm. The two nested loops iterate over the lengths of strings X and Y (n and m, respectively). In each iteration, we perform constant-time operations. Therefore, the time complexity of the algorithm is O(n * m).

Finally

In this article, we explored the Levenshtein edit distance problem, where we aimed to find the minimum number of single-character insertions, deletions, or substitutions needed to transform one string into another. We provided a detailed explanation of the problem, along with a suitable example and standard pseudocode. The Java code provided in the example was analyzed, and the output for a specific input was explained. Additionally, we discussed the time complexity of the algorithm, which is O(n * m), where n and m are the lengths of the input strings.





Comment

Please share your knowledge to improve code and content standard. Also submit your doubts, and test case. We improve by your feedback. We will try to resolve your query as soon as possible.

New Comment