Levenshtein edit distance
The Levenshtein edit distance, also known as the minimum edit distance, is a metric used to quantify the difference between two strings. It measures the minimum number of single-character insertions, deletions, or substitutions required to transform one string into another. This distance metric finds applications in various fields, including spell checking, DNA sequence analysis, natural language processing, and more.
Problem Statement
Given two input strings, X and Y, we are tasked with finding the Levenshtein edit distance between them. Our goal is to compute the minimum number of operations required to transform string X into string Y, where each operation can be one of the following:
- Inserting a character into string X.
- Deleting a character from string X.
- Substituting a character in string X with another character.
Example
Let's take an example to understand the problem better. Consider two strings: X = "kitten" Y = "sitting"
To convert X to Y, we need to perform the following operations:
- Substitute 'k' with 's'.
- Insert 'i' after 's'.
- Insert 't' after 'i'.
- Substitute 'e' with 't'.
- Substitute 'n' with 'g'.
Thus, the Levenshtein edit distance between "kitten" and "sitting" is 5, as we need at least five operations to transform X into Y.
Standard Pseudocode
function levenshteinDistance(X, Y):
n = length(X)
m = length(Y)
dp = 2D array of size (n+1) x (m+1)
for i from 0 to n:
dp[i][0] = i
for j from 1 to m:
dp[0][j] = j
for i from 1 to n:
for j from 1 to m:
if X[i-1] = Y[j-1]:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = min(dp[i-1][j], dp[i-1][j-1], dp[i][j-1]) + 1
return dp[n][m]
Algorithm Explanation
- We create a 2D array
dp
to store the distances of substrings X[1:i] and Y[1:j]. - Initialize the first column
dp[i][0]
with the values from 0 to n (length of X). - Initialize the first row
dp[0][j]
with the values from 0 to m (length of Y). - Loop through the strings X and Y using nested loops, starting from index 1.
- If the characters at index
i
in X andj
in Y are the same, setdp[i][j]
to the value ofdp[i-1][j-1]
. Otherwise, setdp[i][j]
to the minimum of the values ofdp[i-1][j]
,dp[i-1][j-1]
, anddp[i][j-1]
, and increment it by 1. - After completing the loops, the final edit distance will be stored in
dp[n][m]
.
Resultant Output Explanation
In the given Java code, the main
function creates an instance of the Distance
class and
initializes two strings X and Y. It then calculates the length of each string and calls the
levenshteinDistance
method to compute the edit distance between them. Finally, it prints the result on
the console.
For the provided example: X = "bitterness" Y = "buttress"
The output of the code will be:
Given x : bitterness
Given y : buttress
Distance : 4
The Levenshtein edit distance between "bitterness" and "buttress" is 4, as it takes a minimum of 4 operations to transform X into Y. One possible transformation sequence is:
- Substitute 'b' with 'b'.
- Substitute 'i' with 'u'.
- Substitute 'e' with 't'.
- Substitute 'n' with 't'.
Code Solution
/*
Java Program
Levenshtein edit distance
*/
public class Distance
{
// Returns the minimum of a given three values
public int minValue(int a, int b, int c)
{
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
public void levenshteinDistance(String x, String y, int n, int m)
{
int[][] dp = new int[n + 1][m + 1];
// Loop controlling variables
int i = 0;
int j = 0;
// Set the value of first column
for (i = 0 ; i <= n; ++i )
{
dp[i][0] = i;
}
// Set the value of first row
for (i = 1 ; i <= m; ++i )
{
dp[0][i] = i;
}
// iterate the loop through by length of x+1
for (i = 1; i <= n; ++i)
{
// iterate the loop through by length of y+1
for (j = 1; j <= m; ++j)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp[i][j] = minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]);
if (x.charAt(i - 1) != y.charAt(j - 1))
{
dp[i][j] += 1 ;
}
}
}
// Display given text
System.out.print("\n Given x : "+x);
System.out.print("\n Given y : "+y);
// Display calculated result
System.out.print("\n Distance : " + dp[n][m]);
}
public static void main(String[] args)
{
Distance task = new Distance();
// Given text
String x = "bitterness";
String y = "buttress";
// Get the length
int n = x.length();
int m = y.length();
// Test
task.levenshteinDistance(x, y, n, m);
}
}
Output
Given x : bitterness
Given y : buttress
Distance : 3
// Include header file
#include <iostream>
#include <string>
using namespace std;
/*
C++ Program
Levenshtein edit distance
*/
class Distance
{
public:
// Returns the minimum of a given three values
int minValue(int a, int b, int c)
{
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
void levenshteinDistance(string x, string y, int n, int m)
{
int dp[n + 1][m + 1];
// Loop controlling variables
int i = 0;
int j = 0;
// Set the value of first column
for (i = 0; i <= n; ++i)
{
dp[i][0] = i;
}
// Set the value of first row
for (i = 1; i <= m; ++i)
{
dp[0][i] = i;
}
// iterate the loop through by length of x+1
for (i = 1; i <= n; ++i)
{
// iterate the loop through by length of y+1
for (j = 1; j <= m; ++j)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp[i][j] = this->minValue(
dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]
);
if (x[i - 1] != y[j - 1])
{
dp[i][j] += 1;
}
}
}
// Display given text
cout << "\n Given x : " << x;
cout << "\n Given y : " << y;
// Display calculated result
cout << "\n Distance : " << dp[n][m];
}
};
int main()
{
Distance task = Distance();
// Given text
string x = "bitterness";
string y = "buttress";
// Get the length
int n = x.length();
int m = y.length();
// Test
task.levenshteinDistance(x, y, n, m);
return 0;
}
Output
Given x : bitterness
Given y : buttress
Distance : 3
// Include namespace system
using System;
/*
C# Program
Levenshtein edit distance
*/
public class Distance
{
// Returns the minimum of a given three values
public int minValue(int a, int b, int c)
{
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
public void levenshteinDistance(String x, String y, int n, int m)
{
int[,] dp = new int[n + 1,m + 1];
// Loop controlling variables
int i = 0;
int j = 0;
// Set the value of first column
for (i = 0; i <= n; ++i)
{
dp[i,0] = i;
}
// Set the value of first row
for (i = 1; i <= m; ++i)
{
dp[0,i] = i;
}
// iterate the loop through by length of x+1
for (i = 1; i <= n; ++i)
{
// iterate the loop through by length of y+1
for (j = 1; j <= m; ++j)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp[i,j] = minValue(dp[i - 1,j], dp[i - 1,j - 1], dp[i,j - 1]);
if (x[i - 1] != y[j - 1])
{
dp[i,j] += 1;
}
}
}
// Display given text
Console.Write("\n Given x : " + x);
Console.Write("\n Given y : " + y);
// Display calculated result
Console.Write("\n Distance : " + dp[n,m]);
}
public static void Main(String[] args)
{
Distance task = new Distance();
// Given text
String x = "bitterness";
String y = "buttress";
// Get the length
int n = x.Length;
int m = y.Length;
// Test
task.levenshteinDistance(x, y, n, m);
}
}
Output
Given x : bitterness
Given y : buttress
Distance : 3
<?php
/*
Php Program
Levenshtein edit distance
*/
class Distance
{
// Returns the minimum of a given three values
public function minValue($a, $b, $c)
{
if ($a > $b)
{
if ($b > $c)
{
return $c;
}
else
{
return $b;
}
}
else
{
if ($a > $c)
{
return $c;
}
else
{
return $a;
}
}
}
public function levenshteinDistance($x, $y, $n, $m)
{
$dp = array_fill(0, $n + 1, array_fill(0, $m + 1, 0));
// Loop controlling variables
$i = 0;
$j = 0;
// Set the value of first column
for ($i = 0; $i <= $n; ++$i)
{
$dp[$i][0] = $i;
}
// Set the value of first row
for ($i = 1; $i <= $m; ++$i)
{
$dp[0][$i] = $i;
}
// iterate the loop through by length of x+1
for ($i = 1; $i <= $n; ++$i)
{
// iterate the loop through by length of y+1
for ($j = 1; $j <= $m; ++$j)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
$dp[$i][$j] = $this->minValue(
$dp[$i - 1][$j], $dp[$i - 1][$j - 1], $dp[$i][$j - 1]
);
if ($x[$i - 1] != $y[$j - 1])
{
$dp[$i][$j] += 1;
}
}
}
// Display given text
echo "\n Given x : ". $x;
echo "\n Given y : ". $y;
// Display calculated result
echo "\n Distance : ". $dp[$n][$m];
}
}
function main()
{
$task = new Distance();
// Given text
$x = "bitterness";
$y = "buttress";
// Get the length
$n = strlen($x);
$m = strlen($y);
$task->levenshteinDistance($x, $y, $n, $m);
}
main();
Output
Given x : bitterness
Given y : buttress
Distance : 3
/*
Node Js Program
Levenshtein edit distance
*/
class Distance
{
// Returns the minimum of a given three values
minValue(a, b, c)
{
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
levenshteinDistance(x, y, n, m)
{
var dp = Array(n + 1).fill(0).map(() => new Array(m + 1).fill(0));
// Loop controlling variables
var i = 0;
var j = 0;
// Set the value of first column
for (i = 0; i <= n; ++i)
{
dp[i][0] = i;
}
// Set the value of first row
for (i = 1; i <= m; ++i)
{
dp[0][i] = i;
}
// iterate the loop through by length of x+1
for (i = 1; i <= n; ++i)
{
// iterate the loop through by length of y+1
for (j = 1; j <= m; ++j)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp[i][j] = this.minValue(
dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]
);
if (x.charAt(i - 1) != y.charAt(j - 1))
{
dp[i][j] += 1;
}
}
}
// Display given text
process.stdout.write("\n Given x : " + x);
process.stdout.write("\n Given y : " + y);
// Display calculated result
process.stdout.write("\n Distance : " + dp[n][m]);
}
}
function main()
{
var task = new Distance();
// Given text
var x = "bitterness";
var y = "buttress";
// Get the length
var n = x.length;
var m = y.length;
// Test
task.levenshteinDistance(x, y, n, m);
}
main();
Output
Given x : bitterness
Given y : buttress
Distance : 3
# Python 3 Program
# Levenshtein edit distance
class Distance :
# Returns the minimum of a given three values
def minValue(self, a, b, c) :
if (a > b) :
if (b > c) :
return c
else :
return b
else :
if (a > c) :
return c
else :
return a
def levenshteinDistance(self, x, y, n, m) :
dp = [[0] * (m + 1) for _ in range(n + 1) ]
# Loop controlling variables
i = 0
j = 0
# Set the value of first column
while (i <= n) :
dp[i][0] = i
i += 1
i = 1
# Set the value of first row
while (i <= m) :
dp[0][i] = i
i += 1
i = 1
# iterate the loop through by length of x+1
while (i <= n) :
j = 1
# iterate the loop through by length of y+1
while (j <= m) :
# minValue(top, top left, and left element)
# b a
# ↖ ↑
# c ← x
dp[i][j] = self.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1])
if (x[i - 1] != y[j - 1]) :
dp[i][j] += 1
j += 1
i += 1
# Display given text
print("\n Given x : ", x, end = "")
print("\n Given y : ", y, end = "")
# Display calculated result
print("\n Distance : ", dp[n][m], end = "")
def main() :
task = Distance()
# Given text
x = "bitterness"
y = "buttress"
# Get the length
n = len(x)
m = len(y)
# Test
task.levenshteinDistance(x, y, n, m)
if __name__ == "__main__": main()
Output
Given x : bitterness
Given y : buttress
Distance : 3
# Ruby Program
# Levenshtein edit distance
class Distance
# Returns the minimum of a given three values
def minValue(a, b, c)
if (a > b)
if (b > c)
return c
else
return b
end
else
if (a > c)
return c
else
return a
end
end
end
def levenshteinDistance(x, y, n, m)
dp = Array.new(n + 1) {Array.new(m + 1) {0}}
# Loop controlling variables
i = 0
j = 0
# Set the value of first column
while (i <= n)
dp[i][0] = i
i += 1
end
i = 1
# Set the value of first row
while (i <= m)
dp[0][i] = i
i += 1
end
i = 1
# iterate the loop through by length of x+1
while (i <= n)
j = 1
# iterate the loop through by length of y+1
while (j <= m)
# minValue(top, top left, and left element)
# b a
# ↖ ↑
# c ← x
dp[i][j] = self.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1])
if (x[i - 1] != y[j - 1])
dp[i][j] += 1
end
j += 1
end
i += 1
end
# Display given text
print("\n Given x : ", x)
print("\n Given y : ", y)
# Display calculated result
print("\n Distance : ", dp[n][m])
end
end
def main()
task = Distance.new()
# Given text
x = "bitterness"
y = "buttress"
# Get the length
n = x.length
m = y.length
# Test
task.levenshteinDistance(x, y, n, m)
end
main()
Output
Given x : bitterness
Given y : buttress
Distance : 3
/*
Scala Program
Levenshtein edit distance
*/
class Distance
{
// Returns the minimum of a given three values
def minValue(a: Int, b: Int, c: Int): Int = {
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
def levenshteinDistance(x: String, y: String, n: Int, m: Int): Unit = {
var dp: Array[Array[Int]] = Array.fill[Int](n + 1, m + 1)(0);
// Loop controlling variables
var i: Int = 0;
var j: Int = 0;
// Set the value of first column
while (i <= n)
{
dp(i)(0) = i;
i += 1;
}
i = 1;
// Set the value of first row
while (i <= m)
{
dp(0)(i) = i;
i += 1;
}
i = 1;
// iterate the loop through by length of x+1
while (i <= n)
{
j = 1;
// iterate the loop through by length of y+1
while (j <= m)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp(i)(j) = this.minValue(
dp(i - 1)(j), dp(i - 1)(j - 1), dp(i)(j - 1)
);
if (x.charAt(i - 1) != y.charAt(j - 1))
{
dp(i)(j) += 1;
}
j += 1;
}
i += 1;
}
// Display given text
print("\n Given x : " + x);
print("\n Given y : " + y);
// Display calculated result
print("\n Distance : " + dp(n)(m));
}
}
object Main
{
def main(args: Array[String]): Unit = {
var task: Distance = new Distance();
// Given text
var x: String = "bitterness";
var y: String = "buttress";
// Get the length
var n: Int = x.length();
var m: Int = y.length();
// Test
task.levenshteinDistance(x, y, n, m);
}
}
Output
Given x : bitterness
Given y : buttress
Distance : 3
import Foundation
/*
Swift 4 Program
Levenshtein edit distance
*/
class Distance
{
// Returns the minimum of a given three values
func minValue(_ a: Int, _ b: Int, _ c: Int)->Int
{
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
func levenshteinDistance(_ x1: String, _ y1: String, _ n: Int, _ m: Int)
{
let x = Array(x1);
let y = Array(y1);
var dp: [[Int]] =
Array(repeating: Array(repeating: 0, count: m + 1), count: n + 1);
// Loop controlling variables
var i: Int = 0;
var j: Int = 0;
// Set the value of first column
while (i <= n)
{
dp[i][0] = i;
i += 1;
}
i = 1;
// Set the value of first row
while (i <= m)
{
dp[0][i] = i;
i += 1;
}
i = 1;
// iterate the loop through by length of x+1
while (i <= n)
{
j = 1;
// iterate the loop through by length of y+1
while (j <= m)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp[i][j] = self.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]);
if (x[i - 1] != y[j - 1])
{
dp[i][j] += 1;
}
j += 1;
}
i += 1;
}
// Display given text
print("\n Given x : ", x1, terminator: "");
print("\n Given y : ", y1, terminator: "");
// Display calculated result
print("\n Distance : ", dp[n][m], terminator: "");
}
}
func main()
{
let task: Distance = Distance();
// Given text
let x: String = "bitterness";
let y: String = "buttress";
// Get the length
let n: Int = x.count;
let m: Int = y.count;
// Test
task.levenshteinDistance(x, y, n, m);
}
main();
Output
Given x : bitterness
Given y : buttress
Distance : 3
/*
Kotlin Program
Levenshtein edit distance
*/
class Distance
{
// Returns the minimum of a given three values
fun minValue(a: Int, b: Int, c: Int): Int
{
if (a > b)
{
if (b > c)
{
return c;
}
else
{
return b;
}
}
else
{
if (a > c)
{
return c;
}
else
{
return a;
}
}
}
fun levenshteinDistance(x: String, y: String, n: Int, m: Int): Unit
{
var dp: Array < Array < Int >> = Array(n + 1)
{
Array(m + 1)
{
0
}
};
// Loop controlling variables
var i: Int = 0;
var j: Int ;
// Set the value of first column
while (i <= n)
{
dp[i][0] = i;
i += 1;
}
i = 1;
// Set the value of first row
while (i <= m)
{
dp[0][i] = i;
i += 1;
}
i = 1;
// iterate the loop through by length of x+1
while (i <= n)
{
j = 1;
// iterate the loop through by length of y+1
while (j <= m)
{
// minValue(top, top left, and left element)
// b a
// ↖ ↑
// c ← x
dp[i][j] = this.minValue(dp[i - 1][j], dp[i - 1][j - 1], dp[i][j - 1]);
if (x.get(i - 1) != y.get(j - 1))
{
dp[i][j] += 1;
}
j += 1;
}
i += 1;
}
// Display given text
print("\n Given x : " + x);
print("\n Given y : " + y);
// Display calculated result
print("\n Distance : " + dp[n][m]);
}
}
fun main(args: Array < String > ): Unit
{
var task: Distance = Distance();
// Given text
var x: String = "bitterness";
var y: String = "buttress";
// Get the length
var n: Int = x.length;
var m: Int = y.length;
// Test
task.levenshteinDistance(x, y, n, m);
}
Output
Given x : bitterness
Given y : buttress
Distance : 3

Time Complexity
Let's analyze the time complexity of the algorithm. The two nested loops iterate over the lengths of strings X and Y (n and m, respectively). In each iteration, we perform constant-time operations. Therefore, the time complexity of the algorithm is O(n * m).
Finally
In this article, we explored the Levenshtein edit distance problem, where we aimed to find the minimum number of single-character insertions, deletions, or substitutions needed to transform one string into another. We provided a detailed explanation of the problem, along with a suitable example and standard pseudocode. The Java code provided in the example was analyzed, and the output for a specific input was explained. Additionally, we discussed the time complexity of the algorithm, which is O(n * m), where n and m are the lengths of the input strings.
Please share your knowledge to improve code and content standard. Also submit your doubts, and test case. We improve by your feedback. We will try to resolve your query as soon as possible.
New Comment