3

I'm using this Lambda function to calculate the Levenshtein distance between two strings. In Excel it's called LEV.

=LAMBDA(a,b,[ii],[jj],[arr],
        LET(
            i,IF(ISOMITTED(ii),1,ii),
            j,IF(ISOMITTED(jj),1,jj),
            a_i,MID(a,i,1),
            b_j,MID(b,j,1),
            init_array,MAKEARRAY(
                    LEN(a)+1,
                    LEN(b)+1,
                    LAMBDA(r,c,IFS(r=1,c-1,c=1,r-1,TRUE,0))
                    ),
            cost,N(NOT(a_i=b_j)),
            this_arr,IF(ISOMITTED(arr),init_array,arr),
            option_a,INDEX(this_arr,i+1-1,j+1)+1,
            option_b,INDEX(this_arr,i+1,j+1-1)+1,
            option_c,INDEX(this_arr,i+1-1,j+1-1)+cost,
            new_val,MIN(option_a,option_b,option_c),
            overlay,MAKEARRAY(
                    LEN(a)+1,
                    LEN(b)+1,
                    LAMBDA(r,c,IF(AND(r=i+1,c=j+1),new_val,0))
                    ),
            new_arr,this_arr+overlay,
            new_i,IF(i=LEN(a),IF(j=LEN(b),i+1,1),i+1),
            new_j,IF(i<>LEN(a),j,IF(j=LEN(b),j+1,j+1)),
            is_end,AND(new_i>LEN(a),new_j>LEN(b)),
            IF(is_end,new_val,LEV(a,b,new_i,new_j,new_arr))
            )
)

It works fine when comparing two strings. For example, if I pass in

=lev("book","back")

it returns 2.

However, if c2 contains

book look

And I put in

=lev(textsplit(c2," "),"back")

It returns #N/A. I want it to return

2
3

To be honest, I don't know where to start in troubleshooting this function. Can anyone help?

@Ike's answer very helpfully answered the question, but unfortunately my question was a bit ill-stated.

I need to get the lev distance between every word in c2 and d2.

Suppose c2 is

red ball

And d2 is

yellow window

The desired output would be the levenstein distance between each word in c2 and d2.

There are 4 possible comparisons: red/yellow, red/bat, ball/yellow, and ball/bat.

Desired output:

5
4
5
6

Note that I only want comparison between arrays, not within them. I don't want the distance between red and ball, because they're part of the same array.

1
  • 1
    BTW, =BYROW(TEXTSPLIT(C2,," "),LAMBDA(r,LEV(r,"back"))) returns {#REF!;#REF!}? Commented Feb 1, 2024 at 20:11

3 Answers 3

3

You can use this formula:

=LET(a,C2,b,D2,
aSplit,TEXTSPLIT(a,," "),
DROP(REDUCE("",aSplit,LAMBDA(r,x,VSTACK(r,lev(x,b)))),1))

enter image description here

As @vbasic2008 said, BYROW doesn't work.

Passing an array to the Lev-Lambda would need another step of recursion within the Lambda function.

Using REDUCE like this keeps the Lev-Lambda untouched.

**** EDIT due to comment ****

=LET(a,C2,b,E2,
aSplit,TEXTSPLIT(a," "),
bSplit,TEXTSPLIT(b," "),
r, DROP(REDUCE("",bSplit,LAMBDA(r_2,x_2,VSTACK(r_2,
DROP(REDUCE("",aSplit,LAMBDA(r,x,HSTACK(r,lev(x,x_2)))),,1)))),1),
HSTACK(VSTACK("",TRANSPOSE(bSplit)),VSTACK(aSplit,r))
)

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

This works, thank you. I should have thought of it before but I need it to accept arrays on both 'sides' of the formula. That is, D2 needs to be textsplit as well as c2. I need to get the lev distance between every word in c2 and d2. Is there a way I can adapt your formula to do that?
I updated the formula - you need another REDUCE - and I enhanced the formula so that it shows the values of both sides
0

If you're committed to using the LAMBDA formula, this solution won't solve that; however, if you're trying to calculation the Levenshtein, then you can use this VBA function, that will appear in your excel as a formula, and you can input the 2 variables to calculate their distance.

This isn't something I wrote, but was able to find online here, more specifically this answer

  1. Alt + F11 or add the "Developer" tab to your excel ribbon and open the "Visual Basic" Window
  2. In the left section, right-click and insert a new module.
  3. Copy and paste the code below into Module1
  4. =LEVENSHTEIN( [Input1], [Input2] )
Option Explicit
Public Function Levenshtein(s1 As String, s2 As String)

Dim i As Integer
Dim j As Integer
Dim l1 As Integer
Dim l2 As Integer
Dim d() As Integer
Dim min1 As Integer
Dim min2 As Integer

l1 = Len(s1)
l2 = Len(s2)
ReDim d(l1, l2)
For i = 0 To l1
    d(i, 0) = i
Next
For j = 0 To l2
    d(0, j) = j
Next
For i = 1 To l1
    For j = 1 To l2
        If Mid(s1, i, 1) = Mid(s2, j, 1) Then
            d(i, j) = d(i - 1, j - 1)
        Else
            min1 = d(i - 1, j) + 1
            min2 = d(i, j - 1) + 1
            If min2 < min1 Then
                min1 = min2
            End If
            min2 = d(i - 1, j - 1) + 1
            If min2 < min1 Then
                min1 = min2
            End If
            d(i, j) = min1
        End If
    Next
Next
Levenshtein = d(l1, l2)
End Function

1 Comment

Thank you. I need to share the spreadsheet with non technical users, who will balls it up if they have to enable macros. In general I try to avoid vb as it's on the way out.
0

Your LAMBDA seems work with single strings, not arrays. So when you pass an array to the function, it doesn't know how to handle it and maybe cause to returns #N/A. Therefore, try use SEQUENCE to generate an array of indices, and then apply your function to each element of the array. Then use BYROW to apply your LAMBDA to each row of the array:

=LAMBDA(a,b,[ii],[jj],[arr],
    LET(
        i,IF(ISOMITTED(ii),1,ii),
        j,IF(ISOMITTED(jj),1,jj),
        a_i,MID(a,i,1),
        b_j,MID(b,j,1),
        init_array,MAKEARRAY(
                LEN(a)+1,
                LEN(b)+1,
                LAMBDA(r,c,IFS(r=1,c-1,c=1,r-1,TRUE,0))
                ),
        cost,N(NOT(a_i=b_j)),
        this_arr,IF(ISOMITTED(arr),init_array,arr),
        option_a,INDEX(this_arr,i+1-1,j+1)+1,
        option_b,INDEX(this_arr,i+1,j+1-1)+1,
        option_c,INDEX(this_arr,i+1-1,j+1-1)+cost,
        new_val,MIN(option_a,option_b,option_c),
        overlay,MAKEARRAY(
                LEN(a)+1,
                LEN(b)+1,
                LAMBDA(r,c,IF(AND(r=i+1,c=j+1),new_val,0))
                ),
        new_arr,this_arr+overlay,
        new_i,IF(i=LEN(a),IF(j=LEN(b),i+1,1),i+1),
        new_j,IF(i<>LEN(a),j,IF(j=LEN(b),j+1,j+1)),
        is_end,AND(new_i>LEN(a),new_j>LEN(b)),
        IF(is_end,new_val,LEV(a,b,new_i,new_j,new_arr))
    )
)

=BYROW(TEXTSPLIT(C2," "),LAMBDA(x,LEV(x,"back")))

3 Comments

Didn't I write in the comments that this doesn't work? Have you tried it?
How about =ARRAYFORMULA(LAMBDA(x,LEV(x,"back"))(TEXTSPLIT(C2," ")))??
There is no ARRAYFORMULA in Excel, it's a Google Sheets thing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.