Skip to content

Possible edge case where suffixArray does not give a suffix array. #1

@vchoo

Description

@vchoo

This was the fastest Javascript implementation of a suffix array construction algorithm I could find. All seemed well - and then, to my surprise, I found this bug (I think it's a bug) as I was doing some testing.

From what I understand of suffix arrays, the array should give me a lexicographical ordering of the suffixes in a string. That is:

function checkSuffixArrayOutput(str) {
    var arr = suffixArray(str);
    for (var i = 1; i < arr.length; i++)
        if (str.substring(arr[i-1]) >= str.substring(arr[i]))
            return false;
    return true;
}

Should always return true for any given string.

However, for the test case string a m.***b***9.4.3**ions require it.***, checkSuffixArrayOutput() returns false.


More specifically, the output I'm getting when I use this suffixArray is that if I run the following code in NodeJS v8.10.0:

> var test = 'a m.***b***9.4.3**ions require it.***';
undefined
> // gets the suffixes of test, which should be in lexicographical order
> var suffixes = suffixArray(test).map(x => test.substring(x));
undefined
> suffixes[5] < suffixes[6]
false
> suffixes[5]
'***b***9.4.3**ions require it.***'
> suffixes[6]
'***'

To anyone fixing this bug, if it helps, for this particular test case, I think the bug occurs during the first call of _suffixArray() in the source code (I believe the single recursive call during the first call gives me the correct lexicographical ordering).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions