Data Download

We classify our string data by length of two strings and similarity. A csv file "m_s_a_s.csv" has 500 sets of string data with length of string A = "m", length of string B = "m*s", alphabet size = "a" and similarity of A and B = "s"(%). For example, A csv file "100_5_4_85.csv" has 500 sets of string data with length of string A = "100", length of string B = "500", alphabet size = "4" and similarity of A and B = "85"(%)

Data format

An example of a row in "10_1_20_40.csv" is shown as follows.

Here we use the numbers to represent the symbels. In this example, string A is "2 14 4 5 7 8 11 4 18 5", string B is "1 5 20 7 4 10 7 8 9 11", used symbels are in range [1, 20] and similarity of A and B = "40"(%).

Index,stringA,stringB
50,-1\|2\|14\|4\|5\|7\|8\|11\|4\|18\|5,-2\|1\|5\|20\|7\|4\|10\|7\|8\|9\|11

The files

	Length of string A(m)	Length of string B(n)	Alphabet size