Smiley face

Data Download


Description

We classify our string data by length of two strings and similarity. A csv file "m_s_a_s.csv" has 500 sets of string data with length of string A = "m", length of string B = "m*s", alphabet size = "a" and similarity of A and B = "s"(%). For example, A csv file "100_5_4_85.csv" has 500 sets of string data with length of string A = "100", length of string B = "500", alphabet size = "4" and similarity of A and B = "85"(%)

Data format

An example of a row in "10_1_20_40.csv" is shown as follows.

Here we use the numbers to represent the symbels. In this example, string A is "2 14 4 5 7 8 11 4 18 5", string B is "1 5 20 7 4 10 7 8 9 11", used symbels are in range [1, 20] and similarity of A and B = "40"(%).


Index,stringA,stringB
50,-1|2|14|4|5|7|8|11|4|18|5,-2|1|5|20|7|4|10|7|8|9|11
The files


Length of string A(m) Length of string B(n) Alphabet size