Ad
  • Custom User Avatar
  • Custom User Avatar

    Cant solve the last test 'quick brown fox etc'.
    Wrote my own kmeans algorithm and it solves the rest of the cases without hardcoding anything but this last test just seems wrong.

    Took my time to translate it from binary to morse by hand. The expected clusters (if we take the lenghts of each bit substring of 0s and 1s) are
    c1 = [1,2,3,4]
    c2 = [5,6,7,8,9,10]
    c3 = [12,13,14,15,17,18]

    But if we calc those clusters by hand we get:
    c1 = [1,2,3,4,5]
    c2 = [6,7,8,9,10,12]
    c3 = [13,14,15,17,18]

    To test myself i tried to use sklearn.kmeans and it gives me the the same centroids and same clusters so i am almost completely sure its not my mistake anywhere in kmeans implementation. And it solves the rest of the cases to reinforce my assumption.

    I have no idea what to do. Picking different starting centroids barely changes the outcome because of the nature of the algorithm.

    upd: so i solved it. Very frustrating and unsatisfying task. Kmeans wont work because kmeans tries to find, well, means. Pick random centroids and then go with a debugger step by step to see that standard kmeans does wrong on the last test. 5 is being classified as first cluster of shortest lengths.
    From now on the question is not how to classify lengths but how to pick initial centroids right.

  • Custom User Avatar

    This comment is hidden because it contains spoiler information about the solution

  • Custom User Avatar

    I am not sure if there are easier ways to describe an algorithm of a solution of a 2 kyu kata.

  • Custom User Avatar

    Giving wikipedia as a source is an awful idea int this case. Strict scientific language the article is written in may be useful for someone who already knows the topic, but is completely useless for everyone else.

  • Custom User Avatar

    Spent too much time on the second task.
    This is amaing, nothing destroys my self esteem like these problems.