Loading collection data...
Collections are a way for you to organize kata so that you can create your own training routines. Every collection you create is public and automatically sharable with other warriors. After you have added a few kata to a collection you and others can train on the kata contained within the collection.
Get started now by creating a new collection.
What about the example in the sample tests? And which points exactly are unclear to you?
A lovely series of kata! Maybe you could consider also adding a link to the second part in this first part. I almost missed it as I didn't read the title carefully and would have hated to miss out on the fun of solving the second part too.
Cheers
This comment is hidden because it contains spoiler information about the solution
Yes, I see your point. The adjecency matrix would always require n^2 space, even for sparse graphs, so this might not be the best option.
For the indices, I then might as well change the dict to a list, since a dict of the form
{0: val0, 1: val1, 2: val2}
seems a bit odd. In that case, all of the previously submitted solutions will become invalid though. But that shouldn't be an issue, since the kata is still in beta, right?Hi there! Great kata in general!
The setup of the python test cases seems a bit... unique. The
test.it("Result = " + str(result))
statement after the assert_equals appears as a seperate test in the color red (regardeless of whether the solution is correct). This had me scratch my head since I thought my solution was still incorrect and I was failing test cases. It gets even more confusing when trying to print to the log, as the log message that pops up when clicking on theResult = ...
"test-case" drop-down stems from execution of the following test, so the numbers are all wrong. I'm not an expert on the python test suite, but I believe there should only be a singletest.it()
statement before a test, and never one after the test it belongs to. Also, if I'm not mistaking, everytest.it()
should call at least onetest.assert
function.I personally think it is easier to read for debugging purposes if the stones have a name that can not be mistaken for a rope lengths. E.g. in the small random tests, the input net is printed when a test fails.
To me, an output like this:
is easier to read and debug by hand than an output like this:
But you're right, the current format is a bit wasteful. How about a format of
s{i}
?Added
Yet again, a beautiful kata! Looking forward to read a paper about the optimal solution for a piece with n moves. I also see the possibiliy for progressivly harder kata, where the number of allowed guesses is reduced and the number of possible moves increases. Maybe even a version where each of the possible n moves is allowed a maximum of m > 1 times in the move pattern.
Thanks for sharing your insights, advice taken! I am just realizing that there's kata authoring is a lot more work than expected but it's a great learning experience and very rewarding!
Now, you got to apprechiate the irony of having to go through all this trouble of cheat-proofing a kata that is about people cheating katas! Anyways, even if it might mean some extra trouble when authoring, I think we shouldn't refrain from having nice things just because a few people might cheat on them. In the end, couldn't people always just create a second account to look up the solution there and then copy it to their main account? So the whole point system is relying on honor regardeless of the efforts we spend to make it cheat-proof, right?
Hi!
With the help of Blind4Basics, I've revised the test cases quite a bit and think that they now seperate well between wanted and unwanted solutions. This is however mostly achieved with one fully connected graph with 1200 nodes and one a bit more sparse graph with 1500 nodes, so I can see that this might not be the best design. To see a meaningful difference to the unwanted solutions, graphs have to be somewhat around that size though (see plot in an reply to Blid4Basics issue), and are then already a bit too big to have many more tests with them to not have wanted solutions time out as well.
Otherwise I'm of ideas, so looking forward to hear if you think that this is sufficient already or if more can be done.
Errrr, looks like I didn't have my smartest coding session last night :D
Done.
lst += [new_elem]
is a bad habit of mine I should get rid of I guess.You know, I actually briefly concidered doing it the way you say now, but then thought "nah, why the extra effort?" :D Anyways, since I now only send a deep copy to the users function and the original net instance to the reference, there's no need in switching the execution anymore, so I just switched it back.
Done.
I've added 100 random tests on graphs with 5-10 nodes and removed the seeded fixed tests entierly, so no more perfect number easter egg with
random.seed(28)
:(It was just the first naive graph generating method I came up with. It was generating graphs with around
num_nodes*(num_nodes-1)//16
edges, which where "heavier" on one side (higher average arity of early added nodes). My generate_graph method can do a similar thing in around 150% of the time, so no big reason to keep the extra generating method around.Aha, I see what you mean now. After re-reading the testing guidelines I also found it there, so sorry for having bothered you with this. But thank you for being so patient with me! It is surprising what lengths people would go through to cheat on a kata...
I now first create all graphs and starting nodes in a list, shuffle it and only then run the tests in a newly created run_single_test method which also creates a deep copy.
I've switched the order and am now calling the reference solution first and also simplified the run_n_tests method to always create a deep copy to be send to the user solution (missing out on the micro optimization).
For the smaller debuggable tests I'm still not quite sure on how to do things. I currently still have some fixed tests with the random seed, wich intention it is to ease debugging. Though, the graphs there already get to big to be human friendly. So I was thinking it might make sense to either have those graphs be only about 5-10 nodes big and include all graph types (without a random order); or, do that part in the first batch of random tests and get rid of the "Fixed tests" entierly, since currently, they're pretty useless for debugging. Or do you think it'd make sense to have that format (small graphs, all graph types) both as fixed and as random tests?
Besides that, I also dropped the generate_simple_graph method since the added code complexity wasn't worth the time benefit of ~50% over the generate_graph method for similar kinds of graphs.
I also played around with different solutions a little more and think that the tests, as they currently are, do seperate well between wanted and unwanted solutions. This is however mostly achieved with one fully connected graph with 1200 nodes, so I can see that this might not be the best design. To see a meaningful difference to the unwanted solution, graphs have to be somewhat around that size though (see plot in an earlier reply), and are then already a bit too big to have many more tests with them to not have wanted solutions time out.
Hi,
implementation-wise, it should not be too big of an issue to set up the tests in that way. If I generate the graphs, shuffle them and then still perform all n tests on one particular graph all after the other, then its a very small modification. If every single assertion (test on this graph with that starting node) shall be randomized, it can still be done, but I'd change up the run_n_tests method. Which of the two option do you think is sufficient and more importantly, I'd still love to know why to put everything in two randomized batches only, and not more batches with more specific feedback on the graph types. So what is the benefit of the random order of graph types?
About the new points:
random.seed(time())
. Shouldn't that be enough to get rid of the seed?net
, though afterwards, I have thenet
variable point tocopy_net
instead (thus, loosing the reference to the original net). I do it in this (more complicated) way so that I don't have to copy the net in the final iteration (which does make the net useless after run_n_tests, but I don't need it anymore then). I am just noticing though that I send the very same net instance first to the users solution and only afterwards to my reference solution. Since I know that my reference solution doesn't modify the net, I should ve able to get away with just changing the order and not having to create yet another net copy, right?Hi, thanks a lot for the input!
I've played around a bit with the graph generators and different solutions and updated the test-cases (hopefully for the better).
Measures I've taken so far:
The one random graph where I'm not controlling the number of edges exactly is the final big graph with 1000 nodes from the generate_simple_graph method. Though, I created 1000 of these graphs and found the following metrics for the number of edges:
, which I think should be consistent enough for an inefficient solution not to get a "lucky pass".
As I wanted to understand better how the solutions posted by dfhwze compared against my reference solution, I ran some time measurements for fully connected graphs on my computer and came up with the results attatched below. From this, I get that the graphs I'm currently testing on are still way too small to get a meaningful difference between the wanted and unwanted solutions. One possible solution I see is to include one single test on a fully connected graph with, say 1000 nodes, and use the remaining time to get as much variety of other graphs as possible. With this approach though, the total number of tests would be greatly reduced. What do you think about it?
Also, when you say that random tests should only be split in two batches, could you maybe briefly explain why that's a good idea? To me it seems like it could be helpful feedback to the user when they see on which types of graphs they fail or timeout.
Very enjoyable puzzle to solve and a very nice description for it! Though, after looking at the other solutions, I feel a bit silly about the naive approach I came up with while trying out stuff with pen and paper.
About the helper functions in preloaded: If I understood your goal correctly, you'd like as little information as possible to leak out in the example tests. Right now the is_valid_solution_list function works on all inputs (even for n > 1000), while is_optimal_length only works for n in [2, 3, 4, 6] and always returns False for other inputs (I also found
is_optimal_length([], 6) = None
). To make it more clear that those functions only work for specific inputs and to stop users from trying to get more information out of them, my suggestion would be to put assertions at the very beginning of them, like so:Loading more items...