Ad
  • Custom User Avatar
  • Custom User Avatar

    Great example, lets have a look:

    import itertools as i
    
    i.count("hello") # TypeError: a number is required
    i.accumulate(8, 8) # TypeError: 'int' is not iterable (error references the call site, not deep in some library somewhere)
    i.islice(None, []) # ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize
    

    But maybe that is unfair, since its a builtin library. Lets try another good one, numpy:

    import numpy as np
    
    np.ndarray("foo") # TypeError: 'str' object cannot be interpreted as an integer. (Again referencing the call site, not deep in some library)
    np.ndarray(1).fill(lambda x: x) # TypeError: float() argument must be a string or a number, not 'function' (Once again, referencing call site)
    

    These all obviously thew errors, but most of them were explicit about what went wrong, and all of them threw the error from the call site, not from deep in some library. None of these errors would require delving into any 3rd party source code to debug.

  • Custom User Avatar

    and therefore much prefer a library which handled basic problems (which most good libraries do anyway)

    I'd like to know what those libraries are, because I'm quite sure if you took any popular library, even builtin one's like Python's itertools (which you'd have to agree is quite a good library?), and instead of calling a method with an iterable as expected, you provided it with an integer, or a list of generators that produced tcp socket handles (or any of the millions of terrible examples one could come up with), it would just error out at whatever point in the library code the provided value was used. I don't expect it to first test if my value is of the correct container type, with the correct elements, with the correct length, and that the moon is also currently aligned with Sirius. It simply crashes. And it's not because the library is badly designed, it's because you tried to use it in a way it was not designed to work.

  • Custom User Avatar

    I disagree that you are equipped to deal with it. In real life yes, once you learn how to access the source code, however that is not even a possibility here. The first time, or the tenth time, the best option is still to merely guess at what your code might have done wrong.

    I may not call such a library badly designed, however I would have a much better experience, and therefore much prefer a library which handled basic problems (which most good libraries do anyway). Do you not? I also don't think there are even that many thing which can go wrong. If the library relies on a data having length, then check that it has length. If it relies on it being a specific type, then you check that it is that type.

  • Custom User Avatar

    Yes! And how terribly annoying of an experience that is.

    Indeed, it's not fun, but once you learn about it, you are then suddenly equipped to deal with it. To take this further: would you claim a library is "badly designed" if it doesn't somehow provide you with a beautiful error message politely and accurately pointing out exactly which of the millions of invalid things you specifically did to cause the error? Just how many cases are tests expected to anticipate and preempt? So you check for the correct type/subclass, the correct length, the proper types of values in a container, etc. etc., all before you actually dare try to assess the correctness of a solution. At what point do the tests stop being about checking the correctnes of a solution and more about ensuring the user isn't just throwing random stuff at them and then getting confused when things don't work as expected (whatever "expected" means in that context)?

  • Custom User Avatar

    In both cases the stack trace will point to the exact line where the test code is that is using the user solution.

    I don't think it's generally true, and if it is, it's incidental rather than deliberate. This argument is true only in languages which do so, with testing frameworks which do so, and when the code of tests is (accidentally) built in a way to not erase this information.

    the issue would be identical to using a 3rd party library, e.g. calling an API with invalid arguments. The "crash" will originate from somewhere in the depths of the library, but you should realise that with a high probability, you caused it.

    My general experience is opposite (but YMMV), and I would propose an experiment: let's take a random library (not a C one tho) and feed it with invalid arguments and see how many of them responds with "NullPointerException, bye bye loser!", and how many with "Hey, name cannot be null". I am honestly curious what would the results be.

  • Custom User Avatar

    The aim here is not merely to prevent people from raising bogus issues, its also to improve their solving experience overall. For every user which decides to raise such an issue, there are probably 5-10 users who also encountered it, struggled with it, and eventually either worked it out, or simply gave up.

    In both cases the stack trace will point to the exact line where the test code is that is using the user solution

    In this kata, sure, because the user solution happens to be called on the same line in which the error occurs, however I thought we were speaking more generally. If the error happens on a seperate line, then it comes down to guessing at what variable names might mean, about why the variable is being sorted in the first place, etc.

    I understand that this is 4kyu, however this isn't about handholding, its about providing a basic level of solving experience. Debugging is of course a normal part of solving, however saying "Now you have to debug this line that you didn't even write, completely on its own, without any context" is not a normal part.

    The "crash" will originate from somewhere in the depths of the library, but you should realise that with a high probability, you caused it

    Yes! And how terribly annoying of an experience that is. Even knowing that I most definitely caused it, and with the ability to actually access and read the source code, it is still a huge pain in the ass to try debug these things. Why would it not be preferrable for tests to not subject solvers to that?

  • Custom User Avatar

    (on the other hand, the kata is :cough: 4 kyu. So one could also expect from a user attempting that level some minimal degree of autonomy, couldn't they?)

  • Custom User Avatar

    The thing is, at least for submission tests, is that they are not visible to the solver.

    however simply seeing NoneType is not iterable originating from the tests when you are already fairly sure that your code never returns None, is the opposite of helpful.

    In both cases the stack trace will point to the exact line where the test code is that is using the user solution. It is obvious, unless you assume that your solution is perfect and bugfree because you are some kind of devine programmer, and really there's just a bug in the tests. That is the mentality we need to snuff out, though!

    And while yes, I agree that hidden tests like they are presented on Codewars is much different from real life test suites, the issue would be identical to using a 3rd party library, e.g. calling an API with invalid arguments. The "crash" will originate from somewhere in the depths of the library, but you should realise that with a high probability, you caused it. Imagine the run on the issue page of every github repo if this "is it me? No, it's everyone else who is wrong" mentality prevailed in this way outside of platforms like this.

    ...when you are already fairly sure that your code never returns None

    Indeed, how about we force people to be more than merely "fairly" sure before writing complaint letters about defective products? :)

  • Custom User Avatar

    I tend to agree with Hob, while I don't think its a big issue for this kata, ideally the tests should not ever fail.

    you'll have to put on your detective hat and work over your code to see where and what you got wrong.

    The thing is, at least for submission tests, is that they are not visible to the solver. So all that is left to do for the solver is a) try to use the stack trace/other tools to get a look at the test code (which in itself would be a problem), or b) just take guesses at what might be the problem. It isn't merely just an extra step from debugging your own solution, it transforms from debugging a solution to debugging a black box.

    In this particular case, the bug is obvious from the code. However I could easily imagine someone writing a much longer solution, and somewhere accidentally returning None (for example, by mistakenly trying return some_list.reverse()). For this kata, the stack trace combined with the code in the sample tests may be enough to figure it out, however simply seeing NoneType is not iterable originating from the tests when you are already fairly sure that your code never returns None, is the opposite of helpful.

  • Custom User Avatar

    "Index out of bounds" and "Parse errors" are value errors, not type errors, so really a different matter with different considerations.
    I disagree. All of the "Index out of bounds", "Parse error", "X is not a property of null", "sort is not a function" etc have the same root cause: too optimistic assumptions of tests when handling the actual value.

    Considering that this is a platform for developers to hone their skills, the question arises as to just how much handholding we want to provide. After all, a test that fails due to it making valid assumptions about your incorrect return value/type is basically just another type of bug, and really no different from your solution returning incorrect (albeit non-crashing) values: you'll have to put on your detective hat and work over your code to see where and what you got wrong. The error originating from inside a test is then merely a single extra step in the deduction chain.

    Continuing the theme above, what about education? While completely avoiding such errors may be seen as a matter of convenience or UX, it also robs the user of a potential teachable moment, leaving them ignorant of very real pitfalls they might encounter in real life, whilst also possibly reinforcing their instinctive reaction that "the problem lies elsewhere, not with me".

    I understand this argument, but I do not agree with it fully for two reasons:

    • I am not sure I would equate "explicit feedback" with handholding. It is a favorite argument of one prominent author of kata which cause the dashboard to be spammed with "X created an ISSUE for kata Y" type of posts, and my take is: explicit feedback does not take away the requirement of diagnosing the mistake and fixing it. A user still has to debug their solution and fix it, this part cannot be avoided.
    • Continuing the theme above, I think that (with exception of some specific domains) finding bugs in a black box of not your code is not exactly educational. I would argue that usually you have access to whole code (incluing tests) and can dig into it to see why it gets affected by your mistakes. Debugging a black box (of your code or not) is not exactly something a coder usually does, especially in context of testing.
  • Custom User Avatar

    Perhaps a middle-ground approach would be to ensure that the sample tests catch such errors. E.g. if the test suite is going to perform a sort on the user's return value, the sample tests should do so too. A glance at the sample tests would then make it obvious enough why a test crashed when you return something that can't be sorted.

    This is what is done here. You could add a return [], if you want, to the initial code to make returning a list more obvious or use type hinting.

  • Custom User Avatar

    I don't mind continuing this, so here's a few more points:

    • "Index out of bounds" and "Parse errors" are value errors, not type errors, so really a different matter with different considerations.
    • Considering that this is a platform for developers to hone their skills, the question arises as to just how much handholding we want to provide. After all, a test that fails due to it making valid assumptions about your incorrect return value/type is basically just another type of bug, and really no different from your solution returning incorrect (albeit non-crashing) values: you'll have to put on your detective hat and work over your code to see where and what you got wrong. The error originating from inside a test is then merely a single extra step in the deduction chain.
    • Continuing the theme above, what about education? While completely avoiding such errors may be seen as a matter of convenience or UX, it also robs the user of a potential teachable moment, leaving them ignorant of very real pitfalls they might encounter in real life, whilst also possibly reinforcing their instinctive reaction that "the problem lies elsewhere, not with me".
    • Perhaps a middle-ground approach would be to ensure that the sample tests catch such errors. E.g. if the test suite is going to perform a sort on the user's return value, the sample tests should do so too. A glance at the sample tests would then make it obvious enough why a test crashed when you return something that can't be sorted.
    • "What bothers me more are some kata which are either not that clear ("return a regex" might mean a string expression, or an actual object)": This is an issue with bad descriptions, and another topic altogether. My argument here is that perhaps instead of trying to get authors to diligently check a return type before doing any other kind of analysis on every test ever authored, a first step would be to get authors to finally stop being ambiguous in their descriptions and provide a proper, exhaustive spec. Perhaps we could finally put an AI to good use: "Hey, it looks like you just spent 20 minutes writing a completely irrelevant, four paragraph intro story for your kata. Might I suggest you first provide a complete specification about the inputs and outputs of your challenge?". This would extend to every user participating in the beta process, and make it "obligatory culture" that they demand proper descriptions before allowing a kata to publish.
  • Custom User Avatar

    I think I sidetracked the discussion here and I argue the general approach too much, and disconnected it from this specific kata and this also causes some misunderstandings. I am sorry for that. But if you let me continue... :)

    The "if the user caused it, they have to deal with it" is not bad on itself, but it's tricky: someone, somehow, has to estimate what caused the problem. If you additionally consider the fact that a reporter was not able to get things right and caused a problem in the first place, their judgement on the cause of the problem might be not really good (especially if it's supported by feedback pointing outside of user solution). As a result, it induces a support event, someone else has to find out the cause, and this can be more expensive to handle than preemptively detecting the problem in the first place.
    Main premise behind my stance of "tests should not crash" is based on assumptions that:

    • a user caused the crash in the first place, so they cannot be relied to be able to diagnose it,
    • crashes are easy to prevent (I would hope it boils down to adding a single line with a single assertion), and
    • one time effort of making tests better pays off when compared to potentially many (i.e. more than one) support requests. Adding a (figuratively) single line of assert_equals(actualtype, expectedtype, "Unexpected type {actualtype}") is literally smaller effort than answering two questions.

    If any of the above does not hold for some specific case, then sure, it can be treated as out of scope of this argument.

    Chrono's example of "you know it won;t work from the start" is a good one, and he's right: it can be totally obvius (logically, or from specs, or whatever) that you need to return a list. Then fine, such questions can be replied just with "Hey, you need to return a list, cant you read?" and I'd be fine with that. What bothers me more are some kata which are either not that clear ("return a regex" might mean a string expression, or an actual object), or attempt to perform some deeper analysis (for idx in 0..len(expected): verify(actual[i]) fails with "index out of bounds", or actual.split(' ').map(parseInt) fails with "Invalid format"). My idea is that in such cases, preventing a crash is a one-time effort which saves our time later, and it makes the "the maintainence burden will increase" not true in the long run.

  • Custom User Avatar

    You know it won't work from the start.

    You know, he knows, I know. But CoolNewb420 doesn't, which was kind of the argument I made above: at some point, a user might be expected to be able to figure this out on their own. The root of the issue is in their code, even if the tests/trace doesn't make this explicit. The cynic in me wants also wants to rant about this general "If something goes wrong, first blame everyone else before even considering to look at my own code, and even then I can't spot any problems so I know my code is perfect, which is ironic because the issue arose precisely because I was to ignorant and unexperienced to see what I was doing wrong."

    That said, Hobs does make a few good points. Perhaps I am indeed too stuck on this "if the user caused it, they have to deal with it" mentality. After all, nowhere in the above list of arguments has there ever been mention of a detriment to providing more helpful, detailed tests, so really, it can only be beneficial (though the maintainence burden will increase)

  • Loading more items...