6 kyu
N smallest elements in original order (performance edition)
170 of 277FArekkusu
Loading description...
Algorithms
Arrays
Performance
View
This comment has been reported as {{ abuseKindText }}.
Show
This comment has been hidden. You can view it now .
This comment can not be viewed.
- |
- Reply
- Edit
- View Solution
- Expand 1 Reply Expand {{ comments?.length }} replies
- Collapse
- Spoiler
- Remove
- Remove comment & replies
- Report
{{ fetchSolutionsError }}
-
-
Your rendered github-flavored markdown will appear here.
-
Label this discussion...
-
No Label
Keep the comment unlabeled if none of the below applies.
-
Issue
Use the issue label when reporting problems with the kata.
Be sure to explain the problem clearly and include the steps to reproduce. -
Suggestion
Use the suggestion label if you have feedback on how this kata can be improved.
-
Question
Use the question label if you have questions and/or need help solving the kata.
Don't forget to mention the language you're using, and mark as having spoiler if you include your solution.
-
No Label
- Cancel
Commenting is not allowed on this discussion
You cannot view this solution
There is no solution to show
Please sign in or sign up to leave a comment.
This comment has been hidden.
Perhaps try the most basic stuff first and then build up.
Very nice kata on dynamic programming and algorithms optimization for the record, well done.
It seems like all the inputs have non-negative integers within the 0..2³² range. Yet the kata just speaks of integers. Would it not be good to include negative and bigger integers as well? I see some JavaScript solutions work on the assumption that there are no such integers in the input.
for 6 kyu...? na, 'don't think so...
This wouldn't change the intended approach in any way, only some concrete implementations would be affected. During the final iteration of redesigning the tests a couple years ago I stopped on the
[1; 50]
range, and IMO it'd be better to not change the specs again for such insignificant reason. I removed the note about the input range earlier, but now it's back as a part of the specifications.Super confused by the tests and the description. Are we taking the first
n
unique minimum elements? Or just the firstn
of the sorted elements? Or something else entirely? My current solution only deviates in the last few elements of these long lists, which seems like super strange behavior to me. An example:In the "correct" solution you can clearly see that an 8 is left out. In mine, the last four elements are
[8, 4, 6, 6]
and in the "correct" solution the last four are[4, 6, 6, 1]
so clearly the two are just offset by this one extra element in mine, BUT the "correct" solution includes other 8s, so clearly the "correct" solution considers 8 to be one of then
minimum elements... I don't understand why the "correct" solution throws this particular 8 out when others are let in...There is an upper limit of occurrence for the highest possible value. So we are taking the first n of the sorted elements. The highest possible value can be in the original array more times before other smaller numbers which should be accepted instead.
I added a Rust translation, please take a look!
@FArekkusu, The description says there are 15 tests with array of size 300,000. Are there more tests than that? I ran tests on my machine and it finishes in 4.2 sec for that size but it is showing timeout error here.
There are 15 tests, but the RNG was very inefficient, making the tests pass in ~11 seconds IIRC. I've replaced it with a much faster one, and adjusted the input sizes. Reset the kata and try again.
Thank you for your reply
should be 5 kyu instead of 6 performance edition katas shoud be one or two kyu difference than original kata
I must've gotten lucky; I used my same solution from the non-performant version and passed in
11976 ms
. How is the average performance of any given solution for this kata?( JS )
There is one example and fixed test that does not conform to the Numbers range specified for the Performance tests. Is this intentional?
Must have overlooked them during the most recent rewrite. Now there're no such tests.
This comment has been hidden.
It's not me who chose
6 kyu
, altough I wouldn't call this a5 kyu
as well.This depends on the server load, received inputs, and Python version you're using. When the estimated completion time is
12 sec
, even good solutions are not guaranteed to work all the time.I've changed the tests a little to be lighter (at least the completion time dropped for me from 12 to 11 sec).
Great, hopefully that helps alleviate the issues in your second comment. Thanks for responding quickly and helpfully. When something like just time of day can change whether or not a valid solution passes it really isn't great after all, at least a 50% success rate with a decent solution I'd hope, and no successes at all was what had made me particularly concerned, twenty seven consecutive failures is really quite a lot. Trying again it worked first shot though with a comfortable margin at about 11.3 seconds, so the little leeway to allow for things like that is nice.
Definitely agree it still could warrant a kyu raising though, although I have no idea how that would even happen, I've mostly just been solving katas around here, not helping make them.
This comment has been hidden.
If you're getting a time-out consistently, then your solution is not efficient enough. (Unfortunately) having a good time complexity is not enough, and solving it may require micro-optimization.
Heh, our solutions were both counting sort (mine was only so rigorously microoptimised it may not have been transparent), and it pulls 10049 ms in comparison to your 11000, so it wasn't a problem with my code, but rather your tests being too difficult.
This kata was not as interesting as it seems. Micro-optimization is vastly a waste of time and enforcing it on purpose should be avoid.
At the moment, Python solutions are nearly limited to the same bytecode - solving this kata sums up to get the right serie of instructions or to resign. Moreover, many sorting-like solutions and even linear solutions are rejected due to the very specific test suite.
I can't say I have learned something from this kata so I can't speak longer, but I was quite disconcerted by such a kata...
LOL, tempted to give up just by reading this
This comment has been hidden.
Improving the algo may not only mean improving its time complexity but also decreasing the overall amount of operations. I can see your solution is almost the same as in Giacomo's kata but using 2 sorts you can't pass all 3000 tests because you'd be doing more than required.
With a smaller range
[-20..20]
your "double sorting" passes all 3000 tests, same for me. With the current integer range[-200..200]
you complete almost 2900 tests. I do 3000 tests. With a bigger range[-500..500]
you complete almost 2500 tests. I still can complete all 3000.Am I disallowing same algorithms, or are they actually less optimal than needed?
wrong: both our solutions sometimes pass or sometimes fail, with 3000 random tests with values in range of -500 -> 500.
EDIT: oops, you talked about my former solution.
EDIT²: either way, that doesn't change anything because you're not talking about what I meant: you're comparing
O(N log N + n log n + x*n)
while I was talking aboutO(N log N + x*n)
. The double sorting can be considered as nicely fitered out, while your current implementation is filtering out many implementations of the second kind. And that is what I talked about.Wait there's only 3000 tests? I thought there were 5k. Damn.. I'm stuck at ~2700 with basically the same code I used for Giancomo's.
3000 in python, but 5000 in JS, iirc
This comment has been hidden.
https://www.codewars.com/kumite/5af0aa61783bb4651f00008d?sel=5af0bb2d2c5061aba20000f3
you are disabling same algorithms, yes. :p
EDIT: well, "most of the times", it fails x) (the ratio fails/passes is inverted with the original implementation)
But can you do this without sorting 2000 integers 3000 times in 6 seconds? ;)
No, seriously, I took my solution as a reference not to filter out "the unwanted", but because it's more effective than those dumb ones sorting back and forth. Yes, you're sorting the input array only once with this solution which saves you a lot but how dare you call it the same if it still passes only 2500 tests with integer range
[-500..500]
while mine one doesn't give a damn about it ???I could instantly win this argument by making the tests harder so all these mythical "same" solutions fail miserably, not only "some" of them, but I wonder if I should really make it that harsh.
you're joking...?
my current solution to YOUR kata behaves
exactly the samebetter than yours when increasing the range.Please check what you're saying before exposing such kind of things. Did you read the comment I wrote under your solution? Increasing the range, that's YOUR solution that becomes slower because of your dict that contains more items, not mine... (even if mine is a bit slower with the smaller ranges, yes, but it's independant of the number of duplicates, on the contrary of yours) Proof.
Meh, I'm focusing on too much stuff already and mixing up everything...
Well, I'm currently testing you actual solution against mine increasing the integers range (tested range
[-1000..1000]
with both 2000 numbers chosen out of 5000 and 2000 simply shuffled with each test) and they're still passing the tests in 12 sec both (despite mine theoretically getting slower and slower each time).In any case, I don't understand why I have to defend here if all you've presented until now is one working solution which does better than
O(N * log(N) + n * log(n))
, even though you're the one talking about "many solutions which are too slow for no reason"...well, you wanted proof, I provided. If you do not wanna see them or rather, if you do not wanna get my point, that's up to you.
cheers.
at the very least: add a
microoptimization
tag to your kata.EDIT: btw, not sure you saw this (to disable python 2.7)
I'd see proof, if you showed me at least one solution which has the same time complexity (as you're saying) but for some reason is not quick enough. All this time you've been talking about your quick, working solution and saying that "lots of these exist but unforunately, I (somehow) filter them out" -_-
EDIT: I can't disable Python 2.7 because I don't know how to.
This comment has been hidden.
you didn't look at the modification I did in the test suite, you...... :/
"er".translate(str.maketrans("e","r"))
-> add that in the test suite (anywhere) and python 2 will fail(if you wanna secure it, wrap it in the try/except block, using a Test.except(False, "don't use python 2")(no, that's a bad idea -> EDIT^n: well, actually, maybe not... Up to you...)Well, at least this one should be able to pass the test.
But I can still instantly come back and win by replacing
3000 tests with 2000 numbers
by2000 tests with 3000 numbers
which will make any sorting solution a lose (and invalidate everything:\
).EDIT: I saw the modification but I had no idea it would break Python 2 attempts -_-
well, and what do you think about this one: new third fork? ;)
In any case, thanks to you and smile67 now I know that the kata is hardly challenging and simple solutions may pass the tests. Soon it's gonna be different (JS version turned out to be easily passable with a sort-slice-sort approach, one of the reason for Python boys to suffer, on the other hand, is you :D )
my solution is marked as invalid, though it still passes the tests, actually... :/
Now it's 1100 out of 1500.
Well, now it's all fixed ¯\_(ツ)_/¯
seriously...? You're making it worse.
Could you be more specific than? If you're talking about even more solutions becoming inefficient, than that was the goal - enforcing the use of better algos under the given circumstances.
the problem is that the only thing you're currently enforcing is more MICRO-optimizations, not a better overall algorithm. That will lead to more and more similar algorithms/approaches that will time out where they shouldn't. That will lead to a low quality kata (from the user's point of view).
Again, your algo is efficient because you know what you're putting in the arrays and what will be the value of
n
, but you do not provide the informations to the user. Just look at the current solutions: not ANY of them use a similar approach. That is because you're the only one who knows that:That means that you're the ONLY ONE who knows that you're approach will be efficient. Proof: try your solution with range(-10000, 10000) for the values: your approach will fail the random tests. That means that your implementation is specific to a subtask of what is present in the description at the moment, but you do not tell about it. Using a solution the relies on hidden facts to setup the difficulty is just bad. So for now, the kata is badly setted up, yes.
I just tried again (And saw you actually provide a bit more information in the description, now. That's a beginning): the very same solution fails sometimes at 1068 tests and sometimes at 1425/1500. That means there is a problem in the variability of the random tests, don't you agree?
UPD: Somebody approved it with the same rank as non-performant version.
Insane >_>
This is definitely not 6 Kyu. It should be 5 at the very least.
Your solution is
O(len(arr) * log(len(arr)))
in the worst case, so what exactly are the requirements?The classical
O(n*k)
vsO(n*log(n))
debate again?Actually, my first
O(len(arr) * log(len(arr)))
solutions timed out, so I thought a different asymptotic complexity was expected, but I just had a big constant in the reordering part; so I think it should be OK.This comment has been hidden.
Assuming an infinite alphabet, the worst case is
m == n - 1
, the same asm == n
. So, more precisely, it'sO(k * log(k))
wherek = min(n, alphabet size)
, but there are enough characters in Unicode to consider it infinite in many contexts.