Beta

Word Correlations in Text Data

Description
Loading description...
SQL
Databases
  • Please sign in or sign up to leave a comment.
  • victorgarzonmz Avatar

    muchachos consejos para aprender y memorizar metodos de javascript para poder mejorar la logica

  • mortonfox Avatar

    Can a single word match more than one pattern?

    For example, "The quicklazyexample fox jumps over the dog." if "quicklazyexample" happens to be a valid word.

    If so, does it count as the relevant pattern pairs occurring in a sentence?

  • mortonfox Avatar

    Is there an implied ordering within the pair (pattern1, pattern2)?

    i.e. why does the example consider (lazy, quick) as a pair but not (quick, lazy)?

  • Voile Avatar

    The description does not rule out pairs of patterns of the same pattern ((a, a)).

    It doesn't mention which pair should be selected either: when pairs (a, b) or (b, a) exist, which one should be kept in the results? Note that tiebreaking with pattern_text instead of pattern_id is inappropriate, as there are no guaranteed pattern_text is unique.

    The filter criteria is missing too: why is pair (lazy, example) non-existent in the given example? (The kata probably requires both_count greater than 0, which should be specified. Or did you use a join when left join would be more appropriate?)

    • Voile Avatar

       Counts the number of occurrences of each pattern and the number of sentences where both patterns co-occur.

      This sentence is outright wrong: the kata never requires us to count the number of occurrences of each pattern. We only need to count the number of sentences which a pattern occur.

    • dfhwze Avatar

      All above issues are valid and I would like to add that regardsless whether we need to count occurences of patterns or number of sentences with occurences of patterns, there should be fixed and random tests with multiple matches of patterns within a single sentence.

    • Voile Avatar

      there should be fixed and random tests with multiple matches of patterns within a single sentence

      Sample tests already have them, which is why I discovered this.

  • dfhwze Avatar

    The performance constraints seem a bit unbalanced to me. I would either add less data or more (+ performance tag).

  • dfhwze Avatar

    Somewhere in the description we read ..

    Counts the number of occurrences of each pattern

    And later we read ..

    The number of sentences where the first pattern appears.

    It's a bit of a contradiction. Specially when a pattern occurs more than once in some sentences.