Ad

Improved regex search by making it (relatively) simpler, also added a new test case

Regex Explanation:
let's break the regex bit by bit

  1. (?:\w+.|//|)
  • (?:) means a "non capturing group", so the findall function will not return what it finds inside this group.
  • \w+. means a series of word characters ending with a "." (so www. and maps. get both found)
  • // means "//" (the "" is used as an escape character)
  • the middle "|" means either the first part ("\w+.") or the second ("//")
  1. (?:www.)?
  • this agains checkes for an additional "www.", but the ()? means "either 0 or 1 of this". This is used so the new "http://www.youtube.com" test case can be correctly found (since without this iw would find "www.youtube" instead of "youtube"
  1. (\w+)
  • this is the website name, a series of at least one characters
  1. .\w+
  • the ending portion of the website name, a "." followed by a series of characters (like ".com" or ".co.jp")
Code
Diff
  • import re
    
    def domain_name(url):
        return re.findall(r"(?:\w+\.|\/\/)(?:www\.)?(\w+)\.\w+", url)[0]
    • import re
    • def domain_name(url):
    • res = re.search(r"(?:(//www\.|//|(?!//)www\.|maps\.))(?P<reqResult>.+?)(?:\.)", url)
    • return res.group(2)
    • return re.findall(r"(?:\w+\.|\/\/)(?:www\.)?(\w+)\.\w+", url)[0]