Wikipedia Database Dump project:
1. Parsing [login to view URL] files and extracting only unique domain names. Domain should not be wikipedia.org.
2. Script should work with a file from [login to view URL]
3. Two params (filename -> [login to view URL] database settings -> sql table to insert eh urls id, domain).
All params should be in the begining of the file so we can customize ourselves.
4. Software should run on Linux and use regex or other parsing technique.
Example Domain Extraction:
[login to view URL] -> extract [login to view URL]
[login to view URL] -> extract [login to view URL]