Extract word by word all text and meta data from an html page. Save billions of html pages in a structured way in a sql database so you can perform analysis on words and tags, minimizing storage space required and maximizing performance and still be able to reconstruct the html page with the same text, including punctuation marks and tags.
See instructions, data to be extracted and expected results of an example at:
[login to view URL]
You can use a popular, well maintained, html dom parser like AngleSharp.
Coding must be done in C# using async methods. It must return results quickly and efficiently.
Please state in your answer
1. If you have experience with this and how
2. Which parser you would use
15 freelancers are bidding on average €392 for this job
Dear Client. I think that I am a expert with C#. I am sure that I can satisfy your request perfectly. If you need, you may test me. We can discuss more details over chat for your project. Sincerely.