Project Description:
I need a web scraper, developed in Python with Scrapy (scrapy.org) for vBulletin forums.
In the pipeline (see Scrapy's concept of pipelines) I need to receive each message posted in a specific vBulletin forum, with the following fields:
- Member name
- Post html content
- Post date/time
- Post #id
- Post permalink
- Thread name
- Thread #id (the #id may be at the beginning of the thread url or at the end, this should be configurable)