Wordpress Plugin: Sitemap Scraper and Visual Page/Category Creation

  • Status Closed
  • Budget $30 - $250 USD
  • Total Bids 4

Project Description

Plugin Overview

This plugin is meant to help someone build a brand new website STRUCTURE in wordpress. Meaning, analyze the top 10 results in Google, extract out the page and category and post titles from sitemaps (or crawl through a site manually) and then allow the user to visually pick and choose which titles to use for home pages/category titles or post titles. I broke down the workflow in 2 stages.

Stage 1


A) Allow for user to enter in a keyword

B) Scrape the top 10 results of google and get the domain name.

C) Put those top 10 domains into an array [1-10]

D) Now, parse through each domain and see if it has a sitemap attached to it - ie - [url removed, login to view]

E) If it does - Extract out the category/page/post names from the sitemap

E) If it does not, use scrapy or some other php class to scrape through a website based on 'X' level of depth to find all the internal links

i. User input in 2 as the level depth, and the plugin will parse 2 levels deep to identify the internal link structure. Maybe use the following routines (or variations obviously)?

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

F) The plugin will then give scores to each of the page title/category title/post title (the first website in rank one gets 100 points for each title/category since Google has ranked it as being important first, and each site under it starts with a sliding value due to it's position..2nd website starts with 90, third starts with 80, and so forth. If there are any titles that MATCH the original 1st position website, that title will get a score of 100)

Please see referenced document

Stage 2


This is the trickier part, as the above is more along the lines of scraping and parsing with regex or some other DOM component. Stage 2 now provides those Titles and Categories in a visual box list to enable a user to manually drag and drop and create their structure based on the titles and categories. There is an abandoned plugin in the wordpress plugin directory called Visual Site Manager. It sort of doesn't work 100% right now because the devs haven't updated it.

[url removed, login to view]

here's a good write up of the plugin:

[url removed, login to view]

You can test install it and see how it works, but again, all the funcationality isn't there right now.

What I like about it is the usage of JIT Spacetree Visualization.

[url removed, login to view]

Obviously if it's a brand new site, all that the main canvas would have is the top box representing the root of the site. By being able to drag and drop each box onto the tree map that JIT enables you to build, one can take a predefined set of titles scraped and ranked in order of importance from other known authority sites and then build a site structure based on that.

Now, if someone has an existing site, then the main JIT canvas will show the existing site structure and allow someone to change titles on there (click on node, bring up edit text screen to change info) or just drag and drop new categories/pages/postings into the site structure.

Please have wordpress, jit, and scraping experience

Get free quotes for a project like this
Skills Required

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online