In Progress

Fix a web scraper program created in Python -- 2017

We have a Web Scraping program, created in Python, that stopped working properly. I need someone to go through the program files, and fix the part that is no longer working properly. We need to end up with a working program.

This program was fixed 2 years ago, and we have included all of the original files, plus the fixed files (with different revisions. We do not know which revision is the most current)

The ECFR for [url removed, login to view] file is the most current program that was running correctly.

The purpose of the program is to:

1. Spider through a website ad download all files that result from the spidering

2. Format each file downloaded to a specific format

The program appears to be working properly, and goes through the same steps as it always did, but there are no output files created. There is a good chance that there is change need in the 'search and replace' file; "[url removed, login to view]"

I have attached the program files, including source code.

The following is the program description:

1. Spider through a website ad download all files that result from the spidering

2. Format each file downloaded to a specific format

Part one:

You will be given a batch of starting URL's that look like this:

[url removed, login to view]

You will follow each of these URL's that will lead to another page with links that look like this:

[url removed, login to view]

You will follow each of these URL's that will lead again to another page with links that look like this:

[url removed, login to view]

You will now follow each of these links that leads to a page that links to specific documents. The links within the pages tend to look like this:

<table width="480"><tr>

<td><table width="120">

<tr><td>

<a class="tpl" href="/cgi/t/text/text-idx?c=ecfr&SID=f68f503ab8017206c54fb367aaaa7851&rgn=div8&view=text&node=10:1.0.1.1.4.1.56.1&idno=10">

§5.100</a></td></tr>

</table></td>

<td><table width="354">

<tr><td>Purpose and effective date.</td></tr>

</table></td>

</tr></table>

each of these links leads to a page that needs to be saved with the following naming structure that looks like this:

[url removed, login to view]

other examples of naming structures:

6cfrAppendix A to Part [url removed, login to view]

Part two of this project:

After you have downloaded each file, you will need to put each file into a specific html page structure.

1. You will first strip all of the information before <!-- startDynamic --> and after the <!-- endDynamic -->

2. You will now need to create a header for each record that looks like the files that are part of the samples.

3. You will need to replace the string in the text when it comes across a graphic:

example string:

Please replace:

<img src="/graphics/

With this string:

<img src="[url removed, login to view]

AND replace this string:

<a href="/graphics/pdfs/

With this string:

<a href="[url removed, login to view]

4. You will need to create a footer at the bottom of each section, after the p class=” cita, that looks this this example:

<p class="cita">[54 FR 53314, Dec. 28, 1989]</p>

<br><p><center>Copyright 2017 Compliance Publishing Corporation (877) 500-6737</center>

</body>

</html>

5. You must be able to accommodate both regular regulations and the Appendix sections

6. Some of the titles have one less level. This program must be able selectable to how many levels deep the individual text is located.

The project must be completed in 7 days.

.

Screen shot added to show the error while running the program:
cfrprogramerror.jpg

Skills: Javascript, PHP, Python, Software Architecture, Web Scraping

See more: web scraping tutorial, web scraping python, web scraping tools, python scrapy example, web scraping project ideas, web scraping api, python scrapy vs beautifulsoup, scrapy tutorial pdf, python web scraper using keywords, write web scraper program, python web scraper cpanel, fix web site update created wordpress, python web scraper images, python authenticate web scraper, python web scraper mysql

About the Employer:
( 76 reviews ) Edina, United States

Project ID: #15676112

Awarded to:

nasirhjafri

I will fix your python scraper. Relevant Skills and Experience I've 2+ years of experience in python scrapy and twisted. Proposed Milestones $88 USD - fix the script I've read the requirements. Could you kindly inbo More

$88 USD in 2 days
(14 Reviews)
4.8

24 freelancers are bidding on average $139 for this job

$155 USD in 3 days
(55 Reviews)
6.4
lkhelladi

Hello, I'd be glad to fix your scrapping tool . Looking forward to chat with you soon for more details. Best regards, Relevant Skills and Experience Web scraping , Python , urllib2 , beautifulsoup ... I cordiall More

$98 USD in 3 days
(104 Reviews)
6.6
schoudhary1553

Greeting, I have understood your web scraping task and can do it with your 100% satisfaction. Please ping me for more discussion. Relevant Skills and Experience I have more than 5 years of experience in Javascript, More

$250 USD in 5 days
(40 Reviews)
6.0
hunmin888

Hi, sir! I have a close look to your project. I have a good skill in python programming. If you award this project to me, we'll complete it in time. Our budget may be negotiable Thanks Relevant Skills and Experience More

$155 USD in 3 days
(30 Reviews)
5.8
IMdaystar

Hello,Sir How are you? Relevant Skills and Experience I have extension experienced in developing Javascript, PHP, Python, Software Architecture, Web Scraping for 5 years. I will work very hard and best for you. Prop More

$155 USD in 3 days
(6 Reviews)
5.3
webDevsolutionz

Already Done Many Similar Projects’, I will assure you that i will provide 100% Satisfaction Guarantee. 100% Quality Guaranteed Happy to show you previous work completed. Relevant Skills and Experience PHP, WordPress, More

$99 USD in 2 days
(53 Reviews)
5.4
$166 USD in 7 days
(48 Reviews)
5.2
kartikeyagupta

hello, i have expertise in data scrapping and i extensively use python for this purpose. Ping me to disucss more about fixing the scrpt or to make a new one Relevant Skills and Experience python/selenium/bs4/scrapy P More

$155 USD in 3 days
(34 Reviews)
4.9
shinelancer

*** I can do it. I have experience writing many web project using python. You can see my portfolio on software development. I hope to be your good friend through this project. Relevant Skills and Experience python, we More

$150 USD in 3 days
(12 Reviews)
4.7
superman1987417

Greeting I am expert Data Miner with high python programming skills. And then, I have experienced with some risk solving. I can do it quickly and perfectly. Regards. Relevant Skills and Experience Python, Data Mining More

$150 USD in 1 day
(15 Reviews)
4.4
etuannv

Dear Sir, I can do this some. I have done many similar jobs. Let's see them on my profile or these demo videos: [url removed, login to view] Relevant Skills and E More

$155 USD in 3 days
(16 Reviews)
4.4
Nada100200

Over 8 +years experience writing almost exclusively web scraping code. I've done it all. I can scrape all LinkedIn profile My languages in order of experience and use is Python, JavaScript, PHP. Python libraries ( sele More

$100 USD in 7 days
(5 Reviews)
3.9
utkarshv43

A proposal has not yet been provided

$166 USD in 2 days
(3 Reviews)
3.4
dinesh8921

I can help you with [url removed, login to view] me Relevant Skills and Experience more than 5 years experience in full stack development Proposed Milestones $250 USD - code

$250 USD in 2 days
(2 Reviews)
3.1
skriyaz09

I have been working as a software developer for more than two years on python scripting and having good experience in web scraping using python

$110 USD in 5 days
(8 Reviews)
3.5
nikksbagul

I have 3+ years of experience in python. I have done many web scrapping projects earlier. I can show you my work if you want. Relevant Skills and Experience python, web scrapping Proposed Milestones $222 USD - Done p More

$222 USD in 3 days
(2 Reviews)
2.3
lntrx

Will take no time :) In a few hours Relevant Skills and Experience Python, HTML, javascript Stay tuned, I'm still working on this proposal.

$122 USD in 3 days
(1 Review)
0.6
$155 USD in 3 days
(0 Reviews)
0.0
mohdlatief

Hello, My name is latief a certified  lead generation expert,got 6+ years of experience in lead generation. I have designed a tool that makes use of API technologies and i use that to generate leads. I am fully exper More

$30 USD in 3 days
(0 Reviews)
0.0
$61 USD in 4 days
(0 Reviews)
0.0