Project requires: Datascrape pre-defined URLs, custom search of stored pages, google translate results
1) For a predefined list of URLs (obtain from database, by reference to specific fields in each record);
2) Scrape and store the entire website (text only) ("Level 0 Copy") - avoid duplication/recursion of pages generated from URL-local content databases - time and depth limits to be admin editable and flagged if reached;
3) Search stored Level 0 Copy of each page for certain keywords, keep only those pages for which there is a positive result on search ("Level 1 Copy");
4) Search remaining Level 1 Copy pages for a separate set of keywords, keep only those pages for which there is a positive result ("Level 2 Copy");
Up to 5 iterations of search and keep only positive results - the structure is of multiple filters, applied sequentially, to produce a small number of remaining pages that match all keyword sets;
5) Final step if keyword searches conducted in non-english languages (defined at beginning of process), send resulting pages (after all 5 filters) to google translate to english;
6) Output - Excel spreadsheet report on search meta-data and results (template will be provided); and pdf of all remaining pages at end of process with translated copy if applicable.
This is stage 1 of a larger project and must be completed quickly. Clean coding required due to long-term nature of project. Immediate follow on to next stage.
You must advise on any external processing (e.g. cloud) requirements. All testing and development to be on your own account. Final large scale test only to be conducted on my accounts.
Your bid must have at least a price estimate, subject to confirmation following discussion and provision of more detailed description of sources and process.
Looking to make some money?
- Set your budget and the time frame
- Outline your proposal
- Get paid for your work
Bids on this Project
I'm a computer and electrical engineer. I have done many projects here ranging from web scraping to data mining and machine learning. I can code in most common languages (C++, Java, Python, Php) and can also do scientific programming (Matlab, R). I always deliver on time and go over the project if needed until the employer is satisfied.
San Ramon, United States
I am a Data Scientist interested in projects that allow me explore and transform raw client data into new insights and value. I have extensive experience developing in Java and Python. Most of my projects have been deployed in Amazon Web Services (AWS), but am comfortable with the Microsoft Azure platform as well. My skills also include applying statistical methods and machine learning to client data. Specifically, my projects have involved working in a Cloud Environment with large amounts of raw, unstructured data. Should your project require massively parallel computation, I have experience completing projects utilizing GPU Clusters (CUDA), MapReduce/Hadoop, and Message Passing Interface (MPI). If you would like assistance turning your idea into an actionable project plan, general advice on your project, or questions about my capabilities, please feel free to contact me. -William
Mirpur Azad Kashmir, Pakistan
Hello It's Umair here. I am an experienced freelancer with more than 5 years of experience in web scraping Field. I have worked with big & small companies from all over the world and i know how fulfill client requirements. Client Satisfaction is my priority not the money. I am providing following services here: ✔ Web scraping ✔ Data Mining ✔ Email Scraping ✔ Lead Generation ✔ Database Creation ✔ Business list creation ✔ Web Research. ✔ Business directory Scraping I Guarantee you high Quality work with 100% accuracy. Consider me the best option for your project and i will never disappoint you. Thanks
Hochiminh city, Vietnam
PHAMTECH Co is a professional web design and programming group located in Hochiminh city, Vietnam, and founded by Hung Pham - a Tufts College graduate (Boston, MA, USA) - since 2007. We deliver US standard service with a competitive price. Our expertises are: + Website design (in Photoshop files). + Website coding: Converting Photoshop design into HTML5/ CSS3 functional website. + Wordpress theme development: Converting Photoshop design (or HTML) into Wordpress fully functional website. + Wordpress plugin development: Design & code Wordpress plugin to accomplish from general to specific tasks.
Midnapore West, India
Vooraf Technology was conceived and established in 2007. We felt there was a niche in the market for a company to provide 'value for money' custom website designs. We offer a full range of Web services - As well as website design and website redesigns we offer basic and full optimization services, marketing and promotion advice and search engine submission services as well as administrative services. We provide a competitive edge because our solutions are agile and flexible allowing our clients to react, on their own, to competitive threats and changes in the marketplace. Vooraf is proud to say customer satisfaction is one of our proven commodities. How do we know this? Because a majority of our clients still partner with us today as they continue to realize our speed & quality, low Total Cost of Ownership (TCO) and quick Return On Investment (ROI).
Virtual Assistant: We are available 24/7 Round the clock. Data Entry: Data conversation, Shopping cart product upload / management. Research: Research of web or any data source. Customer Service: Round the clock. Back Office Management: For all types of back office responsibility. Custom Project: You can name it as you like, but we are here to give service for that too. We accept all niches! !NOTE: If you require a service that you do not see listed here please contact me as I might be able to help you. Now something about us: Established team who love what they are doing! We value only quality. We work for our clients 100% satisfaction. Friendly support & We meet our deadlines. Not sure if we can get that done? Then please allow us to make a FREE sample work .. Am very sure you will be glad about what you did. Go ahead, try new things.
I specialise in one thing: VisualBasic .NET Check out my CV website link that i gave in the bid i placed for your project to see screenshots and videos of my past projects and much more.
================================================== Software ================================================== 1. Backup Software A full-fledged, cross-platform desktop application that backs up files. It allows you to select the types of files to back up, and the folders to search in. The backup software runs on Windows, Mac, Linux and Solaris. 2. PPC Software A cross-platform desktop application that lets you create PPC campaigns for Google AdWords and Bing Ads. The PPC software runs on Windows, Mac, Linux and Solaris. ================================================== Websites ================================================== 1. Emerald Island Rentals 2. Florida Homes Disney 3. Insight Global Partners 4. Better Leadership Blog ================================================== Flash Animations ================================================== 1. New Deal Used Cars 2. Aussie Taxi Driver ================================================== Web Scrapers, Crawlers, Bots ================================================== 1. Google AdWords Keyword Planner A bot that automates keyword research on the Google AdWords Keyword Planner. 2. Lenovo Outlet A bot that automates the monitoring and adding to cart of products on Lenovo Outlet. 3. TULA Baby Carriers A bot that automates the monitoring and adding to cart of products on TULA Baby Carriers. 4. BlaBlaCar A bot that automates the creating and updating of trips on BlaBlaCar. 5. Costco A scraper for products of Costco. 6. Booker A scraper for products of Booker. 7. JJ Food Service A scraper for products of JJ Food Service.