
Closed
Posted
Paid on delivery
Project Description: Find school districts and charter schools who use a specific vendor for a large list of domains. I am seeking an experienced web scraping specialist to improve our Python script to analyze a large list of school district websites (approximately 4000+ URLs) and identify the ones who show a specific link on any page found in their sitemap. The primary method of identification must be to scan the website's [login to view URL] for specific, known vendor links. Deliverables Required 1. A Production-Ready Python Script (.py file): The script must be commented, easily configurable, and capable of reading the provided CSV list, performing the scan, and generating the output CSV. It should handle timeouts and basic error handling gracefully. 2. The Final Results (CSV/Excel File): A clean data file containing the results for all URLs provided. The resulting CSV should include original variables like organization name, state and zip even though that data was not used in the scraper. The script must perform the following steps for each URL in the input list: 1. Input: Read a list of URLs from a provided CSV file (single column of URLs). 2. Navigation/Rendering: Visit the URL (handling redirects is essential). The use of a headless browser (like Selenium/Puppeteer) or an advanced HTTP library is preferred, as some websites may load the footer content dynamically via JavaScript. 3. Targeted Scanning: Scan the HTML source code of all pages found in the sitemap, specifically looking for the presence of a specific link. 4. Output Logic: - If the link is found, record the identified vendor. - If no vendor is explicitly identified, the output should list the status as `"No Vendor Found"`. - If no website could be loaded, the script should log any failed connections or timeouts. Output Format (CSV) The final deliverable file should be structured with the same columns as the ones provided with the additional column to include your results. Skills Required - Expert proficiency in Python. - Deep experience with web scraping libraries (e.g., Requests, BeautifulSoup, Scrapy, and especially Selenium/Puppeteer for dynamic content). - Experience handling common web scraping challenges (redirects, user-agents, proxy usage (if necessary)). To bid, please confirm your familiarity with scraping dynamic content and provide a brief description of the scraping approach you would use.
Project ID: 40223773
138 proposals
Remote project
Active 18 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
138 freelancers are bidding on average $144 USD for this job

Hello Thank you for posting job. I just checked your project carefully. I am an python expert with experience in web scrapping using Pupetter, Ads Power, Scrapy and Selenium driver. So it is very motivated and interesting for me. It is an ideal match for my skill and experience. If you hire me, you would get perfect result and service asap. I hope work hardest for your success. Thanks & Regards.
$140 USD in 7 days
7.7
7.7

⭐⭐⭐⭐⭐ Improve Python Script for Web Scraping School Districts ❇️ Hi My Friend, I hope you are doing well. I've reviewed your project requirements and noticed you're looking for a web scraping specialist. You have no need to look any further; Zohaib is here to help you! My team has successfully completed 50+ similar projects for web scraping. I will enhance your Python script to scan 4000+ school district URLs effectively, identifying specific vendor links in their sitemap. ➡️ Why Me? I can easily improve your Python script as I have 5 years of experience in web scraping, focusing on Python, Requests, BeautifulSoup, and Selenium. My expertise includes error handling, data extraction, and script optimization. Besides, I have a strong grip on handling dynamic content and common web scraping challenges. ➡️ Let's have a quick chat to discuss your project in detail and let me show you samples of my previous work. Looking forward to discussing with you in chat. ➡️ Skills & Experience: ✅ Python Programming ✅ Web Scraping ✅ Data Extraction ✅ Error Handling ✅ CSV File Management ✅ Requests Library ✅ BeautifulSoup ✅ Selenium WebDriver ✅ Dynamic Content Handling ✅ Script Optimization ✅ Data Analysis ✅ Logging Connections Waiting for your response! Best Regards, Zohaib
$150 USD in 2 days
8.0
8.0

As a Full-Stack Developer specializing in Python and advanced web scraping, I can enhance your scraper to reliably process 4,000+ URLs with robust handling for redirects, dynamic content, and errors. I’ll deliver clean, scalable, well-documented code that generates structured CSV/Excel outputs with extracted vendor data or a clear “No Vendor Found” status, ensuring long-term maintainability and performance.
$150 USD in 3 days
7.0
7.0

Hi there,I am ready to start python web scrapping .Please kindly drop me a message for further discussion.100% accuracy will be delivered with a time frame and suitable budget. waiting for your reply thanks.
$100 USD in 2 days
6.8
6.8

With my extensive experience in web scraping using Python, specifically with the libraries you mentioned such as Requests, BeautifulSoup, Scrapy, and Selenium/Puppeteer, I am an ideal candidate for your project. I have completed numerous projects similar to yours where I had to filter through massive amounts of domain URLs, scanning for specific links within their sitemap. Handling redirects, timeouts, and other common challenges is something I've become quite efficient with over time. I am particularly well-versed in dealing with dynamic content using Selenium/Puppeteer which often comes in handy when some websites load footer content dynamically via JavaScript. By incorporating this knowledge into my work with your specific project requirement, I can analyze the large list of school district websites effectively hierarchy within a reasonable timeframe. My keen eye for thoroughness means that even if no vendor is explicitly identified, your final deliverable will still indicate "No Vendor Found" to ensure all information is recorded. As a professional who values clean code and configurations, my script will be well-documented allowing easy configuration and understandability. Thanks...
$250 USD in 7 days
7.1
7.1

Hello, I will create a PHP script to automate your task. Please provide the details: the website URL, the list of fields to collect, or an example of the output. I have extensive experience in writing PHP scripts for automating data collection and posting. Please see my reviews for reference.
$400 USD in 3 days
6.5
6.5

Hi, We’ve built similar web scrapers that extract data from multiple URLs and handle dynamic content using Selenium. We also have extensive experience with libraries like BeautifulSoup and Scrapy, and we’ve developed production-ready solutions that can run independently and manage proxies and user agents. For your project, I’d suggest using a combination of Selenium and a headless browser to ensure we capture all relevant data, even from JavaScript-rendered content. We can also implement a fallback mechanism to use direct HTTP requests when JavaScript is not needed, optimizing the overall scraping time. Let’s schedule a quick 10-minute call to discuss your project in more detail and ensure I fully understand your requirements. I’m available at any time that works for you. I’m eager to learn more about your exciting project. Best regards, Adil
$165.79 USD in 7 days
6.0
6.0

Hello, Hope you are doing great, i am expert in web scraping , I can easily scrape all the target data from the website using Python or any other script so you don't have to spend any time or effort doing it manually. Plus, I provide quality results quickly and efficiently within your budget. Lets connect through chat for further detailed discussion, i can start the work right after the discussion., thank you Gaurav D.
$140 USD in 4 days
6.4
6.4

Hey, I'm proficient in Python and have extensive experience in web scraping using libraries like Requests, BeautifulSoup, and Selenium. I'll enhance your Python script to analyze 4000+ school district websites, identifying specific vendor links. My approach involves reading URLs from a CSV, navigating websites with a headless browser, and scanning sitemaps for vendor links. The final output will be a clean CSV file with results. Let's discuss your specific requirements further.
$155 USD in 1 day
5.7
5.7

Greetings. This is a great match for my skillset and I guarantee the ability to finish the script in just 2 - 3 days. I am indeed familiar with scraping dynamic content using headless browsers and will not have any issues getting this working. # Approach 1) The script will have a command line interface and will run without any human intervention from start to finish. 2) It will read the CSV of school websites and pass it to a procedure that iterates through it. 3) The procedure will schedule each school one at a time to be scanned by the crawler component of the application. 4) The crawler will implement best practices in respect to limiting connections and having fault tolerance (it will not crash if websites don't connect or have bot detection). Failed scans will merely be logged. 5) The scanner will pass each website page to a procedure that checks for the vendor link and records it if found. If not found after all pages are exhausted, this will be recorded when the scanner finishes with that site. 6) Final application output will be the CSV of schools and vendor link status and value. # Design 1) The application will utilize a modular design for its components that isolates them from each other as much as possible. This will allow easy updates and extensions to the application scope in the future. 2) It will use Playwright or Selenium. --- I am available to begin immediately and work until completion. Contact me if you wish to continue. Thanks.
$190 USD in 3 days
5.5
5.5

I'm Doan, an experienced web scraping specialist who believes in delivering nothing but the best for my clients. I've built numerous Python scripts like the one you're requesting that involve conducting sophisticated analyses on large-scale datasets. My understanding of web scraping libraries and handling common web scraping challenges - such as redirects and user-agents - makes me a formidable choice for this project. Moreover, I possess in-depth proficiency in handling dynamic content using Selenium/Puppeteer – an expertise that will be vital in ensuring we don't overlook any vendor links. Furthermore, my skills and experience allow me to navigate websites effectively even when content is loaded dynamically via JavaScript - a crucial requirement you have specified. My dedication to thoroughness in coding practices means the resulting Python script will be well-commented, configurable, and handle all potential contingencies gracefully. Rest assured that the final output CSV file you'll receive will include all the original variables requested and deliver a crisp, clean analysis of each URL's vendor status. Let my proactivity and problem-solving skills benefit your project today.
$140 USD in 1 day
5.6
5.6

Hello, I have over 7 years of experience in Web Scraping, Data Extraction, and Python. I have carefully reviewed the project requirements and am confident in my ability to enhance the Python script for analyzing a large list of school district websites to identify specific vendor links. To achieve this, I propose to utilize Python along with web scraping libraries such as Requests, BeautifulSoup, and Selenium/Puppeteer for handling dynamic content. The script will read URLs from a CSV file, visit the websites while handling redirects, and scan the sitemap for the specified links. The output will be a clean CSV file containing the results for all URLs, with proper error handling in place. I am well-versed in handling common web scraping challenges like redirects, user-agents, and proxies if necessary. I am eager to discuss the project further and provide a detailed plan for its successful execution. Please connect with me in the chat to explore this opportunity in more detail. You can visit my Profile at: https://www.freelancer.com/u/HiraMahmood4072 Thank you.
$100 USD in 2 days
5.5
5.5

Hello, I see that you need your Python script improved to analyze a large list of school district websites and identify the ones who show a specific link on any page found in their sitemap. I have a vast experience in scraping dynamic content. Please message me so that we can discuss your project requirements in more detail. Looking forward to our successful collaboration, Fahad.
$110 USD in 2 days
5.4
5.4

Hello! I understand that you are looking to enhance your Python web scraper to analyze over 4000 school district websites for specific vendor links found in their sitemaps. In my previous projects, I successfully developed robust web scraping solutions that handled dynamic content and significantly improved data accuracy. For instance, I created a scraper that analyzes educational site structures, yielding a clear output while maintaining compliance with web standards. ✅My Plan - Review the existing Python script to identify areas for enhancement. - Ensure the script can read URLs from a CSV file and perform safe navigations, including handling redirects. - Utilize Selenium for dynamic content scraping and implement error handling for timeout issues. - Produce a well-commented script that generates a final CSV with the required data structure, including the original variables and vendor identification. Looking forward to bringing this project to life! Best regards, Hongqiang Chen
$190 USD in 2 days
5.0
5.0

Hello, there! I am confident that my skill set perfectly qualifies me for your Python Web Scraper project. Automation, Data Analysis, and Python are all key proficiencies I boast to ensure a high-quality delivery of your Production-Ready Python Script (.py file). Not only have I utilized popular scraping libraries like Requests, BeautifulSoup, and Scrapy in my past projects, but I’m also quite skilled with Selenium and Puppeteer which are paramount for handling dynamic content. For redirect challenges, proxies or user-agents usage – don't worry, I've got you covered. My exceptional ability to adapt to changing environments and tireless work ethic have helped me build trusting relationships with my clientele. This is reinforced by consistent project completion within deadline and budget while exceeding expectations even when faced with complex tasks - just as this one promises to be. I'm confident that by choosing me you won't just rope in a capable python developer but also have a dependable hand who’ll intertwine Pythonic magic into every line of code I produce making sure ‘No Vendors Go Unnoticed’. Let's team up so we can create not just a great script together but start a collaborative relationship brimming with successful deliveries!
$150 USD in 5 days
4.7
4.7

Hello , I have 10 years of experience with python and have been performing scraping since (for a myriad of different projects) . 1. I will setup basic requests based scrapper with proxy , redirect ,user-agents . 2. For sites that can't be scraped using simple requests ,I will setup selenium or playwright and get the rendered html. I will search search for specified vendors and links . All the actions will be logged or saved to a db . We will keep track of what sites failed, which ones had mention of our required vendors ,etc. If required will create a separate ui (desktop based or web based ) for you to look at the db , download the cleaned output in csv format. Message me if interested . I can handle at least 5-10 websites per day .
$200 USD in 7 days
4.1
4.1

With my extensive experience in guiding web scraping projects and proven skills as a full-stack developer, I am the perfect fit for this Python web scraping task. My collaborative approach of fostering clear communication through updates and iterative delivery aligns particularly well with this project's requirement. Moreover, my AI/ML proficiency coupled with strong background in handling all facets of data extraction will ensure that you not only get an efficient, usable code but also an output that meets your specifications. To tackle the challenge of dynamic content, I would utilize a combination of Selenium, BeautifulSoup, and Requests libraries. Selenium will enable me to elegantly handle redirects and dynamically loaded elements through its powerful headless browser functionality. BeautifulSoup will be used to parse HTML source code whereas Requests will be useful in handling proxies, user-agents, and potential timeouts, should they arise. This well-rounded approach ensures maximum flexibility in scraping both static and dynamic content on these school district websites to identify the presence or absence of the specific link. In addition to first-rate technical skills relevant to this project, I bring to the table a methodical approach that emphasizes maintainability and simplicity. Trust me to automate your web-scraping endeavor expertly and deliver excellent results
$200 USD in 5 days
4.3
4.3

Hello, As a seasoned web scraping specialist, I have the necessary expertise to enhance your Python script for this project and provide you with robust results. In terms of skill set, I am proficient in Python with deep experience in web scraping libraries like Requests, BeautifulSoup and Scrapy. Furthermore, I have considerable experience with Selenium and Puppeteer, which will enable me to adeptly handle dynamic content - a crucial element in your project given that some websites may load footer content via JavaScript. Personally, I place a strong emphasis on time management, transparency, and delivering high-quality work –which aligns closely with your project's needs. I will ensure that the final production-ready Python script is not only easily configurable but also commented comprehensively to enhance ease of use. I eagerly anticipate the opportunity to collaborate with you and produce outstanding results for your important project. thank you Regards Hemlata G.
$140 USD in 5 days
4.1
4.1

Unlocking Insights: With expertise in Python and web scraping, I'll streamline your process to identify school districts linked to a specific vendor. Leveraging advanced tools like Selenium, I'll enhance your script to scan 4000+ URLs seamlessly. I have 5 years of experience working on similar projects offsite, promising efficient results. Let's collaborate to ensure your CSV output captures all essential data accurately. I'm ready to discuss how we can achieve your project goals effectively and exceed expectations. I’d love to chat about your project! Worst case, you get free advice that can guide your project. Regards, Chirag Pipal
$200 USD in 7 days
3.8
3.8

Hi , Good morning! I am an expert mobile coder with skills including Selenium, Scrapy, Django, Web Scraping, Python, BeautifulSoup, Software Architecture, Data Extraction, Data Analysis and Automation. "No Vendor Found" Please send a message to discuss more regarding this project. Always happy to hear from you
$155 USD in 5 days
3.8
3.8

Carbondale, United States
Payment method verified
Member since May 17, 2008
$115 USD
$25-100 USD
$100-300 USD
$100 USD
$20-50 USD
$30-250 USD
$30-250 USD
₹12500-37500 INR
₹750-1250 INR / hour
$30-250 USD
$25-50 USD / hour
₹750-1250 INR / hour
$10-75 USD
$15-25 USD / hour
$30-250 USD
$15-25 USD / hour
$750-1500 USD
₹1500-12500 INR
₹750-1250 INR / hour
$15-25 USD / hour
₹12500-37500 INR
$30-250 USD
$8-15 CAD / hour
min ₹2500 INR / hour
$750-1500 USD