Closed

Make a function that produces a regex pattern to identify URLs of interest

Suppose we are intending to scrape a job portal, [login to view URL], which virtually contains many external sublinks, such as:

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

[login to view URL]

The idea is to apply a difference checker algorithm which yields a generic regex that matches the above routes, considering variable parts of the URLs, based on whether they yielded jobs or not.

Build a function, generatePattern(routes), where routes is an array of object having:

URL: str

hasYieldedJob: bool

In the above example, all the links except the last 3 ones yielded jobs, so, the perfect (fictive pattern) regex would be:

/job/{any number}/{any string}/?{any string}

Case scenarios

Query parameters should be considered as variables due to their complexity.

We do not want to apply a constant rule upon them, even if in the given dataset of urls they are the same. So if we have “/job/foo?parameter=true”, pattern will be “/job/foo{any string}”. Additional brainstorming is welcome.

- If routes contain hyphens, say ".../foo-bar/...", no matter if the part is invariant within the supplied urls, it will be considered as ".../{any string}/..."

Skills: Python, Regular Expressions

See more: mysql regex pull image urls, code make function search site, make function export database file zip file cake php, make function rbf matlab, php make function create recordset, regex pattern finder, rewritecond nocase option non regex pattern supported, java regex pattern matches, java util regex pattern example, java util regex pattern examples, regex pattern java example, make your own camo pattern online, how to make a repeat textile pattern, how to make function for insert query in php, how to make your own pants pattern, how to make a seamless repeat pattern in photoshop, how to make a digital sewing pattern, how to make a 3d plush pattern, how to make a cross stitch pattern in excel, how to make a stuffed animal pattern

About the Employer:
( 9 reviews ) Piazza Armerina, Italy

Project ID: #29045421

10 freelancers are bidding on average $102 for this job

shadabkhan92

I have experience in python for Regex generator checker for Licene plate checker. Links to some previous projects: https://www.freelancer.com/projects/html/Project-for-Shadab https://www.freelancer.com/projects/pytho More

$140 USD in 7 days
(29 Reviews)
6.3
Rajat6905

Dear Client Warm Greetings, I have been Python Developer for 3+ years and have experience of Building Management, Distributed, Database Applications. with Machine Learning, Ensemble Learning, Deep Learning implementat More

$111 USD in 1 day
(6 Reviews)
3.6
sajazaeri

Dear employer, Hi I can develop the code to find the URLs which has yielded job. I read the description carefully and got exactly what you want. I am a computer programmer with more than 10 years of working experienc More

$100 USD in 7 days
(9 Reviews)
3.8
Sayed95

Hello Sir, I have previous knowledge and experience with regex. I think I can meet your requirements. Inbox me please so I can help. Thanks

$70 USD in 7 days
(6 Reviews)
3.4
narsim3128

NOTE : I HAVE EXPERTISE IN WEB SCRAPING. With respect to this project I would like to present myself as a candidate for your consideration. I have more than 12 years of IT experience. I have successfully completed pro More

$140 USD in 4 days
(1 Review)
2.8
AleksandarDikic

Hello Python EXPERT I have read your description and I am so interested in your project. You can see well experienced and skillful Python +15 years of experience in software development. Confident in your project and I More

$140 USD in 7 days
(5 Reviews)
2.5
anashaat95

Hi, I can build this function using python and will give you the script of course. Ready to start right NOW. I could make a sample script for the presented details here if you wanted.

$60 USD in 1 day
(4 Reviews)
2.0
muzahidscl

Hello, this is Rahaman. I will build you a pyton function to identify if the link has job or not with regex on the given website website. This job seems interesting to me. I have extensive experience in crawling websit More

$75 USD in 2 days
(1 Review)
1.4
SaudQadir

Hello, I am Individual freelancer. I have pretty much good experience in regular expression re library of python. I am available for this task. and will try to deliver you the script today. Waiting for your kind respon More

$100 USD in 2 days
(1 Review)
0.4
joronoso

Hi, I can get you a working version of the function you need straight away. Probably you will want to supply some additional test data, to see if you need it to account for some additional factors not present in the s More

$80 USD in 1 day
(0 Reviews)
0.0