Perl/Python Programmer for Text Parsing


We are looking for an experienced programmer for engagement in long-term freelance work. A background in natural language processing (NLP) and/or computational linguistics would be an asset, but is not required. Pay is commensurate with experience and can either be project- or hourly-based. As part of our hiring process, we ask that interested candidates successfully complete the following tasks to demonstrate basic competency:

1. The SEC stores various text files they receive from companies on their Edgar website. These files are available for public download via FTP. A listing of all files sent to the SEC is stored in a quarterly index file. Go to the SEC’s Edgar website and download any 4 consecutive Company Index Files for PC here, [url removed, login to view] Do not download any index files prior to 2007. The index will point to the physical location of various file types.

2. Using the index files, download all of the full .txt files with a file type of “10-K” only for the 4 consecutive quarters you have chosen.

3. You will have downloaded .txt files which embed HTML, SGML, or XBRL code, in addition to tables, special characters, images, and other embedded files, such as PDF, etc. Flatten the .txt file to its raw text. That is, remove all code, tables, images, or embedded files. All that should remain is raw text. Write the raw text to a .txt file. The filename for the raw text file should be that of its parent with the suffix “_raw” added.

4. Count the number of words and sentences in your flat text file. Count the number of words that match any word in the following array of words: {growth, sales, billion, forecast}. Write the output to an Excel or CSV file.

5. Identify any outstanding issues, questions, or concerns regarding the steps above.

For full consideration, please send your resume, a random sample of 100 raw/flat text files, the output files (counts and matches), and your response to #5 above by June 30th, 2013. We are an equal opportunity employer. Work permits or visas are not required.

Skills: Perl, Python

See more: write code freelance, work freelance python, work as python programmer freelance, work as freelance programmer, Website freelance programmer , types of freelance work, types freelance, type from pdf to word freelance, sample sales resume, sample of resume, sample of company background, sales resume sample, resume with freelance experience, resume types, resume freelance work, resume for sales, resume for freelance sales, questions to ask employer, questions to ask an employer, python project freelance, python freelance tasks, python freelance available, python for freelance, python code freelance, python array count

Project ID: #4587926

15 freelancers are bidding on average $424 for this job


Hello, I am a Perl scripting expert interested in your projects. I will open a PMB to discuss your task. A IDLER

$408 USD in 10 days
(97 Reviews)

I am capable to finish your request.

$388 USD in 10 days
(8 Reviews)

Hi! I have a lots of experience in text processing. I carefully read the project description and have some Q, check please PM.

$515 USD in 3 days
(4 Reviews)

Hi, I am interested in the project and will update details later. regards VC

$526 USD in 3 days
(3 Reviews)

Placing bid

$444 USD in 60 days
(5 Reviews)

I have experience in the text processing methodologies, So I could handle this challenge

$305 USD in 3 days
(0 Reviews)

I am an expert level Perl developer. I have worked with a batch process that does something similar at MSCI. Please refer to my CV.

$305 USD in 3 days
(0 Reviews)

i am familiar with perl and have some web scraper module on cpan.

$305 USD in 3 days
(0 Reviews)

Dear Sir, I have experience with scraping and processing text files. Please check also the private message. Thank you.

$500 USD in 1 day
(0 Reviews)

Hello, Please check my private message for my bid.

$305 USD in 3 days
(0 Reviews)

Expertise with python and extensive experience with natural language processing tools such as freebase, NLTK and WordNet places me in ideal position to complete this project efficiently.

$388 USD in 4 days
(0 Reviews)

Hi. I have an industrial experience of 4 years where I used Perl on a regular basis. Please review my profile and message me for further considerations. Thanks.

$555 USD in 7 days
(0 Reviews)

I have wide array of experience in processing different types of flat files and generating different types of reports. I can do this.

$222 USD in 3 days
(0 Reviews)

Hi ready to start the work.............

$618 USD in 30 days
(0 Reviews)

Hi. I have a good working experience in Natural Language Processing and Python. I have worked in similar projects before.

$305 USD in 3 days
(0 Reviews)

I have text processing experience and can handle large numbers of files.

$500 USD in 3 days
(0 Reviews)