login
Forgot?
Login with Facebook

Don't have an account? Register one now!

script / software to parse file and extract data

Bids 
35
Avg Bid
$115 USD
CLOSED
  • Project ID:

    463062
  • Project Type:

    Fixed
  • Budget:

    $30-$250 USD

Project Description:

i need a script / software to parse an huge "webcrawler index file", extract some informations and save them into my mysql database

data to extract:
-link relations: link from site - link to site
-link info: textlink or image link, link text, alt text, title text
-link type: nofollow, (meta tag nofollow), follow link
-some page infos: content encoding, title, domainname...

the index file sizes are between 15GB and 100GB big, please keep in mind that your script can handle this capacity
the script / software should run on our linux root server
i'll give you an mysql database layout
you find can all informations about the index file here: http://www.dotnetdotcom.org/

please send me an pms if you have any question

Skills required:

C Programming, Java, Linux, Perl, Python

Project posted by:

crilla Germany
(3 Reviews)

Last seen:

Public Clarification Board

1 messages

  • jesuse1998

    Why do you want to place the records in a dbms when the authors of the webcraler decided to up flat files, most likely due to performance issues? just wondering...

    over 2 years ago


If you are the project creator or one of the bidders, please Log In for more options.


Awarded Bids

jimcrow Russian Federation
jimcrow
Russian Federation From Russian Federation     Offline
 Accepted
$40 in 7 days 
0
over 2 years ago
5.0

5.8

3 Reviews
71% Completion Rate
I can do it with python.

All Bids ()

cliver Ukraine
CliverSoft128.jpg
cliver
Ukraine From Ukraine     Offline
  Foundation EUFreelance.com Member
$180 in 2 days 
0
over 2 years ago
5.0

6.5

23 Reviews
93% Completion Rate
Hello, Please look at the PMB. Regards, Sergey
srinichal India
victory.jpg
srinichal
India From India     Gold Member     Offline
  General Freelancer Orientation (85%, 95th percentile)
$120 in 4 days 
0
over 2 years ago
4.9

6.3

73 Reviews
57% Completion Rate
I can write a bash script for the same
SigmaVisual Pakistan
svt_logo_header2.gif
SigmaVisual
Pakistan From Pakistan     Gold Member     Online
  General Freelancer Orientation (80%, 90th percentile)
  Foundation EUFreelance.com Member
$250 in 4 days 
0
over 2 years ago
5.0

6.0

30 Reviews
73% Completion Rate
We can help in your project, please check PMB to see our related experience.
gangabass Russian Federation
Forum.png
gangabass
Russian Federation From Russian Federation     Gold Member     Online
  Foundation EUFreelance.com Member
$70 in 3 days 
0
over 2 years ago
4.9

5.7

113 Reviews
57% Completion Rate
I can do this job for you. See PM for details.
ancosys Pakistan
Logo.JPG
ancosys
Pakistan From Pakistan     Offline
  Foundation LimeExchange Member
$130 in 3 days 
0
over 2 years ago
4.9

5.5

39 Reviews
78% Completion Rate
Hi, Please check PM Thankx
sureshdevi India
Desh Tech.jpg
sureshdevi
India From India     Gold Member     Offline
  Freelancer Orientation (80%, 97th percentile)
  Employer Orientation (75%, 97th percentile)
  General Freelancer Orientation (90%, 98th percentile)
$200 in 5 days 
0
over 2 years ago
4.9

5.5

46 Reviews
85% Completion Rate
I can do this work. Thanks, Suresh
pawel100 Poland
pawel100
Poland From Poland     Offline
  General Freelancer Orientation (90%, 98th percentile)
  Foundation EUFreelance.com Member
$60 in 3 days 
0
over 2 years ago
4.9

4.6

26 Reviews
74% Completion Rate
Hello, I'm interested in your project, Please check PMB for more details.
edatawiz India
logo-image.jpg
edatawiz
India From India     Offline
  General Freelancer Orientation (80%, 90th percentile)
  Foundation EUFreelance.com Member
$150 in 7 days 
0
over 2 years ago
5.0

3.7

5 Reviews
83% Completion Rate
Hi - Please check PM for details.
KelvinChen China
KelvinChen
China From China     Offline
  Foundation EUFreelance.com Member
$110 in 3 days 
0
over 2 years ago
5.0

3.6

5 Reviews
66% Completion Rate
Please check PM for details.
ulkas Slovak Republic
trash it.gif
ulkas
Slovak Republic From Slovak Republic     Offline
  General Freelancer Orientation (95%, 100th percentile)
  Foundation EUFreelance.com Member
$100 in 2 days 
0
over 2 years ago
5.0

2.9

2 Reviews
55% Completion Rate
easy task, don't need any more info, just get me an example file and i can start asap and after then you can try it with your own big file.
yaroslavm Russian Federation
yaroslavm
Russian Federation From Russian Federation     Offline
  Foundation EUFreelance.com Member
$100 in 3 days 
0
over 2 years ago
5.0

2.9

2 Reviews
100% Completion Rate
Hello, I can do that quickly and at low price.
Ellemer Hungary
Ellemer
Hungary From Hungary     Offline
$111 in 3 days 
0
over 2 years ago
5.0

2.8

4 Reviews
100% Completion Rate
Please check the PM, thanks
IstvanAntal Romania
2009-07-02-211859r.jpg
IstvanAntal
Romania From Romania     Offline
$100 in 2 days 
0
over 2 years ago
5.0

1.2

1 Review
81% Completion Rate
Hello, I understand what needs to be done, and I can start right away. See PM for details.
jporwal India
jporwal
India From India     Offline
$30 in 2 days 
0
over 2 years ago
Dear sir, I have experience developing parsers for VHDL and C++ using lex/yacc and ANTLR. Can develop a very efficient, cache optimized parser for you in 2 days. It is and interesting and simple task for me. Rega... more
Dear sir, I have experience developing parsers for VHDL and C++ using lex/yacc and ANTLR. Can develop a very efficient, cache optimized parser for you in 2 days. It is and interesting and simple task for me. Regards, Janak less
Kamerer Ukraine
Kamerer
Ukraine From Ukraine     Offline
$150 in 7 days 
0
over 2 years ago
I am expirienced in Python. I can write such script for you.
xeNorthwest India
xeNorthwest
India From India     Offline
$50 in 2 days 
0
over 2 years ago
can do this easily with perl
zub1uk United Kingdom
zub1uk
United Kingdom From United Kingdom     Offline
$100 in 2 days 
0
over 2 years ago
0.0

1.4

0 Reviews
100% Completion Rate
Hi, I am an information extraction specialist and would be happy to help you with this project. All I require is a small sample of the file to be parsed to proceed.
Fernando444 Sri Lanka
index.jpeg
Fernando444
Sri Lanka From Sri Lanka     Offline
  Foundation EUFreelance.com Member
$200 in 2 days 
0
over 2 years ago
0.0

0.0

1 Review
22% Completion Rate
Please see the PM
nusch Poland
nusch
Poland From Poland     Offline
  Foundation EUFreelance.com Member
$99 in 2 days 
0
over 2 years ago
I can do it fast with Python, I have experience with crawlers and other automated systems.
ibatica Serbia and Montenegro
ibatica
Serbia and Montenegro From Serbia and Montenegro     Offline
$100 in 4 days 
0
over 2 years ago
0.0

0.0

0 Reviews
0% Completion Rate
I can do this for you.
ishantoraskar India
ishantoraskar
India From India     Offline
$35 in 3 days 
0
over 2 years ago
pls see PM.
jcgasser United States
jcgasser
United States From United States     Offline
$140 in 5 days 
0
over 2 years ago
0.0

0.0

0 Reviews
100% Completion Rate
Please see my pm for details. Thanks
TipofIceberg India
TipofIceberg
India From India     Offline
$80 in 4 days 
0
over 2 years ago
0.0

0.0

0 Reviews
25% Completion Rate
Hello. I can develop this software using Java. I can start the project right now. Regards.
maheshpmahadevan India
1299335493_technical.png
maheshpmahadevan
India From India     Offline
  Foundation LimeExchange Member
$100 in 3 days 
0
over 2 years ago
0.0

0.0

0 Reviews
0% Completion Rate
hello , i can do this job on python or c/c++
ThePilgrim Romania
gaf.logo.png
ThePilgrim
Romania From Romania     Offline
  General Freelancer Orientation (85%, 95th percentile)
  Foundation EUFreelance.com Member
$200 in 10 days 
0
over 2 years ago
0.0

0.0

0 Reviews
50% Completion Rate
I'm proficient in ANSI C, C++ and also JAVA, and i the task at hand might or might efficient to complete. Anyway please post in PMB the database structure with a description on the handling of data extracted.
mkedar India
mkedar
India From India     Offline
$100 in 3 days 
0
over 2 years ago
I am expertise in java and sql. And since your file are very large it needs an efficient io for to read. I and do this for you ...
demigraff United Kingdom
demigraff
United Kingdom From United Kingdom     Offline
  General Freelancer Orientation (90%, 98th percentile)
$120 in 9 days 
0
over 2 years ago
Have looked at the file format, and believe this should be a relatively simple project in Perl. Thankyou for your consideration.
bvfalcon Russian Federation
bvfalcon
Russian Federation From Russian Federation     Offline
  Foundation LimeExchange Member
$100 in 3 days 
0
over 2 years ago
I have experience in web crowler creating and text data mining
subhabrataban07 India
subhabrataban07
India From India     Offline
  Foundation EUFreelance.com Member
$100 in 3 days 
0
over 2 years ago
i can do it in perl/php/python..
serjant2600 Ukraine
a_264b00c1.jpg
serjant2600
Ukraine From Ukraine     Offline
$150 in 3 days 
0
over 2 years ago
0.0

1.8

0 Reviews
71% Completion Rate
I have done similar project, Please see the PM
rupaliT India
rupaliT
India From India     Offline
$35 in 3 days 
0
over 2 years ago
0.0

1.0

1 Review
33% Completion Rate
Hi, I will do this work. I am very well in C/C++. Now i am currentlly working on that also.
ricky867 India
ricky867
India From India     Offline
  Foundation EUFreelance.com Member
$150 in 20 days 
0
over 2 years ago
Recently..i did perl scripts for intel,oregon converting xml files..,templates..plugging register values etc..for huge files.. I will be glad to assist..you.
periwebindia India
periwebindia
India From India     Offline
  Foundation EUFreelance.com Member
$150 in 5 days 
0
over 2 years ago
hello, I can help you out.
xjx922 China
xjx922
China From China     Offline
  Foundation EUFreelance.com Member
$120 in 5 days 
0
over 2 years ago
0.0

0.0

0 Reviews
0% Completion Rate
I can hel you