In Progress

Build a Geohash Data Cleaning System

We need a software engineer to help us fix some bad data in a large dataset (5MM+rows). We are basically going to sort the data and then look for orphan values. Depending on what is right above and below these values, we might make a change. And the changed value will come from the surrounding data.

We will be using geohash to filter our data. For example, [url removed, login to view], In this very small geographic area, we see 22 rows of data. Formation name looks pretty good. The only orphaned value is Rodessa. We can't really correct it here, because James Lime is below and Massive Anhydrite is above. No consensus on what to change it to. Right now, the geohash is set to 9vsn3su55e9 to 11 digits. This is a very, very small area. As we back off the digits, 9vsn3su55e (10 digits, removed the last 9c), we will see the 18 rows don't change. We have to go all the way to 9vsn3s, six digits, to see any new rows show up and we are only at 22 rows [url removed, login to view]

When we finally get to five digits (this will happen often) we see the number of rows grow to 231. [url removed, login to view] Let's walk through an example change we will make.

[url removed, login to view] See the "Austin" (no chalk) formations? These need to be changed to "Austin Chalk." We will explain more on the logic once we award the project, but these are pretty obvious changes. We will be creating a new field to record all our changes. We don't just want to change raw data. Need to have before and after values. We will also add a couple new fields for analysis. One will be length of the geohash when we made the Austin to Austin Chalk change. In this case, 5. We will also want to record values above and below. To show an example, let's look at a different part of the data.

The project gets a little bit more complicated after that, but not much. The project's full description is too long for the description box here, so I've included it in a txt file in the attachments, along with a sample csv. Please take a look at the entire description.

Skills: Data Cleansing, Data Processing, Geographical Information System (GIS), Geospatial, Java

See more: geohash java examples, geohash scala, geohash example, what is geohash, geohash to lat long java, geohash java example, geohash java, geohash java maven, epos data entry system, brio data entry system, excel project data cleaning, build contacts data base, excel data cleaning, build telephone data base, data cleaning excel

About the Employer:
( 3 reviews ) Oklahoma City, United States

Project ID: #14967534

Awarded to:

lovinagarwal21

i have 11 years of experience in developing web android ios desktop applications...i have expertise in java spring hibernate database android ios swing security sso oauth Relevant Skills and Experience java spring hib More

$1555 USD in 30 days
(120 Reviews)
6.4

14 freelancers are bidding on average $2531 for this job

winmaclin

Hi, We have gone through your requirements and recommend Spring Batch to process files. Spring batch supports multi threading and allows chunk reads which would help processing file faster and will be less heavy on More

$3000 USD in 30 days
(90 Reviews)
7.6
zhengnami13

Dear You're looking for a senior mobile developer for your business this is exactly what I specialise in. Talk about a perfect match! Relevant Skills and Experience This is my Freelancer profile. [url removed, login to view] More

$2647 USD in 30 days
(12 Reviews)
6.8
Vlzinch

Hi there! David asked me check this project. I offer create python scrip with pandas that will process csv and clean data. Relevant Skills and Experience Data science and data processing my main field of work. I have More

$2647 USD in 10 days
(26 Reviews)
6.2
mahershahmeer

Hi there! It took a lot of time to read and grasp the description you provided. And I really have limited characters to respond as a proposal. I'm confident, Adventurous and A decent Programmer. Relevant Skills and E More

$2500 USD in 30 days
(100 Reviews)
5.7
DavidLiu80

Hello, Its a pleasure to meet you. I’ve reviewed your job "Build a Geohash Data Cleaning System" in depth and I can add value and complete your project professionally and in a timely manner with my 8+ years of web More

$2882 USD in 30 days
(14 Reviews)
6.1
vorasiddh4it

You can see my last project which are based on GIS Image Processing Algorithm Development and I can complete your project perfectly. Relevant Skills and Experience We have 10+ years experience in software development. More

$3000 USD in 30 days
(23 Reviews)
5.1
$2500 USD in 30 days
(5 Reviews)
4.7
anuragiitk

I am an IITK graduate and I have 9 years of experience in software development. I have 100% completion rate and I have finished all the projects with the highest level of customer satisfaction. Relevant Skills and Exp More

$2500 USD in 30 days
(18 Reviews)
5.1
mike199

Hi, I’m a Web Designer/Developer from the UK. My name is Mike. Your project description sounds interesting to me and I do have skills & experience that are required to complete this project. Relevant Skills and Experi More

$2500 USD in 30 days
(1 Review)
3.2
intelgeek

It seems the CSV file is missing, can you please share that? Will you be providing the data in .csv format or will there be database import for this? Relevant Skills and Experience I've following experience: Data Scie More

$2999 USD in 30 days
(3 Reviews)
2.7
zdesign77

Hey, how’s it going? My name is Mike, I’m a Web Designer & Developer from Boston. I've had a look at your project description and feel that my skills match your requirements perfectly. Relevant Skills and Experience G More

$2500 USD in 30 days
(0 Reviews)
0.0
king18yat

In the bid amount, You get website version, desktop, android and ios apps. with daily work updates , daily communication, 1 year complete (maintenance , updates , changes),No advance needed, SEO Relevant Skills and Ex More

$1500 USD in 30 days
(0 Reviews)
0.0
MetaoriginLab

We are a Team of Technical Consultants and Data Engineers having healthy experience into Big Data technologies,IOT/Cloud/AWS and Python/AI+Machine Learning. The Dynamic force has qualified engineers having expertise in More

$2706 USD in 7 days
(0 Reviews)
0.0