Profile image of souvikghosh
Flag of India Kolkata, India
Member since March, 2009
0 Recommendations


Souvik Ghosh is a result oriented professional with over 3 years of Freelancing experience and 4 years of post-degree Corporate experience. Started his career in web development and gradually shifted to the Hadoop ecosystem. Co-founded and run a console game sharing startup for 1 year. Currently seeking exciting big data challenges where one can solve problems as well as learn new technologies
$30 USD/hr
34 Reviews
  • 69%Jobs Completed
  • 94%On Budget
  • 94%On Time
  • 30%Repeat Hire Rate

Recent Reviews


Assistant Manager

May 2014

Project# : Loggers Analysis for site Users Analysis of hourly site usage of Afrihost users from a set of Apache HTTP Server logs.. All the Legal procedures, complains are maintained and recorded in this system. Previously various downstream applications were implemented in Afrihost to fetch and amend the data. Client African Hosting Giant Technologies Map Reduce Algorithm, Hive , HDFS,PIG, Sqoop Database MySQL Duration May 2014 – till date Role Tech Lead Team Size 4 Responsibilities: • Create different Hive schema • Extract tables from database using SQOOP and load them into HDFS • Create HIVE partition as required • Join different table to generate different reports • Implementation using Map Reduce

Specialist - Technology

Aug 2012 - Feb 2013 (6 months)

Project# : Overhaul of Reporting Section and Adhoc user engagement analysis Overhauling of the reporting section for 123Stores to accommodate exponential growth in database size. Implementation of Hadoop ecosystem as well as user log data analysis for users of 123greetings. Technologies Map Reduce Algorithm, Hive , HDFS,PIG, Hbase, Oozie, Sqoop Database MySQL Duration Aug 2012 – Feb 2013 Role Senior Programmer Team Size 5 Responsibilities: • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics. • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data. • Shared responsibility for administration of Hadoop, Hive and Pig.

Software Engineer

Oct 2010 - Aug 2012 (1 year)

Project# : Device Fault Prediction Cisco’s support team on a day-to-day basis deals with huge volumes of issues related to their network products like routers, switches etc. The support teams have been operating on a reactive model i.e. based on the customer tickets/queries being raised. Hence, to improve customer satisfaction, they would like the system to predict network faults based on the logs being generated by various network devices i.e. by loading them into Hadoop cluster and analyzing them using some of the machine learning algorithms implemented in Apache Mahout or custom built. Client Cisco Technologies Map Reduce Algorithm, Hive , HDFS Database MySQL Duration Jan 2012 – Aug 2012 Role Developer Team Size 7 Responsibilities: • Involved infrastructure Verification(File System, OS Compatibility Check, Java Version Verification) • Analyzing the requirement to setup a cluster • Created two different users (hduser for performing hdfs operations and mapred user for performing Map Reduce operations only) • Setting-up the Hadoop-cluster on 10 VMs • Ensured NFS is configured for name node • Setup Hive with MySQL as a Remote Metastore • Moved all log files generated by various network devices into HDFS location • Written Map Reduce code that will take input as log files and parse the logs and structure them in tabular format to facilitate effective querying on the log data. • Created External Hive Table on top of parsed data Project# : Internal Ticketing System An internal ticketing system for Cisco to manage, log, monitor all issues received by its tech support team. Client Cisco Technologies PHP, Mysql, Html, Css, Jquery, Javascript Database MySQL Duration Jan 2011 – Dec 2011 Role Developer Team Size 15 Responsibilities: • Involved in database schema design and architecture • Ensured that Cisco specific coding standards are met including nomenclature • Optimized Mysql queries to lower fetching times • Responsible for Unit Testing and Integration Testing


Bachelor Of Technology

2006 - 2010 (4 years)


  • General Orientation
  • Freelancer Orientation

My Top Skills

Browse Similar Freelancers