Closed

Analysis of Big data -- 2

Big data refers to the growth in the volume of structured and unstructured data, the speed at which it is created and collected, and the scope of how many data points are covered. Big data often comes from multiple sources and arrives in multiple formats. The amount of data we produce every day is truly mind-boggling. There are 2.5 quintillion bytes of data created each day at our current pace, but that pace is only accelerating with the growth of the Internet use. The world's effective capacity to exchange information through telecommunication networks was 281 petabytes in 1986, 471 petabytes in 1993, 2.2 Exabyte in 2000, 65 exabytes in 2007 and predictions put the amount of internet traffic at 667 exabytes annually by 2014. 2016 was a landmark year for big data with more organizations storing, processing, and extracting value from data of all forms and sizes. In 2017, systems that support large volumes of both structured and unstructured data will continue to rise data from different sources, or even different tables from within the same source, could often refer to the same information but be structured entirely in a different way which increases complexity of data structure. The huge amount of complex and heterogeneous data pouring from anywhere, any-time, and any-device, there is undeniably an era of Big Data, a phenomenon also referred to as the Data Deluge, where every second we create new data. For example, we perform 40,000 search queries every second (on Google alone), which makes it 3.5 searches per day and 1.2 trillion searches per year. So, we require a set of techniques and technologies with new forms of integration to reveal insights from datasets that are diverse, complex, and of a massive scale. For a very long time RDBMS supported data storage, that is based on the relational model which uses SQL for storing, manipulating and retrieving data stored in a relational database. It's a standardized query language for requesting information from a database that organizes data into one or more tables (or "relations") of columns and rows. But, communication between different tables and rows can be slow and difficult for huge and irregular datasets and it does not support frequent schema changes.

So, NoSQL database (For example: Key-value database, Graph database, etc) came into existence.

The graph database model (for example: Neo4j, infinity graph, Sparksee, Blazegraph, etc) focuses on the relationships of different nodes, or data-points. They are effective and fast because of the way they are built to find these different relationships and to represent them in a graphical, and simple, model. So, in a world of interconnected data, understanding the value of data alone is not enough, we also need to look for the different ways that connects data with each other.

2. Problem Statement: Interactive graph analytics supported by suitable visualizations is highly desirable to put the human in the loop for exploring and analyzing graph data. The currently existing separation between interactive query processing with graph databases and batch-oriented graph analytics should thus be overcome by providing all kinds of analysis in a unified, distributed platform with support for interactive and visual analysis. Some of the graph e.g., Blazegraph, System G and Titan, try to go into this direction, but there are still many open issues in finding suitable visualizations and interaction forms for the different kinds of analysis and at the same time it poses a number of challenges for suitable implementations which are observed as follows:

● Overhead reduction in cypher queries.

● Graph visualization and summarization

● Complex link analysis to discover fraud patterns in a big data analysis

● Multi-threaded transaction in neo4j

The analysis of graph data has become one of the most important part in many applications

and a major focus of big data platforms.

Skills: Neo4j, Python, Apache, Spark, Java

See more: exciting hadoop aws big data weather analysis project, big data weather analysis, big data big analysis project management, big data analysis thesis assistant india freelance, big data analysis, analysis vs reporting in big data, mapreduce algorithms for big data analysis, big data analysis course in hyderabad, critical analysis of big data challenges and analytical methods, big data twitter analysis, exploratory data analysis coursera project 2, big data analysis course, network data analysis includes in big data, skills needed for big data analysis, how does data format affect big data analysis, challenges in big data analysis, distinguish between analysis and reporting in big data, big data analysis report, learning spark: lightning-fast big data analysis, clickstream analysis big data

About the Employer:
( 0 reviews ) Lucknow, India

Project ID: #29419997

6 freelancers are bidding on average ₹29491 for this job

ibrahimanjum330

Hi, I am Ibrahim, and I am a data scientist, I can help you with analysis, please share the specific instructions, yours are a bit generic. Regards, Ibrahim Anjum

₹12500 INR in 3 days
(44 Reviews)
5.5
sharmaalana1

HI I am experienced in Python Apache Spark etc I can start right now but i have few doubts and questions lets have a quick chat and get it started waiting for your reply

₹25000 INR in 7 days
(1 Review)
1.6
Shweta1112

Hello, I have read your job posting and I am interested in the same. I have a B. Tech (Computer Science) and MBA from the very best universities in the country with an acceptance rate of 0.01% of the applicants. I ha More

₹25000 INR in 7 days
(0 Reviews)
1.7
raviraje21

I have 5 years of hands on experience in big data so it will be very helpful while building the models.

₹30000 INR in 7 days
(0 Reviews)
0.0
rohitbigdata

Hello There, I am having 5.5 years of experience of working in bigdata domain. Handled large volume of semi-structured & structured data for batch and real time processing as well. I have expertise in Apache Spark Proc More

₹40000 INR in 14 days
(0 Reviews)
0.0
dushyantsingh285

I am working on big data and ETL. I have good experience of handling large amount if data. We have processed the data in optimised way . Interested to work with you.

₹44444 INR in 7 days
(0 Reviews)
0.0