Goals and Objectives
Define what is Big Data ?!
The main goals of analyzing big data
Present the main advantages of big data analysis
Big data categories
How big data is stored and processed
The power of data
Big data statistics
What are the Risks of Big data
Introduction
The presence of big data is mainly because of the growing number of information resources in use and increasing of technology tools that used such resources. Based on the current reports, studies and analysis of big data, many services for both providers and customers can be offered that help in building huge real-time systems, allowing processing optimizations, and controlling the system’s performance.
What is Big Data?
Big data is an evolving term that describes a large volume of structured, semi-structured and unstructured data that has the potential to be mined for information and used in machine learning projects and other advanced analytics applications.
Big Data
Big Data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze.” (McKinsey Global Institute)
Big Data is the term for a collection of datasets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.” (Wikipedia)
analyzing big data
Extracting valuable information is of the central goal of analyzing big data that need to be secured in order to avoid any potential risks.
A huge number of Big data contains sensitive data or information, like financial, legal or private information.
The main advantages of Big Data Analytics
The main advantages of Big Data Analytics
Cost Reduction
The main Big Data Technologies available in the market are Hadoop and Cloud-based Analytics. These bring significant cost advantages when it comes to storing large amounts of data and identify more efficient ways of doing business. Thus, it helps to save cost and time for a business.
Faster and better decision making
Businesses are able to analyze information immediately and make decisions based on what they have learned by making use of the speed of Hadoop and in-memory analytics, combined with the ability to analyze new sources of data
Creation of New products and services
With this analytics, a businessman can know the customer’s interests. To satisfy these needs they will produce new products. Thus the business will grow.
How Big Data Analytics actually work
Some of the big data sources
Artificial Intelligence
• Artificial intelligence (AI), largely manifesting through machine learning algorithms, isn’t just getting better. It isn’t just getting more funding. It’s being incorporated into a more diverse range of applications.
• AI is starting to make an appearance in almost every new platform, app, or device, and that trend is only going to accelerate in 2018.
• AI will become even more of a mainstay in all forms of technology.
Forbes
Turning Data Into Insights
It often happens that organizations have a lot of data (internal as well as external collected through a range of sources),
But they do not know how to process it to gain value
Turning Data Into Insights
There have been successful cases where data and intelligent techniques (derived from past data and experiences) put together have solved problems in mapping crime, disaster management, marketing campaigns with greater accuracy, predictions with respect to consumer demand/preference and positioning offerings accordingly, providing government schemes and services more effectively, and many more.
How big data is stored and processed
• Organizations must apply adequate processing capacity to big data tasks to achieve the required velocity.
• This can potentially demand hundreds or thousands of servers that can distribute the processing work and operate collaboratively in a clustered architecture.
How big data is stored and processed
• the Apache Spark processing engine and related big data technologies.
• Amazon Elastic Map Reduce (EMR) from Amazon Web Services (AWS) is one example of a big data service that runs in a public cloud
• Microsoft’s Azure HDInsight and Google Cloud Dataproc. In cloud environments,
• Big data can be stored in the Hadoop distributed File System (HDFS) or in lower-cost cloud object storage, such as Amazon Simple Storage Service (S3); NoSQL databases are another option in the cloud for applications that are a good fit for them
Big Data Analytics tools, Technologies
Apache Hadoop
• The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers packaged code into nodes to process the data in parallel.
Statistics
• According to a May 2018 Forbes article, the amount of data created every single day amounts to 2.5 quintillion bytes.
• Almost 90 % of the data in the world was generated over the last two years.
• one can imagine the speed and volume of data being generated through innumerable sources like:
IoT, sensors, wearable devices, tweets, YouTube videos, mobile communications, chats, pictures, emails, blogs, Skype, print media, TV, smart devices, and so on.
Facebook Knows You Better Than Anyone Else
• Researchers at the University of Cambridge and Stanford University tested their
algorithm on more than 17,000 Facebook users, who completed a personality survey
and provided the researchers with access to their “likes.” Many of their friends,
colleagues and family members also completed a survey describing the users
The algorithm was better able to predict a person’s personality traits than any of the human participants.
Facebook Knows You Better Than Anyone Else
• It needed access to just 10 likes to beat a work colleague
• It needed access to 70 likes to beat a roommate
• It needed access to 150 likes to beat a parent or sibling
• It needed access to 300 to beat a spouse.
https://applymagicsauce.com/demo
Big data Security Issues
• One of the key security issues involved with big data aggregation and analysis is that organizations collect and process a great deal of sensitive information regarding customers and employees, as well as intellectual property, trade secrets, and financial information.
•Centralizing data in one place, it becomes a valuable target for attackers, which can potentially leave huge swathes of information exposed, which could undermine trust in the organization and damage its reputation. This makes it essential that big data stores are properly controlled and protected.
Big data Security threats
A Big Data compromise
Malicious attackers
Data manipulation
Data discloser
Ransomware
Big data Privacy
• Another potential problem relates to regulatory compliance, especially with data protection laws. Such laws are more stringent in some jurisdictions than others Such as GDPR, particularly with regard to where data can be stored or processed.
• Privacy and security of big data is of high importance and priority
References
• http://www.thesmartmanager.com/technology/power-of-data.html
• https://www.forbes.com/sites/blakemorgan/2018/10/11/10-examples-of-customer-experience-innovation-in-banking/#23707478729d
• https://searchdatamanagement.techtarget.com/definition/big-data
• http://www.experian.com/blogs/news/2014/10/01/data-is-good/
• https://dustinstout.com/social-media-statistics/#facebook-stats
•https://www.iacae.org/English/ResourceCenter/data/Big_Data_and_the_Audit_Challenge.pdf
• https://www.nytimes.com/2015/01/20/science/facebook-knows-you-better-than-anyone-else.html
By Tamer Alajrami / Information Security Consultant / International Speaker
Information Security Team / Risk Management Professional Forum
Leave A Comment