Facebook goes IPO this Friday. This milestone event has me thinking. With 900 million active users generating terabytes of data each day, how valuable is Facebook data?
I do believe Facebook advertisers and application developer gain real value from Facebook data to in order to grow and stay competitive. For example, a Facebook game developer can improve customer loyalty because its customers are Facebook users who grant permission to access their private data such as posts and comments to perform sentiment analysis.
But unless you are a Facebook advertiser or application developer, how can you accurately measure sentiment from Facebook data? And if used, can it lead to bad decision making? I believe so. Most user profiles on Facebook are not public; therefore, you obtain a very small sample size of sentiment from your customers. Not all Facebook users post or comment on public group or fan pages; therefore, you again obtain a small sample size of sentiment.
Let me give you an example. A retailer uses Facebook to capture customer sentiment and discovers negative sentiment for its Mother’s Day Spa Basket through conversations on its public fan page and other related public conversations. To keep customers loyal and happy, the retailer sends a $10 gift card to all customers who had purchased the Mother’s Day Spa Basket. In reality, perhaps only a small percentage of customers are unhappy. As a result, it costs more for the retailer to offer gift cards than it costs to lose a small percentage of its customers.
Enough about my perspective on the value of Facebook “Big Data”. Instead, lets hear from Facebook directly, the expert of leveraging Big Data to successfully grow its business. Greg Dingle, Software Engineer at Facebook, provides a glimpse into how analysts at Facebook leverage Big Data to deliver a value to Facebook users.
What is your role at Facebook?
I lead an engineering team that develops tools to make Big Data accessible, useful, and social.
What is the biggest misconception about Marc Zuckerberg?
I like to tell people that he is not like the nerdy character in the movie about Facebook. He is actually a good communicator with a warm sense of humor. In the movie they portray him as the classic guy who just wants to win the girl but I think in truth he is closer to the guy who just wants to build the biggest thing ever.
What were the business drivers that led Facebook to address Big Data?
Facebook knew early on that they had to build for massive scale and support a data infrastructure that was distributed and simple to operate. Every data set here is Big Data. We never had to address it because we already planned for it.
Describe the Big Data ecosystem at Facebook.
Our ecosystem is similar to what the industry is using – Scribe, Hadoop, Hive, PHP, R, etc. Data from our applications flows through a distributed logging system called Scribe. Then most of it is copied into our distributed data warehouse called Hive, built on Hadoop. From there employees consume it in a variety of ways. Depending on whether you are a developer, analyst or data scientist, you will need different tools to access data, perform analytics, and create applications. For example, my team has built an application in PHP on top of Hive, our data warehouse, to make data accessible through SQL queries. The application also makes data more social by providing visibility into a library of metadata so users can share knowledge and collaborate around data sets.
I once heard that Facebook has a more accurate database than the government. Is that true?
I’m not sure how accurate that statement is, but I can say that we make it easier for the average citizen to update personal information such as address, phone number, marital status, etc. So yes, I would say that our database is probably more accurate that the DMV since it is so simple to update an address in Facebook versus downloading a DMV form and mailing it in.
Who are the main internal users of Big Data Analytics at Facebook? How are these users or Data Scientists addressing business problems or uncovering business opportunities at Facebook?
We have all types of roles in the company performing Big Data Analytics. A simple use case would be Facebook’s goal of having at least 20 friends per user to improve user experience. Big Data Analytics addressed this though our ‘Friends You May Know Feature’ feature. We store thousands of attributes per user and leverage this to predict a list of people that will most likely prompt a user to convert to a friend.
Do you provide social media data generated by Facebook to external customers or partners for data monetization?
Anyone can access Facebook data that is “public” through our APIs.
For our partners such as advertisers, we provide access to data through dashboards on ad performance such as number of impressions, cost per click, etc.
For application developers, we provide access to data for only those users that have authorized or given permission for the application to access their data.
Many see the value in Big Data for improving the quality of life such being able to predict the right cancer treatment, predict the next placement of a large-scale wheat farm. What is exciting to you about how Facebook data can be leveraged to improve the quality of life?
Improving relationships with friends and family. I have a friend that developed a Facebook App that monitors status updates in a user’s network, and notifies the user when anyone in their network has something to celebrate – birthday, anniversary, graduation, new job, etc. The notification also includes product recommendations based on the user’s profile in order to immediately to take action such as sending an e-card or e-gift.