Hot topic for 2012?? Two words come to mind…Big. Data.
Anyone who is anyone in the technology industry knows that big data is one of the biggest trends of the year. Of course we could not resist jumpin’ on Twitter and talking about it.
Want to extend a huge thanks to our Big Data experts Craig Warthen, Logan McLeod, Barton George and Gina Rosenthal for joining our discussion today. I hope that those participating felt that it was helpful and insightful.
We were joined by many well-known Big Data tweeps such as Stu Miniman, Mike Hoffa, Shel Isreal, and Mike Fishman. The official questions this month were:
- How would you define big data?
- What is not big data?
- Do you have to be a big company to have big data problems?
- Any examples of how small companies can use big data?
Lots of discussion on these questions, and really interesting side questions popped up. For example, How does Hadoop play in to #bigdata? Are there companies using Hadoop in VMware environments that are protected? If so, what methods? Check out the transcript for the discussion and lots of links, and leave a comment if you have an answer or a related question.
The best tweet of the chat has to go to Barton George:
Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂
You can find the full transcript below. Be sure to follow us on Twitter so that you stay up to date on the upcoming SANchats and tweet us if you have any follow up questions/comments! Join us in April as we talk about end to end technology solutions!
dell_storage | We’ll be hosting a #SANchat all about Big Data today at 9am CST! Be sure to join us for the discussion: http://t.co/E8pW01OX |
gminks | RT @dell_storage: We’ll be hosting a #SANchat all about Big Data today at 9am CST! Be sure to join us for the discussion: http://t.co/E8pW01OX |
AlisonatDell | get ready to hear from our big data experts, @bigpapabigdata, @loganmcleod, @barton808, and @gminks! 8 minutes!! #SANchat |
dell_storage | RT @AlisonatDell: get ready to hear from our big data experts, @bigpapabigdata, @loganmcleod, @barton808, and @gminks! 8 minutes!! #SANchat |
Meesh_Says | RT @dell_storage: We’ll be hosting a #SANchat all about Big Data today at 9am CST! Be sure to join us for the discussion: http://t.co/E8pW01OX |
barton808 | All logged in and ready to go! 🙂 #SANchat |
loganmcleod | tap tap tap This thing on? Yay Bigdata! #SANchat |
barton808 | Is it ok if i talk about DAS given this is #SANchat? 🙂 #SANchat |
BigPapaBigData | Yes, DAS is a big help particularly behind Hadoop #SANchat |
gminks | hey I’m here too! Y’all ready to talk abt Big Data? #sanchat |
barton808 | Join me for a #SANchat TweetChat at: http://t.co/RFbYBWZo #SANchat |
shelisrael | #SANChat for big data junkies is about to start with Dell’s @BartonGeorge. |
gminks | Join me for a #sanchat TweetChat at: http://t.co/YDKpl6F8 > use http://t.co/YDKpl6F8 to join the convo! #sanchat |
stu | @barton808 @gminks interested to hear Dell’s network view on #bigdata – it’s not SAN but OK for #SANChat < I wrote http://t.co/zdCcaX5j |
BigPapaBigData | Join me for a #SANchat TweetChat at: http://t.co/DGe9SPnV on #bigdata #SANchat |
gminks | this chat is for anyone who is interested the topic of #bigdata #sanchat |
gminks | hey guys could you introduce yourself? #sanchat |
iSCSIKing | RT @gminks: Join me for a #sanchat TweetChat at: http://t.co/YDKpl6F8 > use http://t.co/YDKpl6F8 to join the convo! #sanchat |
barton808 | Howdy, im the dir of mktng for Dell’s Web|tech vertical. I focus on companies that use the internet as their platfrom #SANchat |
loganmcleod | @stu Reading the article real quick. #SANchat |
gminks | hey @stu & @shelisrael ! #sanchat |
gminks | RT @stu: @barton808 @gminks interested to hear Dell’s network view on #bigdata – it’s not SAN but OK for #SANChat < I wrote http://t.co/zdCcaX5j |
AlisonatDell | i’m alison and i work in storage social media!! excited to learn more about #bigdata! #SANchat |
BigDataClub | RT @BigPapaBigData: Join me for a #SANchat TweetChat at: http://t.co/NYP50G3R on #bigdata #SANchat |
loganmcleod | Hi SANchatter’s.. Logan McLeod… I work in our CTO office and help plot our cloud technology strategy & new tech R&D. #SANchat |
stu | #SANchat hi I’m an analyst w @Wikibon – watching the intersection of #bigdata and infrastructure |
barton808 | This might be helpful, its part of a glossary we put together. It focuses on the data tier eg big data etc http://t.co/jLqgXumO #SANchat |
gminks | @stu #sanchat is vendor neutral by design, so we’re more talking tech than Dell this am. |
gminks | @loganmcleod how are you reading Stu’s article! it is link packed! #sanchat |
gminks | @loganmcleod you are a #speedreader #sanchat |
BigPapaBigData | My Name is Craig and I’m in the Dell Solutions Group storage team. I’ve been focused on helping customers address their #bigdata #SANchat |
chriscastellani | #SANChat Anyone read the new IDC report on #BigData? Any insights to share? http://t.co/ZNrbUdDt |
gminks | Since @barton808 is already going nuts with links – how would you define big data? #sanchat |
loganmcleod | @gminks #speedreader. #beentheredonethat #SANchat |
barton808 | More reference stuff: heres a summary from the last Hadoop summit w/links to a bunch of interviews http://t.co/eiNGKCdJ #SANchat |
gminks | RT @chriscastellani: #SANChat Anyone read the new IDC report on #BigData? Any insights to share? http://t.co/g67W0ZOE #sanchat |
BigPapaBigData | The IDC report discussed a initial step of supporting #bigdata with archival type platforms that can scale. #SANchat |
stu | @chriscastellani #SANchat Wikibon also published a Big Data market study http://t.co/vkRQOOZi |
BigPapaBigData | @chriscastellani The IDC report was what I expected to see, but I was surprised by the low market penetration of the #bigdata SW #SANchat |
loganmcleod | Not surprised on low market penetration. High complexity with implementation and Big data is a means to an end. #SANchat |
gminks | Interesting: IDC defined big data as1:the system has to collect over 100TB of data, 2. data sets to be growing at a rate of 60% and #sanchat |
BigPapaBigData | @chriscastellani The report showed only about 2% of the market implementing analytics SW over the next few years. #SANchat |
coolsport00 | @gminks I would say – lots of data requiring much horsepower in resources & mgmt? #sanchat |
gminks | #. to be deployed on “scale-out architecture” –> do people agree? #sanchat |
barton808 | Forrester estimates that firms effectively utilize< 5%of available data since the rest is too expensive to deal w/. #SANchat |
gminks | So back to a definition – does anyone want to try to define big data? #sanchat |
loganmcleod | Architecturally, I’ve seen success in both SAN based implementation & DAS based architectures. #SANchat |
barton808 | Forrester says Bigdata is new cause it lets firms affordably dip into that other 95%. #SANchat |
VirtualHoffa | @gminks Even non big data related applications need to start looking hard at scale out architectures #SANchat |
BigPapaBigData | @loganmcleod Yes, I guess they are not counting existing data warehouses in that number. #SANchat |
BigPapaBigData | @gminks I try to define #bigdata as the data, particularly machine-generated and then there is the ecosystem around that data. #SANchat |
gminks | RT @VirtualHoffa: @gminks Even non big data related applications need to start looking hard at scale out architectures #sanchat |
gminks | RT @coolsport00: @gminks I would say – lots of data requiring much horsepower in resources & mgmt? #sanchat |
loganmcleod | Datasets larger than can be managed with traditional db mgmt tools, driving insight into trends & the previously unknown. #SANchat |
loganmcleod | @barton808 Agree. #SANchat |
barton808 | Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂 #SANchat |
chriscastellani | RT @barton808: Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂 #SANchat |
BigPapaBigData | Machine-generated data, event generated data, etc is #bigdata. They are all used for more trending and require billions of records #SANchat |
VirtualHoffa | @gminks I think of #bigdata – data could be gathered/generated throughout normal business process, but was unanalyzed in the past. #SANchat |
gminks | rt @barton808 Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂 #SANchat #sanchat |
barton808 | Besides Variety, the other two axis of Big Data are Volume and Velocity. #SANchat |
VirtualHoffa | But it’s #bigdata now because it has value. Analysis can be applied the data and business value can be derived from that analysis #SANchat |
barton808 | @VirtualHoffa yep thats the 95% of unanalyzed data that Forrester cites. #SANchat |
loganmcleod | @barton808 VVV #SANchat |
chriscastellani | Since this is #SANchat: which Big Data uses cases are SANs the best fit for? #SANchat |
BigPapaBigData | Don’t forget Volatility in #bigdata. #SANchat |
mike_fishman | #sanchat Big data and scale out are not synonymous. Scale out is a solution – Bigdata is a challenge ..um .. opportunity |
jayfry3 | RT @barton808: Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂 #SANchat [Of course.] 😉 |
gminks | RT @chriscastellani: Since this is #SANchat: which Big Data uses cases are SANs the best fit for? #sanchat |
BigPapaBigData | @mike_fishman Good point #SANchat |
Dome9 | RT @jayfry3: RT @barton808: Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂 #SANchat [Of course.] 😉 |
gminks | hi @mike_fishman welcome to #sanchat |
barton808 | @mike_fishman Id say scale-out is an architecture that is well suited to support the “opportunities” of big data #SANchat |
ZertoCorp | listening in to #SANchat interesting stuff… |
mike_fishman | #sanchat Hi, I’m Mike and I am crashing the big data party. I design BD solutions for the other 800lb gorilla. |
gminks | @mike_fishman so can you explain the diff between big data and scale out plz in 140 chars #sanchat |
BigPapaBigData | I like all the new database technologies coming out to help manage #bigdata.#SANchat |
coolsport00 | RT @barton808: @mike_fishman Id say scale-out is an architecture that is well suited to support the “opportunities” of big data #SANchat <+1 |
gminks | @BigPapaBigData is volatility a defining factor for big data? #sanchat |
VirtualHoffa | @chriscastellani TBH it depends on the architecture of the application containing / analyzing the #bigdata. SAN isn’t always best #SANchat |
coolsport00 | How does Hadoop play in to #bigdata? #sanchat |
mike_fishman | #sanchat BD is a collection of structured or unstructured information “big” is relative but IMO it exceeds normal OLTP or OLAP capabilities |
barton808 | Data is the currency of the Net, its whats monetized &when aggregated, parsed &made accessible, where the value lies 4biz &individs #SANchat |
KongYang | Cool #SANChat on #BigData going on. Hi I’m Kong and I’m a tech-a-holic. What’s a vacation without some tech goodness 🙂 |
loganmcleod | Big data technology itself is rapidly transforming. Bunch of innovators, more every day. Lots of change in the next couple years. #SANchat |
mike_fishman | #sanchat Scale out is a storage architecture that deploys parallel storage nodes and com;ute to deliver high tput and bandwidth -howdido? |
iSCSIKing | RT @KongYang: Cool #SANChat on #BigData going on. Hi Im Kong and Im a tech-a-holic. Whats a vacation without some tech goodness 🙂 #sanchat |
gminks | Hi @kongyang & @ZertoCorp welcome to #sanchat |
barton808 | @coolsport00 Hadoops a great platform for aggregating & processing big data. It can also analyze but thats not its core strength. #SANchat |
stu | @barton808 Data is the raw material – information/insight is what needs to be extracted using #bigdata tools #SANchat |
BigPapaBigData | @coolsport00 Hadoop is like a ETL tool that brings all the sources together and then sorts through it. Structured and unstructured #SANchat |
VirtualHoffa | @gminks Hadoop actually helps eliminate the reliance on SAN equipment for #bigdata – can use local storage on commodity servers #SANchat |
coolsport00 | @BigPapaBigData @VirtualHoffa Both good answers! #bigdata #SANchat |
VirtualHoffa | @stu @barton808 And that analysis of the raw data is where the real value comes from. Just having the #bigdata means nothing 😀 #SANchat |
coolsport00 | @barton808 Thx #SANchat |
mike_fishman | @coolsport00 #bigdata #sanchat Hadoop is a DW technology that is designed to leverage parallel processing – it is good for big data tasks |
BigPapaBigData | @VirtualHoffa Hadoop is still not mature enough to be the primary storage location. #SANchat |
gminks | OK – not sure we have settled on a good definition of big data. It still feels cloudy 🙂 So…what is NOT big data? #sanchat |
iSCSIKing | Lots of great info about #BigData this morning on #sanchat |
barton808 | @mike_fishman I see Hadoop and DW as separate. Hadoop can integrate with a DW but it can also act in place of one. #SANchat |
loganmcleod | It’s sitting in your RDBMS? #notbigdata #SANchat |
mike_fishman | #sanchat @gminks Great question – what is NOT big data? |
edwsonoma | RT @gminks: rt @barton808 Of course in Texas, we dont call it “Big Data” we just call it “Data” 🙂 #SANchat #sanchat |
BigPapaBigData | @gminks Defining BD is like defining cloud. All depends on who you are talking to, and what they are trying to accomplish. #SANchat |
stu | RT @gminks: OK – not sure we have settled on a good definition of big data. It still feels cloudy 🙂 So…what is NOT big data? #sanchat |
gminks | RT @loganmcleod: Its sitting in your RDBMS? #notbigdata <haha #sanchat |
zertojjones | #sanchat general question for #BigData..Are there companies using Hadoop in VMware environments that are protected? if so, what methods? |
storagebod | @mike_fishman @gminks Big Data is all your data, everything else is a subset of Big Data. Your Big Data is different to my Big Data #sanchat |
VirtualHoffa | @mike_fishman Personally, I think any dataset that is not gathered, or analyzed, with the intent of extracting additional value #SANchat |
BigPapaBigData | @gminks What Bigdata isn’t? Potentially all data and potentially nothing. #SANchat |
coolsport00 | @BigPapaBigData @gminks Does BD nec mean volume? Agreed it’s relative.. #SANchat |
gminks | RT @BigPapaBigData: @gminks What Bigdata isnt? Potentially all data and potentially nothing. < oh COME ON. #sanchat |
BigPapaBigData | @VirtualHoffa Agreed. Data that is not #bigdata is the data you are not interested in analyzing. #SANchat |
mike_fishman | #sanchat Yes, Hadoop can run in a virtualized env. AND can run on SAN, DAS or hydrid – Virt is a good way to leverage un-used resources |
BigPapaBigData | @gminks Depends on where you think you can extract value. If you think there is value in all your data, then it is all your data #SANchat |
gminks | RT @zertojjones: #sanchat general question for #BigData..Are there companies using Hadoop in VMware environments that are protected? if so, what methods? |
iaflash | #iaflash #. to be deployed on “scale-out architecture” –> do people agree? #sanchat http://t.co/pw53xmux |
BigPapaBigData | One of the challenges that defines #bigdata is how to handle some of the datasets that have exteme characteristics. #SANchat |
mike_fishman | #sanchat So it’s not big data unless I specifically plan to mine it? intent doesn’t define a noun. it’s still bigdata – still make a sound |
BigPapaBigData | @mike_fishman That is why it is up to the user to define their #bigdata. I don’t think anyone else really can. #SANchat |
gminks | @storagebod is that a def of what big data is or is not? #headspinning #sanchat |
barton808 | @mike_fishman I agree w/you.While Forrester implies it must be mined to be bigD i belive it can be defined by Vol,Velocity,Varitey #SANchat |
gminks | ok – last question then we’ll need to wrap up #sanchat |
gminks | Do you have to be a big company to have big data problems? #sanchat |
mike_fishman | @barton808 Agree – Mining and BI are ways to LEVERAGE big data to advantage #SANchat — lol go for it @gminks |
mike_fishman | @gminks ahahahahah ..sorry, funny question. #sanchat |
VirtualHoffa | @gminks You know the answer to that is a resounding NO 😛 #SANchat |
coolsport00 | RT @VirtualHoffa: @gminks You know the answer to that is a resounding NO 😛 #SANchat <Absolutely not |
BigPapaBigData | @gminks Not at all. I see really small companies with the same issues as the big guys. #bigdata #SANchat |
mike_fishman | RT @gminks: Do you have to be a big company to have big data problems? #sanchat <- nope, BIg is always relative |
BigPapaBigData | RT @VirtualHoffa: @gminks You know the answer to that is a resounding NO 😛 #SANchat |
barton808 | @gminks Size of co has nothing to do w/need to leverage BigD, 10 person web startups can have mountains o’Data #SANchat |
gminks | @mike_fishman I thought you would like it. You know I’m always abt the humor #SANchat |
coolsport00 | @gminks I would come close to saying it’s worse cuz of their size they don’t think they need to manage as well as Enterprise #sanchat |
gminks | OK so followup – any exps of how small companies can use big data? #SANchat |
VirtualHoffa | @coolsport00 As well as not having the knowledge of even how to gain value out of their existing data sitting there doing nothing #SANchat |
BigPapaBigData | @coolsport00 I see them having challenges getting analytics expertise. #SANchat |
mike_fishman | #sanchat hmm. Small companies can and should still leverage big data – and who says it needs to be “their” data? |
mattwbaker | @BigPapaBigData @coolsport00 – Bingo! (MR)ETL; adding a few new steps 2 old proc 2 gather & make relevant previously untapped data #sanchat |
coolsport00 | @BigPapaBigData @VirtualHoffa Definitely…concur #SANchat |
BigPapaBigData | I look at #Bigdata in a maturity model form. Store it, optimize it, manage it, analyze it, make use of it. #SANchat |
mike_fishman | @gminks SaaS and Iaas are some alternatives available to small companies with #bigdata challenges. #SANchat |
gminks | RT @BigPapaBigData: I look at #Bigdata in a maturity model form. Store it, optimize it, manage it, analyze it, make use of it. #SANchat |
mattwbaker | That said, shouldn’t we be talking abt #BigInsights vs. #BigData – then we can talk abt things that are revolutionizing analytics #SANChat |
gminks | Ok guys, we have to officially wrap up, but keep the conversation going! Anyone want to share what you are working on now? #SANchat |
barton808 | Big Data is the new Cloud: It represents the next not-completely-understood got-to-have strategy http://t.co/CQXJt1ZJ #SANchat |
mike_fishman | #sanchat @gminks Thank you all for a wonderful, lively, insightful twitter discussion today. |
gminks | @mattwbaker maybe next month’s SANchat can be on #biginsights vs #bigdata — hmmmmmmmmm #SANchat |
loganmcleod | @mattwbaker Explain your BigInsights thoughts for the crowd.. 🙂 #SANchat |
loganmcleod | @barton808 Yay Shiny objects! #SANchat |
BigPapaBigData | I’m launching a Big Data Solution in the coming months. I hope you like it #bigdata #SANchat |
gminks | @mike_fishman thank you for joining Mike! #SANchat |
barton808 | This was my first twitter chat, it was a lot of fun 🙂 #SANchat |
gminks | I’ll be at the #ATXVMUG tomorrow and at #Interop — hope to see some of you at one of those events #SANchat |
loganmcleod | Lots of fun at the … #SANchat |
mattvogt | Dang it, missed #SANchat again 🙁 |
coolsport00 | @mattvogt And was good stuff too! #SANchat |
JoeBugBuster | My thought exactly: RT @mike_fishman SaaS and Iaas are some alternatives available to small companies with #bigdata challenges. #SANchat |
gminks | @barton808 thanks for joining, also big thanks to @loganmcleod & @BigPapaBigData #SANchat |
BigPapaBigData | SXSW in Austin is kicking off the Music festival today. 2000+ bands. Bruce Springsteen is the keynote. #SANchat |
gminks | @mattvogt rats! We’ll have the transcript posted soon….#SANchat |
gminks | @BigPapaBigData wow you are working #SXSW? #SANchat |
mattvogt | @gminks thanks! Don’t think I’ll ever make a 7am #SANchat 🙂 |
barton808 | To end w/: you’ve heard about elevator pitches well here is our 90 sec Big Data _escalator_ pitch http://t.co/rWMyWjP1 🙂 #SANchat |
AlisonatDell | @mattvogt sorry, matt! i’ll post a few days early next time! SANchat |
gminks | @mattvogt we will try better next month. right @alisonatdell SANchat |
BigPapaBigData | I wouldn’t call it working! #SANchat |
loganmcleod | RT @barton808: …elevator pitches well here is our 90 sec Big ata _escalator_ pitch http://t.co/wFr016oO < IS AWESOME. 🙂 #SANchat |
mattwbaker | @loganmcleod – It’s simple, folks r looking for new ways of gaining insights (value). Size is just part of it – maybe least import #SANChat |
BigPapaBigData | This was good. Thanks veryone. #SANchat |
gminks | ok everyone, thanks for coming. If you have a suggestion for a ANchat topic plz let @dell_storage know! #SANchat |
BigPapaBigData | RT @mattwbaker: – Its simple, folks r looking for new ways of aining insights (value). Size is just part of it -maybe least import SANchat |
loganmcleod | RT @mattwbaker: Its simple, folks r looking for new ways of aining insights (value). Size is just part of it – maybe least import SANchat |
mattwbaker | @loganmcleod -Start by focusing on the desired outcomes, the nputs (data) & tools(cool stuff) come along for the ride #BigInsights #SANChat |
coolsport00 | @jaslanger @mattvogt Gina and/or Allison tweet it. It’s a tweet hat if you will with the #sanchat hash |
ZertoEric | Enroute to #atxvmug. Come see @ZertoCorp as were interested in iscussing #virtualization and data protection. #SANchat |
stu | @gminks @barton808 thanks for the #SANchat – I’m presenting on bigdata at Interop, looking for customer successes to illustrate trends |
ZertoCorp | RT @ZertoEric: Enroute to #atxvmug. Come see @ZertoCorp as were ierested in discussing #virtualization and data protection. #SANchat |