Loading user information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading user information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Big Data, Big deal

No media available

The recording for this session is not yet available

Are you ready for the exploding world of big data? Do you know the difference between Hive and Pig? Do you know why MapReduce is being taught in many universities rather than SQL? If not, pay attention because this talk will help get you started in understanding this new world. While sometimes the Hadoop toolkit (which includes HDFS, MapReduce, Hive, Pig, and Sqoop) is used as an alternative to relational database systems such as SQL Server, more frequently customers are using it as a complementary tool. Sometimes it may be used as an ETL tool or to perform an initial analysis of a freshly acquired data set to determine whether or not it is worth loading into the data warehouse, and sometimes to process massive data sets that are too big to even contemplate loading into all but the very largest data warehouses. In addition to covering the basics of the various parts of the Hadoop stack, this talk will discuss the strengths and weakness of the Hadoop approach compared to that provided by relational database systems and explores how the two technologies can be used productively in conjunction with one another.

Follow the discussion

  • Oops, something didn't work.

    Getting subscription
    Subscribe to this conversation
  • Interesting talk on the paradigm shift in dealing with big data.

    My partial time annotations in mmss (MinutesSeconds) format are:

    315 some big data stats

    410 amount of data will increase by a factor of 35 to 40 by 2020

    450 the data deluge, G20 interest into big data

    504 why the sudden explosion of interest in big data?

    650 data is not thrown anymore + trend to analyse social network sentiment data

    730 cost of data storage is down

    755 managing "big data": parallel DB vs NoSQL system

    845 Bing statistics

    900 NoSQL discussion

    957 why NoSQL (Not only SQL)?

    1140 NoSQL is driven by developers

    1220 Reducing time to insight explains interest into NoSQL

    1315 NoSQL vs. SQL approach = agile vs. not

    1325 NoSQL approach

    1405 2 types of NoSQL systems:   

    1. key/value: Mongo DB, Couch DB, Cassandra, Azure tables   

    2. 1525 Hadoop = distributed execution framework & file system

    1625 Two universes of data: structured and not

    1705 paradigm shift from SQL

    1840 what is Hadoop?

  • Is this video going to be put up on C9? 


Remove this comment

Remove this thread


Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.