all short courses

Big data summer institute

Rice University. June 18-19 2012

Make sure you have the following software installed:

  • R 2.15

  • Rstudio desktop

  • Once you have R and Rstudio installed, open Rstudio and run the following code:

    install.packages(c("ggplot2", "plyr", "reshape2", "stringr", "lubridate"))
  • You can check everything is installed correctly by running:

    qplot(mpg, wt, data = mtcars)

Course outline

    Introductions and course outline.

    ggplot2 basics

      Create informative scatterplots: add extra variables with aesthetics (like color, shape and size) or facetting.

      Displaying distributions

        Create graphics for large data: histograms and bar charts for displaying distributional summaries; boxplots; scatterplots variations that overcome the over-plotting problems associated with large data.

        Data input and output

        Learn how to get data into and our of R.

        Data manipulation

        Basic tools for manipulation data: subsetting, transforming, summarising and re-arranging. Group-wise variants.

        Visualising time and space

          Basic techniques for visualising data that has time and or space components. Merging/joining data

          Polishing graphics for presentation

            Polish your plots: tweak your plots for maximum presentation impact; introduction to color theory; labels, legends and axes; tweaking the plot themes.

            Tidy data

              What is tidy data, why it’s useful and how to make messy data tidy

              Introduction to modelling

                Models as tools, linear models, removing group means