Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

A flexible two-column Jekyll theme. Perfect for personal sites, blogs, and portfolios hosted on GitHub or your own server. Latest release v4.9.1

Data Analysis

Use R for data analysis and visualization, handle geo-datasets, train models and estimate errors, and use GitHub for comprehensive documentation and task man...

Splash Page

Bacon ipsum dolor sit amet salami ham hock ham, hamburger corned beef short ribs kielbasa biltong t-bone drumstick tri-tip tail sirloin pork chop.

Posts

examples

unit01

First Things First

Go through a brute force introduction into R, R Markdown, the RStudio IDE, version management with Git and GitHub’s classroom functionality to get ready for ...

R and RStudio

To start with a clarification: R is the statistical programming language you will use in this course (and which is used by many other scientists). With R yo...

Example: Vector Basics

Vectors are the basis for many data types in R. Creating a vector A vector is created using the c function. Here are some examples: my_vector_1 <- c(1,2...

Example: Data Frame Basics

Data frames are one of the most heavily used data structures in R. Creation of a data frame A data frame is created from scratch by supplying vectors to the...

Example: R Markdown with html output

This page shows how a compiled R markdown file looks like (in fact, all code examples in this course were compiled with R markdown). This is a header This ...

Git and GitHub

To start with a clarification: Git is the version control system you will use in this course (and which is used by many other developers all around the world...

Assignments and GitHub

A note on individual learning log assignments with GitHub Within this course, you will individually submit your personal solutions for the course assignments...

unit02

First Things Second

Look closer at data sets and data types before focusing on the most important features of programming languages, namely run-time control and loop structures.

Data Types

For a quick introduction to data types in R check out our own material in the accompanying Base R course, this video by DataCamp or a more textual descriptio...

Object Types

For a quick introduction to object types (aka data structures) in R check out our own material in the accompanying Base R course, a brief description at RSpa...

Operations

For a quick introduction to Operations in R, i.e. different types of operators and control structures, check out our own material in the accompanying Base R ...

Unmarked Assignment: Loop and Conquer

This worksheet provides some control structure and loop examples to help you getting familiar with these probably most important properties of any programmin...

unit03

Look at Your Data

Become familiar with reading and writing data, computing summary statistics and visual data exploration as the basics of data analysis.

Tabulated Data I/O

Reading or writing tabulated data into or from a data frame is a quite common task in data analysis. You could use the read.table function for this. df <-...

Visualization

Do not wait until the very final analysis stage to produce some publication quality graphics but produce fast (not necessarily nice) visualizations all the w...

Example: CSV I/O

Readading data from csv files Reading csv files is realized using the read.table function from R’s utils library. The function will return a data frame whic...

Example: Aggregation Statistics

Summarizing a data set The most straight forward function which returns some aggregated statistical information about a data set is summary. a <- c("A",...

Example: Visual Data Exploration

Visual data exploration should be one of the first steps in data analysis. In fact, it should start right after reading a data set. The following examples ar...

Marked Assignment: Read and Plot

This worksheet will guide you in getting a first overview of the wood harvest in Hessen between 1997 and 2014 using a visual data exploration. After completi...

unit04

Clean Your Data

Check the integrity of datasets and clean them up to ensure that the data basis for your analysis is consistent.

Example: Missing Values

Handling missing values is straight forward. Let’s start with a vector with one NA value at position 3. Please note that NA is not inside quotation marks sin...

Example: Date/Time

Coercing data types to date and/or time information is generally performed using as.Date or either as.POSIXct or as.POSIXlt. Let’s start with as.Date: as.Da...

Example: Sorting

Sorting vectors or lists Vectors can be sorted using the sort function. If you want to sort a list, you have to access the actual elements since sort require...

Example: Cleaning Columns

Cleaning data frames involves quite different aspects like splitting cell entries, converting data types or the conversion of “wide” to “long” format. In ge...

Example: Merging

When thinking about combining two data frames one has to distinguish between merging them by the values given in a specific column or consecutively putting t...

Unmarked Assignment: Cleaning Crops

This assignment is the first in a series which use regional statistical data. While the wood harvest data from Hessen was (i) quite small and (ii) quite tidy...

unit05

Describe your linear data

Compute simple statistical linear regression models that relate a dependent to an independent variable.

Basic idea of statistical modeling

Basic idea of statistical modeling Use observation samples to describe the relationship between a dependent variable and one or more independent variables. ...

unit06

Predict your linear data

Compute simple linear models to predict dependent data and assess the performance by independent test samples.

Cross-validation

Test statistics can describe the quality or accuracy of regression models if the assumptions of the models are met. However, the assessment would still be b...

unit07

Select your variables

Evaluate the importance of your independent variables and select an optimal subset for your prediction model.

unit08

Tune your model

Evaluate model tuning strategies and find optimal settings for your prediction model.

Generalized additive models

So far, the models have only considered linear relationships. The corresponding model type to simple linear models would be an additive model and for poisson...

Unmarked Assignment: Model Tuning

This worksheet uses cross-validation strategies for tuning an additive model. After completing this worksheet you should have improved your skills on handli...

unit09

Predict Your Temporal Data

Look into some specific characteristics of time series data and predict future observations based on past dynamics.

Time Series

Although we already had contact with some temporal datasets, we did not have a closer formal look on time series analysis. Time series datasets often inhibi...

Predicting time series

Time-series analyses can generally be divided into forecasting future dynamics and describing and potentially explaining past patterns. Since the later ofte...

unit10

Time Series Decomposition

After looking into time-series forecasting, we will now switch to some basics of describing time series. To illustrate this, we will again use the (mean mon...

Time series clustering

Just as one last example on time series analysis for this module and mainly for demonstrating that this module only tipped a very small set of analysis conce...

Unmarked Assignment: NAO and Cölbe

This worksheet focuses on the analysis of meteorological time series data recorded at a station near Marburg University Forest and some global teleconnection...

unit11

unit12

Graphics

Visualize your data, get some hints for publication quality graphics, and learn about some packages specifically made for visualizations.

Example: Colours

Before we expand our plotting capabilities, we want to spend a bit more time thinking about colours and colour spaces. A careful study of colour-spaces (e....

Example: Colours and maps

This is a short example on how to use the hcl colour palette for colouring features of a shapefile. Load the required packages library("rgdal") library("ras...

Example: The R Graph Gallery

Finally, check out the R Graph Gallery for getting an impression of the many more data visualization possibilities in R.

worksheets