Using the new pipe feature in R

The magrittr package has introduced the pipe operator to R. It looks like this:

%>%

You use it the way you use any pipe. First operations result passed to next one and so on.
Makes for more readable code instead of nested functions.

To use it you need to install the magrittr package. In addition to get the nice easy to use filter,  group_by etc. functions used below you need to install the dplyr package. In fact dplyr depends on magrittr, so install it first and magrittr comes along for the ride.

install.packages('dplyr')
library('dplyr')

You'll need the nycflights13 dataset for this example, so do this:

install.packages('nycflights13')
library(nycflights13)

Here is the example which points the finger at airlines with longest delays from NYC:

filter(flights, !is.na(dep_delay)) %>%
group_by(carrier)
%>%summarise(delay = mean(dep_delay))
%>% merge(airlines)
%>%arrange(desc(delay))

I really like these functions like the filter one. I find subsetting the 'normal' way in R to be v tricky. This makes is a lot easier - to my mind.

Additional stuff I stumbled upon while looking at and getting the nyc data were these bits:

Find the datasets in your installed packages:
data()

In a specific package:
data(package = 'nycflights13')


Here are the dplyr docs - 63 page monster pdf.

This is a bit more manageable.

magrittr github with nice examples is here.

Comments

Popular posts from this blog

Building a choropleth map for Irish agricultural data

Early Stopping with Keras

AutoCompleteTextView backed with data from SQLite in Android