Sunday, August 31, 2014

Using the new pipe feature in R

The magrittr package has introduced the pipe operator to R. It looks like this:

%>%

You use it the way you use any pipe. First operations result passed to next one and so on.
Makes for more readable code instead of nested functions.

To use it you need to install the magrittr package. In addition to get the nice easy to use filter,  group_by etc. functions used below you need to install the dplyr package. In fact dplyr depends on magrittr, so install it first and magrittr comes along for the ride.

install.packages('dplyr')
library('dplyr')

You'll need the nycflights13 dataset for this example, so do this:

install.packages('nycflights13')
library(nycflights13)

Here is the example which points the finger at airlines with longest delays from NYC:

filter(flights, !is.na(dep_delay)) %>%
group_by(carrier)
%>%summarise(delay = mean(dep_delay))
%>% merge(airlines)
%>%arrange(desc(delay))

I really like these functions like the filter one. I find subsetting the 'normal' way in R to be v tricky. This makes is a lot easier - to my mind.

Additional stuff I stumbled upon while looking at and getting the nyc data were these bits:

Find the datasets in your installed packages:
data()

In a specific package:
data(package = 'nycflights13')


Here are the dplyr docs - 63 page monster pdf.

This is a bit more manageable.

magrittr github with nice examples is here.

Wednesday, August 13, 2014

Using R to analyse data from the Central Statistics Office in Ireland

I started a Coursera course this week called Getting and cleaning data. I was looking at some data from the CSO and realised that I needed to clean it up. The course is good, but quite difficult. It assumes you have not forgotten everything you learned in the previous course, R Programming. 
 
Mid way through week 2 (I am playing a bit of catchup) I stumbled across this package

http://pxr.r-forge.r-project.org/

px is the format that the CSO releases its 'raw' data in. This package puts that into a data frame which is more amenable to analysis. I reckon there has to be something lurking within the CSO dataset which is one of my main motivations for getting up to speed on R. 

Nothing to declare yet, but hopefully I will find something interesting soon :)

Monday, September 16, 2013

Get a list of your functions in mysql

If you want a list of functions (as opposed to procedures) use this:

select * from information_schema.routines where routine_schema = 'your_schema_name' and routine_type != 'PROCEDURE'


Friday, August 9, 2013

Douglas Crockford Video 5 - 'The end of all things'

No let up in quality in this video. Here are my notes for the final video in the original series:

Cross site scripting (XSS) is a big problem. Huge privs accorded to a successful attacker.
Caja and adsafe - make js safer.

Don't confuse a variable and a value.


How does an object get a reference:
By Creation
By Construction
By Reference


David Parnas:


Lazy programmers guide


Keep performance delays below 100ms - provide some sort of immediate feedback.
Don’t fiddle with code. Measure first. Use PageSpeed


Arrays can be slow in older versions of ie. No hashmaps.
Don’t add unnecessary chrome. Takes time.


Don’t tune for quirks. Keep code clean and readable. Future versions of JS engines will be much faster. Your quirk optimisations may cause trouble.


jslint.com

Avoid global variables.

Avoid ++ - too easy to mess up
Use jslint

Wednesday, August 7, 2013

Douglas Crockford's JavaScript Video 4 - AJAX

These videos are remarkable in the packed world of IT training videos in that they are clear and enjoyable to watch. The fourth instalment is about Ajax, but goes into plenty of detail that I didn't know about where the DOM came from and some info about the famous browser wars. Here are my notes so that you can see what is in there before you invest 90 minutes.

Markup languages

RUNOFF
GML - generalized markup language
SGML
HTML - simplified SGML
Latex

Angle brackets came from Scribe.

HTML

Does not fail on errors - allowed innovation. Otherwise the web would have frozen.
2 types of outlines - H1 - not nested and p type which are. Yuk.

CSS

Not modular - clashes can wreck your page.
difficult to manage selectors - classitis and iditis.
None of the browser vendors ever got it implemented!

The DOM

Brendan Eich - Netscape

Browser workflow

url -> Fetch -> cache  -> Parse -> Tree ->  Flow -> display list ->  Paint -> pixels

Comments around script tags just protects users of ancient browsers from seeing the script. Don't bother with this.

document.write

Very bad. Don't do it.

For performance improvement of scripts


  • minify
  • gzip
  • Reduce number of script files (concat at deploy)
  • Use something like Chrome PageSpeed to test


Javascript uses camel case for style properties. CSS uses hyphens - incompatible with JS - in fact incompatible with most languages. Done on purpose. Source of annoying bugs.

InnerHTML 

Nice and fast, but dangerous. Developed by ms - all browsers support it.

Always err on the side of understanding and clean code over performance unless performance is a serious problem.


Events

Bubble up through the DOM - use stopPropagation to deal with this.
Allows attaching of a single event handler to a container. The container then dispatches the event to the appropriate element. Faster to set up.

Use good speed testing tools - Chrome best.

Server vs browser

Neither side should dominate. A balance is the best. The server is not a filesystem and the browser is not a dope that just displays returned content.




Thursday, August 1, 2013

Douglas Crockford's JavaScript Videos from his time at Yahoo

He has now moved on to other things, but these videos are still around. They are like reading a novel - long form is still best. No sound bites here. Each video is over an hour long and there are 8 of them. I am only on the third at the moment, but am getting a huge amount out of them. If you are like most js developers and me you will be bludgeoning your way through whatever tasks you need to complete without knowing the details. JQuery et al insulate us from having to know this stuff right? Afraid not. There is no substitute for knowing the javascript in detail. These videos manage to do that.

Tuesday, July 30, 2013

Useful CSS Wisdom

I came across this in the most recent Web Design Weekly. I have called it wisdom as it is full of the type of advice that people only have the balls to give you when they really know what they are talking about. Why IDs in CSS should be avoided is detailed. A lot of people rant about this, but there is something about this document that got me sifting through my CSS to find all the # marks.
The document is short and I am going to look back at it a few times over the coming months to make sure I have squeezed the benefit out of it.