blog

Data Cleaning and Manipulation/Organization

This is the third part in an ongoing series on how and why you should be using R. If you missed the earlier ones, you can check out part 1 (Intro to R) and part 2 (R Basics). This post will go into some more specifics relating to data cleaning, organization, and manipulation.

In my opinion, the dplyr package is a game changer for those trying to learn R. It is what motivated me from just recommending that people use R to basically demanding that my friends and co-workers switch to R. I remember the day that I finally got around to learning how to use the package’s functionality and all of the ways in which it lets you easily and clearly manipulate your data frames1. I just kind of stared at my computer screen and imagined how much better my data-life was going to be with these tools. I realized that the hours and hours I used to spend in Excel trying to massage my data into the right form were over2. Also, I wouldn’t have to decipher weird R base code anymore when trying to create new variables or filter datasets. The dplyr package and its friends make your code/scripts much easier to read which will help both you and future you in trying to decipher what is going on.

Continue reading “Data Cleaning and Manipulation/Organization”

Back to academia!

In August I moved across the country (again) for a new opportunity that I am so excited about. I have started a new position as a tenure-track assistant professor at the University of Illinois at Urbana-Champaign. Yay! I am part of a new campus-level initiative called Technology Innovation in Educational Research and Design (or TIER-ED). I have a split appointment within the College of Education; I have a 75% appointment in the Educational Psychology department and 25% in the Curriculum and Instruction department. The plan is to do a lot of interdisciplinary work across campus (e.g., with VR/AR researchers, speech researchers, and the physics department).

img_5392

Continue reading “Back to academia!”

10,000 Tweets

I have been on twitter for almost ten years. Twitter has changed a lot in that time and my enthusiasm for the platform has waned a bit over the years, but I still find it to be a compelling communication platform. Initially I used it to share about the more mundane, personal parts of life and my stresses as I finished graduate school. Lately it’s become more professionally-focused (most of the time) and more reflective of the many things that are happening in the world (but with important dog pictures also). I have met lots of people through twitter as well as listened and learned from thousands of people who I would never have met in my day-to-day life. It has helped me gain a wider audience for my academic work and has allowed me to share pictures of my awesome dog with strangers and friends alike.

I just hit 10,000 tweets (if I did this correctly then the tweet linking to this post would be number 10,000). And I thought it would be a good opportunity for me to go back through my twitter archive and get a sense of what all of those tweets were about and how I tweeted. (The analysis that follows is actually only on my first 9,945 tweets because I had to request my tweets a couple weeks ago and do the actual analysis.) This was also a fun R exercise for me1.

Continue reading “10,000 Tweets”

Eclipse pinhole projector

Yesterday I posted some videos on Instagram of step-by-step instructions on how to build your own pinhole projector to safely view the eclipse on August 21st. In order to make the instructions easier to share, I’ve compiled them all here (well, screenshots from them at least) to help you turn an ordinary cardboard box into a pinhole projector. 

For more information on the eclipse check out the NASA website eclipse2017.nasa.gov.

Step 1: Find a cardboard box and cut a white piece of paper to fit the bottom. 

Continue reading “Eclipse pinhole projector”

Traveling 2016

I flew about 75,000 miles this year. That’s a lot. For comparison, that’s about 1/3 of the way to the moon (238,855 miles on average). I went to:

  • LA (6 times – once for work, the rest for family stuff)
  • DC (3 times for work, including a memorable Snowpocalypse adventure)
  • New York, NY (for work, but managed to see Hamilton!)
  • Baltimore, MD (work conference)
  • Bloomington, Indiana (workshop)
  • Pittsburgh, PA (workshop)
  • Wichita, KS (work stuff)
  • Madison, WI (for a friend’s wedding!)
  • Arecibo, Puerto Rico (radio telescope!)
  • Carpinteria, CA (annual family Labor Day fun time)
  • Paris and Lyon, France (work conference)
  • Edinburgh, Scotland (work conference)
  • Singapore (work conference)
  • Hanoi and Ha Long Bay, Vietnam (vacation!)

So yeah, I’m a bit tired. That was probably too much. I’m going to try really hard not to repeat that in 2017, but who knows what will happen. I love traveling, but it is exhausting.

Here are some photo highlights from my year of travel.

Continue reading “Traveling 2016”

Multi-modal Learning Data Collection at (Small) Scale

subtitle: even the best-laid plans…

Last year (spring 2015) we collected a really nice set of data of students collaborating in groups of three. The data collection process wasn’t entirely smooth or perfect, but it generally went off without any major technical or logistical problems. We ended up with a really nice dataset of almost 150 students with high quality audio data (four channels per group), video recordings (one per group), and computer log files (ideally one per group, practically more than one). [NB: The annotated audio from this first phase of data collection will be made available soon to other researchers. You can read the paper about the data set (presented at Interspeech 2016) here.]

In the spring of 2016 we set off to do our second phase of data collection, in classrooms during a regular class period. So unlike the first phase where we had just two groups at a time with kids who had volunteered and were excited to try out some math problems (a.k.a. the best kids), we had up to 10 groups at once with varying levels of excitedness and/or willingness to follow directions. We mostly wanted to test out how well the audio capture worked with all of the background noise in a typical classroom environment and see if our speech models still held up.

Continue reading “Multi-modal Learning Data Collection at (Small) Scale”

Going Deep with David Rees

Today on the blog: a TV show recommendation. Season 2 of Going Deep with David Rees started last week and I think it’s a really good show. The basic idea of each episode is that David is trying to figure out how to do something. Something simple, like how to make an ice cube, because it turns out that even simple things are actually really complex and interesting when you break them down. While that premise is immediately interesting to me, one of the things I like best about the show is its warm sense of humor and an open and sincere quest for knowledge of everyday life. It’s this same sense of wonder and propensity for questioning things around me that initially made me want to be a scientist (and now, study how people learn science).

David Rees is a well-known artisanal pencil sharpener. Ok, maybe not well-known to a large number of people, but still, if you send him a pencil he will sharpen it by hand for you. He wrote a book on How To Sharpen Pencils, so he probably knows what he’s talking about. He is probably actually more well-known for being the person responsible for the political cartoon Get Your War On which, at least for me, made the post-9/11 George W Bush years slightly more bearable.

Season 1 of GDDR focused on important questions like How to Open a Door, How to Flip a Coin, How to Shake Hands, and How to Dig a Hole. Those might sound like silly topics for a show, and they are to a certain extent, but that’s not really what episode is totally about.

Sadly, season 1 is not available to stream anywhere at the moment, but it’s not too late to get on the bandwagon for season 2. The first episode was about How to Pet a Dog and tonight’s second episode was about How to Eavesdrop. Tonight’s episode was a really good example of how they can take a simple question and expand it into a really interesting and engaging sciencey show.

How to Eavesdrop is not really about eavesdropping perse. It is about sound. Which is one of my favorite physics topics. As David says in the episode, “how do sound waves get turned into something my brain recognizes as sound?”. Even though he talks to a former CIA spy about actual eavesdropping, the heart of the episode (to me, at least) is talking to the audiologist and learning how the ear works and talking to the cognitive scientist about how we interpret sound waves to understand speech. They even talked about the McGurk illusion which is fascinating and is also something I wrote about on this very blog about four years ago. And, to make my little academic heart even happier, GDDR popped up a citation to the McGurk et al. paper when they talked about it!

If you’re looking for a fun and engaging bit of science on your TV (or computer), you should definitely check this show out.