Some years ago I read an article – I forget where – describing how our general knowledge often becomes frozen in time. Asked to name the tallest building in the world you confidently proclaim “the Sears Tower!”, because for most of your childhood that was the case – never mind that the record was surpassed long ago and it isn’t even called the Sears Tower anymore. From memory the example in the article was of a middle-aged speaker who constantly referred to a figure of 4 billion for the human population – again, because that’s what he learned in school and had never mentally updated.
Is this the case with programming too? Oh yes – as I learned today when performing the simplest of tasks: reading CSV files using R.
Here’s the scenario: given a directory containing CSV files with the same columns, read them into a single data frame with an additional column containing the file name.
We start with
list_files() of course, something along the lines of.
csv_files <- list.files(path = "path/to/the/folder", pattern = ".csv", full.names = TRUE)
My frozen, outdated knowledge tells me that the next steps are: (1) use
lapply() to read the CSV files into a list of data frames, (2) use the vector of file names as names for the list and (3) use
dplyr::bind_rows() to create a single data frame and add the column of file names, here named “path”.
library(dplyr) library(readr) csv_data <- lapply(csv_files, read_csv) names(csv_data) <- csv_files csv_data <- bind_rows(csv_data, .id = "path")
readr::read_csv() for years. Only today did I learn that not only can it read multiple files given a vector of file names, but it can also add a column for those file names. All in one line.
csv_data <- read_csv(csv_files, id = "path")
Why did I not know this? I guess because I had a solution that worked, and I’d never bothered to go back and see if something better had been invented since I learned my solution.
How can we unlearn our frozen, outdated knowledge and update our skills? Right now my answer is “once in a while take the time to read the help page when you use a function, even if it’s one you use all the time, in case it’s been updated with something new and useful.”
Any better ideas?
5 thoughts on “Has your knowledge stopped updating?”
Always great to read you. Should I read all tidyverse help pages? Yes. Will I do it? Nope, I prefer spending that time on improving my skills in machine learning. All the best for 2023.
I’m glad that you blogged it, as I’ve put my Twitter account to sleep for now! This is so much easier than lapply or the purrr variants I’ve been using
Ouch. My code is officially dated! Thanks, this is genuinely useful and a good reminder to keep abreast of help files/Stackoverflow
I also try to read the tidyverse blog, they had written about this feature when it came out: https://www.tidyverse.org/blog/2021/07/readr-2-0-0/#reading-multiple-files-at-once
Of course, whether I read and remember everything from these very information-dense posts, that’s a different question (I don’t)!
Pingback: The Importance of Re-Learning for Knowledge Updates – Curated SQL