|

|

Data Analysis of My 2023 Reading


reading data 2023 in the matrix

Three years ago, I started a reading database. In 2023, I started learning about the data analysis process and, indeed, spent the best parts of the year reading and learning about data.

While many posts try to convince their reader to build a “Second Brain,” this post has no such Frankensteinian ambitions. Instead, I will show you my virtual brain, and we will consider what happens when you habitually use the e-brain and how it impacts reading and writing.

The Big Numbers

First, how many books did I read in the year? The 21st century adds significant complexity to an otherwise simple question. The books were primarily digital, borrowed, electronic, or audio. I could not pile the books. I often think about book piles as I write a newsletter about my reading piles, but there is no pile. The Pile Is A Lie.

So, without a database, this simple question would be difficult to answer because most books were not printed on physical paper.

Counting big numbers showed me interesting things about the veracity of truth and the grand, epistemological nature of knowledge. Far out, number stuff.

Number of Books – VERIFIED TRUSTWORTHY!

One hundred eighty-three books is a strongly verifiably datapoint. I opened many books and went from the first to the last page. It averages 3.5 books a week. Of course, this is all self-reported data. I could have forgotten to log a book, or I could be making all this stuff up! But to me, 183 is a strongly knowable number.

What’s interesting about this number is that I cared a lot about it from 1/1/2023 to 12/31/2023, but today, it is one of the least insightful numbers because the collection is complete. Creating the records drives the collection, but once it’s done, the records become self-evident.

Number of Pages – UNTRUSTWORTHY!

Here, things get vaguer and much more imprecise. I am not confident I read 50,574 pages this year for a few reasons. I read about 40% of non-fiction books, which include indexes and footnotes, valuable tools that show the skeleton of a good, well-argued book (or the amorphous blob of a poorly written one). But these pages make the numbers go up arbitrarily.

Furthermore, page count information online is not reliable. I’ve decided to track this more closely this year. I’ve found wrongly reported numbers of numerous books. I’ve held a copy of a book in my hands and saw a different number on Amazon. Various editions, layouts, and formats all impact a quantifiable fact. But despite it being knowable information, it is not always known. Amazon has consistent errors, and this imprecision must somehow affect their warehouse usage and shipping costs. While exact page counts of books is mostly useless, random information, I find this imprecision hints at the fallibility of Big Data and monopoly capital.

Authors Read – VERIFIED TRUSTWORTHY!

In 2022, I read 73 authors and 91 books in total. Thus, I set the goal to read 100 authors. Achieving this goal was one of the drives pushing me to read this much in a year. Again, this is easy to count. I limited each book (record) to one author field, so multi-author books list both authors but count as one author.

Something I realized binging Elmore Leonard novels was that getting the hang of an author’s style dramatically increases the speed at which one reads their books. I could start pounding these out in a day, even with other stuff to do! I think this quality explains why readers can voraciously consume an author like Brandon Sanderson, who’s written dozens of books and well over a million pages.

Fiction Vs. Non-Fiction – UNTRUSTWORTHY!

A new edition to my database, [Fiction: Y/N?]. The checkbox asks the user to consider what is true and what is false.

I read Mike Tyson’s memoir, Undisputed Truth, which approaches this question from the title. Mike admits to a lot in the book, and there are undoubtedly truths in it. I marked this non-fiction.

But a book I marked fiction, The Siberian Job, begins with an introduction stating how the book is fact but barely fictionalized because those involved started getting death threats. The book might be total bullshit, but there are also a lot of truths about privatization in the USSR.

Or one like James Ellroy’s essay collection, Destination Morgue, which is half fiction, half non-fiction, the author recounting his childhood improprieties of huffing paint and breaking into houses, but it’s also fictionalized accounts of him as a police officer solving murders.

That’s to say nothing of obvious roman à clef French, for a book about real life that probably got the author a vicious enemy. No Longer Human by Osamu Dazai and Queer by William S. Burroughs are two such examples.

To binarize fact/fiction is to oversimplify reality into an abstraction that doesn’t always capture narratives’ strange, contradictory nature.


Genre & Format

Genre –

A pie chart that doubles as a nifty abstraction for my brain.

Categorizing what I read has made me keener on the contours of subcategorizations. Whereas most bookstores shelf together crime, mystery, and true crime, because they’re my most represented reading, I break them apart.

Format –

Analyzed independently of the content genre was the information’s format. Or, how did I consume the book? Did I read it in physical or digital? Did I own it or borrow it?

There’s a clear insight from this chart. 64.1% of reading was done with the library. Using digital library apps makes reading cost-free. Most of the books I borrowed were in audio format. These are too expensive to buy individually, so a cost-free way of accessing this information doubled my consumption.

Multi-modal consumption made reading and writing easier. Borrowing the audio and the digital copies allowed me to copy and paste text or highlight useful passages.

Next year, I plan to correlate genre and medium and see which books I prefer in which mediums. I currently need to gain the technical skills to do that.


Age and Quality of Information

Publication year

This tells me the age of the information consumed. Most of them are very new. My most represented year is 2023, with 34 books, and books read from the 21st century (137) outnumber books from any previous century (46).

I think this is for a few reasons:

This graph prompts the most straightforward conclusion of how to broaden my reading depth. I only read things from the 20th and 21st century. I don’t read any classics. In 2024, I intend to read something written in the 19th century.

How did I read so many new books? I think there’s three reasons.

  1. Advanced Reader Copies
    • Requesting ARCs explains why 2023 is the year most represented in the data. With a Goodreads account and a website, I could receive free copies in exchange for a review. ARCs are an efficient way to read contemporary books.
  2. Library Power User
    • Again, the library keeps more recent materials in its collection. Because I use the library, I am more likely to read more current work.
  3. Publishing Press Reader
    • I’m also subject to advertising for new books. Because I read blogs like ShelfAwareness, Booklist, and Crimereads, I am recommended new books more frequently than old ones.

Quality of information

This is arguably the most subjective data point captured. I grade a book on a 5-star scale and am not a harsh grader. I like most books I read, with the overwhelming majority getting five stars. I find rating the books to be my least favorite part of this process, as it feels arbitrary and inconsequential. So what, more or less? I grade the information on whether I found it relevant. Did I enjoy the thrill of the story? How does it compare to others in the genre? Unlike comparisons is the problem with this category. How do I compare a reference book on sentence structures to an expose of an FBI coverup of the Osage massacre to a literary masterpiece to an enjoyable potboiler thriller? I can’t! I could rank the books I read this year, but that would be excruciatingly dull for me and the reader. If you want that, feel free to print out the blog post, cut it up, and rank it accordingly. Which one was better? WGAF?

I’ve also noticed this is the field I’m most likely to revise. If I pick a number in a bad mood, it’s lower than one in a good mood. It’s capricious, but my awareness keeps it relevant enough to keep recording. Perhaps it will be fun to compare year after year after a while.


Narrative Voice

My favorite category in the database is narrative voice. It asks, how is the story told, whatever the story is?

Traditionally, there’s first person (I, me), second person (you, yourself), and third person (He, his name), and it’s a way to define fictional tellings. Who Says? By Lisa Zeidner inspired to look closer at narrative voice and how an author chooses to tell a story. Books can have more than one narrative voice. The bar has multiple colors overlapping to show this.

This year, I noticed more texture in non-fiction tellings. These are terms or flavors. I mostly made up. Historical would be citing multiple sources; Essay is opinionated writing; How To is a descriptive guide; Reporting is interviewing first-hand sources; Fandom is writing about a thing one passionately engaged with for multiple years. I intend to write up something longer about this column in 2024.


Takeaways

I have internalized some big takeaways as I am steadfast about continuing and deepening this project in 2024.

Libraries vs. Book Buying

I started using the library with regularity in 2022, but I still bought books. I quit my job in 2023, so I “stopped” buying books (I buy fewer anyway). This has dramatically increased my reading, and I now think buying books wastes time more than it wastes money. The biggest takeaway for me in 2023: buying books cuts into reading time. When browsing the library, the evaluation is so much simpler. “Does this look cool?” When buying a book, I ask about ten questions from “Is this a reference text?” To “Do I want to lug this around to more apartments?” Hours wasted browsing used bookstores were better spent walking and listening to books. Last year, I saved thousands of dollars not buying books—this year, I am thinking of it differently. I’ve internalized library usage, and I’ll never buy books in significant numbers again (lol, says the addict, time will tell).

It Is Surprisingly Easy To Read New Stuff For Free in 2023

This year, I realized what many book reviewers already knew: reading new books in exchange for a review is a valuable service readers provide. Publishers and authors need this, and asking for free virtual copies is not an imposition on them. Without 5-star reviews, new books are not aggregated on buying markets like Amazon, Kindle, Barnes and Noble, or even library apps like Hoopla. So, in 2024, I’ll keep asking for more free stuff!

Querying Time vs. Inputting Time

A subtler trend, but I realized this year I spent comparable time querying the database than creating it. Intuitively, cataloging would take more time in 2023, which intuitively makes sense: more records = more time. However, I streamlined my process for faster entry. So, writing 183 lines in a spreadsheet didn’t take much time at all, especially compared to the time spent reading the books. How often I referred to my notes from this year compared to previous ones surprised me. I referenced the database in bookstores, online conversations, and while writing. I’m comfortable with this data now. I also created the dashboards at the end of 2022 and was able to use them throughout 2023. Through reflection, I could comprehend the iterative progress of these milestones.

Methodology and Tools

I used an Airtable database to input, track, and filter the information. Here’s an Interface displaying these visualizations. [link]


Here’s the entire list of all 183 books considered:


Leave a Reply

Your email address will not be published. Required fields are marked *