Profile picture
Jan Piotrowski @Sujan
, 18 tweets, 6 min read Read on Twitter
Yesterday I received Amazon's response to my "Data Subject Access Request" from 2 months ago: betamode.de/2018/09/18/res…
Today I started to hack together some scripts to analyze the data a bit. I will post my progress and results as I create them.
First look at `Kindle/Inhalte/KindleReadingActions.csv`, a ~55000 line file with data on my Kindle device and app usage.
Turns out the Kindle(s) "sent" data (actually: tracked, then sent later in batches) on almost every day in the last 15 months.
Short detour: What devices were reporting data to Amazon in general? `Kindle/Geräte/registration.csv` tells us: lots of them.
Grouping the previous calendar data by device (only 4 devices sent "Kindle Reading Actions"), it becomes much clearer when each device was active:
Paperwhite all the time,
iOS + Android + desktop only sometimes.
Actual reading activity is simpler to extract from `Kindle/Inhalte/ReadingSessions.csv`. Seems I like reading outside in spring and during vacations. (Suprise!)
Kindle app on iPhone only if I have to (= forgot the Paperwhite) - or for some audiobooks from Audible (buy package of book + audiobook, listening and reading progress are correlated/synced) lying in the park. (No audio + headphone jack on Paperwhite.)
Oh wow - Whispersync data in `Kindle/Inhalte/whispersync.csv` is available all the way back to my very first Kindle. All (?) the books and documents I opened and read.
Next `Kindle/Inhalte/digitalcontentownership`: one file for each of my 719 digital "items" (ebooks, ebook samples, music files, apps, uploaded documents) with type, ASIN and acquiredDate.
Some simple web scraping later (title of amazon.de/dp/%ASIN%), I have titles for ~50% of the 700 unique ASINs in this list.
And some additional web scraping later (title and html extract of amazon.de/product-review…%), all of the "out of stock" ASINs (not sold any more) also have a title.

Seems I "own" a lot of Kindle User's guides and dictionaries I didn't know about.
Only items left nameless for now are my "Kindle Personal Documents" that I uploaded, so PDFs or ebooks from other, non-Amazon sources.
Interesting discovery along the way (because it listed books I didn't know): All items I can access via "Family Library" are also included here.
Short investigation later: the JSON contains a `rights.originType` that encodes this. Only half of my items are actual purchases.
Those 91 samples turned into 21 purchases.
(Both same and ebook fortunately have the same ASIN)
All days with "Purchase" events.

Interesting (?) how this partly correlates with Whispersync data, and partly doesn't:
Anyone interested in applying the same analysis on their own Amazon DSAR data?

I put all the code on github.com/janpio/amazon_… - works on any shared host with PHP.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Jan Piotrowski
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!