Useful Tools

PhD Tools: DevonThink for File Storage and Discovery

[This is part of a series on the tools I used to write my PhD. Check out the other parts here.]

Discovering similar notes in one of my DevonThink databases

Discovering similar notes in one of my DevonThink databases

I first heard about DevonThink in the same breath as Tinderbox. They go together, though they serve different purposes. Some people want to make an either/or decision about which to use. I see them as sufficiently different to assess them on their own merits and as per your usage scenario.

As with all tools, you should come to the decision table with a set of features that you're looking for. Don't just shop around for new things for the sake of newness or for the sake of having a really great set of tools. These programmes are not cheap. Luckily almost all of them come with generous trial versions or periods, but I don't recommend 'newness' as a feature of any particular merit.

Devonthink (I use the Pro Office version) is a place to store your files and notes. It can, I think, take any file you can throw at it. It comes with software for processing PDFs into fully-searchable documents (OCR software, in other words) which is part of the reason why the license for the Pro Office version of the programme is so expensive.

If you're anything like me, you're drowning in PDF documents. They all come with helpful names like "afghanistan_final_report_02_16.pdf" and unless you have a rigorous file hierarchy and sorting system, you'll probably be unable to find one file or the other. And using the basic file hierarchy system for storage doesn't help you with situations like when you want to store the same file in multiple folders (i.e. what if a report is about Afghanistan and Tunisia). (DevonThink has a feature which allows you to store the files in multiple locations, but without saving two copies of the file. Any changes or annotations you make in one file will automatically be transferred to the other).

You might ask yourself why you would need DevonThink and Tinderbox (see this post for more). The short answer is that they store different kinds of files/data, and that DevonThink is less about thinking than about storage (to a certain extent) and discovery.

One of the key features of DevonThink Pro Office is its smart searching algorithms, its ability to suggest similar texts based on the contents of what you are looking at, etc. It does this by means of a proprietary algorithm, so I can't really tell you how it works, but just know that it does. It works best on smaller chunks of text. In this way, I was reading through a particular source from the 3 million-word-strong Taliban Sources Project database and then I clicked the "See also" button and it had found a source I would never otherwise have read on the same topic, even though it didn't even use one of the keywords I would have used to search for it. It uses semantic webs of words to figure this stuff out. Anyway, beyond a certain database size, this power becomes really useful. It can also archive websites, store anything including text, do in-text searches on e-books etc etc. (Read more on how I use DevonThink for research in general here.)

I also used it a little as an archive for substantive drafts / iterations of the writeup process. That's another important part of the process: making backups of many different kinds. I never found any use for them, but at least they were there (just in case).

If you're a data and document hoarder at heart, like me, you'll soon have a Devonthink database (or several databases, split up by topic) that is bigger than you can fully comprehend it, or remember what was inside the files. At that point, search becomes really important. Not just a straightforward search, but the ability to input 'fuzzy' terms (i.e. if you search for "Afghanistan" it'll also find instances where it's incorrectly spelt "Afgahistan"), and boolean language, into your query is really powerful/useful. DevonThink is an amazing search tool. The company that developed the database software also make something called DevonAgent, which is basically a power-user search tool for the internet. Google on steroids, if you will. Fully customisable, scriptable... you can really go crazy with this stuff. I use it, but my PhD wasn't really about searching things on the internet, so I didn't use it much for my research or writeup. But it's a great tool, too.

In short, DevonThink is a research database tool that will help you store and find the documents that relate to your research, and do smart things to help you find sources and texts that maybe you'd forgotten you'd saved. Highly recommended for anyone working with large numbers of documents.

PhD Tools: Think better with Tinderbox

[This is part of a series on the tools I used to write my PhD. Check out the other parts here.]

Tinderbox is a tool for writers and thinkers that can handle most things that you throw at it. Anything to do with thinking, it can probably do what you want. That said, there is a slight learning curve to the programme, and it may not be to everyone's particular style. With those caveats stated, let's dive in.

Any PhD student generally takes a lot of notes. Notes on books or articles you're reading, or notes about points you want to make in the argument of your text / writeup. There are purely text-based / database-style systems that can handle these kinds of notes (like DevonThink, about which more soon) but none with the flexibility or visual features of Tinderbox.

A list of notes, for example, can be transformed into a visual / spaced-and-linked map of meaning like traditional 'mind maps'. You can switch back and forth between outlines and maps easily (or even display both on the same screen/window) and display notes as well.

It's fast, it doesn't break or crash or slow down your computer, and it helps you think things through in the way that is best suited to your needs. Too often, software forces you to think in a particular way (i.e. the way of the software creator), but Tinderbox adapts to the way you were thinking and allows you to draft notes and structure accordingly.

I've written elsewhere about some things I've done using Tinderbox, so no need to mention all that here, but some things I found specifically useful for my PhD:

1. Small databases, constructed on the fly, while taking notes from books. An example of this is a database of key players or individuals from within the Taliban who occurred at various places in my notes. This grew to a pretty extensive document, but Tinderbox allows you to make these kind of structured data sets without needing to think too much about how the data might eventually be presented or used. Changing things is easy.

2. Timelines -- Tinderbox can display lists of events with start and/or end dates on a timeline. I used this to create the TalQaeda timeline, for example, or the list of moments where the Afghan or international military forces claimed to have killed or identified a Chechen fighter in Afghanistan (chechensinafghanistan.com).

3. Working through 'unstructurable' ideas -- There's often a gap between the ideas you think you have in your head and the ideas as they are expressed on the page. I have found Tinderbox extremely useful in allowing me to find a way to make the two align closer together, or to figure out a structure or a sequence to parts of an idea in a way that makes most sense.

You can also use Tinderbox as a day-planner or a task outliner (like I discussed in my post about Trello), though I think it might be less suited to this task when compared to Trello.

The forum for Tinderbox and related products is a great place to discuss method, process and different ways of structuring ideas. Users are a mix of complete beginners and others who have been drafting books, novels and essays using Tinderbox for years. I find the discussions in the forum are often stimulating; asking questions there is an interesting way to rethink a particular mental quandary you might find yourself in.

BONUS: Listen to my podcast interview with Tinderbox's creator, Mark Bernstein, for more on the vision behind the software and for some practical tips on structuring ideas.

PhD Tools: Visualise Structure and Kanban Flow with Trello

[This is part of a series on the tools I used to write my PhD. Check out the other parts here.]

What was leftover by the end of my PhD submission process...

What was leftover by the end of my PhD submission process...

I used Trello for structuring my PhD argument and for tracking my progress during the drafting and redrafting of the final text.

Trello is primarily associated with the Kanban workflow / movement and as such it offers a fast and easy-on-the-eyes way to visualise structure, the passage of tasks through a particular workflow and so on.

It only works with an internet connection, however, which makes this a somewhat qualified recommendation. The mobile apps associated with Trello also lack an offline mode.

Tasks are split up into lists, and these are organised in a sequence. Thus for me, my lists at one point were my different chapters. It's easy to email things (links, notes to yourself, or anything else text-based) into your lists from outside Trello, so it can function as a useful 'bucket' where you can deposit things you want to research in the future, or just tasks that need to be performed for a certain chapter.

It's a way of seeing what needs to be done, or what you want to add to a particular chapter, at a single glance. Not essential, but I found it useful at certain junctions of the editing process.

PhD Tools: Beeminder

[This is part of a series on the tools I used to write my PhD. Check out the other parts here.]

I feel like I've mentioned the end of the PhD several times in recent posts (PHD IS OVER!). It occurred to me that it might be useful to go through some of the tools and principles that I found most useful in completing the doctoral thesis, the research and the work in general. Part of this is by way of giving thanks to the application or methodological creators, and the other part is me thinking that others (future / current PhD students?) might find this useful.

It took me many years to finally settle on these tools. It would probably be unwise to adopt my entire writing style and process for yourself, because everyone's unique. I read a lot of books, blogposts and discussed things in forums and at meetings with others. This is all the product of a lot of procrastination (some active, some just resulting from hanging out on twitter or subscribing to a bunch of productivity-related blogs in my RSS reader).

Each post will vary in size. For some I'll go into a bit more detail because the principle will be somewhat unknown. Others are mega-players in the tech world so I'll just tip my hat in their direction.

Minding the Bees

My first pick is, of course, Beeminder. (I've written about Beeminder before here.) The principle behind this service is pretty simple: you commit to doing a certain thing (or things) by a certain date (or regularly each day etc) and if you don't do them, you're penalised with money taken from your credit/debit card. The amount of money taken depends on whether you're a first-time offender (free, or $5), but then it increases exponentially. Pretty soon you'll be facing $270 or even higher fines.

Needless to say, this is a pretty strong motivator. You can hear about some of the nitty-gritty details in a podcast interview I did with Matt Trevithick and the founders of Beeminder, Bethany Soule and Daniel Reeves.

I have used Beeminder for a really wide variety of things -- not just for my work but for my personal life, too -- but in terms of my PhD, I had three main goals it supported:

1) tracking the amount of time I spent writing. You can hook up RescueTime (a passive activity tracker on your laptop) to feed into Beeminder. I can then say that I want to make sure I do a minimum of 1 hour of writing in Scrivener each day (for example), and Beeminder keeps track of the rest. This is a good thing to track because, ultimately, the PhD is all about keeping writing. You can get lost in the research, but after a certain point you just have to deliver it and ship the damn thing. This keeps you honest about the writing part, the sitting down in the chair and putting words on the page.

2) words drafted -- this one's a bit more delicate, since often when you're starting out, drafting a new section or chapter, the words that come out are useless drivel (or replace with a far less charitable way of describing their quality, and say hi to my inner voice while you're there!). At the beginning, doing basic drafting, it's hard to get started because you feel everything has to be perfect. The best antidote to this is to work on a 'shitty first draft'. Here, the idea is simply to churn out enough thoughts to fill the blank space in the outline, or book, or chapter or wherever.
A specific example: I flew to Karachi in late 2012 to hammer out the first draft of my dissertation. I setup a Beeminder goal of having 100,000 words of text (approx the maximum word count allowed for submission to the university) and a date 6 weeks in the future, and I got writing. Beeminder calculates and tells you how many words you have to get done each day in order to stay ahead of the curve. (There are graphs. They are awesome). As long as you keep writing, you're ok. And I did it. Most was horrible, and some of it was inner conversations between myself and myself about the subject under consideration, almost all of which I had to rewrite in some shape or form later on. But... it was words on the page, and it was me thinking through the issues. It was essential.

3) Sources Read -- this might be unique to me, but at some point I had to return to the newly-gathered sources of the Taliban Sources Project. I looked in my DevonThink database (about which, more to come in a future post) and saw I'd flagged 1000+ articles to reread, catalogue/tag and integrate into the main thesis argument. So I plugged those numbers into Beeminder, gave myself a workable daily rate (50 or 100, I think) and then it calculated the rest and kept me honest.

So, to sum up:

  • Beeminder forces you think backwards from your goal if you have a specific endpoint in mind. This is extremely valuable as it makes sure you're not being overambitious.
  • Beeminder gives you accountability. It keeps you honest. This is what I initially found was most valuable, but later on I needed this less. YMMV.
  • The community of Beeminder users is wonderful. The forum is a great place to get ideas, discuss approaches / failures etc.
  • It works! Many people have had great results using Beeminder.

I'm not going to say I couldn't have written my PhD without Beeminder, but I'm almost saying it. Go check it out!

Learn all the districts of Afghanistan with Anki!

A friend was asking about using Anki to learn to recognise the districts of Afghanistan so I made her a deck that provides tests in the following way;

On the front of the card the question is presented along with a computer-generated audio pronunciation of the district name:

Then if you know it, you'll answer Badakhshan and then you'll click/tap through to the next screen to see if you got it right. You'll see this:

 
 

Then you can mark whether you got it right or not. There are around 400 districts to learn, so if you learn 13-15 new cards each day you'll finish the whole lot in a month.

Why learn all the districts of Afghanistan? Sometimes you'll hear someone talking about a particular place or part of the country, and without knowing which province they're talking about you might not understand the context or the conversation. Plus, a little bit of geography never hurt anyone.

Give it a try. And let me know if you manage to complete the deck. You can download the full Anki file here. Enjoy!

Upcoming Maniac Week

 

 

I hereby commit to doing a maniac week. This is inspired by Nick Winter and the good people at Beeminder, namely Bethany Soule and Daniel Reeves. The idea is as follows (borrowing heavily from a format over here):

  • I will begin at 6am on Sunday December 6.
  • I will continue until 6pm on Saturday December 13th.
  • I will not be checking my email at all during the week. I will also be turning off and/or disabling all chat programmes and my phone.
  • I will not use any social media websites or check RSS news. (This block will be handled by the StayFocused plugin and RescueTime’s Get Focused mode.
  • I will ensure I am in bed for 7 hours every night. This will be tracked via Fitbit.
  • I am allowed 3.5 hours every day for things which aren’t work (showers, preparing meals, eating, rest, meditation and walks outdoors). This will be tracked using TagTime using the tag “notwork”.
  • The remaining time will be for my work. This will be tracked using TagTime and RescueTime, and my main focus during this week will be my PhD dissertation.
  • As with others’ maniac weeks, I’ll be recording the whole week using time-lapse photography, though I’ll see how much hassle it is to assemble a video after the week is finished. Also, part of my work will involve me away from the computer, writing and outlining things by hand, and anything involving interview transcripts etc will obviously have to be blurred out or blacked out. Thus, I’m not committing to posting a video, but I will publish a post-maniac-week blogpost during the week that follows.
  • I reserve the right to tweak these rules (by editing this post) up until the evening of December 5. After that point it’s time to work, and I cannot change the rules any more.

No, I am not crazy. Yes, you can do one too.

Four Colours

 
 

A few years ago, I read a book that changed the way I took notes. That book was “How to Make a Complete Map of Every Thought You Think” by Lion Kimbro. Thanks to my podcast, Sources and Methods, I had the chance to chat with Lion a few weeks ago. The episode will be out in November but I wanted to share one of the ideas that I’ve found most useful. He wrote about it in his book and we discussed it again on the podcast.

It involves taking notes with a four-colour pen. I’m talking about pen-and-paper here, not digital notes, though I suppose it might work there too with some tweaking. You use a different colour to ascribe different meanings to your notes. Thus, quoting from his book:

RED: Error, Warning, Correction
BLUE: Structure, Diagram, Picture, Links, Keys (in key-value pairs)
GREEN: Meta, Definition, Naming, Brief Annotation, Glyphs
BLACK: Main Content [p. 26]

Most notes will thus be in Black, but other things can stand out by sticking to the system outlined in the quote. It takes a bit of getting used to, including sticking up reminders on walls showing the colour scheme, but after a week or two it’s instinctual and really helps when revisiting notes at a later date.

Lion shares lots of other note-taking tips in our podcast, which I’ll post here when it’s out.

A short practical tip: ever since reading Lion’s book, I’ve been using a Bic four-colour biro which are quite easy to find in most stationery stores. Lion mentioned a different type which I’ve now been using for a week or two and have had a really good experience so far. It’s the Zebra Sarasa4 model (pictured above). If you want to get into taking notes using four-coloured pens, I’d really recommend it.

Ecolinguism and the ethics of learning new languages

 
 

I was interviewed by Tammy Bjelland of the Business of Language podcast a few weeks ago, and the episode recently went live. Readers of this blog will know that I write about the study of language with some regularity – see the archives for some previous posts – but I don’t talk about it a great deal on my own podcast nor is it really the focus of my work. So it was nice to have a chance to talk through my background in learning languages and the challenges of learning languages with few materials available for self-study. There isn’t enough written about this.

It was also gratifying to find a forum to discuss Richard Benton’s ideas about ecolinguism. He wrote a blogpost summarising some of his ideas here:

I am an ecolinguist because I want my work to preserve the complexity of our world’s language and culture ecosystem. How do you create a strong community made up of hardened, poor refugees and rich, privileged natives? The privileged must work hard to create new connections. In middle school, the band geek or math nerd can’t simply decide to enter the “cool crowd.” Only those with strong social capital can invite in those on the outside.
The strength of our communities depends on the decisions of the privileged and the powerful. When insiders opt to forgo their comfort to commune with those who go without, they unite communities who would be isolated. When a well-educated privileged professional chooses to learn a language, for example, he forgoes his advantage in communicating in way where he feels most comfortable. The white Minnesotan, speaking elementary, broken Somali, puts the outsider, the refugee, in the position of power. Struggling to learn this difficult language allows new connections to grow.

The choices we make as to which language to learn next have a broader impact beyond our own lives. For the full discussion, visit Tammy’s website to listen to the full episode or subscribe via your preferred podcast client.

UPDATE: I now offer one-on-one language coaching. Read more about what it involves and what kinds of problems it's best suited to addressing.

Arabic Language Update: I did it! (Almost)

 
My Beeminder accountability graph showing how I reached my goal for a study challenge in June

My Beeminder accountability graph showing how I reached my goal for a study challenge in June

 

Just a short post as I'm off away on an intensive language course for most of the next three months. This is the programme run by Middlebury College, but held in Oakland, California (USA) at Mills College. I was extremely lucky to win a Kathryn Davis Fellowship which covers the costs of the course and food and accommodation while I'm there. I have a BA degree in Arabic and Farsi from London's School of Oriental and African Studies, but 10 years in Afghanistan spent writing books and studying Dari and Pashto meant that my Arabic has atrophied considerably. I thought it was time to resurrect those old skills, in part as a way of deepening my understanding of some of the religious aspects of the Afghan Taliban and in part -- let's be honest here -- as a way of covering my bases prior to Afghanistan completely falling off the map a few months from now.

I'll be writing a much longer post on how to get a high-beginner-to-mid-intermediate level out of the well-known "intermediate language plateau" after the course finishes, specifically focusing on what resources are available to Arabic-language students who have good basic skills but want to go beyond that to more advanced materials. (Read these three posts for more on getting out of language plateaus in general terms.)

The Middlebury course caters to various levels of language ability, and since I didn't want to waste the opportunity just revising things I had already learnt at university, I had to do a good deal of preparatory work these past few months. I started getting serious about this preparation in February. This involved over 75 hours of spoken/conversation practice (and some grammar work) with a number of different native Arabic speakers over Skype (lessons made possible through iTalki.com), as well as a lot of reading and listening. In June, as you can see on the Beeminder graph displayed above, I challenged myself to get 100 hours of exposure to the Arabic language over a period of 30 days; this included some iTalki lessons, but was also a lot of listening to Arabic-language podcasts, time spent writing on lang-8 and lots of time spent doing so-called "extensive reading" (much more to follow on that in August/September). I managed 99.5 hours, in total, just short of the total required to successfully complete the challenge I'd set myself, but enough to really make my language proficiency come along in leaps and bounds.

An additional note to those who would like to get in touch with me during this period: as part of the Middlebury course, they expect participants to take a language pledge where you only speak the language of study (i.e. Arabic for me) for the duration of the period of study. Read more here. For non-Arabic speakers, if you want to get in touch with me, please visit Google Translate and translate your message into Arabic there before copying the full text and pasting that into the email. It's not perfect, but it allows me to continue to stay connected with the world without violating the language pledge. If I reply, I'll be doing that in Arabic, too, so you'll have to copy the text back into Google Translate to get a sense of what I replied.

I'll be away on the course until the end of August, and will thus ignore all non-essential email until then. If you write to me in English, I will also ignore your email until September. Thank you.

UPDATE: I now offer one-on-one language coaching. Read more about what it involves and what kinds of problems it's best suited to addressing.

How I use Goodreads to pick what I read

So far this year, I have read 35 books. I'm trying something new for 2015 so I thought I'd write up the outline in case someone else finds it useful. As I wrote at the end of last year, I'll be reading 150 books over the course of 2015. That's fifty books more than I read in 2014. The point of it is to expose myself to lots of different ideas, different styles, different perspectives. I've found that 150 probably isn't an impossible amount to be reading (less than three a week) and I really relish brushing up against interesting authors and ideas.

I've used Goodreads as a way of tracking what I read for a long time now. I'm lucky enough to have an interesting group of 'friends' who also use it (more or less regularly) so there's usually a decent amount of new or niche books that I discover that way. I also use it as a way of noting down the books I want to read in the future. (Incidentally, I've never really had a problem in finding something new to read. The list of books I want to read will always be larger than the time I have to read them. That's just life.)

Goodreads offers a 'list' function whereby you can not only state that you 'want to read' a book, but where you can categorise things to your heart's content. Each year I set up a list ("2015toread" and so on) so I can see which books I think I'm more motivated to read that year. I'll usually take 5 or 10 minutes each weak checking over the list to make sure the things I added to the list are actually things I still want to read (versus things I added in the heat of a moment, after reading a particularly persuasive review, for example, but which I probably don't need to spend my time on).

Previously, I was generally following my gut with what I wanted to read next. Unfortunately, this often meant I went with the easiest option, or the path of least resistance. Long books (weighty histories, or more abstruse theoretical texts) would be passed up for the latest *it* novel or someone's entirely forgettable memoir about their time in Afghanistan that I'll feel obliged to read.

This year I've been trying a different approach. Goodreads allows you to sort lists by various bits of metadata attached to each book (author name, date added etc) but you can also sort by "average rating". This is the average rating given to a particular book by the entire Goodreads user base (20+ million users). You can see how this pans out in my current set of 'up next' books:

 
 

This "average rating" isn't in any way a guarantee of anything resembling quality. It's not that hard for authors to game the system, and books with few reviews (common for niche subjects like Afghanistan or Islam) have either really high or low ratings. But I'm finding this approach brings me to read far more books outside my path-of-least-resistance choices and often brings me into contact with some real gems.

Needless to say, this method of discovery is only a little better than putting all the names in a hat and picking one at random, but I am still finding some benefit. It does mess with my desire to read fewer male authors (you'll note in the picture above that only book number seven is by a woman; the rest are men) but everything in life is a tradeoff of some sort, I suppose.

Let me know if you find some use to this or if you have any other ways you pick what books to read next.

Note-Taking Jujitsu, Or How I Make Sense Of What I Read

Note-taking is a problem. It's an interesting problem, but still a problem. Many people have switched over from paper books to digital copies. I am certainly one of the early adopters in this trend, having wrangled Graeme Smith and his sister into facilitating a first iteration of Amazon's Kindle to be delivered to my house in Kandahar.

My colleague Felix Kuehn and I used Kindle versions of books heavily in our research for An Enemy We Created. Using those references in footnotes was difficult at the time: the format was so new that established footnoting styles (APA/Chicago etc) hadn’t developed the standards for referencing kindle documents. All this was made harder by the fact that Kindle copies of books added a whole new problem into the mix by abandoning page numbers for ‘Kindle location numbers’. This changed a few years later, and current users probably won’t have this problem, but if you go look at the footnotes for An Enemy We Created, you’ll still find that many, if not most, of the references are to Kindle locations and not page numbers. In fact, I think our book was probably the first serious history work to rely so extensively on digital Kindle references in the footnotes; I remember having discussions with our publisher about it.

 
 

All this isn’t to say paper copies don't have their uses. But some books just aren't available in digital format. I'll get into the workaround for that later. The best way to make this less of a problem is to gently nudge publishers to issue their books on a kindle format.1 But I am already getting off track.

All this seemed to come to a head this past week, where a podcast I hosted together with Matt Trevithick took up the topic of notes and note-taking. Mark Bernstein, our guest on the show, wrote a really excellent book on the topic some years ago entitled The Tinderbox Way. I’d strongly recommend you read if you’re involved in knowledge work in any way. Here’s a short excerpt defining the importance and use patterns for notes:

“Notes play three distinct roles in our daily work:

•Notes are records, reminding us of ideas and observations that we might otherwise forget.

•Shared notes are a medium, an efficient communication channel between colleagues and collaborators.

•Notes are a process for clarifying thinking and for refining inchoate ideas.

Understanding often emerges gradually from the accumulation of factual detail and from our growing comprehension of the relationships among isolated details. Only after examining the data, laying it out and handling it, can we feel comfortable in reaching complex decisions.”2

Later in the week, Maria Popova (of Brainpickings fame) was on Tim Ferriss’ podcast to talk about her website, her reading and her workflow. Both Tim and Maria expressed frustration over the lack of tools for people wanting to download and interact with their Kindle clippings:

“I highlight in the kindle app on the iPad, and then Amazon has this function that you can basically see your kindle notes on the desktop on your computer. I go to those, I copy them from that page, and I paste them into an Evernote file to have all my notes on a specific book in one place. But sometimes I will also take a screengrab of a kindle page with my highlighted passage, and then email that screengrab into my Evernote email, because Evernote has, as you know, Optical Character Recognition, so when I search within it, it’s also going to search the text in that image. I don’t have to wait till I’ve finished the book.

The formatting is kind of shitty in the kindle notes on the desktop(…) if you copy them, they paste into Evernote with this really weird formatting. (…) It’s awful. If you want to fix it you have to do it manually within Evernote. (…) There is no viable solution that I know.”3

She then goes on to some more detailed points of how this doesn’t work, and Tim commiserates, suggesting that maybe they should hire some people to fix this problem. But the good thing is that there are solutions. The problems Maria and Tim bemoan are things that every other Kindle user has had to deal with since day one, so thankfully there are a number of workarounds that simplify the process of reading, annotating and sifting within one’s notes of a book or document.4

So notes are important, we get that. But how do we use them to their utmost? How do we even gather them together and store them? How do we use them for our writing, for our thinking? These are all important questions which I don’t feel have been properly answered, and where those answers have been given, they’re buried or hidden somewhere out on the internet.

I want this post to get into the weeds about how to get your materials off a Kindle device, how to store it usefully on a Mac (my apologies, PC/Linux users), and how to repurpose those notes to be creative, to write, and to think.

This post has three parts:

  1. Storage
  2. Clipping & Splitting
  3. Discovery & Meaning

It will by necessity be an overview of some useful tools and options for researchers, but if you leave comments I can probably expand on individual points/sections in follow-up posts if needed.

1. Storage

This is a problem that wasn’t explicitly raised in the things that motivated this post, but it’s something I get asked frequently. Maria and Tim both seem to be avid Evernote users, and I know many others also use this, but there are other options. It’s worth starting here because the tools will determine what you can do with your notes.

I’ve offered advice to other Mac users on what software to use for research projects that require a certain deftness in handling large quantities of sometimes disparate materials. The same applies to people who are just trying to keep track of the things they read, trying to draw together connections, and to derive meaning from it all. I’ll get into the meaning-creation in the final section, but for the moment, let me briefly describe our four options for file/note storage as I see it.5

  1. Finder/PathFinder. This is the lowest-tech option. Basically, once you split your files up (see section two) you store them in folders and refer to them that way. I don’t find this option very attractive or useful, because it’s like a filing cabinet. Your ability to discover connections and to remember what’s in those folders is pretty limited. I don’t recommend this at all, but from conversations with other researchers and writers, it seems this is the default option.
  2. Evernote. I include this here because it’s part of a workflow that we’ll cover later on. Evernote is great for all the reasons you can read about on their site. It syncs across all your mobile and desktop devices, it OCRs images so you can search for text captured inside photos you upload into your library of notes.
  3. DevonThink. This is my default ‘bucket’ for information, documents and notes. You can read up on the many (MANY) things that DevonThink Pro Office or DTPO (the version you should get, if you’re getting this software) does. Not only does DTPO store your documents, but it allows you to access that information in a number of extremely useful formats. There is a mobile app, too, though it could do with a bit more work. The most interesting feature of DTPO is its search and discovery functionality (using some magic sauce algorithms). They don’t make as much of this on their website as they used to, but I’d strongly recommend you check out these two articles (one, and two) by Steve Berlin Johnson which explain a little of the wonderful things DevonThink can do for your notes. As with the next recommendation, it’s not cheap. But powerful doesn’t always come cheap. It’s a solid investment if you spend the time getting to know this piece of software.
  4. Tinderbox. I discussed this at some length on the Sources & Methods podcast with Mark Bernstein, so I’d recommend you listen to that as your first port of call. Tinderbox is not an everything-bucket in the way that Evernote and DevonThink are, and I use it slightly differently, but it’s a great place to actually do the work of thinking, organising and writing once you have something (i.e. a project of some sort) for which you want to use all your notes. I’ll explain more about this in section three.

I’d recommend getting to know the different bits of software to get a sense of what they can do. DevonThink has a handy section of their website where you can see how people use it in their work lives. Tinderbox has something similar, with some case studies of usage.

For DevonThink, it’s generally good to keep your ‘buckets’/databases of files separated by topic. I have a mix of these kinds of databases (50 in total): some are country-specific, some are project-specific (i.e. to contain the research that goes into a book or a long report), and some are topic-specific (i.e. I have one for clippings and notes relating to Mathematics, one for things relating to Cardiology etc). I’d also recommend you give Steve Berlin Johnson’s book Where Good Ideas Come From a read, particularly chapter 4.

Given the learning curve with some aspects of the workflow that follows, you might want to consider introducing these pieces of software one-by-one, or as needed. That way you’re using only what you understand and can implement things without being too overwhelmed by the novelty of the systems. It took me years (almost a decade) to implement and iterate the systems described below, and I’m still not finished modifying as the tools change.

2. Clipping & Splitting

This section is all about getting materials off mobile devices and onto your computer where you can put them into some sort of overarching database.

Accessing Your Amazon Kindle Clippings

First let’s sort out how best to get notes from a kindle onto your Mac. Don’t use Amazon’s website. It’s going to create all sorts of problems for you in terms of formatting.

First thing’s first: sync your kindle to the cloud. Just turn on the wifi/3G and select the “Sync” option. This will ensure all your highlights are backed up to the cloud.

Then plug your Kindle into your computer via USB. Then go into the “Documents” folder, and search for a file called “My Clippings.txt”. If you’ve been using your kindle for a while, it’s probably going to be quite large. Nevertheless, copy that file to your desktop. Feel free to eject your Kindle from your laptop now. We won’t be needing it any more.

 

An example of what you might see when you open your "My Clippings.txt" file

 

If you open the txt file that is now saved to your desktop, you’ll find all your clippings and annotations preserved in a useful plaintext format. This may solve your problems straightaway, in which case, congratulations: you now have all your annotations in a useful format that you can use however you wish.

If you want to take it to the next level, though, you’ll want to split this file up. At the moment, you have a very large plaintext file which contains all your notes. You’re likely to have notes from a wide variety of topics and books in here, so it doesn’t make sense for you to keep them all in a single location. The ideal solution is for you to have a single file for every clipping, a single file for every annotation.6

This is where Split-ter.scpt comes in. I’m afraid I don’t know who to credit for this wonderful piece of code. I downloaded it somewhere on the internet some years back and can’t seem to find a link to the author either in the code or elsewhere online. (Whoever you are, thank you!)

This script works with another piece of software mentioned above — DevonThink Pro Office. For now, I’ll ask you to ignore that bit, and focus on what’s happening to the file. I use the script to convert our “My Clippings.txt” file into multiple files. It goes in, finds a delimiter (any piece of text or syntax that repeats itself in the original file) and creates a new note/file every time it comes across this delimiter. In this way, you’ll quite quickly from the file shown above to something like this:

Now you have a note for every annotation and/or clipping. This is then something you can dump into Evernote, or keep in DevonThink. Again, more about the difference between these programmes in the next section. (Note, that you can use Tinderbox to split up the “MyClippings.txt” file as well using the “Explode” tool).

UPDATE (a little later on Friday night): Seb Pearce has just let me know that there are other options available for dealing with the 'My Clippings.txt' file. Check them out on his site.

The second problem raised on the Tim Ferriss podcast was Amazon’s limitations for clippings. This differs from publisher to publisher, it seems, so there’s no way of predicting it. An unfortunate rule of thumb: the more useful the book, the more likely the publisher has locked it down. When you’re making clippings inside the book, Amazon gives you no notification that you’ve reached the book’s limitations. But when you go to check your “My Clippings.txt” file to start using your notes, then you may find the note says:

"<You have reached the clipping limit set by the publishers>"

All the work you’ve done selecting pieces of text are for nothing, it would seem. The publisher has prevented you from using your book.

One solution is to remove the DRM from the book before you put it on your kindle. This is legal so long as you’re not sharing the book with other people (as this process would theoretically allow you to do).7 Follow this link to find out how to de-DRM your Kindle and iBooks documents. You can also visit libgen.org to download an already-DRMed copy of the book you’ve purchased. These will often be in .epub format so you’ll have to convert these over to a .mobi format if you want to use them on your kindle device. (To convert from .epub to .mobi, use the free Calibre cross-platform software.)

If you read a de-DRMed copy of a kindle book on your kindle device, there will be no limitations as to how much you can annotate. The publishers limitations will all be gone. So that’s one option.

For those who aren’t comfortable removing the DRM on your books, you can get all your annotations out, but it comes with a little bit of hassle.

Here’s an example of what I mean (screenshot from my DevonThink library). I was reading in Hegghammer’s excellent Jihad in Saudi Arabia and making highlights (at 4:06am, apparently) but at some point I hit the limit imposed by the publisher.

 
 

The workaround to bypass this limit from the publisher is to first export all your notes out of your “MyClippings.txt” file. So all your clippings are saved, even though some of them may not work. Let’s say, for the sake of argument, that the final three notes aren’t working because of the publisher’s limitatations. That’s the case in the screenshot above. What you do is (again, once you’ve backed up the clippings txt file) delete three of the earlier clippings that you already have. Then you sync your Kindle to the server and it will think that you have clipped three less quotes, so these will then become available (both in the myclippings.txt file and on the website. Like I said, it’s a bit fiddly. I would much rather remove the DRM completely and not have this hassle at all, though when you do that Amazon will not sync your clippings to the cloud and to their kindle.amazon.com database. You’ll have to export them using the tools I mentioned above.

Keeping Up With The Joneses, or How to Use Instapaper to Clip Web Articles

This may be something completely idiosyncratic to my own workflow, but I don’t enjoy reading articles in a web browser. I’d also prefer not to be hijacked into reading all these articles. For instance, when I’m in Tweetbot/Twitter or Facebook and I see a link that I like, I will almost never read that article then and there. Rather, I’ll send it to my Instapaper queue.

First, a quick word about Instapaper vs Pocket. I use Instapaper. I started off with them, switched over to Pocket for about two years, and now I’m back with Instapaper. They’re both more or less the same. Instapaper happens to be what I’ve chosen for myself because of their handy Kindle service. (If you have articles in your queue, you can have Instapaper send the most recent articles to your Kindle at a particular time (i.e. first thing in the morning) which you can then clip and archive to your heart’s content.) Both Pocket and Instapaper work with what follows, so just pick one and stick to it. I’d recommend Instapaper because they allow for the sharing of the full texts of articles and because of the Kindle digest feature.

I find I have so much to stay on top of and keep tracking online, I can’t just click around and read things as and when I see them online. I schedule time apart for reading of my Instapaper queue (and for reading books on my Kindle) and only read during those times. (I do the same with email, only checking and responding to email between the hours of 12-1pm and 5-6pm each day. The rest of the day email is off and disabled. I even deleted my email account on my iPhone as inspired by this medium.com post.)

My workflow with web articles is to follow as much as possible via RSS. I prune the sites I’m following every three months, but in general the number is stable around 650. I use Newsblur as my RSS reader, and every time I find an article I’d like to read (later), I use the handy ‘send to instapaper’ bookmarklet. This sends the article to my Instapaper queue.

The same goes for twitter. I follow enough people on Twitter for it to be impossible for me to read every post that passes through my stream. I will dip once or twice a day, however, to see what people are saying. I use two services to monitor my Twitter and Facebook streams to pick out the most-shared articles to ensure that I don’t miss the big sharks of the day. They’re both free, and I’d strongly recommend you signing up and getting their daily summaries of what people were talking about on Twitter that day. News.me has been around for a while and I trust their article selection. Nuzzel is newer, but it seems to have a few more options. I guess you could probably do with picking only one of the two.

After reading articles on my Kindle (or sometimes on a mobile device like my iPad or iPhone), you can clip the article if you want to save it (just like making a clipping inside a book, only the entire article is saved).

 

This is what you see in an article when you click to "Clip This Article" on a kindle...

 

Then your clippings will be captured in the ‘MyClippings.txt’ file as explained above and you can export them directly to DevonThink or Evernote or Tinderbox. (The main downside to doing things this way is that when the kindle clips it, all formatting is lost (including paragraph breaks)).

Alternatively, you can ‘Favourite’ the article. I use this setting because it then sends the article and URL to my @stricklinks twitter account, something I created to share the best things I was reading. It also saves the full text of the article to Pinboard (a service I’ve already written about on my blog here) and to Evernote. (I use If This Then That to facilitate this.)

Once I’m done reading, I can go into Evernote and all my articles are waiting for me to be sorted through. Because I use DevonThink as my everything-bucket, and because all the sorting and discoverability features are there, I have a separate stage of exporting my notes out of Evernote into DevonThink. I’ve already probably taken you a little too far down the rabbit-hole of my workflow, but this is an important stage because otherwise you can’t do anything with your notes.

Luckily, someone has written a script which makes this possible. Many many thanks to the good people at Veritrope for updating the script every time updates to the software get released. It’s fairly self-explanatory. You select the notes that you want to export, choose which DevonThink folder you want to export to and then it goes to work. It can occasionally be buggy and stop half-way through, but usually a little trial-and-error will let you pinpoint which Evernote note is causing the problem and you can transport that one over manually.

I usually do an export session to bring everything from Evernote into my DevonThink inbox once a week. This way the number of clippings doesn’t get too out of control, and I’m not constantly playing around with this during the week. You might find this all is overkill, but it has become an essential part of my workflow to store the various things I’m reading on a daily basis.

Pillaging Your Hard Copies, AKA Living the Paperless Dream

You may have hardcover copies of books that you want to use as part of this system. One way to use them is to scan the books into your DevonThink library. DevonThink Pro Office comes with an OCR package (via ABBYY FineReader) so whatever you scan can then become searchable and useful.

In the past, particularly with books I’ve purchased in Afghanistan and Pakistan that are unlikely (read: never) to be made available as electronic versions, I take a Stanley knife to the bindings, feed the pages into my ScanSnap scanner which scans both sides and compiles all the scans into a single PDF document that is searchable on my laptop. The whole process is destructive of the book, but it gives the text inside a new life. Given how fast the new ScanSnap models work (around 25 pages per minute, both sides), this is an attractive way to get digital access to materials that are only available in paper form.

You can highlight text within the resulting PDFs and then later export your clippings from those PDFs as notes into DevonThink. There’s another useful script to help with that. It only works with the free Skim PDF reader, but that’s my default PDF reader so it works out well.

For more on paperless workflows, check out David Sparks’ Field Guide on the topic.

3. Discovery & Meaning

If you made it this far, congratulations. This is the section where all the fiddling with export starts to take on some meaning. After all, we’re not reading and exporting these notes purely because we are hoarders or to fetishise the art of collection (though in some cases, that may be what’s going on). No, we are taking notes because we are trying to understand difficult topics, because we are trying to solve important problems.

Discovering Links and Connections

The Steve Berlin Johnson articles referenced earlier are an essential first stop, particularly in demonstrating how DevonThink can add some serendipity into how you use your individual notes. To give you an example of how this works, here’s a screenshot from my ‘TalQaeda’ database that I put together while working on An Enemy We Created:

 
 

In the upper part you can see a bunch of notes relating to the Haqqani family. The lower left part is the contents of a note (Note: exported from Instapaper). The bottom right list of documents (under “See Also”) is a list of notes that may be related to this particular quote. This is the magic algorithmic sauce I mentioned earlier that makes DevonThink so powerful.

If I click through to some of those suggested notes, I’m taken to similar quotes on the same topics, two PDFs of reports (of dubious analytic value, but that’s a separate issue), three clippings from Kindle books where people are making reference to the relationship between the Haqqanis and al-Qaeda (the subject of the original note). Note that I didn’t have to pre-tag documents for this ‘see also’ functionality to work its magic. It analyses inside the text and makes its suggestions based on the similarities it identifies. (Needless to say, it’s not simply a matter of matching individual words. Some of the suggested notes don’t mention al-Qaeda or the Haqqanis by name, but they are implied; DevonThink catches this all).

Once you start to build up a decent database of notes (my Afghanistan database has just under 65 million words of notes, including 12,800+ PDFs) this ‘See Also’ functionality really allows for some unexpected links to be made, especially when you’re at the stage of writing up a project/book. One note will lead to another note, which will lead to another note. If you follow these trails of notes (like breadcrumbs) you can develop a pretty idiosyncratic picture.

I do not know of a manual method which allows for this kind of process.

DevonThink has an extremely robust search function which allows you to find things along similar principles (including a very useful ‘fuzzy spelling’ option, perfect when checking my database for notes on someone whose first name could be spelt Mohammad, Muhammad, Mohammed or any of the other variations).

Figuring Out What It All Means

Once you have an idea of the outlines of the topic, once you’ve been taking notes for a while, your database in DevonThink is probably starting to fill with useful information.

If you’re writing a book, though, you’ll want to start writing alongside this gathering process. (Check out Michael Lopp’s overview of the process of writing a large research book, which, to my mind, is fairly accurate.)

I don’t find DevonThink a particularly pleasant place to write, so I do that elsewhere. Before I write things out in long form, I usually do some outlining, particularly if it’s something where the dense collection of factual detail is important to the development of the argument (as was the case with An Enemy We Created). For this, I find Tinderbox indispensable for working up an overview of what I know, for figuring out how I’m going to structure it, and for helping me put together my first draft.

Tinderbox can display notes in a number of different ways. You can view your documents as outlines, as maps, or even as timelines:

 
 

In this image you can see the information arranged as an outline, but here (below) you see the same information organised as a map (mirroring the actual layout of the map of those districts in a particular part of Kandahar):

 
 

Just to show you that it can handle complexity, here’s a map created by Felix to help him figure out how people involved in militant Islamism were/are connected across different geographical sectors:

It's complicated...

I’ll often use Tinderbox maps to store outlines for how I’ll write a particular section or chapter, making notes inside the document, dragging quotes in from DevonThink to supplement the argument that’s being constructed.

Getting to the point where you can actually start writing on the basis of your notes is the whole point of all of this. Technology is useful, but mainly when directed at a specific problem or goal. All the tips, tricks and software described in this post has helped me write books, reports and (coming soon!) even my doctoral thesis/PhD. I have encountered only a few (barely a handful) researchers who use their computers for this collation, sifting and discovery process. There’s no way to keep it all in your head. Here’s hoping more people start adopting these tools…

Footnotes:

  1. For many years, Amazon offered users the ability to let publishers know that you wanted to see title X or Y on a Kindle format, but they failed to make this piece of interaction useful by keeping track of what you'd requested of publishers (so as then to be able to let you know when it was finally released in Kindle format).
  2. Excerpt From: Mark Bernstein. “The Tinderbox Way.” iBooks.
  3. Selective transcript from around the 50-minute mark in the podcast audio. Needless to say, the rest of this blogpost constitutions a ‘viable solution’.
  4. Most of these are derived from other people, I should say. I try to give credit where I can, but sometimes I can’t remember where I first read something or who first recommended a particular tool or trick.
  5. Yes yes, I know, I’m going to leave out some mentions for useful software here. This is an overview, and I’m just trying to describe some options for what might work in certain situations.
  6. A clipping is when you have selected and copied a passage from the book for safe-keeping, and an annotation is when you yourself write a note connected to a particular passage.
  7. Needless to say, don’t take legal advice from me.