Productivity

My new book: The Taliban Reader

 
DeXH1VMW0AAN60p.jpg:large.jpg
 

My new book is out (finally). The Taliban Reader is somehow the culmination of years of work to drive studies of the Taliban back to primary sources. Some of this work was accidental; more recently it was more purposeful. The book I produced (together with Felix Kuehn) is long and detailed.

Comments and feedback prior to publication were extremely positive. It'll presumably take readers a while to start getting some real independent reviews in, but I look forward to feedback and whatever conversation is generated off the back of it all.

You can pick up a copy at any good bookshop or from Amazon here.

Tweeting to the Void

I've previously written about how I turned off Facebook's news feed. I keep an account with Facebook because people occasionally contact me there. It is also an unfortunate truth that many companies in Jordan (where I live) or in the wider Middle East only have representation on Facebook instead of their own website. (Why they insist on doing this baffles me and is perhaps a topic for a future post).

I have long preferred Twitter as a medium for filtering through or touching -- however obliquely -- things going on at any particular moment. I have no pretensions to actively follow every single tweet to pass through my feed. Rather, it's something I dip into every now and then.

Increasingly in recent months, I found myself growing dissatisfied with the pull it often has on me. It has become something of a truism to state that 'twitter isn't what it once was', but there's less and less long-term benefit in following discussions as and when they happen.

RescueTime tells me that I spent 86 hours and 16 minutes on Twitter in 2017 -- just under quarter of an hour each day. That feels like a lot to me.

ScreenShot 2018-01-25 at 19.13.15.png

Enter 'Tweet to the Void'. This is a Chrome extension. (For Firefox and other browsers, I have to imagine things like this exist.) When I visit twitter.com, the feed is not visible. All I see is somewhere to post a tweet if that's what I want to do. (There is still some value in posting blogposts and articles there, since I know some people don't use RSS). Of course, I can always turn off the extension with ease, but adding this extra step has effectively neutralised Twitter for me. 

Try it; see how you feel about having something standing in the way of your social media fix. Let me know how you get on.

Tabula for extracting table data from PDFs

Have you ever come across a PDF filled with useful data, but wanted to play around with that data yourself? In the past if I had that problem, I'd type the table out manually. This has some disadvantages:

  • it is extremely boring
  • it's likely that mistakes will get made, especially if the table is long and extends over several pages
  • it takes a long time

I recently discovered a tool that solves this problem: Tabula. It works on Windows and Mac and is very easy and intuitive to use. Simply take your page of data:

A page listing Kandahar's provincial council election polling stations from a few years back. Note the use of English and Dari scripts. Tabula handles all this without problems.

Then import the file into Tabula's web interface. It's surprisingly good at autodetecting where tables and table borders are, but you can do it manually if need be:

ScreenShot 2018-01-17 at 15.56.25.png

Then check that the data has been correctly scraped, select formats for export (from CSV to JSON etc):

ScreenShot 2018-01-17 at 15.57.19.png

And there you have it, all your data in a CSV file ready for use in R or Python or just a simple Excel spreadsheet:

ScreenShot 2018-01-17 at 15.57.50.png

Note that even though the interface runs through a browser, none of your data touches external servers. All the processing and stripping of data from PDFs is done on your computer, and isn't sent for processing to cloud servers. This is a really nice feature and I'm glad they wrote the software this way.

I haven't had any problems using Tabula so far. It's a great time saver. Highly recommended.

Highlights + DevonThink = Pretty Great

I’m late to the Highlights party, but I’m glad I got here.

Like many readers of this blog, I get sent (and occasionally read) a lot of PDFs. In fact, I did a quick search in DevonThink, and I am informed that I have 52,244 PDFs in my library. These are a mix of reports, archived copies of websites, scanned-and-OCRed photos and a thousand-and-one things in between.

Thus far, my workflow has been to read PDFs on my Mac. Any notes I took while reading the file were written up manually in separate files. I would laboriously copy and paste whatever text snippet or quotation I wanted to preserve along with its page reference. These would be fed into DevonThink’s AI engine and magic would happen.

Now, post-Highlights-installation, my workflow is much less laborious. I can take highlights in-app, export all the quotations as separate text or HTML files and have have DevonThink go do its thing without all the intermediary hassle. If you’re  a professional researcher or writer using DevonThink as your notes database — and quite frankly, if not, why not? — the Highlights app will probably please you.