On Reading in Arabic: The Evidence

[This is the second post in a series on the importance of reading when studying Arabic (or any other language). Read the first post here.]

It is notoriously difficult to study and show which are the most efficient methods to study second languages. For starters, everyone is slightly different, so it's hard to compare between individuals. Learning a language is also such an involved pursuit (taking place over all hours of the day, and in the mind, where microscope or dictaphone can't usefully reach) that it is impractical to follow the student for all twenty-four hours of the day.

Having given the pitch for why I think reading is so important for students of Arabic, today I wanted to summarise a study that was carried out from 1970-1977. This study, by ElSaid Badawi, is entitled "In the quest for the Level 4+ in Arabic: training Level 2–3 learners in independent reading" and can be found as an article in Betty Lou Leaver and Boris Shekhtman's fascinating (and underrated / underred) edited volume, Developing Professional-Level Language Proficiency. Given its somewhat obscure provenance, it's unlikely you'd come across this fascinating article in the normal course of your day, hence my interest in summarising it for you here.

Badawi offers an overview of his experience running the CASA (Center for Arabic Study Abroad) programme between 1970-1977. This programme was originally started in 1967 for advanced-level students and the idea of it was to give a year of intensive study in order to really catapult students into real competency in being able to read, speak and use Arabic in a professional capacity. (Badawi begins his article with a justification for reading, but I'll skip those details since their is a great deal of overlap with what I've already written).

The original CASA curriculum in the 1967-era programme was established around a 3000-word vocabulary list, reading of some short passages using those words in context, a grammar book and two long 'authentic' texts that would be covered over the course of the year. The students found this dull and unrewarding, however, so CASA's administrators decided to design a new course based around familiarising students with a 'language domain of their interests'. In other words: allowing them to read things that were related to their interests and professional trajectory.

Students taking part in the programme were assessed (prior to joining) as being at a high level, but their vocabulary was generally limited to political subjects. They had a poor understanding of morphology and little to no facility with semantics. They had, Badawi writes, bad reading habits in Arabic: too much focus on sentence structure, engaged in 'parsing-based reading' and with only a minimal grasp of the "semantic role of punctuation". In that last case, this is the way Arabic uses words, phrasing and sentence constructions to signify the meaning of a sentence, whereas in English a lot of those meaning structures are conveyed through punctuation. Most of all, students suffered from an 'excessive / crippling' use of the Arabic-English dictionary, which was identified as an obstacle to spontaneous and contextualised language learning; words were quickly forgotten.

The programme sought to encourage a switch in its students: "a change of attitude toward Arabic from that of a language they are being taught to one which they should start learning". The responsibility, at this level, generally should switch from the teacher to the students.

The programme was split up into three semesters / terms:

  • Semester 1: 8-week summer programme

This was made up of introductory cultural classes (based around Cairo, Egypt, where students were living. It offered classes to bring students up to a competent level in functional colloquial Arabic. (Students could solve all their problems and interact with Egyptians in a functional way, following the course). There was also a component of media Arabic where students would become familiar with the formalised language used in printed and spoken contexts.

  • Semester 2: 14-week autumn programme

This semester was for allowing students to gain a higher competence in MSA. Reading was one of the core elements here (news reading became effortless and there was some inclusion of classical language as well). Colloquial Arabic was encouraged through the reading of plays (which often used colloquial/dialect expressions and language). An intensive reading programme was added alongside this to boost confidence.

  • Semester 3: 14-week spring programme

The final semester included three graduate-level courses in subjects of the students' interest / choice. There was also some training in 'Educated Spoken Arabic' (i.e. the discussion of high-culture topics).

The Intensive Reading Course

The core belief behind the programme was that reading was important to the students' knowledge of Arabic in a fundamental way. All the other skills would benefit and develop alongside the reading done as part of the programme. There were different kinds of texts available and a selection criteria for what kinds of reading took place:

Finding materials for intensive / analytic reading was easy. The harder issue was finding materials suitable for extensive reading, i.e. the kind of wide-reading that students are able to do with some level of ease. Arabic poses a particular problem in this regard, given its 'wide range of active vocabulary in use', and the 'complexity of the morphs-semantic system'.

Plays were believed to be the best for extensive reading. They carried a "high degree of word and sentence redundancy", usually had only a single theme and were of moderate length. (It was found that reading two 200-page books was much more satisfying than reading a single 400-page book). Plays also lend themselves to real-life activities. There is also the possibility of watching the plays being performed (or, now, on YouTube).

Novels were also considered useful, but the fact that dialogue is used only minimally means that they were kept for later in the semester. Short stories were denser in meaning and language use and thus harder. They were included in the programme, though, for the sake of variety.

Overall, texts were chosen for the language structures used rather than for their literary value / content.

Reading Texts

The course had students reading three items each week. Usually one novel or a play (a long item) and a short story and a 1-act play (i.e. two short items). These were generally from the same author, and difficulty would escalate over time. All texts were authentic and unabridged. Ideally they were selected from leading literary figures and they would all be texts for which no English translations already exist. Selecting these texts was hard at the beginning, but over the years they settled into a broad pattern, escalating in difficulty:

  • Group 1 (first three weeks)

Plays by Tawfiq al-Hakim (short and long). These were good because he uses a lot of redundant vocabulary, follow familiar thematic sources to those with which students would have been familiar, used a lively dialogue and generally contained "straightforward language".

  • Group 2 (5 weeks)

This consisted of works by Ihsan Abdul Quddus, a journalist, novelist and short story writer. These works tackled themes from social phenomena and thus were appropriate to a young audience. They referenced local customs and expressions. They included fewer dialogues in the novels and short stories. They had a lucid structure and controlled range of vocabulary.

  • Group 3

This was works by Yusuf Idris, blending MSA with colloquial idioms, Qur'anic citations and quotations from the hadith literature. These were at a higher difficulty level.

  • Group 4

This was a mix of items chosen for special topical interest or artistic value. For example, in the final week, students read Fathy Ghanem's 1958 novel Al-Gabal. They also tackled some of the non-famous novels by Nagib Mahfouz.

Mixed in these various groups were shorter items: one-act plays and short stories. There was generally a balance between length of a text and its linguistic difficulty.

Reading Instructions

I found this section of the article the most interesting / instructive. Students were told the following:

  • The beginning of a story / text is always the hardest. You don't know what's going on, who the characters are and what the context / scene is. Bear with it. A lot of this will be scene-setting. You can always return back to it later on.
  • Arabic has a lot of redundancy. Compare what you are stuck on with what follows and check if you can figure out the meaning that way.
  • Continue reading as long as you can make out a story or theme for yourself. Don't worry or second-guess yourself as to whether what you understand from the story is the same thing as what the author intended you to understand.
  • If you find a word or part of the structure you don't understand and stop, DON'T look the word up in the dictionary unless:
    • you have failed to guess the meaning
    • there is nobody around to ask the meaning
  • Mark / highlight the words you were able to guess in the text. Mark the words you were able to do without understanding.
  • Make a list of cultural features that you'd like to be addressed in class.
  • Mark and make a list of any expressions and grammatical features or constructions that you want addressed in class.


Class sessions were essentially there to ensure that students were keeping up with the reading volume. Students would narrate their understanding of the texts they had read, and would raise any issues they wanted to learn more about.

Classes were also a good time to increase students' semantic understanding -- allowing students to identify shared roots and usages in different contexts and forums.

Students submitted written responses / follow-ups to the text in the class with the teacher present. A weekly conference with students gathered feedback on the choices of texts, allowing teachers to adapt the programme depending on the ease/difficulty perceived by each individual cohort of students.


By the end of the 14-week programme, students had read an average of 2500 pages of authentic Arabic texts. Graded text levels showed that their language was improving. They were encouraged by managing to review words and structures that had been marked as 'hard' earlier on in the semester. (Usually 25-40% of these words had become intelligible to them, despite no vocabulary learning strategy specifically targeted at learning these words.) The graduate-level courses (all taught in Arabic, obviously) of the final semester were also a proving ground for students.

This reading programme increase students' competence and was transferrable to their other skills. (Yes, even their spoken Arabic.) Reading helped with writing. Reading 'complete texts' did a lot for the morale of the students at the intermediate-level, too. And the literary focus of the content was useful for students even if their interests didn't lie in that particular area.

My next post about reading Arabic will detail some options that are available to the intermediate-level student of Arabic, and some practical considerations resulting from this article.

Highlights + DevonThink = Pretty Great

I’m late to the Highlights party, but I’m glad I got here.

Like many readers of this blog, I get sent (and occasionally read) a lot of PDFs. In fact, I did a quick search in DevonThink, and I am informed that I have 52,244 PDFs in my library. These are a mix of reports, archived copies of websites, scanned-and-OCRed photos and a thousand-and-one things in between.

Thus far, my workflow has been to read PDFs on my Mac. Any notes I took while reading the file were written up manually in separate files. I would laboriously copy and paste whatever text snippet or quotation I wanted to preserve along with its page reference. These would be fed into DevonThink’s AI engine and magic would happen.

Now, post-Highlights-installation, my workflow is much less laborious. I can take highlights in-app, export all the quotations as separate text or HTML files and have have DevonThink go do its thing without all the intermediary hassle. If you’re  a professional researcher or writer using DevonThink as your notes database — and quite frankly, if not, why not? — the Highlights app will probably please you.

PhD Tools: Goodreads for Cross-Pollination

[This is part of a series on the tools I used to write my PhD. Check out the other parts here.]

During the period I was working most intensely on my PhD writeup, I read over 100 books. I put that number out there not as a confrontation, but as an illustration that reading is important to ensure you don't get lost in a small box of your own creation. Judging purely from my own experience and from sporadic conversations with a loose handful of fellow PhD candidates, this can be a real problem.

Reading widely and about issues and problems wholly unrelated to your field of study is, I believe, the hallmark of a curious mind. If I meet someone for the first time and I'm assessing their work, I'm far more likely to be interested in the last ten books they've read than many other data points. Even the fact that someone is taking time to read, and to read diversely, is an important indicator for me.

I think I can date my adoption of this books-and-ideas-for-cross-fertilisation to when I read Steven Johnson's book Where Good Ideas Come From. He makes a strong case for a more deliberate approach to how you develop and cultivate ideas in your thinking life. (The book is short and highly suggestive of specific approaches to work. I'd recommend it if this kind of thing interests you).

I've found that things that I don't track and monitor tend to fall beside the wayside. Hence Goodreads and Beeminder and a number of other tracking tools. Goodreads allows you to set how many books you want to read each year and then keeps a convenient little widget reminding your how far ahead or behind you are of your goal. If you want a bit more of a 'sting' for non-compliance, you can hook up Beeminder and you'll be kept honest that way.

Reading books on unrelated topics was something I would do in the afternoons or evenings after my Four Perfect Hours. The time would be mine and I could read without any sense of guilt or that I wasn't making progress on my PhD writeup. No, I'd done my work in the morning, so now I could read to my heart's content.

Encounters with books are encounters with other ideas, other minds. It refreshes your approach and your sense of perspective -- both so important for your PhD. Give it a try! See how you can add in some reading time to your daily routine. Even 30 minutes before bed each evening adds up in the end.

PhD Tools: DevonThink for File Storage and Discovery

[This is part of a series on the tools I used to write my PhD. Check out the other parts here.]

Discovering similar notes in one of my DevonThink databases

Discovering similar notes in one of my DevonThink databases

I first heard about DevonThink in the same breath as Tinderbox. They go together, though they serve different purposes. Some people want to make an either/or decision about which to use. I see them as sufficiently different to assess them on their own merits and as per your usage scenario.

As with all tools, you should come to the decision table with a set of features that you're looking for. Don't just shop around for new things for the sake of newness or for the sake of having a really great set of tools. These programmes are not cheap. Luckily almost all of them come with generous trial versions or periods, but I don't recommend 'newness' as a feature of any particular merit.

Devonthink (I use the Pro Office version) is a place to store your files and notes. It can, I think, take any file you can throw at it. It comes with software for processing PDFs into fully-searchable documents (OCR software, in other words) which is part of the reason why the license for the Pro Office version of the programme is so expensive.

If you're anything like me, you're drowning in PDF documents. They all come with helpful names like "afghanistan_final_report_02_16.pdf" and unless you have a rigorous file hierarchy and sorting system, you'll probably be unable to find one file or the other. And using the basic file hierarchy system for storage doesn't help you with situations like when you want to store the same file in multiple folders (i.e. what if a report is about Afghanistan and Tunisia). (DevonThink has a feature which allows you to store the files in multiple locations, but without saving two copies of the file. Any changes or annotations you make in one file will automatically be transferred to the other).

You might ask yourself why you would need DevonThink and Tinderbox (see this post for more). The short answer is that they store different kinds of files/data, and that DevonThink is less about thinking than about storage (to a certain extent) and discovery.

One of the key features of DevonThink Pro Office is its smart searching algorithms, its ability to suggest similar texts based on the contents of what you are looking at, etc. It does this by means of a proprietary algorithm, so I can't really tell you how it works, but just know that it does. It works best on smaller chunks of text. In this way, I was reading through a particular source from the 3 million-word-strong Taliban Sources Project database and then I clicked the "See also" button and it had found a source I would never otherwise have read on the same topic, even though it didn't even use one of the keywords I would have used to search for it. It uses semantic webs of words to figure this stuff out. Anyway, beyond a certain database size, this power becomes really useful. It can also archive websites, store anything including text, do in-text searches on e-books etc etc. (Read more on how I use DevonThink for research in general here.)

I also used it a little as an archive for substantive drafts / iterations of the writeup process. That's another important part of the process: making backups of many different kinds. I never found any use for them, but at least they were there (just in case).

If you're a data and document hoarder at heart, like me, you'll soon have a Devonthink database (or several databases, split up by topic) that is bigger than you can fully comprehend it, or remember what was inside the files. At that point, search becomes really important. Not just a straightforward search, but the ability to input 'fuzzy' terms (i.e. if you search for "Afghanistan" it'll also find instances where it's incorrectly spelt "Afgahistan"), and boolean language, into your query is really powerful/useful. DevonThink is an amazing search tool. The company that developed the database software also make something called DevonAgent, which is basically a power-user search tool for the internet. Google on steroids, if you will. Fully customisable, scriptable... you can really go crazy with this stuff. I use it, but my PhD wasn't really about searching things on the internet, so I didn't use it much for my research or writeup. But it's a great tool, too.

In short, DevonThink is a research database tool that will help you store and find the documents that relate to your research, and do smart things to help you find sources and texts that maybe you'd forgotten you'd saved. Highly recommended for anyone working with large numbers of documents.

North Waziristan: A Reading List

Technically, this is South Waziristan...  Photo credit: Drregor (via  Flickr )

Technically, this is South Waziristan... Photo credit: Drregor (via Flickr)


I've been doing a bit of reading about North Waziristan in the English-language sources that are available outside Pakistan. It took a bit of time to put together a decent collection that gave real information. By 'real information', I mean things that speak of names, dates, places and events. I wasn't really interested in analysis, though that forms part of what follows. I was interested in the basic factual building blocks that must precede any analysis or understanding of a place. (That, and actually going there yourself). Most of these sources have are filled with stories and little details, all of which need triangulating with one another and with interviews on the ground.

I can't vouch for the veracity of any of it -- my experience in Afghanistan has given me an innate distrust for anything I read in a report, particularly if it was assembled outside the country -- yet this is what we have. There are, of course, thousands if not hundreds of thousands of news articles in the databases of Pakistan's media outlets, but I didn't trawl those yet. Needless to say, this is a work in progress and I will continue to update as and when I read more. It seems the area is also missing a well-sourced chronology akin to something like what I did for Kandahar or for the Taliban/Al-Qaeda relationship. I don't have the time at the moment to do this myself, but perhaps someone will be inspired to work on it. If you have any suggestions for additions to this list, please let me know.

Books (Core)

Books (Supplementary / Tangential)




UPDATE: This continues to be added to as recommendations come in from various places here and there. (Last Update: January 3, 2015)