Useful Tools

Automating social media posting for my new blogposts

I love blogging and I've benefitted a lot from what it's done for me ever since I started my first Geocities page in the mid 1990s. I maintain a technical blog at mlops.systems and a somewhat less technical blog at alexstrick.com/blog, though hope at some point to merge these together.

In the past I would have been content with ensuring that my blog published an RSS feed and known that anyone wanting to follow what I was writing could do so simply by connecting their feed reader and subscribing. I've become more conscious in recent years of a healthy brew of ambivalence, ignorance or even outright hostility to even the idea of RSS feeds and readers. It seems many people don't have RSS as an essential part of their informational hygiene any more. (I'll put my sadness / confusion about this to one side for now.)

And if I love blogging, I really dislike having to post my new blog posts to social media one by one, coming up with some catchy yet not overtly breathless summary of what I wrote, since this is apparently what many people use instead of RSS.

I've been grumbling under my breath about this situation for this for a few years now, but when ChatGPT came out it seemed like an obvious use: summarise my blogpost and repost to all my social media accounts taking into account their particular needs. (Mastodon uses hashtags more than the others, whereas LinkedIn posts can be a bit longer, vs Twitter which needs to be a bit shorter and so on.)

I held off, thinking I'd want to set up some system fully under my control involving serverless function calls and so on, but then I was reminded that I already use Zapier for some other administrative tasks. So this afternoon I set up and turned on some automation for social media posting to my Mastodon, Twitter and LinkedIn accounts. Posting happens at one step removed since I queue my posts in Buffer so that they go out at a time when people are more likely to see them. I apologise / don't apologise for this. My blog writings remain wholly un-automated; it would completely remove the point of 'learning through writing' if I were to automate the things that I blog about. My social media postings (just one post per blogpost so as not to spam you all) are from now on automated. As an additional courtesy / discourtesy, I've tweaked the prompt such that the social media posts should always read just slightly 'off' and will be labelled with an #automated hashtag.

Getting Out of the Intermediate Language Plateau: Arabic Edition / Principles

[This is part of a series on getting out of a language-learning plateau at the intermediate-advanced level. Check out the other parts here.]

Seasoned language learners are familiar with the concept of the 'language plateau'. If you're learning a second language for the first time, you will inevitably reach a point in your studies where your progress seems to flatten. You will find this place and period extremely frustrating.

When you are in your plateau, it's hard to improve because you're already at a point of (some kind of) self-sufficiency. You can express yourself. You understand most of what is going on in a conversation or TV series you watch. You can write things and people will understand what you're saying. You could (and many do) stop your studies at this point and still be 'functional' in the language.

Getting out of this flat, dead zone is what I want to talk about today. It's hard, but it's by no means impossible, and making this kind of progress is possibly the most valuable work you'll do in your language studies, because all of it will be specifically tailored to your needs.

The starting point, though, is to identify your current status. What can you do? You don't (necessarily) need to take a formal language certification test to get a grade, though that can sometimes be useful. The kind of measurements you want to take are more subjective. You want to take stock of your capacity in certain situations, what level you are able to achieve in different contexts (your skills in reading will be different from writing vs listening or speaking, for example) and you want also to assess your experience on the cultural level as well -- i.e. how much experience do you have navigating all the unspoken parts of culture, whether that is body language, or behaviours and so on.

Principles of Skill Acquisition

Now a slight detour into some more general principles of skill acquisition. Some of this is derived from my own personal experience, other parts from interviews with experts in this field (such as my conversation with K. Anders Ericsson, who more or less invented the field of expert performance studies), and other parts still from reading a bunch of books on the subject.

Three things are relevant here:

1) Stretch

When you're learning a new skill, you want to step outside your comfort zone. This is usually difficult work, and work that is mentally (and possibly emotionally) taxing. Thus, if you want to get better at speaking in Arabic, you'll need to speak more, but at the beginning this practice (i.e. talking with other people) will feel pretty horrible, simply because you're not used to doing it. It's a paradox that you need to do the thing to get better at doing the thing. It is this difficulty, pushing yourself a little past what you're capable of doing, that allows for personal growth. (I wrote about this in an entirely different context a few weeks ago with respect to my attempts to get better at climbing.)

2) Lots of practice coupled with speedy feedback

These two parts (practice and feedback) go together. It isn't practice alone that will allow you to improve, but rather the combination of making efforts to use new skills alongside getting some kind of feedback that tells you when you're getting it wrong vs when you're not. An implication of this, too, is the reality that this kind of practice is going to involve you making lots of mistakes. This can feel crappy, especially when you're getting immediate feedback on exactly when this is happening. You need to adopt a flexible mindset, if possible, in which you see the mistakes as indicators of growth rather than as any kind of personal or intellectual failures on your part.

3) Know what you're practicing and focus on that

This is basically Ericsson's principle of "deliberate practice":

"Rather than chilling out in the comfort of skills you've already acquired, as an expert-to-be, you're relentless about heading to the frontier of your abilities. The practice shouldn't be so difficult that it overwhelms you—that would be depressingly demotivating, but not so easy that you're unconsciously languishing. In other words, you're arranging for flow, that space where you're right at the boundary of your abilities."

See also this summary of the routines that 'experts' tend to have around deliberate practice:

They can only engage in practice without rest for around an hour.
They practice in the morning with a fresh mind.
They practice the same amount every day, including on weekends.
They only have four to five hours of deliberate practice a day.
If they don't get enough rest, they can get overtraining injuries or burnout.

If you're hoping that 'using the language' in a general and non-specific way will get you out of your plateau, you'll be disappointed. It's perfectly possible to exist in the plateau zone without improvement ad infinitum. If you want to improve at a certain skill, you'll need to isolate that element and focus on it in a targeted way. This can be vocabulary, or speaking about a certain topic, or even something as small as 'using conditional sentences'. Whatever it is, you'll only get better if you concentrate your efforts.

Customisation & Your Individual Needs

Learning languages at the post-intermediate level will be a different experience from what you are used to in the early stages. Early on, you're doing a great deal of necessary-but-boring work to learn basic patterns, vocabulary and grammar.

Once you have mastered that, and you can explain yourself in most basic contexts, you reach the point where you have to customise. There's a great deal of science and research behind this claim. Check out this talk, by the always stimulating Alexander Arguelles, for an overview of some of that research.

You'll need to pick which areas you're most interested in. This is the hard work of advanced language studies -- you pick one area or context, conquer it, and then pick another area and repeat. This fulfils the princicle of focus that I mentioned above.

To give an example from my own studies. My current big push for Arabic is to be able to read serious fiction (i.e. short stories and novels written for native speakers). I've written previously that this was a personal goal, but various realities of how modern literature is written really make it hard to take the leap into complex native-reader-level fiction (especially novels). Arab writers like to use many synonyms (for poetic effect, or perhaps as an attempt at pretension?) for words, so when reading I often find myself stuck referring to dictionaries the whole time. Fortunately, a new textbook offering graded literature at just that 'stretch' level was released recently, which is allowing me an entry point into that world. None of the texts are simplified, and the language is hard and the number of unknown words is pretty large, but it's not too far down the scale of difficulty.

On Making a Self-Study Plan

My next post will cover and offer a host of suggestions for resources you can use to get out of this plateau / dead zone. Before you start reading through and diving into things that seem interesting, I'd strongly advise you take the time to figure out your specific goals. "Improve my Arabic" is not a useful goal. It's too unspecific. Even "improve my spoken Arabic" may not be particularly useful at the intermediate-advanced level. Once you figure out your goal, write it down somewhere. Maybe stick it to your wall or on the inside of your notebook. It's good to be reminded why we're doing the work.

Once you have your goal, then you want to set yourself small targeted bursts or challenges to push out into your stretch zone. You don't want these challenges to feel like you're straining against the limits of what you are capable. You want it to be just challenging enough that you feel uncomfortable, but not so much that you are constantly questioning yourself and your abilities in any kind of fundamental sense.

The scale of these challenges will be pretty variable, so examples will span a range of tasks from taking a week to learn and read deeply in a niche topic, to something more longer-term (over six months, perhaps) like my modern literature challenge. The characteristic that you need to look for, however, is that you'll be able to tell when you're finished with the challenge. Part of defining the goal is finding a specific (and somewhat measurable) definition of what it means to have achieved what you want.

Then the rest of the trick is basically keeping moving, tracking your progress and achievements along the way. There are various ways of doing this, some of which will depend on what else you have done in this regard. You can add in things like Beeminder to encourage compliance and regularity, or you can do that in other ways.

When I work with people 1-on-1 to learn a language, a lot of what we do is figuring out this kind of ongoing goal setting and progress assessment. (If you want to learn more about this, click here and read through what I offer).

The next posts will offer a roadmap to the different resources available to the intermediate student of Arabic and some of the ways you can utilise these resources. It won't be exhaustive, but I'm pretty sure that most will find something of use in them. Feel free to get in touch if you have specific things you want me to tackle in terms of skill development in Arabic.

Small, unexpectedly powerful boxes

Graphics Processing Units or GPUs are what your computer uses to quickly display your screen. Most computers (desktop or laptop) have one of these, and they are used to good effect to keep the screen refreshed and display everything in effectively realtime speed. The world of gaming is also, perhaps unsurprisingly, quite dependent on fast GPU performance, with Nvidia as the lead provider of these hardware units.

nvidia gpu

It was discovered a while back that GPUs are also pretty great at performing certain kinds of computation at incredible speed. Certain calculations which, if you would do them on a standard CPU, would take ages to complete are much faster when run on a GPU. For this reason, they're the hardware of choice for training deep learning models.

GPUs also happen to be heavily used (for similar reasons) for cryptocurrency mining and accordingly there has been a worldwide shortage for some time. Between the crypto bros and the deep learning practitioners, the price got inflated for a while. Nvidia has made some attempts to limit crypto miners from using their hardware, but to inconclusive effect.

Rosenblatt's Mark I Perceptron

I've now read a little about Rosenblatt's Perceptron in two different places: in the Howard/Gugger Deep Learning book, and also in Cade Metz' Genius Makers.

The Mark I Perceptron

Built in 1958, it is usually described as the first machine which was based on the principle of the artificial neutron. It used a single layer in this initial configuration, and even in that simple way you could already see glimpses of where it might go.

Unfortunately, Marvin Minsky and Seymour Papert's apparently perceptive but also damning assessment of the perceptron as a technology without a future ushered in the first of the so-called 'AI winters', and the idea of using neural networks was buried for several years.

Thankfully, some ignored the herd and stuck with it.

Deep Learning: Best in Show?

Deep Learning is an incredibly powerful technology and there are a number of (focused / specific) areas where it already surpasses human-level abilities. Here are some examples:

  1. Translation: If you haven't been watching closely, the quality of Google Translate translations has really been improved in recent years. This 2016 story is a little dated, but it explains how they made a big push a few years back and it continues to improve as the technology improves.
  2. X-ray interpretation: In a matter of a few years, the performance of Deep Learning in reading and making diagnoses from x-rays has surpassed top radiology practitioners. See how DeepMind raised the bar on identifying breast cancer.
  3. Playing Go: Watch the AlphaGo documentary if you haven't already.
  4. Protein Folding: Check out AlphaFold from last November, where DeepMind blasted through a notoriously complicated problem in biology.
  5. Colourising images: A former fast.ai student, Jason Antic, made great progress with his work on DeOldify.

The really great thing about the fastai course is how it successfully has managed to democratise Deep Learning as a technology. I always enjoy reading about niche areas where specific burning problems were solved because someone took the opportunity to educate themselves.

Removing Barriers: Deep Learning Edition

I've been re-reading Jeremy Howard & Sylvain Gugger's Deep Learning for Coders with Fastai and PyTorch and I really appreciate the reminder that a lot of barriers to entry into the Deep Learning space can be productively put to one side.

Gatekeepers make four big claims:

  1. You need lots of maths to use Deep Learning to solve problems
  2. You need lots of data (think prodigious, Google-sized quantities) to use Deep Learning
  3. You need lots of expensive computers and custom hardware to use Deep Learning
  4. You need a PhD, preferably in Maths or Physics or some computation-heavy science

Needless to say, it's not that maths or more data or better hardware isn't maybe going to help or improve your experience. But to say that if you don't have those things then you shouldn't start is also (seemingly) inaccurate or not helpful.

If you are a domain expert in something that has nothing to do with Deep Learning or data science, you probably have a lot of problems that are like low-hanging fruit in terms of your ability to use powerful techniques like Deep Learning to solve them.

How to use jQuery and Handlebars in your website

jQuery and Handlebars are both external to the core functionality of JavaScript. Both are libraries that we can use and include when making websites. Doing so is very simple. We include <script>s in the head of our HTML file, as in the following example:

<html>
  <head>
    <script src="https://cdn.jsdelivr.net/npm/handlebars@latest/dist/handlebars.js"></script>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
  </head>
  <body>
    <h1>Hello, World</h1>
  </body>
</html>

Note that jQuery is now modular, so you may want to consider whether you want to include the entire library. Above I chose to download it from Google's CDN, but there are other options listed here.

Tweeting to the Void

I've previously written about how I turned off Facebook's news feed. I keep an account with Facebook because people occasionally contact me there. It is also an unfortunate truth that many companies in Jordan (where I live) or in the wider Middle East only have representation on Facebook instead of their own website. (Why they insist on doing this baffles me and is perhaps a topic for a future post).

I have long preferred Twitter as a medium for filtering through or touching -- however obliquely -- things going on at any particular moment. I have no pretensions to actively follow every single tweet to pass through my feed. Rather, it's something I dip into every now and then.

Increasingly in recent months, I found myself growing dissatisfied with the pull it often has on me. It has become something of a truism to state that 'twitter isn't what it once was', but there's less and less long-term benefit in following discussions as and when they happen.

RescueTime tells me that I spent 86 hours and 16 minutes on Twitter in 2017 -- just under quarter of an hour each day. That feels like a lot to me.

ScreenShot 2018-01-25 at 19.13.15.png

Enter 'Tweet to the Void'. This is a Chrome extension. (For Firefox and other browsers, I have to imagine things like this exist.) When I visit twitter.com, the feed is not visible. All I see is somewhere to post a tweet if that's what I want to do. (There is still some value in posting blogposts and articles there, since I know some people don't use RSS). Of course, I can always turn off the extension with ease, but adding this extra step has effectively neutralised Twitter for me. 

Try it; see how you feel about having something standing in the way of your social media fix. Let me know how you get on.

Installing PostgreSQL on a Mac

PostgreSQL is a SQL-type database system. It has been around for a while, and is in the middle of a sort of revival. Installing Postgres on your own system can be a little difficult. Last time I tried, I was helped through the process while doing the Udacity Intro to Programming Nanodegree.

Recently I had to reinstall Postgres, and there were some useful improvements to the process when guided through it in my Dataquest lessons.

Postgres.app is an application you can install on your Mac which simplifies a lot of the legwork, particularly when setting up new databases, servers and so on.

When you want to install a commonly used Python library for interfacing with Postgres, psycopg2 is a good option. You can do this easily with Anaconda:

conda install psycopg2

Tabula for extracting table data from PDFs

Have you ever come across a PDF filled with useful data, but wanted to play around with that data yourself? In the past if I had that problem, I'd type the table out manually. This has some disadvantages:

  • it is extremely boring
  • it's likely that mistakes will get made, especially if the table is long and extends over several pages
  • it takes a long time

I recently discovered a tool that solves this problem: Tabula. It works on Windows and Mac and is very easy and intuitive to use. Simply take your page of data:

A page listing Kandahar's provincial council election polling stations from a few years back. Note the use of English and Dari scripts. Tabula handles all this without problems.

Then import the file into Tabula's web interface. It's surprisingly good at autodetecting where tables and table borders are, but you can do it manually if need be:

ScreenShot 2018-01-17 at 15.56.25.png

Then check that the data has been correctly scraped, select formats for export (from CSV to JSON etc):

ScreenShot 2018-01-17 at 15.57.19.png

And there you have it, all your data in a CSV file ready for use in R or Python or just a simple Excel spreadsheet:

ScreenShot 2018-01-17 at 15.57.50.png

Note that even though the interface runs through a browser, none of your data touches external servers. All the processing and stripping of data from PDFs is done on your computer, and isn't sent for processing to cloud servers. This is a really nice feature and I'm glad they wrote the software this way.

I haven't had any problems using Tabula so far. It's a great time saver. Highly recommended.