WTF Internet acronyms, predictive text and txt spk say about the Lexical Approach

Do we, someone texted me the other week, actually use u r in text messages any more? Or are these Internet abbreviations a bit old fashioned?

This question got me thinking both about how the medium we use shapes language and also, the nature of language itself. It also is very much linked to the reason why it was so infuriating when people used to sneer at text speak, or when people now don’t recognise the sheer genius of some of the tricks people use online for getting their message across.

But my answer to the question is that u r is indeed a bit old hat.

This doesn’t mean some people won’t use it, but them some people also insist that it’s ‘may’ and never ‘can’ when we make a request and frankly, this is a line of argument up with which I will not put. I mean, I am a reasonably non prescriptive sort of person, so I think you can do what you like with language by and large. Just don’t try to foist your antiquated and completely irrational random quirks on other people, is what I say.

Except about ‘however’ being used to join two ideas in one sentence. That’s just wrong.

The reason why u r, and gr8, and so on came into existence or at least widespread use was that at one point texting was done on phones where the number and the letters had to share the same buttons. To get to some letters you would be tapping the buttons multiple times. Which is why this method was called, wait for it, the multi-tap.

Image by Mabel Amber from Pixabay

‘E’ for example, required three button pushes (I think. It’s been a while).

Add to this that you were restricted to 160 characters, and each further text would cost extra, or eat into your texting limit for the month, and it’s clear that finding abbreviations was the way forward. In much the same way that shorthand allowed secretaries to keep up with a spontaneous flow of speech for taking dictation. No time to write out ‘therefore’ in its entirety, use the three dots instead.

Texters, of course, were limited to the alphabet and numbers, so shorthand itself was out as adding in all the characters used in that would just make the button pushing more not less time consuming.

So what prolific texters in the 90s seem to have gone for is a greater relationship between sound and spelling than is usually allowed for in English, and leaving out vowels (hence, txt spk). As well as some acronyms like OMG.

OMG, incidentally, is not an Internet acronym as such. It seems to have been first used in the telegraph era at the beginning of the 20th century, another time when the fact that you needed to pay by the word encouraged people to start taking liberties with how they phrased their messages.

Of course, that doesn’t make txt spk easier to read. It’s basically like learning a whole new writing form, because we do not really read letter by letter, sounding each one out in our head laboriously until it matches the spoken form we recognise. Don’t get me wrong, it’s how we teach kids to read, but that’s really about getting them used to the basic sound spelling relationship. When they get good at it, they merrily take in whole words, whole groups of words at a time. As long as they are in forms they are already familiar with.

Txt spk, if you weren’t used to it, disrupted this process. I would put money* on it being a lot less prevalent in online chatrooms of the same era, when people had access to proper keyboards and wanted their interlocutor to understand their message with ease so they would be able to fire back a response as quickly as possible.

Except for some phrasings that were so widespread that they themselves became word formations which people would understand it at a glance, without the need to consciously decode.

If you want to see why, here is a famous couple of sentences from a ‘back to school’ essay supposedly written by some teenager back in the day (probably apocryphal): My smmr hols wr CWOT. B4, we used 2go2 NY 2C my bro, his GF & thr 3 :-@ kids FTF. ILNY, it’s a gr8 plc.

Ow. I mean, it’s possible to puzzle it out, but it’s not a quick thing to do. Which is why it was suitable for 160 character messages, but not much more.

What did for this sort of very full on text messaging language was that mobiles started to be able to include full keyboards as part of their key pads. And, of course, predictive text. It’s harder, in fact, to get your phone to write ‘l8r’ than ‘later’ right now (5 taps vs 4, including having to switch from letters to numbers for the former).

So why bother, especially as it doesn’t actually add to comprehensibility for many people?

In fact, modern texting is a lot more like telegraph speak. Although the 90s keypad limitations have disappeared, we do still want to be concise because we are composing on the fly.

What I find interesting is that the Internet acronyms that have survived, proliferated in fact, are not the clever sound play abbreviations (youngsters don’t seem to know what BCNU means any more, for example, although you can work it out if you say the letters aloud), but the ones that reduce well known phases to their initial letters.

In fact, a lot of the shorthands we use regularly now are just chunks of everyday phrasing that it would seem inefficient to write out in full. IDK, YMMV, IMO. Etc.

See what I did there?

What’s interesting is that in this they do rather mirror the point made by adherents to the Lexical Approach, that language is less about grammatical formulas into which we drop vocabulary, and more to do with combinations of words in fixed phrases that we store in our heads in their entirety. And then bang out in prefabricated chunks when we are trying to get our mouths and our brains lined up at speed and do not have time to be thinking about fabulous new combinations.

It’s not that we cannot play with language, it’s just that a lot of the time we don’t.

Computers helped us really solidify our understanding that this is really how language is put together. They allowed us to see just how may times ‘tall’ went with ‘woman’ instead of ‘high’; that ‘terrible’ and ‘horrible’ are not exact synonyms because of the words they tend to be used with; that ‘have you ever…?’ is followed by one of only five verbs fifty percent of the time; or even that one meaning of ’cause’ takes ‘a’ and another takes ‘the’.

This is also what allows predictive text to work.

I mean, we all enjoy the hilarious results of the game where you complete an opening phrase and allowing autofill to do the rest. But the fact that what comes out is recognisable, and usually some very fixed if banal phrases is the point. It doesn’t just work because you have typed in the first two or three letters of a piece of vocab and it’s coming up with the most frequent ways those can combine into a word for you to choose from, it’s working it out based on what you have said so far in the sentence, and what words frequently come after that.

Because that’s how language works.

And why we don’t feel the need to write it out in full, even if, sometimes, with Internet acronyms unfamiliar to us, it does still take us a while to work it out when we are reading it.

Text on image including internet acronyms: James Wong: [...] (Full disclosure: It took me ages to work out what 'cba' means.) Eve Simmons: I know baking bread is self care etc, but if you frankly cba, don't worry [...]

Amusingly we seem to be going full circle. I’ve caught my kids saying ‘press the ok button’, where ‘ok’ is pronounced as one word to rhyme with ‘clock’. Is this a Russian thing (it’s not the first time I’ve heard it here), or a young people thing (all the people who have said it to me are considerably more youthful than me)?

Genuinely want to know the answer to that, if anyone has any further data

*I haven’t actually come across any studies where they’ve compared the two mediums, and I’m too lazy to go searching.

How good are you at online turn taking?

So I have discovered Gretchen McCulloch, and I am sulking a bit because I think she has stolen my ideal career. In a parallel universe, sort of thing.

Her raison d’etre is to delight in Internet linguistics. She’s written a book, Because Internet (a title I am wildly jealous of, best title for a book EVAH), and she co-hosts a podcast called Lingthusiasm, which to be fair is not just about the Internet, but is about being enthusiastic about linguistics for a reasonably non-specialist audience.

Obviously, the book, Gretchen McCulloch and the podcast are likely to come up quite a bit on this blog from now on as I work may way though the back catalogue of things I wish I had written and things I wish I had said. Although I must say that the potential pain of this is much mitigated by the delight in finding someone (well, people, including Lingthusiasm’s co-host, Lauren Gawne, and guests and so on), who actively agree with me about things like why Twitter speak is not a debased from of making the words go.

Anyway.

I was listening to the episode on conversational analysis, which had a lot about turn taking in*.

Now speaking as a sometime teacher of exam preparation classes, turn taking can get reduced to a set of functional phrases we try to ding into student’s heads, mainly as a way of reminding them that there are bits of exams where they can in theory pick up marks by turning to their partner and saying ‘so what do you think?’ rather than trying to hog the limelight.

You can, if you are not careful, end up with students who have almost entirely content free conversations, consisting mainly of phrases for responding to each other.

But of course there’s a lot more to conversation than just the phrases, and a lot more to turn taking than explicit signals. Relinquishing a turn, holding the floor or diving into your part of the conversation are often managed by more paralinguistic means. For example (I learned from the podcast) we look away from our listener while taking our turn, and re-making eye contact is one way to show that you are about to pass the baton of speechifying over.

Which means I have been doing Zoom all wrong. Because I have been grimly staring at the camera for the entirety of my turn.

This may well be disconcerting for my listener, who is probably, if they are following face to face tendencies, looking at my face on screen and being freaked out by my unwavering gaze.

Of course, if they are staring at the window containing my face (likely), that means they are not actually looking at my eyes. Which is probably also sending the wrong signal.

Unless they’ve turned their camera off entirely, of course.

No wonder Zoom is tiring. All our usual cues are skewed.

I’ve heard it suggested that a good strategy for presenters is not to look at the camera except for key moments. You do it, the equivalent of catching an entire group of people’s eyes, to really make people pay attention. Which seems like good advice. Must try to follow it.

That said, turn taking is a horrible thing, I think, for non-native speakers to have to do at the best of times. Trying to get myself into a multi way chat in Russian along with trying to remember what declension of the verb, adjectival form and object case to use is extremely intimidating. It would be much easier if we did just manage every handover boundary with a nice fixed phrase. Because I am a teacher much more than a linguist, I consider this the real value of teaching set expressions, regardless of absolute authenticity. It builds people’s confidence to fling themselves into and out of the fast running waters of chat.

Image by 이룬 봉 from Pixabay

What makes it worse though (getting back to the topic of the podcast now) is that it’s not just about being a non-native speaker, but the norms of your local culture, or what personality type you are. And that can cause friction even between different flavours of native speaker, and not even ones from different countries.

Are you one of those people who dives in before an interlocutor has quite finished and finishes off their thought for them? (Yes). Or are you someone who needs a pause of some length before you recognise it’s your turn? (No). Will you top an anecdote with your own?(Yes). Or will you reserve your story until someone explicitly asks for it? (Hahaha. No).

These are all conflicting strategies used by different speakers, and you can irritate the crap out of other people if you are using strategy A with a strategy B type conversationalist.

Jump in whenever I even look as though I might be pausing for breath is my advice.

Except on Zoom or Skype, because the time lag means that instead of an elegant overlap of my final words, you’ll end up interrupting my next utterance. It’s type B people who rule that environment.

However, the bit of the podcast I found really interesting was when online written chat came up.

These days, whether you are talking about real time chat, or forums where more leisurely, asynchronous posting is the norm, you have to wait to see an utterance in full – you don’t have a chance to start thinking about your reply until after other person has posted their utterance in full.

This is different to face to face communication. You can see (hear) face to face where your interlocutor is going a fair few syllables or more away from them completing their thought. Hence overlaps. Or, if you are a type B person, very short pauses (they are never very long).

(The reasons for this are quite interesting, but probably a diversion too far at this point. Hold that thought).

This, I expect, is why many of these chat programmes let you see a little ‘Bronwyn is typing’ to encourage you not to wander off when you don’t get a near instantaneous response to something you just typed in. Like you would in speech. Even from a type B person. I bet they ran tests and everything to see whether it is necessary to keep people on their platform for longer to have that little message.

But what is the actual etiquette of turn taking online (I started thinking), and can you spot whether someone is type A or type B chatter from the way they post?

Take the issue of posting your whole thought, your whole turn, in one go vs turning it into a series of individual utterances or posts.

On the one hand, you have platforms which are not intended to be particularly live. Facebook, for example. It does rather invite longer, complete turn posts, not just for the original poster, but for every response thereafter. Multiple posts one after the other from the same person are a bit iffy. It’s the equivalent of hogging the floor, clogging up the thread like that. Or possibly coming across as overly scatterbrained.

On the other, you have Whatsapp and the like, which can be more spontaneous.

And then waiting for someone to finish a multiple utterance, full-thought post, complete with careful proof reading, could be intolerable. Even with ‘Bronwyn is typing…’

Plus, posting an initial short idea gives your chatees time to start thinking about the topic and their responses.

I mean, on Whatsapp, even if the next person’s turn then overlaps yours, because they have not just thought but also started typing, and even if there are a number of people participating who all press send at once resulting in, gasp, multiple overlaps, you can still dance though these multiple threads reasonably successfully. The quote function allows you to keep each strand reasonably coherent.

But perhaps this marks me out as a type A person even online, and drives type B people nuts. They, perhaps, would MUCH rather I manage to hold off mashing the enter button until I actually finish my whole thought. And possibly until I have edited my typos out too. They may also be praying that just this once I would just let them reply before I suddenly ping off in a different direction.

Ahem.

I used to take part in real time written meetings (this was before programmes like Zoom had made face to face online meetings between groups of people in many different locations reasonably doable. Ah the olden days. What was it, all of five years ago?).

To deal with people like me and make sure that turn taking was fairly even we had a system.

Type in + to raise your hand and bid for the next turn. And to ensure you got to keep the floor when you were typing, no-one could take over until the speaker has written * to show they had finished.

Of course, this was a more formal context, and turn taking was therefore more formally managed in the same way formal face to face spoken word meetings have management.

Definitely thought up by a type B person though. She says, provocatively.

So, which type of conversationalist are you, and do you think you behave the same online as you off? And do you use the chatbox while someone is having a long turn in a Zoom meeting to add your own side commentary, and what does that say about you? Are there any other turn taking idiosyncrasies you have noticed? Answers below!

*Episode 39: How to rebalance a lopsided conversation.

Indexicality, Expertise and Twitter

It surely comes as no surprise that we use language differently depending on who we are talking to. Or where we are. And whether we are tapping away with our thumbs on a smartphone, using a fountain pen on our best headed notepaper, speaking face to face, or screaming at our other half on the phone over the roar of public transport.

Some of this is to do with the constraints of the method we are using. But a lot of it is to do with making sure that we are being sufficiently polite and paying enough respect to our interlocutor’s status and relationship with us.

While not overdoing it. Because that would be weird. And come across as sarcastic.

So, how do we know which version of all the available phrasings to use?

I mean, quite often the context is, or should be, enough. You don’t need any special cues to be super polite to every person you encounter in the building where your super important job interview is taking place.

Hopefully.

Who knows whether that woman or that man might turn out to be the interviewer and not the intern, despite their apparent youthfulness.

And whilst people used to make fun of the ‘sent from my iPhone’ signoff, it was a way of signalling that the production circumstances were less than ideal. Which might account for any miss-match between the way the message was expressed and what you might have thought was due to you.

I came across a reference recently* to the way that Korean telephone opening sequences are a bit longer and involve more phrases back and forth, because, it was speculated in this book, it gave the listener a bit longer to tune into the way of speaking of the other person when no visual clues are available. And this increased the chances that the participant would be able to use the appropriate form of polite address in a language where this is hardwired into the grammar and important to get right.

It’s easy to condemn the concept that we are swayed into assuming the social class, likely educational background and status in relation to you of someone you are talking to because of their accent. It’s a lot harder to say it doesn’t actually happen, that we don’t do it all the time.

This is the nice thing about moving to a completely new country. The intricacies of social positioning are largely a mystery, at least at the beginning. I still can’t pick out a Muscovite accent from a St Petersburg one or that of someone even from further away from the centre of civilisation, although I’ve internalised some of the visual stereotypes.

Which brings me to how we signal expertise in the relatively anonymous arena of Twitter, where everyone certainly does not know your name unless you are a truly global superstar.

Image by Karen Arnold from Pixabay

Of course, the first rule of being an expert on Twitter is to avoid being a woman. It is extremely noticeable that experts of the unfortunately female persuasion have a distinct tendency to add ‘Dr’ to their Twitter handles. Yes, I am aware it was one of those collective social media episodes from a while back. The fact that it has stuck does rather suggest that it is slightly easier for a lady nuclear physicist to comment on nuclear physics without having a relatively large number of people come and tell her to read her own authoritative book on the subject before she dares make such wrongheaded statements.

The blue ticks you say? Well indeed. That can backfire though if you say something and people note the blue tick and decide you are insufficiently famous or expert to have such a mark of distinction. It’s not quite the invulnerable shield, although it’s interesting that people tend to react as though it is a (wrongly given) blanket endorsement of any and all views by Twitter. Even though it’s been given for notability in a particular area.

But what prompted this post was the discovery of this perfect example of a linguistic trick. Or rather this perfect example of someone taking the piss out of a linguistic trick being used to signal expertise in a suddenly interesting field. To muscle their way to the top of the punditry pile in a mass of billions of voices on social media, presumably.

Text reads: one of the absolute, surest signs that somebody is about to start talking bullshit about China (zhongguo, or the Middle Kingdom') is when they start unnecessarily putting the Chinese in when addressing a foreign (waiguo) audience

This sort of thing is called indexing, by the way, in case you thought there wasn’t actually a discourse analysis connection.

Indexicality is the idea of language having contextually bound meaning, although as far as I can tell, when one uses the verb form, one is able to talk about people using language in a particular way so as to construct an identity in a particular context. Such as the identity of an expert.

So, what flexes do you use to index your expertise, and when do you find yourself needing to do it?

* In Barbara Johnstone’s Discourse Analysis.

Lexical Cohesion and SEO for Bloggers

At some point if you write a blog, you will get a nagging feeling that you out to be doing more about SEO, or Search Engine Optimisation. Or the plugin you use round the back end of the blog to monitor the situation will start nagging you about it, which is what generally happens in my case.

Search Engine Optimisation is how you explain to search engines (well, Google really, let’s face it) what your blog post is about and why they should put it number one in the list of any results when someone types in a query related to your topic.

And this has a lot to do with what discourse analysis says about aspects of how texts hang together. And the limitations of computing. Or possibly the limitations of people’s understanding of computing and/ or discourse. I haven’t quite decided which yet.

Now, obviously we are all aware that texts have structure. Introductions, arguments for, arguments against, conclusion. Sort of thing.

But we also use words to tie it all together, which is called lexical cohesion.

Image by Alicja from Pixabay

One way we do that is via referencing. ‘Taylor Swift’ will not always appear as ‘Taylor Swift’ in a 2025 newspaper article about the highest selling recording artist of this or any other generation. She might become ‘she’.

But she may well also be ‘Taylor’, ‘Swift’, ‘the singer’, ‘the highly accomplished lyricist’, ‘the pop phenomenon’ and ‘the savvy social media influencer’. Which is an example of elegant variation in a lexical chain of words and phrases all really referring to the same thing. It is considered good style and is a feature of the way we recognise a text as a text and not a list of random sentences.

One of the things linguists have been able to do with the advent of computing is feed texts into a computer to make a bank of language samples called a corpus. They can then analyse this corpus for patterns of language use with far greater statistical authority than some old English geezer with a quill pen using his intuition and a few interviews to work out which words we actually use and what they mean in order to write them all down in the first dictionary.

And one of the things you can look at using this sort of software is how the words of a text tell you what it is about.

(I know! The revelation! Bear with me!)

This has led (via a few other steps) to the somewhat depressing conclusion that in order to understand comfortably any given English language text, you need to know in the region of the most frequent 16 000 words*. Just to give you some idea of why this is somewhat horrifying for a learner of English, survival levels, a sort of every day vocabulary for getting about, is said to be 2 – 3 000 words and once you reach Upper Intermediate level, which is pretty jolly competent really, you might expect to know 6 00 – 10 000. For a given value of know. For a given value of words too.

So an extra 6 000 is quite a big ask, especially as there are some complicating factors which I am not going to get drawn into here.

Why has it led to this conclusion?

Well, a large proportion of a text, a very large proportion, will be made up of very common words. But the final few percent are the words that are specific to the topic of the text. And even in non-specialist texts, this will probably involve a whole bunch of words that are not at all frequent, words that are connected to the topic at hand. Words that, if you broke the text down into its component parts, you might be able to categorise as ‘verbs to describe ways of eating’, ‘adjectives to describe smell, taste and food texture’, ‘different ways of referring to cooking techniques’, and ‘ lots of synonyms or near synonyms for “cod”, “fried potatoes” and “eating with your fingers”’. Until you could confidently say without actually reading it from start to finish that this text is probably about a foodstuff called fish and chips.

Which, of course, is what Google’s search engines (and similar) do when they are trying to work out what your online text is about in preparation for serving it up when someone types’fish and chips’ in the search bar.

More or less.

However, there is a lot of writing out there on the internet now. If you want your piece to compete with all of the many many lots of words about fish and chips you are going to have to make it stand up and SCREAM ‘my post is the most relevant’.

Preferably by using the exact fish and chip related phrase that people are googling for.

In the past, this meant breaking some of the offline text construction tendencies in that once you had settled on your keywords, you overdid the repetition element of lexical cohesion. Which is always a feature of texts, because after all there really are only so many ways you can call Taylor Swift fabulous and full of awesome before you run out of paraphrase. So a text about Taylor Swift will indeed contain a higher number of examples of the name Taylor Swift than you would expect in a text which is not about Taylor Swift.

But only to a certain degree. Normally.

Have you, in short, ever come across a post online and noticed that they seem to have rather overdone the term ‘learn about SEO and lexical cohesion’ by shoehorning that exact phrasing no less than three times into one short paragraph? And thereafter another twenty times with no variation at all for the rest of the piece?

It doesn’t help that such writers will also eschew ellipsis (If you want to learn about SEO and lexical cohesion this article will give you everything you need to learn about SEO and lexical cohesion) or substitution (If you have read a thousand articles where you learn about SEO and lexical cohesion, you may not want to read another article where you learn about SEO and lexical cohesion one, but…).

Luckily for teachers trying to find authentic online texts to train our students on, this practice is starting to be considered outdated, presumably because either search engines are now a little more sophisticated, or the people who try to game them have become more sophisticated in their understanding of how they work.

Although my SEO plugin does still tell me off if I use the keyword I am trying to rank for fewer than X number of times for a given length of text, which generally results in me having to go back and do a bit of editing out of pronouns or synonyms.

I suspect there is still a bit of an advantage therefore in a using your keywords more frequently than seems natural (to me), but I have not tried out what happens if I go overboard and use too many examples of the SEO keywords. Is there an upper limit according to Yoast? Might have to experiment.

Of course, more interesting is what will happen if copywriters win and shift what is considered the optimal, correct, elegant or [insert your own adjective here] way to achieve lexical cohesion in a text. Certainly one wonders what keyword stuffing has done to corpus data in the meantime.

And of course, I have made a massive assumption that this practice should be considered ‘wrong’, which is always a dangerous thing to do when considering language use in the wild.

Have you ever noticed this phenomenon or is it less glaringly obvious than I think it is? Would it not bother you if it became the norm or are you relieved it is on the wane?

*I got this from From Corpus to Classroom: Language Use and Language Teaching by O’Keefe, MacCarthy and Carter.

Hello World

My name is Heather. I am a native Brit, and I have been an English as a Foreign Language teacher since 1996.

I started out volunteering as a teaching assistant in Moscow. Having discovered to my complete surprise I really liked teaching, I got qualified in EFL instruction, moved to Russia full time and never looked back. Except that time I taught History to teenagers.

Since then I have worked in both the UK and Russia, in private language schools and the state sector, as a teacher, an academic manager and a teacher trainer.

This blog isn’t really about that though. It’s my love letter to discourse analysis, social media and online communication.

Image by Free Photos from Pixabay

What discourse analysis is can be quite hard to define. Whole books have been written on the topic, but let’s have a stab, shall we?

Discourse analysis is the study of language at text level, with text being defined much more widely than neatly complete written articles in newspapers or whole novels. It’s a fairly interdisciplinary sort of field involving everyone from linguists, the language teaching profession, sociologists, anthropologists, to computer scientists trying to programme AI, and that’s not even an exhaustive list.

To me, it’s super interesting because it’s where sentences or words stop and communication begins. It’s about the choice of phrasing. What intonation does to the message. It’s about the aspects of language which cannot be described by a grammar reference book. And it’s about the nature of how we cope with trying to construct utterances in real time, and what happens when we can wield words with more consideration.

And it’s about why it all goes wrong and we have cut all ties with Auntie Vera because of the way she used ‘well’ on WhatsApp.

I will mostly be writing about whatever I have last been reading on the topic, possibly illustrated with stuff people have said on social media. I love online communication. I happen to think that because it is an interesting blend of spoken and written language, it has turned us all into discourse analysts. Moves that people might have got away with in ephemeral speaking get clocked much more easily by casual onlookers on the Internet. Plus, of course, some of the gloves are off in a medium which transcends the need to get along with your neighbour for the foreseeable future.

I wanted to call the blog WEAPONISING DISCOURSE ANALYSIS ON SOCIAL MEDIA, in fact.

But I couldn’t figure out how to fit that neatly into a URL. Or come up with a version of the name that would not get me an immediate reputation on Twitter.

So Those Sharp Words it is. Thanks to my friend who is much better at snappy titles than I am.

I am highly unlikely to have an original thought on this topic. I am not going to be doing formal discourse analysis myself. But I hope there are other people out there who find this as interesting as I do, and I am looking forward to connecting with them.