Why Does the Word Count Differ Between Programs?
Counting characters, words, and lines is extremely important in the field of content writing for both the writer and the client. The count sets the rate for an entire project and its price. So, if you rely on only one writing platform, you could be misled by the word count feature. Sounds like not that big of a deal? Well, you could be paying more than you should for your articles.
That is because different writing platforms have different ways of calculating the word count of a file. For instance, Google Docs and Microsoft Office will give you two different word counts—even though the text being processed is the same! Thus, they are not entirely reliable.
“But why does that happen?” I hear you ask. It is a little complicated, but we have broken it down for you so that it’s easy to understand. Keep reading!
How Do the Most Common Programs Count Words
Have you ever noticed a slight difference between the word count in one program and another? If you have, hats off to you! You weren’t just “seeing things”—there is actually a difference between how writing platforms count words.
When these platforms are designed, they have word count algorithms programmed into them. That is just a fancy way of saying a programmer fed the platform a formula for counting words. And because all the platforms we use have been made by different companies (Microsoft, Apple, Google, and so on), no two formulas are the same.
To better understand how and why word counts differ between programs, let’s look at the science behind this commonly used feature (hint: it’s easier than it sounds).
The Algorithm for the Microsoft Word Count Feature
Microsoft Word might be the most popular writing platform out there. Its programmers taught it to consider any string of “things” between two spaces a word. That is to say, a word doesn’t necessarily have to contain letters. It could simply be a long string of numbers or symbols. For example, “12345” is a word according to Microsoft Word!
Another example of how this algorithm works is looking at these two sequences: “and/or” and “and / or.” You and I might say each sequence counts as two words. However, if you guide yourself by Microsoft Word’s algorithm, the first sequence is only one word (because there are no spaces there) and the second is three words (because it counts everything between spaces as a word, even the slash symbol). Odd, right?
Plus, the curiosities don’t end there! Microsoft Word typically counts fewer words than other writing platforms because it not include headers, footers, and the words in text boxes into its count. However, you can change the settings so that the algorithm starts counting these if you want.
The Algorithm for LibreOffice
LibreOffice might not be as widely used as Word, but it is still an especially useful tool for content writers and their clients. It works very similarly to its Microsoft counterpart, which is to say it follows an algorithm that is practically identical to that of Microsoft Word. Thus, if you understand Microsoft Word’s counting feature, you understand LibreOffice’s as well!
Let’s break down the formula LibreOffice uses to tell you how many words are in your document:
Let’s break down the formula LibreOffice uses to tell you how many words are in your document:
• Every string of letters between two spaces is considered a word: This means “and / or” are three words, but “and/or” is only one word.
• Hyphenated words only count as one word: We can gather that, according to LibreOffice, “fast-paced” counts as one word, while “fast paced” counts as two. That is because in the first example there is a hyphen separating the two strings of letters. But in the second example, the space is there instead of the hyphen, which tells LibreOffice it must be two words.
• Combinations of characters and numbers are counted as words, if they are between spaces: So, LibreOffice will tell you that “Cat123 $1.34 * https://111.com” is actually four words because they are all between spaces—whereas you and I would agree that example has no words, just random numbers and characters.
Let this be your reminder that, no matter how advanced computers are, they still don’t behave like us and will tolerate all kinds of nonsense!
The Algorithm for Pages
If you have an Apple device, you have likely used Pages before. This language processing tool is the alternative to Microsoft Word and, thus, behaves differently. Out of all the writing platforms we will be walking you through today, Pages consistently shows users the highest word count.
This is because Pages (unlike Microsoft Word and LibreOffice) counts the words used in charts, graphs, text boxes, headers, and footers. If you’re one to feel super productive when looking at the word count box in the corner of your writing platform, Pages will delight you.
Moreover, Pages has a more ingenious word count formula than other text processing tools. If you think back to the example we gave earlier, Pages counts “and/or” as two words—just like you and I would. The fact that there isn’t a space between “and” and “or” doesn’t throw off Page’s word count. Additionally, “fast-paced” is also counted as two words.
The Algorithm for Google Docs
The last writing platform we will take a look at is Google Docs. This program only counts strings of letters and numbers as words.
So, if you write “???” on Google Docs, the platform will tell you there are no words there. But if you write down “1234?1234” Google Docs states that two words (here, the question marks is read as a space, thus giving rise to two distinct words). On top of all that, Google Docs counts URLs as three different words, all separated by symbols (colons, periods, or dashes).
For example, if you were to write down “www.xyz.com” into Google Docs (and Pages, too), you will have three words. But if you write down that same URL address on Word or LibreOffice, you will only have one word. If your article has a lot of links, your word count could be way off.
A Note on Different Languages
All the examples we gave you were in English. Yet, all these writing platforms apply the same formulas to other languages as well. To clarify this a little, let’s look at some examples.
In French, question marks (along with some other punctuation marks) are separated from the last word of the sentence by a space. Thus, you could find “Comment ca va ?” (meaning: How are you?) written in a document. Microsoft Word and LibreOffice will count that as four different words, whereas Pages and Google Docs will count it as only three.
That makes sense, right?
There are many more examples of how all these popular word counting features work in different languages—too many for us to go into. However, they always fall into the more general rules we described for each writing platform.
How Words Are Counted in Popular Counting Tools
Does that sound like a lot of information to take in? If it does, don’t worry. We believe these next examples will illustrate all the main differences between word count programs.
• IFO 9002:2017 — Pages and Google Docs tell you this sequence contains three words. LibreOffice and Microsoft Word only count two words. Why? Because the colon between the digits tricks them into seeing the combination as one large number.
• Mary@qwr.com — Pages and Google Docs once more agree this sequence contains three words. LibreOffice and Microsoft Word band together in counting only one word.
• 1,324 — Surprise, surprise: Pages and Google Docs count this number as two words (because the comma there is read as if it were a space, thus giving rise to two words), while Microsoft Word and LibreOffice disagree and both state that it is one word.
• Shouldn’t — Because words such as this one (that is to say, contractions) are so common, programmers have tweaked their writing platforms to look at it as just one word. This is true for LibreOffice, Microsoft Word, Pages, and Google Docs. Even if you write two different words and join them with an apostrophe, these processing programs will still count it as one word. Try it out with Mary’Jane!
Alright, now that we have gone through so many isolated examples, let’s put your understanding of how the word count differs between programs. We will be using the same written text in all these examples (yay for scientific method!):
Hi, my name is Elmore and I live on Willow/Harrington street. If you would like to contact me please make use of my email address which is firstname.lastname@example.org. It will only take you 1.375.20 seconds to get in contact with me, that’s how quick I am at responding to my emails. If you feel that I am unresponsive in my emails, please connect with me through my website which is www.standfor.willow.com, hope to hear from you soon! If you would like and/or need more details do let me know!
Microsoft Word and LibreOffice
In this sample, the total word count is 93 words, and this is how it breaks down:
• Words with no spaces are considered one word, such as “Willow/Harrington” or “and/or”
• Phrases with a combination of characters are considered two words, for example, “IFO 9002:2017”
• URLs are seen as a single word
• Contractions are just one word, such as “shouldn’t” and “that’s”
• Sequences of numbers are one word, such as “1.375.20”
• Email addresses are counted as one word
• Words containing symbols are also one word, such as “Willow/Harrington”
As we have seen before, LibreOffice and Microsoft Word share a word counting algorithm, so there is no difference here between the two!
In this sample, Google Docs counts 103 words—ten more than Microsoft Word and LibreOffice do! Let’s look at why that happens.
• Words with symbols but no spaces between them are considered two words, which is what happens with “Willow/Harrington”
• Combinations of characters are seen as several words, such as in “IFO 9002:2017” which counts as three words
• Underscores are not read as spaces, which gives rise to “elmore_cook” being read as one single word
• A URL is registered as four words, for example “www.standfor.willow.com” because the periods between the letters count as a space
• Contractions are counted as just one word, hence “that’s” is not seen as two words
• A number separated by symbols counts as several words, in this case, “1.375.20” is counted as three words
• Email addresses are recorded as more than one word, for example, email@example.com is three words
According to Pages, the text we wrote down has 102 words (one fewer than Google Docs). Here is a sneak peek into the logic of the program.
• Phrases with a combination of characters, numbers, and symbols are read as more than one word, in this case, three words make up “IFO 9002:2017”
• Contractions are counted as a single word, which is why “that’s” is not two words
• A URL is registered as several words, for instance, “www.standfor.willow.com” is read as four words separated by periods
• Underscores are read as spaces, thus resulting in “elmore_cook” being seen as two separate words
• Words with symbols between them (rather than spaces) are still read as two different words, such as in “Willow/Harrington” which counts as two words
In short, the only way Pages reads this example text differently from Google Docs is in how it treats underscores.
Knowing the right word count is equally important for the client and the writer. Using just one software can throw off your count. Thus, when you have a project that is paid for by the word, it’s key that you know exactly how many words you want to pay for.
The easiest way to tackle this problem is to agree to a specific software’s word count feature with the writer. Alternatively, you could use different platforms (for example, Google Docs and Microsoft Word) to get two different word counts, and then find the average between those two programs.