jump to navigation

“True” Word Count In LaTeX February 7, 2007

Posted by Carthik in commands, packages, Readers' Tips.
trackback

By way of Wei comes this little nugget of useful information of the kind I love.

If you were to count the number of words in a LaTeX document using the “wc” command, you will find that you have counted, in addition to the words you wrote, all the LaTex formatting text, like the “\paragraph”s and the “\textit”s.

Of course if you use Kile like I do, all you have to do is go to “File -> Statistics” to see the word count. But if you don’t use Kile, then you can follow Wei’s advice and install and use the “untex” package by doing a:
$sudo apt-get install untex and then a:
$untex source.tex > target && wc -w target
to count the number of words in the file named “source.tex”.
Alternatively, you can use this online tool to count the words.

A word of caution here — untex does not ignore equations, and so the output of the word count might be off by a bit. If you are a perfectionist, I would recommend using detex instead. There is no seperate package for detex, it ships in the Ubuntu package texlive-extra-utils.

If your document has citations, references, and include other files etc, the only reasonably efficient way to count the words in the final result is to convert the pdf file to text and then to count the words. Here is a command that will help you do that:
$pdftotext file.pdf - | egrep -E '\w\w\w+' | iconv -f ISO-8859-15 -t UTF-8 | wc

pdftotext is a command line utility provided by Xpdf. You may have to tweak the charsets in the previous command.

Comments»

1. Luke - February 8, 2007

I’m assuming that this would also work:

$untex source.tex | wc -w

It’s less typing and less disk operations because you read in the file only once (as opposed to two reads, a write in your example).

I haven’t tested it but it looks like it would work.

2. sam tygier - February 8, 2007

there is a nice script called texWordCount.pl at
http://www.comp.nus.edu.sg/~kanmy/software.html . that shows total word count, and also count per section. it can properly handle included files as well.

3. ubuntonista - February 8, 2007

Hey Sam, thanks for stopping by, and for the script!

4. Kimie Nakahara - May 6, 2007

Very good tip! Helped me a lot!

Thank you!

5. miscellaneous factZ » Blog Archive » Counting Words in a Latex File - August 24, 2007

[…] of this was inspired by this blog post. Having tested on my own set of files I would suggest that these methods could be ranked in order […]

6. Incie83 - September 17, 2007

I would highly recommend Sam’s script posted above… untex is a bit rubbish when you’ve got math in your paper.

7. Robert Rothenberg - December 3, 2007

Since I often have large documents broken up into multiple files, I use:

cat *.tex | untex – | wc -w

8. urban - November 9, 2009

I just made it… so it may be wrong.
Word count without Bibliography entries:

#!/bin/bash

if [ $# -ne 2 ];then
echo “Usage: $0 ”
exit
fi

if [ “$2″ != “c” ] && [ “$2″ != “w” ];then
echo “Usage: $0 ”
exit
fi

echo -n “Words|Characters Found: ”
pdftotext $1 – | awk ‘BEGIN{disp=1;line=0}{
if ($1 ~ /.*Bibliography.*/){
print $0 > “./wc_skipped”
disp=0
next
}
else{
line++;
if (disp==1) print $0
else print $0 >> “./wc_skipped”
}
}’ | wc -$2

echo “Check the Bibliography lines …”
nano ./wc_skipped
echo “Cleaning Up”
rm ./wc_skipped

echo “Bye”
exit 0

exit 0;

9. Edens - April 3, 2010

Do you remember what was going through your head when you first saw me?

10. Top News - September 8, 2010

good tips. Thanks

11. sikiş izle - September 26, 2010

good tips. Thanks

12. antalya ilaçlama - September 27, 2010

I’m assuming that this would also work:

13. porno sikiş - September 27, 2010

Since I often have large documents broken up into multiple files, I use:

14. sex sikiş - September 28, 2010

I just made it… so it may be wrong.

15. film - October 13, 2010

hehe hohoho

16. xpornofilm - October 13, 2010

sdseesdddeeee

17. mobilseks - October 13, 2010

bbccaadsaeeseeee

18. sikissene - October 13, 2010

yeter yoruldum

19. pornoizle31 - October 13, 2010

sdsdeeeeseee

20. qnetix - October 13, 2010

kkoseeees

21. d0xnet - October 13, 2010

heheheooseeeseeeee

22. pornofilm - October 13, 2010

sdeeeeeseeesgbbsaeee

23. cam mozaik - October 13, 2010

dfgkljlfgjdlşx

24. mantolama - October 13, 2010

gjhngijghn

25. escort - December 22, 2010

Do you remember what was going through your head when you first saw me? izmir escort

26. laubblat - November 9, 2011

count word also of include tex files:
texcount -inc source.tex

27. xfactors - November 25, 2011

this is a little blog about the singing talent show x factor, with some news and updates about the hit show x factor

x factor blog link click here

28. Besuchbare Omas - August 2, 2012

I think its quiet an important factor and nobody should forget to think about something like that.

29. Mikhaela - May 19, 2014

hello there and thank you for your information – I’ve
certainly picked up anything new from right here. I did
however expertise some technical points using this web site,
as I experienced to reload the website lots of times previous to I could get it to load correctly.
I had been wondering if your hosting is OK? Not that I am complaining, but sluggish loading instances times will sometimes affect your placement in google and could damage your high quality score if ads and marketing with Adwords.

Anyway I’m adding this RSS to my email and can look out for much more of your respective interesting content.
Ensure that you update this again very soon.

30. movinghouse.bravesites.com - June 11, 2014

movinghouse.bravesites.com

“True” Word Count In LaTeX | Ubuntu Blog

31. Poker Online Tanpa Robot - July 13, 2014

We are a group of volunteers and opening a new scheme in our community.
Your site provided us with valuable info to work on. You have done an impressive job and our whole community
will be grateful to you.

32. movinghouse5.webnode.com - July 27, 2014

movinghouse5.webnode.com

“True” Word Count In LaTeX | Ubuntu Blog

33. Frances - August 6, 2014

Hey!Would you mind if I share your blog with my twitter group?
There’s a lot of people that I think would resally enjoy your
content. Please let me know. Thanks

34. Marissa - August 11, 2014

Nice post. I was checking continuously this weblog and
I am inspired! Very useful information specially the remaining section :) I care for such information a lot.

I was looking for this certain information for a long time.
Thnk you and good luck.

35. Tory - August 11, 2014

This blog was… how do I say it? Relevant!!

Finally I have found something that helped me.

Thanks!

36. Samual - August 14, 2014

You caan certainly see your enthusiasm within the work you write.
The arena hopes for more passionate writers such as you who
are not afraid to say how they believe. Always go
after your heart.

37. Bernadine - August 14, 2014

Thank you, I have just been looking for information approximately this subject for a while and
yours is the best I’ve came upon so far. However, what about the conclusion? Are you sure about the
source?

38. Buck - August 15, 2014

Thankfulness to my father who stated to me regarding this
web site, this weblog is actually amazing.

39. sell - August 17, 2014

Quality posts is the secret to attract the visitors to go to see the web
page, that’s what this site is providing.

40. Xiomara - August 24, 2014

There’s definately a great deal to learn about this subject.
I love all the points you’ve made.

41. qshort.de - August 25, 2014

Thanks fߋr finally talking about >True Word Count In LaTeX | Ubuntu Bloog <Liked it!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 548 other followers

%d bloggers like this: