Sunday, July 11, 2010

How many keywords do you type in your code?



Have you ever wondered how many keywords (reserved words) do you type in your programs? Can you type more or type less without changing what your program does? Can you change programming language and save some typing? And if so, which language? If you have, then I knew it! I’m not alone! these questions can be easily answered by looking at the graphs of the bottom!

UPDATE: second part of this post is here: http://carlosqt.blogspot.com/2011/01/how-many-keywords-in-your-source-code.html 

Today’s post is about programming language keywords, also known as reserved words, and how many of them you need to write some code. Let’s start with some definitions taken from Wikipedia: http://en.wikipedia.org/wiki/Keyword_(computer_programming)

“In computer programming, a keyword is a word or identifier that has a particular meaning to the programming language. The meaning of keywords — and, indeed, the meaning of the notion of keyword — differs widely from language to language.”

“In many languages, such as C and similar environments like C++, a keyword is a reserved word which identifies a syntactic form. Words used in control flow constructs, such as if, then, and else are keywords. In these languages, keywords cannot also be used as the names of variables or functions.”

“Languages vary as to what is provided as a keyword and what is a library routine. Some languages, for instance, provide keywords for input/output operations whereas in others these are library routines. In Python (versions earlier than 3.0) and many BASIC dialects, print is a keyword. In contrast, the C and Lisp equivalents printf and format are functions in the standard library.”

To be able to get some numbers I needed to have some input data:
  1. The Code (the program where the keywords are used) 
  2. The Language (the programming languages used to write the program)
  3. The Rules (that allowed me to compare 1 to 1 each of the 20 programming languages)

THE CODE
The first step was to decide which program to use. I quickly found exactly what I was looking for! An Object Oriented version of the famous Hello World program which included syntax for: creating a class, defining a private data member, defining a method, creating a program, creating an instance of an object and calling the method. Why did I choose that? You can find out in one of my previous posts (http://carlosqt.blogspot.com/2010/06/oo-hello-world.html).

THE LANGUAGE
Once I identified the program to use; I took 20 programming languages from my list of “The most active .NET and Java languages” (http://carlosqt.blogspot.com/2010/06/most-active-net-and-jvm-languages.html) that provide built in Object Oriented capabilities. Those languages were: 
.NET/CLR: C#, VB.NET, C++/CLI, F#, Boo, Phalanger, IronPython, IronRuby, Delphi Prism, Zonnon, Nemerle, Cobra and JScript. Java/JVM: Java, JavaFX, Groovy, Jython, JRuby, Fantom and Scala.

THE RULES
Finally, I knew that the combination of the programming language and the compiler gives you a lot of freedom on how to write your code. It can be that the import statement is not required because an implicit import to System is already done, you can write it as an Script (if the language support it) which does not require a program class nor a Main method, you can begin blocks on one line or another, sometimes the language define their best practices on how to write your program, etc. So, I decided to add some rules:

1.       All programs written in any language must follow the same Program structure.

[Import / using / etc]
Class
                Data
                Constructor
                Method
Program
                Instantiate Class
                Call Method       => OUTPUT (Hello, World!)

2.       If the language supports extra features that are built in such as: String.Capitalize() mehod or ucwords(), UCase() global functions I used them.

3.       Even if a language defines “string” as keyword I didn’t count it as such. I considered string to be String.System even if I used the string keyword.

4.       To be able to follow the first rule (same program structure), but to also show the language Scripting support I decided to create 2 versions of the same program by language:

Version 1 (Minimal): The minimum you need to type to get your program compiled and running.
Version 2 (Verbose): Explicitly adding instructions and keywords that are optional to the compiler.

If scripting is supported It was done that way in the “Minimal” version.

THE RESULTS
You can find the 20 programming languages and their respective version of the program in my last 19 (+ JavaFX) previous blog posts which can be easily located in the Blog Archive (2010 - July (6) and June (13)) section on the right.

For God’s sake show me the numbers!!!

VoilĂ ! Here you go! (click on the image to enlarge)



Total Keywords by Language

Language Keywords
Cobra 139
VB.NET 137
Delphi Prism 134
C# 102
F# 97
Phalanger 92
C++/CLI 84
JScript.NET 75
JavaFX 70
Boo 66
Nemerle 59
Groovy 57
Zonnon 50
Java 50
Fantom 46
Scala 40
IronRuby 38
JRuby 38
IronPython 31
Jython 31



Version 1 (Minimal)

Language Keywords
Delphi Prism 35
VB.NET 24
Zonnon 16
Java 12
C++/CLI 11
Phalanger 10
C# 10
Cobra 9
Nemerle 9
JScript.NET 7
Boo 7
JRuby 6
IronRuby 6
F# 5
Scala 5
Fantom 5
JavaFX 5
Groovy 5
Jython 4
IronPython 4


Version 2 (Verbose)

Language Keywords
Delphi Prism 39
VB.NET 34
Zonnon 21
Boo 17
Java 16
Nemerle 16
Phalanger 16
C++/CLI 16
JavaFX 15
JScript.NET 15
C# 15
Cobra 14
F# 12
Groovy 11
Scala 10
Fantom 9
JRuby 6
IronRuby 6
Jython 6
IronPython 6



Both (Minimal and Verbose) 

Language Minimal Verbose
JavaFX 5 15
Scala 5 10
JScript.NET 7 15
Fantom 5 9
JRuby 6 9
Jython 4 6
Cobra 9 14
Java 12 16
Groovy 5 11
Nemerle 9 16
Zonnon 16 21
Delphi Prism 35 39
IronRuby 6 9
IronPython 4 6
Phalanger 10 16
Boo 7 17
F# 5 12
C++/CLI 11 16
VB.NET 24 34
C# 10 15



THE CONCLUSIONS

We can identify 3 main groups of languages:
1. Scripting/Dynamic Languages (Python, Ruby, Groovy, etc.) fewer keywords
2. Functional Languages (F#, Scala, etc.) more or less than the previous group
3. Imperative/Static Languages (C#, C++, VB.NET, etc.) more keywords


Evidently, the scripting/dynamic languages are the ones that define the fewer keywords with the exception of Groovy, which I think is because its class modifiers or Java syntax compatibility.

We can also see a relation (not in every case) between the total number of keywords that the language defines with the number of keywords you need to type in the code using that language.

Delphi Prism is the most verbose language followed by VB.NET!
Python is the most minimalistic language followed closely by Groovy, Ruby and Fantom.

I have read that Delphi Prism syntax is very elegant (and I agree), but the same is true with F# and Scala, so, this tell us that the verbosity of the language has nothing to do with its elegance or expressiveness.

To answer the questions at the top lets see one by one:
How many keywords (reserved words) do you type in your programs?
> That will depend of your program (size), but this can give you a basic idea.

Can you type more or type less without changing what your program does?
> Yes. That's why I did 2 versions of the same program: the Minimal and the Verbose.

Can you change programming language and save some typing?
> Yes. But in the real life I don't think that you can change a language just because how many keywords you type with it! hehe you usually do it because of features or impositions or preferences.

And if so, which language?
> That's your choice! :D

So, what do you think?


 Note: if I did a mistake on any counting sorry on that. If you let me know where I can update it asap :)

11 comments:

  1. I guess it would be more work to try extracting the same stats for language examples for a task from the Rosetta Code site, for example: http://rosettacode.org/wiki/Bulls_and_cows.

    This would allow the language to solve the task in its idiomatic way without too much forcing of style (in this case OO).

    Nice post!

    ReplyDelete
  2. @Paddy3118
    Thanks for your comments

    I still have some topics I want to talk about these examples before move forward with more complex examples (as you said idiomatic examples) but i probably will! the Bulls_and_cows is a good example to work with.

    I'm working on my next post which will compare the languages based on Source Lines of Code for the same OO program.

    ReplyDelete
  3. Very interesting comparison.

    How does Clojure compare?

    ReplyDelete
  4. @Andreas Pauley

    Hi Andreas, concerning your question, I did not include Clojure because it does not support built-in OO syntax, which is one of the rules of the comparison, and also because, as it was mentioned in another comment (for another post), it looks like Clojure does not have language keywords (reserved words), execept for: "true, false, nil"?

    ReplyDelete
  5. The graph goes from most to least keywords.
    The text goes from least to most keywords.
    What were you thinking?

    ReplyDelete
  6. @ George,
    Hi George, I did notice that, but I did not consider it a big deal. Anyway, if it disturbs readers, I will change it :)

    ReplyDelete
  7. i use a single keyword. thanks for sharing this article. graphs are really interesting. i have a very good analysis.

    ReplyDelete
  8. This is really valuable and interesting information regarding keywords. I am sure it would be great help for everyone.

    ReplyDelete
  9. This helps a lot, I just added this feed to my bookmarks. I have to say, I very much enjoy reading your blogs. Thanks!

    ReplyDelete