Tuesday, 21 January 2014

Compilation in C++ vs Java

If you know a programming language, and how to use its syntax, you should also know how your instructions are converted into machine instructions (which is 1s and 0s).

C++ is considered to be a purely compiled language whereas Java is not.
To those who have no idea about what a compiled language is, the definition is as follows.

Compiled Language: A language that requires a compiler program to turn programming source code into an executable machine-language binary program. After compiling once, the program can continue to be run from its binary form without compiling again

The last line of the above definition is really important.It is what emphasizes the main difference of a compiled language vs interpreted language.

To illustrate the main idea of this post, let's look deep into how the compilation process work in C++ and Java.  The following image illustrates how a source file in C or C++ gets converted into machine language.


As you can see in C++, once the compilation is done, it is already converted into machine code( Even without the linking process ). An Object file contains machine instructions for that specific source file.The linking process links all the object files and makes a single executable which can be executed on that platform. Therefore whenever you want to run this program, what you do is run the executable.You do not need to compile it again. 


In Java the process is a bit different. The following image explains how it is done.

In Java when you run " javac MyClass.java "the output is a .class file( contains byte code).This file is platform independent, and you can run the same file within different platforms(Linux or Windows or MacOS). Which means if you compile the source in a windows machine, and you run the .class file in a Linux machine, the program works exactly the same (you should have the JRE to run a java program). If you have a Java Runtime Environment installed on your computer, there exists a Java Interpreter in it which is able to run a .class file ( converts byte code in .class to machine instructions). when you execute "java MyClass", what really gets executed is this .class file. 

Therefore as you can see, Java is not a purely compiled language. it does not convert the source file into an executable by only using the compiler. It uses a program called the Interpreter which gets executed each time when you run that Java program. It is what converts the .class into 1s and  0s and makes your program understandable to your machine. 


"goto" in C++ vs Java

Many programmers believe that "goto" is a keyword that exists in C++, but not in Java. To be exactly correct, that is wrong.

In Java, there exists the keyword "goto". If you go through the keywords in Java by doing a google search, you may come across "goto". The difference when comparing to C++ is that , in Java there is no implementation defined for this keyword until today, but it is reserved as a keyword.

Why would anyone reserve a keyword which is not used? My feeling is that so they could be used one day ( in a future release) if the language designers felt the need. Therefore it is an unimplemented, but reserved keyword in Java. You shouldn't be using it as an identifier, and if you do so, the compiler may output an error.

Control Flow in C++ vs Java

Control flow is a term which is used a lot in computer science. It refers to the different paths of executions a program may choose to run.

If, else, while, for and do  are some of the keywords used in both languages and have almost the same meaning. These keywords are associated with a condition to be checked upon.

In C++ the result of the conditional test may return any integer value(negative, positive or zero). In Java the result of the test should be a boolean( true or false) value and anything else is not permitted( if you try you may get a compile time error). If C++ can return an integer for a conditional test  then, how does the C++ compiler decide on which path to choose?

The answer is, in C++, any non-zero integer or pointer is considered to be "true", and the integer 0 is considered to be "false".