
(End User Programming with Hierarchical
Objects for Robust Interpreted Applications)
Euphoria is
a powerful yet simple programming language, developed by Robert Craig at Rapid
Deployment Software in 1993. It is very easy to use, and it has good support,
which makes it an excellent language for novice programmers. Euphoria is an
interpreted language, just like AWK or QBasic. Although Euphoria does not claim
to be object-oriented, some disagree by stating that its scope rules and
flexible data structures allow you to simulate any method of programming,
including object-oriented. "Euphoria is a small, fast, cheap programming
language and a true gift to young programmers.", states Paul Smith in an
August 1997 article in the Monitor.
Euphoria
is a new programming language with the following advantages over conventional
languages:
Ø
A remarkably simple, flexible, powerful language
definition that is easy to learn and use.
Ø
Dynamic storage allocation. Variables grow or shrink
without the programmer having to worry about allocating and freeing chunks of
memory. Objects of any size can be assigned to an element of a Euphoria
sequence (array).
Ø
A high-performance, state-of-the-art interpreter
that's at least 10 to 30 times faster than conventional interpreters such as
Microsoft QBasic, Perl and Python.
Ø
Lightning-fast pre-compilation. Your program is
checked for syntax and converted into an efficient internal form at over 35,000
lines per second on a Pentium-150.
Ø
Extensive run-time checking for: out-of-bounds
subscripts, uninitialized variables, bad parameter values for library routines,
illegal value assigned to a variable and many more. There are no mysterious
machine exceptions: you will always get a full English description of any
problem that occurs with your program at run-time, along with a call-stack
trace-back and a dump of all of your variable values. Programs can be debugged
quickly, easily and more thoroughly.
Ø
Features of the underlying hardware are completely
hidden. Programs are not aware of word-lengths, underlying bit-level
representation of values, byte-order etc.
Ø
A full-screen source debugger and an execution
profiler are included, along with a full-screen, multi-file editor. On a color
monitor, the editor displays Euphoria programs in multiple colors, to highlight
comments, reserved words, built-in functions, strings, and level of nesting of
brackets. It optionally performs auto-completion of statements, saving you
typing effort and reducing syntax errors. This editor is written in Euphoria,
and the source code is provided to you without restrictions. You are free to
modify it, add features, and redistribute it as you wish.
Ø
Euphoria programs run under Linux, 32-bit Windows,
and any DOS environment, and are not subject to any 64K or 640K memory
limitations. You can create programs that use the full multi-megabyte memory of
your computer, and a swap file is automatically used when a program needs more
memory than exists on your machine.
Ø
You can make a single, stand-alone .exe file from
your program.
Ø
Euphoria routines are naturally generic. The example
program below shows a single routine that will sort any type of data: integers,
floating-point numbers, strings etc. Euphoria is not an
"object-oriented" language, yet it achieves many of the benefits of
these languages in a much simpler way.
Example
Program
The
following is an example of a complete Euphoria program:
sequence list, sorted_list
function merge_sort(sequence x)
-- put x into ascending order using a
recursive merge sort
integer n, mid
sequence merged, a, b
n = length(x)
if n = 0 or n = 1 then
return x -- trivial case
end if
mid = floor(n/2)
a = merge_sort(x[1..mid]) -- sort first half of x
b = merge_sort(x[mid+1..n]) -- sort second half of x
-- merge the two sorted halves into one
merged = {}
while length(a) > 0 and length(b) >
0 do
if compare(a[1], b[1]) < 0 then
merged = append(merged, a[1])
a = a[2..length(a)]
else
merged = append(merged, b[1])
b = b[2..length(b)]
end if
end while
return merged & a & b -- merged data plus leftovers
end function
procedure print_sorted_list()
-- generate sorted_list from list
list = {9, 10, 3, 1, 4, 5, 8, 7, 6, 2}
sorted_list = merge_sort(list)
? sorted_list
end procedure
print_sorted_list() -- this command starts the program
The
above example contains 4 separate commands that are processed in order. The
first declares two variables: list and sorted_list to be sequences (flexible
arrays). The second defines a function merge_sort(). The third defines a
procedure print_sorted_list(). The final command calls procedure
print_sorted_list().
The
output from the program will be:
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.
merge_sort()
will just as easily sort {1.5, -9, 1e6, 100} or {"oranges",
"apples", "bananas"}.
The language is a possible
replacement for QBasic or AWK as a quick one-off file manipulation tool; for
QBasic, Fortran, or extensible matrix packages for coding mathematical or
statistical procedures; for QBasic or Pascal as a beginner's first procedural
language; and for C or assembler as a language for fast, all-out action/arcade
DOS game programming.
Euphoria is not a visual
rapid application development tool for the corporate client-side programmer or
a Web language for the Web site administrator: Delphi/Power++ and Java/PERL are
safe for the moment, but the rest of the procedural world may be in play.
No one has to learn Euphoria
to stay employed. On the other hand, for those of us who, through duty or
delight, crank programs out daily, this new language is worth a glance. Rapid
Deployment is an exact description of the capability Euphoria delivers to the
procedural programmer.
Small
Euphoria is a small
language. C and C++ have char, short, int, long, float, double (with signed and
unsigned variants) and pointers of one or another of these forms as built-in
scalar types. Euphoria has "atoms".
All scaling in Euphoria is
internal. Counts don't overflow. If the decimal value for an ASCII character is
sent to a function that speaks ASCII, then that value goes out as an ASCII
character. The use of a single scalar type should make Euphoria interesting to
those who analyze statistical data, track the national debt, or transform text
files.
There is only one other
primary built-in type: the sequence. A sequence is, simply, a sequence of none
or more atoms or sequences. The concept is a bit startling, like LISP's lists.
It is much more than an array, because it can nest to any depth and does not
require equal "sizes" in its elements, rows, columns, and so on.
There are some subtle conventions in Euphoria's syntax that come into play
here. Examples are that single quotes surround only atoms (single characters or
numbers) while double quotes surround sequences (strings of none or more
characters or numbers).
atom a,b,c sequence d,e,f,g,h,i a = 'a' -- lowercase a, ASCII 97 b = ' ' -- blank, ASCII 32 c = 10 -- scalar value 10, also LF d = "" -- an empty sequence e = " " -- a sequence of one blank f = {10} -- a sequence of one 10 g = "Hello, World.\n" -- a sequence of 14 atoms h = {a,b,c,d,e,f,g} -- a sequence i = {{a,b,c},{d,e,f},{g,h}} -- a sequence of sequences
The double dash is
Euphoria's only comment designator. The braces are a sequence-forming operator.
For example, d = {} is exactly the same as d = "", and completely
different from d = '' which is, after all, illegal since an atom must exist
rather than be only an empty molecule.
Euphoria takes from Pascal
and spreadsheets the double-dot convention in indexing. There are no "wild
pointer" errors in an executing Euphoria program. Of course, there are no
pointers either.
Note the smallness of the
syntax: Parentheses only group arithmetical expressions and function
parameters. Braces only group sequence definitions. Brackets only group
sequence indexes. And there are only two primary built-in types, atoms and
sequences. Yet these are enough to reproduce almost all the complex types and
structures of all other procedural languages.
Actually there are two
secondary types built in, but they are idioms rather than independent species:
An object is a variable that can be either an atom or a sequence, and an
integer is a signed atom with 30 bits, about 1 trillion. The atom tops out as a
signed double floating point value of about 10 to the 300th power
with 15 or 16 significant decimal places. The whole Euphoria math package is
standard IEEE double precision, and it includes the infinities and not a
numbers (NANs) of Intel's floating-point processor.
Despite having only two
primary built-in types of data structures, Euphoria is a strongly typed
language. Type checking is automatic even at run time, but can be turned off
for speed. The programmer is free to define new types, as he or she pleases.
The type definition facility is unique: it tests whether a variable meets its
definition.
Euphoria's type definition
facility is similar to modern database languages with enforced data definitions
and business rules built into the defining mechanism: all under programmer
control and all alive at run time. Euphoria is completely type safe.
Euphoria's atoms and
sequences do it all. They just do it very, very fast.
The rest of the language is
just as clean but a little more conventional. All the arithmetic, relational,
and logical operators are in Euphoria, and with common precedence. Powers and
remainders are functions rather than special symbols. Sequences can be
concatenated with & although there are also append and prepend functions.
arithmetical: + - / * relational: < > <= >= = != logical: and or not
An assignment statement in
Euphoria is, indeed, a statement rather than an expression as it is in C. The
double equal sign (==) in C for the relation of equality is not needed because
an assignment statement and a relational expression can never be confused in
Euphoria's syntax.
The list of operators in
Euphoria is small, but in Euphoria small is powerful.
All Euphoria's operators are
vectorial, including the relational and logical operators. If the parameters
are both atoms, the operator works as it usually does in other languages. If
the parameters are an atom and a sequence, the operator applies the atom to
every member of the sequence. If the parameters are two sequences of equal
length, the operator applies elementwise along both sequences. Any other
combination gets an error message.
The results arrive at a
furious rate. This feature turns a personal computer into something very like a
vector processor. Deep in the innermost loop of a linear algebra package is an
operator always called "saxpy" for "scalar a times vector x plus
vector y". In Euphoria that is a single statement. Of course that's true
for Cray FORTRAN, too. Statisticians, spreadsheets, and physicists all use code
built upon such vectorial operators. They also come in handy for updating
players' scores in a game and their position vectors.
Euphoria is still under
construction. Two language facilities are missing. The first is scalar
accumulation along a vector operator. For example, cumulating the sum of the
products of a vectorial multiply. This is the physicist's scalar or dot
product, the statisticians' variance/covariance summation, and (when the
operator is != not equal) the game programmers' collision detector. (Babel's
curse: each discipline renames the basic math concepts, often several times.)
Of course these functions can be programmed in Euphoria, but nothing can match
the speed of a built-in function, and Euphoria's design cries out for a scalar
inner product symmetrical with the vector outer product it already provides.
The other missing syntactic
feature in Euphoria (as in AWK) is the run-time function specifier. C, C++, and
Java all use a pointer-to-function scalar type, and these are the only pointers
that Euphoria's index syntax can't replace. More powerful languages (like LISP,
Scheme, Forth, or PERL) have an eval, apply, or interpret function built-in
that will evaluate a string and apply the function it specifies.
Euphoria will have to gain
at least the function pointer to become an object oriented programming language
(OOP) because the "member function" is the only missing ingredient
for completely encapsulated "class" definitions. Notice that
Euphoria's type definitions are otherwise completely inheritable. The same
facility is at the heart of simulated annealing, genetic algorithms, nonlinear
function maximization, and other generalized tools of modern numerical
analysis. It is also the key to "strategy" routines in game
programming.
The last small features of
Euphoria are the statements themselves, and there are only a few of them. The
three control flow statements are if, while, and for. Euphoria adds a simple
elsif clause to the if statement. The while statement is also simple, and the
for statement is classic.
The indexing variable in the
for statement is local to the loop and disappears outside it. Euphoria also has
a modern namespace structure where variable and function names are local to
their enclosing procedures or include files. A function or procedure must be
marked as global if its name is to be exported beyond its namespace.
Both the while and the for
statements can have an exit statement within them that exits to the first
statement following the innermost enclosing loop. Euphoria suffers nothing
comparable to Java's labelled continue statement.
There are no semicolons at
the end of lines. Euphoria is a modern stream language with new lines as white
space. We may break up any statement as we please, or put several on a single
line. The indentation wars should be glorious.
This small language has only
two built-in types, full type definition facilities, run-time type checking,
full IEEE math, vectorial operators, run-time bounds checking, structured
syntax, streaming statements with simple delimiters ("end if"), and
modern namespaces. It is an ideal language for the beginner because there is
only a little to learn now and nothing to unlearn later.
Fast
Euphoria is an interpreted
language, just like AWK or QBasic. The programmer codes in the swift
edit-test-edit cycle without waiting for compile/link operations. Euphoria has
an integrated debugger built into the interpreter and also has a compiler to
turn debugged code into a distributable EXE file: the best of both worlds in
one small fast language.
Coding is faster because
Euphoria supplies 8 function libraries containing 40 functions in addition to
the 48 built into the language. The function libraries cover graphics, image
processing, mouse reading, file and directory operations, command-line wildcard
specifiers, sorting, keyboard input, full machine access to user assembly
routines, interrupts, and memory assignments. The libraries (and the editor)
are all in open source so the programmer can lift and learn rather than
recreate.
Notice that there is no
"dimensioning" in statements defining Euphoria's sequences and no
arbitrary length or end markers for text strings. The programmer doesn't have
to work out how to fit objects into 64-kbyte blocks or the 640 Kbytes of lower
memory. Euphoria has a built-in memory manager that gives the whole 32-bit flat
address space of the machine's memory to the programmer, and then automatically
pages out to disk if more is needed. A compiled Euphoria program carries the
whole virtualized memory mechanism right along with it, and almost always
produces an EXE file smaller than 200 Kbytes.
In addition to the memory
manager and virtualized memory paging to disk, Euphoria has an exceptional
garbage collection algorithm that recaptures and reallocates unused memory
automatically.
Euphoria code executes very
rapidly: Creator Robert Craig claims 10 to 20 times the execution speed of
QBasic and 8 times the speed of Java.
Euphoria is fast to learn,
fast to code, and flies when it executes. Euphoria is fast.
BIBLIOGRAPHY
Craig, Robert and Junko C. Miura. “The Official Euphoria
Homepage.” Rapid
Deployment Software. 2001. http://www.rapideuphoria.com/ (22 Nov. 2001).
“Euphoria.” Hippy's Happy Home Page. 1999.
http://www.hippy.freeserve.co.uk/euphoria.htm
(23 Nov. 2001).