Disclaimer: Quite some time has passed since I started writing this page. At the time, .NET was at 2.0 and it has been pointed out to me that some of my concerns have been addressed in later versions of the language. Thanks to Nelson LaQuet for an extensive critique.
Why C# Is Not My Favorite Programming Language
1. Default Object Lifetime Is Non-Deterministic
In most object-oriented languages, there is a very specific time when an object constructor is called (namely, when an object is instantiated) and when its destructor is called (namely, when it falls out of scope).
In C#, they have taken the "garbage collection" paradigm one step too far. Not only does memory management rely on it, but even the object destructor is called "somewhen", at an unpredictable time! In a previous version of this document I wrote "somewhen after the object falls out of scope", but it turns out to be even worse, so I'll devote a separate section to the gruesome truth below.
In any case, this means that handy constructs such as an AutoLock
can no longer work (example in C++):
class AutoLock { public: AutoLock(Mutex& m): m_mutex(m) { m_mutex.Lock(); } ~AutoLock() { m_mutex.Unlock(); } private: Mutex& m_mutex; };
In a typical Microsoft-way of thinking, they added a "special case" for this
particular example by means of the lock
keyword (which, incidentally,
would be trivial to simulate in C++ should you like the
keyword-taste of it). However, other "automatic" resource management using object
lifetime (for example, for handles, GDI object, etc.) still won't work.
To resolve this, objects can implement the IDisposable
interface,
which has a Dispose()
method. When object lifetime is important to
you, you should put the relevant cleanup code in the Dispose()
implementation and remember to call Dispose()
on the object yourself.
It is good practise to have the destructor call Dispose()
on the
object too, but as Professional C#, 2nd Edition puts it: "The
destructor is only there as a backup mechanism in case some badly behaved
client doesn't call Dispose()
" (emphasis mine). You see,
only badly behaved clients would forget to clean up after themselves,
so I guess only badly behaved clients would need a garbage collector in
the first place, right?
The proposed "better solutions" for this are the using
keyword,
like so:
using (AutoLock theLock = new AutoLock(m_lock)) { // your protected code here }
or using the finally
clause (which is often recommended over the
using
statement), like so:
AutoLock theLock = new AutoLock(m_lock); try { // your protected code here } finally { theLock.Dispose(); }
In the former case, it gets a nuisance if I have more than one object which
I'd like to have a deterministic life time for, and in the latter case (which,
incidentally, is very similar to the code that gets emitted by the compiler when
you use the lock
keyword) I still need to remember typing
Dispose()
by hand.
And it gets worse! Even program termination doesn't trigger proper cleanup. You can verify this with the following program:
using System; using System.IO; class TestClass { static void Main(string[] args) { StreamWriter sw = File.CreateText("C:\\foo.txt"); sw.WriteLine("Hello, World?"); // Note: We "forget" sw.Close(). // Incidentally, StreamWriter.Dispose(bool) is protected, so we can't call it directly. } }
The foo.txt
file will be created, but it will be empty. Note
that even C specifies that all unflushed data is written out, and files
will be closed, at program termination. And even if I did remember
to call Close()
myself (I wouldn't want to be a badly behaved
client, now would I?), this wouldn't be exception-safe. I am supposed to
remember to use using
, or litter my code with finally
blocks.
2. Object Lifetime is Not Determined by Scope
I wrote above that I initially thought that objects are destroyed "somewhen after they go out of scope", but in reality it seems to be far, far worse. As it turns out, the JIT compiler can do "lookahead optimization", and may mark any object for collection after what it considers it's "last use", ignoring scope!
I have had a colleague ask me about the following code:
{ ReadAccessor access(image); IntPtr p = access.GetPtr(); // lengthy piece of code here doing stuff with the pixels from the image }
A ReadAccessor
is an object which provides access to the pixel
data in an image, which is stored in a memory mapped file for performance reasons.
When a ReadAccessor
is constructed, it maps in the memory, after which
you can call GetPtr()
to get at the actual data. Once it goes out of
scope, it unmaps the memory again. So, the "validity" of the data is guaranteed
for the lifetime of the ReadAccessor
.
Incidentally, there is also a WriteAccessor
, which makes sure that
there be only a single writer at any given time. Of course, people using this code
in C# quickly found out that they had to dispose of these WriteAccessor
s
manually, because otherwise they'd get the error that this
WriteAccessor
would still be sitting in the garbage bin while they were
trying to acquire a new one. But that's the problem mentioned in the item above.
This one is far, far worse.
The colleague told me that his code crashed somewhere in the pixel-processing code.
It took me a while to figure out what was happening: The JIT optimizer looked
ahead a little bit, decided that access
wasn't being used after the
GetPtr()
call, and marked it for collection. Later on in the code,
in the same scope, mind you, the GC apparently decided it was a good time
to destroy the ReadAccessor
, which unmapped the memory still being
used by the code.
I still find this hard to believe (even C# can't be this stupid), but the crash went away by modifying the code like so:
{ ReadAccessor access(image); IntPtr p = access.GetPtr(); // lengthy piece of code here doing stuff with the pixels from the image System.GC.KeepAlive(access); }
This particular item is so mind-boggling that I hope some dear reader can tell me it's just a bad dream and scope is, in fact, honored by the GC.
3. Every Function Must Be A Method
C# imposes an object-oriented paradigm and enforces it by prohibiting the definition of stand-alone functions: every function must be a member of a class.
If you take object-orientation to the extreme, you would not say
float b = sin(a);
float b = a.sin();
This is clearly unpractical. (Ignore the question of how you would take the sine of a number instead of a variable.)
C# (and Java, for that matter) still try to go about half-way there by making
the sine function a member of the Math
class (or namespace —
I can never tell them apart in C#):
float b = Math.sin(a);
If I want to add my own mathematical functions, I either have to extend the
Math
class (which I can't, because it's sealed
) or put
up with the strange distinction that I need to write
float h = Math.sqrt(a*a + b*b);
float h = MyMath.hypot(a, b);
It gets even more scary if you look at the OracleNumber
class,
which also has a sin
method. Luckily, it's static
, and
you can't call static
member functions on instances.
This is related to the following item, but that is bad enough that I think it warrants its own item:
4. Containers Have Algorithms As Methods
The popular ArrayList
container (an auto-resizing container,
comparable to C++'s vector
template) has a Sort()
method.
And a Reverse()
method. But not a Randomize()
method.
Why should some algorithms be member functions, but not others? The answer
is that no algorithms should be member functions. What if I wanted to use
a different sorting algorithm than the one the original implementers
of ArrayList
had in mind?
Note that an ArrayList
sorts itself, while
Array.Sort(...)
is a static
member function of the
Array
class.
If I decide, late in a project, at the performance-tuning stage perhaps, that
I could better use an ArrayList
for some particular collection than
the Array
I used up to now, I will likely have to modify my code in
multiple places.
Note that this is not a shortcoming of the language, but it is partly
a consequence of item number 2, above. Also note that C# shares this problem with
some other languages – even C++ has a few quirks here (the string
class comes to mind).
5. Default Comparison Behavior Is Dangerous
Given a class Vector
, which doesn't overload the comparison
operator==
, I can still write
Vector a, b; if (a == b) { ... }
In C++, the compiler will have the courtesy of telling me there is no
operator==
defined for Vector
s; in C#, this will simply
compile, but it means "compare the references a
and b
",
i.e. it is true
when a
and b
are the same Vector
, not when their value is equal.
Also, because of the following item, you can't add such an operator yourself without
altering the Vector
class:
6. Operator Overloading Is Severely Broken
In C++, given a class Vector
, you can define an operator for adding
two Vector
s without altering the Vector
class itself:
class Vector {}; Vector operator+(const Vector& lhs, const Vector& rhs) { return Vector(whatever it means to add two Vector); }
In C#, this is not possible without altering the Vector
class
itself. Because of the limitation mentioned in item number 2, above, you
cannot make this operator a "free-standing" one. Of course, adding this operator
has nothing to do with the interface to the Vector
class, so you'd
probably try something like this:
public class VectorOps { public static Vector operator+(Vector lhs, Vector rhs) { return new Vector(whatever it means to add two Vectors); } }
Vector
class without overloaded operators, you'll have
to modify the class itself, also introducing a dependency of your class on
the module which happens to implement these operators.
But wait, there's more.
Note that when you overload the operator==
, you also have to
overload operator!=
– but we'll forgive the compiler for not
being able to auto-generate it. It will do a similar "helpful" trick
with arithmetic and bitwise assignment operators – when it most definitely
shouldn't. You cannot overload the arithmetic and bitwise assignment operators
+=
, -=
, etc. Instead, they are evaluated in terms of
other operators that can be overloaded. This is exactly the wrong way
around; most C++ programmers implement an operator+
in terms of
operator+=
.
Suppose you have a class Image
, representing an image. Also,
suppose you have some kind of image processing library, offering functionality to
add two images together. For performance reasons, this library will likely have
separate functions for adding one image in-place, overwriting the old contents,
and for returning a new image containing the result of the addition:
public class ImageProcessing { public static Image Add(Image lhs, Image rhs); public static void AddInPlace(Image lhs, Image rhs); }
You may decide that it's a nice service to clients of your Image
class to offer operators for this, so they can write code like
Image a, b, c; c = a + b; // really c = ImageProcessing.Add(a, b) a += b; // really ImageProcessing.AddInPlace(a, b)
Image
class, because you
have to modify it for this; in addition, your Image
class can now not
be used without the ImageProcessing
class). You would think you'd
override operator+
for ImageProcessing.Add()
and
operator+=
for ImageProcessing.AddInPlace()
, but you
can't. Instead, when your client types a += b
, a whole temporary
Image
will have to be constructed, holding the result of the addition,
after which the left operand is replaced with the result. Good bye performance!
Update: In the 3.0 version of the language, a new feature called "extension methods"
has appeared. It is now possible to add methods to classes without modifying the
original class file, so you could make img.AddInPlace(otherImage)
work.
However, extension methods don't work together with operator overloading.
7. Events Without Subscribers Raise Exceptions
If a tree falls down in the woods and there is nobody there to hear it, does it still make a sound? C# has a very interesting view on this popular Philosophy 101 question.
In C#, there is a concept called delegates. A multicast delegate is a set of methods to be called successively when the delegate is called. When the set of methods is empty, trying to call the delegate raises an exception.
However, events are implemented in terms of multicast delegates, too.
You declare a delegate
and an event
like so:
public delegate void TreeListener(); class Tree { public event TreeListener Fell; public void Fall() { // Fall down, and make some noise. To be discussed. } }
The idea is that clients interested in hearing trees fall can subscribe themselves to the event using a very fancy syntax:
class Client { public Client(Tree tree) { tree.Fell += new TreeListener(TreeFell); } private void TreeFell() // This will be called when the tree falls. { Console.WriteLine("I heard it!"); } }
In the Tree.Fall()
implementation, you'd simply call the
event
:
class Tree { public event TreeListener Fell; public void Fall() { // Fall down, and make some noise: Fell(); } }
So, now comes the important question. What if nobody has subscribed
to the Tree.Fell
event? In that case, the multicast delegate will
be empty, and calling it will raise an exception. You heard it right (or did
you?): Trees simply aren't supposed to fall over when nobody's around.
The suggested solution is to check whether anybody's listening first (if the
event
is empty, it will be null
):
class Tree { public event TreeListener Fell; public void Fall() { if (Fell != null) Fell(); } }
if (Fell != null)
above.
Conclusion
C# is very nice for quickly building GUI applications. Especially programmers used to MFC can't seem to praise C# loudly enough. But then again, if you are used to rusty pins being driven under your fingernails daily, the prospect of being kicked in the groin at unpredicable times but only once a week must sound really attractive.
By the way: my introductionary computer programming book is available here. As you can guess, it doesn't use C#. But I promise I didn't rant against it in the book.