A set of features for word-level confidence estimation is developed.
The features should be easy to implement and should require no additional
knowledge beyond the information which is available from the speech recognizer
and the training data.
We compare a number of features based on a common scoring method, 
the normalized cross entropy. We also study different ways to
combine the features. An artifical neural network leads to the best performance,
 and a recognition rate of 76% is achieved. 
The approach is extended
not only to detect recognition errors but also to distinguish between insertion
and substitution errors.