It's hard to say if it's an error, defect, or fault although it seems more like an error.
Assuming the removal of the "<" was to prevent XML tag mischief, this seems like a discrepancy in the set of admissible characters in the text input (vs. inadmissible control codes). Clearly the scanner was expected to handle alphanumeric characters but what about other non-alpha ASCII symbols (including the high-bit set) or the entire set of unicode characters? If the human writers of the scanned documents were told never to use "<" but write it out as "less than" instead, this would not be a problem.
Thus one could argue that a problem like this was operator error (or a failure to train users) although given the commonality of the "<" in technical text, it seems more like a failure to ensure a correct specification of the scanner or to validate that the scanner adhered to a spec that required correct handling of 100% of the set of characters found in the documents.
The other interesting possibility is that the scanner was an already-developed and "well-validated" module that no one would suspect of harboring a problem. An OCR library that had been developed, tested, and extensively used to scan old newspapers, magazines, fiction, and mainstream books might seem like error-free software because all the previous use cases involved non-technical text where "<" is never used. In that regard, the scanner was performing exactly as it was intended to perform but the system integrators were using the library in ways outside the intended purpose of the scanner (not unlike using Windows in a medical device).