Parsing is basically when a process (a parser) reads and interprets data. Generally the data is formatted, has a syntax, and maintains other issues of compliance or the parser returns an error/warning.
The particular parser we are talking about is an SGML (Standard Generalized Markup Language) parser. Now here is where it all gets fun. Markup languages, like HTML and XML, are not a new idea. Back in the 1970's a need arose for document structure - a way to not only describe the formatting of text, but even the meaning of text (ex. seperate format text from printed text), and also allow programs to interpret this document in a variety of ways - without having to imbed all those ways into the actual document.
Programmers term this:
"The principle of separating document description from application function..."
http://www.sgmlsource.com/history/roots.htm
SGML parsers look at a document like an engineer might look at a machine. Component pieces, their arrangement, their relationship, and then their meaning (generally in that order too). The parser reconstructs the document according to SGML standards. In this strict reconstruction, errors become very clear to the parser. Some people compare the document structure to a tree, and that's often how parser interpretations are expressed. This sort of analogy allows you to easily associate elements of the document to one another and the entire document.
HTML is a simplified DTD from SGML - a child markup language. An SGML parser should be able to handle HTML if the opening DTD line exists (citing the document at HTML), and the parser will hold the document exactly to the specification which the document cites.
A program that simply scans a document looking for something in particular (but not exhaustive nor strict) is called a scanner. Basically, if you write an incomplete parser you can call it a scanner - that way the parser writers won't go after you for tarnishing their term.
For more than you want to know about parsing, try this:
http://www.cs.vu.nl/~dick/PTAPG.html