|
|
||
|
|
What is a Parser? A parser breaks data into smaller elements, according to a set of rules that describe its structure. Most data can be decomposed to some degree. For example, a phone number consists of an area code, prefix and suffix; and a mailing address consists of a street address, city, state, country and zip code. Consider the following data:
Because of the way these items are formatted, we recognize them as a list of phone numbers. The structure of these items may be described informally as:
"A phone number consists of a three-digit area code, enclosed by parentheses, followed by a three-digit prefix, followed by a dash, followed by a four-digit suffix." This description can be expressed more formally as a grammar. A grammar is a set of rules that describe the structure, or syntax, of a particular type of data. The following grammar describes the syntax of phone numbers:
Each rule in the grammar, known as a production rule, describes the composition of a named symbol. The "::=" notation may be interpreted as "is composed of". Hence, the first production rule states that a phone_number is composed of a left parenthesis, followed by an area_code, followed by a right parenthesis, and so on. The next rule states that an area_code is composed of exactly three digits. Note how closely the grammar corresponds to the informal description of phone numbers. Once the syntax of a data source has been described by grammar rules, a parser can use the grammar to parse the data source; that is, to break data elements such as phone numbers into smaller elements, such as area codes. The output of the parser is a parse tree. The parse tree expresses the hierarchical structure of the input data. For example, the following parse tree is generated when phone number "(800) 555-1234" is parsed, using the grammar shown above:
Parsing is the process of matching grammar symbols to elements in the input data, according to the rules of the grammar. The resulting parse tree is a mapping of grammar symbols to data elements. Each node in the tree has a label, which is the name of a grammar symbol; and a value, which is an element from the input data. While parsers have traditionally been used in the construction of compilers, they're also quite useful in routine programming tasks, such as reading comma-delimited files, extracting data from formatted reports, and verifying the correctness of data formats. In fact, as applications continue to become more information-centric, the need for robust parsing technologies continues to grow. Common uses for parsers include:
How do I build a Parser using ProGrammar?
|
|
|
For comments or questions about this site, please contact webmaster@programmar.com Copyright © 1998-2008 NorKen Technologies, Inc. All rights reserved. |
||