intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

SAX Parsing

Chia sẻ: Nguyen Uyen | Ngày: | Loại File: PDF | Số trang:21

37
lượt xem
1
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Method of representing data  Differs from HTML by storing and representing data instead of displaying or formatting data  Tags similar to HTML tags, only they are user-defined  Follows a small set of basic rules  Stored as a simple ASCII text file, so portability is insanely easy.

Chủ đề:
Lưu

Nội dung Text: SAX Parsing

  1. SAX Parsing SAX Presented by Clifford Lemoine CSC 436 Compiler Design
  2. SAX Parsing Introduction SAX Review of XML  What is SAX parsing?  Simple Example program  Compiler Design Issues  Demonstrated by a more complex Demonstrated  example example Wrap-up  References 
  3. Quick XML Review Quick XML – Wave of the future  Method of representing data   Differs from HTML by storing and Differs representing data instead of displaying or formatting data formatting  Tags similar to HTML tags, only they are Tags user-defined user-defined  Follows a small set of basic rules  Stored as a simple ASCII text file, so Stored portability is insanely easy portability
  4. Quick XML Review Quick Syntax  Every XML document has a preamble   An XML document may or may not have An  a DTD (Document Type Definition) or Schema Schema 
  5. Quick XML Review Quick Syntax cont.  Every element has a start and end tag, Every  with optional attributes with …
  6. Quick XML Review Quick Syntax cont.  Elements must be properly nested Elements properly   The outermost element is called the root element The root  An XML document that follows the basic syntax rules An is called well-formed well-formed  An XML document that is well-formed and conforms An to a DTD or Schema is called valid valid  Once again, XML documents do not always require a Once DTD or Schema, but they must be well-formed must
  7. Quick XML Review Quick Sample XML files  Catalog.xml   authorSimple.xml  authorSimpleError.xml
  8. What is SAX Parsing? What Simple API for XML = SAX imple PI ML SAX  SAX is an event-based parsing method SAX event-based  We are all familiar with event-driven software, We  whether we know it or not whether Pop-up windows, pull-down menus, etc.  If a certain “event” (or action) happens, do If  something something A SAX parser reads an XML document, firing SAX  (or calling) callback methods when certain events are found (e.g. elements, attributes, start/end tags, etc.) start/end
  9. What is SAX Parsing? What Benefits of SAX parsing  Unlike DOM (Document Object Model), SAX does not store Unlike  information in an internal tree structure information Because of this, SAX is able to parse huge documents (think Because  gigabytes) without having to allocate large amounts of system resources system Really great if the amount of data you’re looking to store is Really  relatively small (no waste of memory on tree) relatively If processing is built as a pipeline, you don’t have to wait for If  the data to be converted to an object; you can go to the next process once it clears the preceding callback method next
  10. What is SAX Parsing? What Downside  Most limitations are the programmer’s Most  problem, not the API’s problem,  SAX does not allow random access to SAX the file; it proceeds in a single pass, firing events as it goes firing  Makes it hard to implement cross- referencing in XML (ID and IDREF) as referencing well as complex searching routines well
  11. What is SAX Parsing? What Callback Methods  The SAX API has a default handler class built in so The  you don’t have to re-implement the interfaces every time (org.xml.sax.helpers.DefaultHandler) every  The five most common methods to override are: startElement(String uri, String lname, String qname, startElement(String  Attributes atts) Attributes endDocument(String uri, String lname, String qname)  characters(char text[], int start, int length)  startDocument()  endDocument() 
  12. Simple Example Program Simple Sax.java  Instantiates a SAX parser and creates a Instantiates  default handler for the parser default  Reads in an XML document and echoes Reads the structure to the standard out the  Two sample XML documents: authorSimple.xml   authorSimpleError.xml Demonstration here 
  13. Compiler Design Issues Compiler What is actually happening when a SAX What  parser parses an XML document? parser What type of internal data structures What  does it use? does How do the callback methods fit in?  Can it solve problems of world peace, Can  hunger, and death? (Or at least can it help me pass Compiler Design?) help Demonstrated with Demonstrated  SaxCatalogUnmarshaller example SaxCatalogUnmarshaller
  14. Compiler Design Issues Compiler Heart of the Beast  Underneath it all, the SAX parser uses a stack Underneath stack   Whenever an element is started, a new data object Whenever is pushed onto the stack is  Later, when the element is closed, the topmost Later, object on the stack is finished and can be popped object  Unless it is the root element, the popped element Unless will have been a child element of the object that now occupies the top of the stack (board) now
  15. Compiler Design Issues Compiler Heart of the Beast cont.  This process corresponds to the shift- This shift-  reduce cycle of bottom-up parsers  It is crucial that XML elements be well- formed and properly nested for this to formed work work
  16. Compiler Design Issues Compiler startElement()  Four parameters:  String uri = the namespace URI (Uniform Resource  Identifier) Identifier) String lname = the local name of the element  String qname = the qualified name of the element  Attributes atts = list of attributes for this element  If the current element is a complex element, an If  object of the appropriate type is created and pushed on to the stack pushed If the element is simple, a StringBuffer is pushed on If  StringBuffer to the stack, ready to accept character data to
  17. Compiler Design Issues Compiler endElement()  Three parameters:  String uri = the namespace URI (Uniform Resource Identifier) (Uniform   String lname = the local name of the element  String qname = the qualified name of the element The topmost element on the stack is popped, The  converted to the proper type, and inserted into its parent, which now occupies the top of the stack (unless this is the root element – special handling required) (unless
  18. Compiler Design Issues Compiler characters()  Three parameters:  char text[] = character array containing the entire entire  XML document XML  int start = starting index of current data in text[] text[]  int length = ending index of current data in text[] text[] When the parser encounters raw text, it passes a When  char array containing the actual data, the starting position, and the length of data to be read from the array read
  19. Compiler Design Issues Compiler characters() cont.  The implementation of the callback The  method inserts the data into the StringBuffer located on the top of the stack StringBuffer  Can lead to confusion because of: No guarantee that a single stretch of No  characters results in one call to characters() characters()  It stores all characters, including It whitespace, encountered by the parser whitespace,
  20. Wrap-up Wrap-up SAX is an event-based parser, using SAX  callback methods to handle events found by the parser found Applications are written by extending Applications  the DefaultHandler class and overriding the DefaultHandler event handler methods event The SAX parser usually uses a stack to The  perform operations perform And No, SAX will not save the world… 
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2