Get Complete Project Material File(s) Now! »
Working with P2A
In this chapter we will discuss about basic XML, validating of XML and XML parsing. We will also discuss how to handle the analysis file in this project.
XML understanding
XML (eXtensible Markup Language) is a general purpose data format. Two version of xml are in place, First 1.0 version was introduced in 1998 without any new versions name it has five editions. Second version of XML was 1.1 and released in February 2004. In February 1998 XML become a W3C Standard. XML has basic syntax to carry information between different kinds of computers and applications. It is being used as a format for transforming the information. It is a famous data formats then the previous because it support both tabular data and structured data. It is a platform independent; XML does not based on any program-ming language, software vendors and operating system. [3][4][5][6]
Some Design Goals
• It should support wide variety of application
• It should be easy to write and process.
• It should be reasonably clear and easy to use over internet. Xml document should be well formed and valid.
In this thesis users will upload an analysis file in the XML format and System need to ex-tract P2I. This xml file will contain information about the object oriented file. It is a standa-lone object oriented program file.
P2A file is divided in two basic parts. Both parts are discussed in detail below.
• Analysis introduction
• P2I
Analysis introduction
From this part of the P2A file system will extract some basic information about Java/object oriented files. It will provide some basic information about the P2A file. .
Project name
In the first part one element will contain the name of the project which we are going to ana-lyzed. In P2A file this information can be seen under the tag of analyzed project.
<analyzedProject>{name of the project}</analyzedProject>.
<Start tag> data/ PCDTA <end tag>
Project version
One element in the P2A file will contain analysis version of the Java file. In this project user is allowed to upload different analysis version for one Java file. So it is important for the sys-tem to make a distinction between two P2A of one Java file. One XML statement will contain the project version information.
<analyzedProjectVersion>{version of analysis}</analyzedProjectVersion>
Project analysis setup
Third information in the introduction part is about the method, of which P2A is being gener-ated. In the P2A file one statement will show, if analysis are generated automatically or ma-nually. Within the syntax of P2A file all this information can be shown under the tag of “analysis Setup”.
<analysisSetup>{generation method}</analysisSetup>
P2I
Second part of P2A file will contain P2I. This is an important part, that need to be stored in the database and for the users P2I is the base for comparison of different files. In this part there can be any kind of information related to object oriented files, for example information about heap object and method invocation. For the heap object it can be, name of different object type attributes in the file and information about the data they are going to store. P2I can contain the information of different objects created in the file. It can be the information of methods called by different objects. P2I part of analysis file will contain any kind of informa-tion about object oriented files.
XML document validation
Validation means to check the document against the construction rules. Whenever users upl-oad P2A file system needs to validate the analysis document. If this analysis file is valid then it will extract the P2I and store them to the database. If file is not validated by the used tech-nology then user is not allowed to upload the file to the system. There are two widely used technologies to validate any xml file are.
• DTD (Document Type Definition)
• Xml Schema
Document Type Definition (DTD)
”A Document Type Definition (DTD) defines the legal building blocks of an XML docu-ment. It defines the document structure with a list of legal elements and attributes.”[7]
Purpose of DTD is to define a structure for XML document. It will define all the possible elements, attributes and attribute type that can be in the xml file. If any XML document is written on the base of DTD then it should be compare to the DTD defined format for valida-tion. When an xml file will be created on the base of DTD then file will contain the informa-tion about DTD. [8]
XML Schema
When we talk about Xml schema it means W3C Xml Schema technology. It is another tech-nology to validate the XML document. XML schema is also used to draw the boundaries of xml document. XML schema was approved as a W3C standard on 2-May-2001. It used to define the legal building blocks of an xml documents. It defines the structure of elements, attributes, child attributes and detail about the data they will contain. Schema will lay down the rules for the element data; it can be some text/number data, empty, fixed or flexible. [9][10]
Technology used in the thesis and why?
Both technologies have their own advantages but for this thesis XML schema will be used because
• Schema use basic XML syntax while DTD use separate syntax.
• Schema supports the Namespace recommendation.
• Schema facilitate user to create complex and reuse able content very easily.
• Schema support the object oriented concept like inheritance.
• Schema allows to validate text content based on both built in and user defined data types.
• Extensible nature of schema. It can be reused in other schemas.
Schema offer all the features a DTD can do, in addition schema provides extra functional-ity. Schema allows more data types and to create custom data types. Schema provides more control on elements. [11]
XML PARSING
For XML document parsing there are lot of technologies are available in the market. There are two widely used xml parsing technologies [12]. Both have their advantages and disadvan-tages one can use them according to the requirements.
• Simple API for Xml (SAX).
• Document Object Model (DOM).
We will discuss these two in detail so we can decide which one can fulfill our needs.
SAX: Simple API for Xml
SAX is a standard API for xml. First SAX 1.0 was released in May-11-1998. It was develop to perform efficient parsing of large xml document. SAX does not have any default object model. It only read the events occurs in the documents. SAX moves from start tag of the doc-ument towards the end tag of the document in a linear fashion. It is fast so it is best choice for large documents. SAX parser is very useful when parsing a large file and we do not want to load complete file in the main memory. SAX parser can work at low connection speed be-cause it processes the bytes as soon as they arrive instead of waiting for the document to complete first. SAX requires less memory and proceeds sequentially. SAX is an event driven parsing technology, it work on the XML tags whenever a tag will found in the file sax func-tion will be called. Working of the SAX is depend on the tag, [13][14][15]
• Start tag (it includes tag name, attributes and other information in the tag).
• Tag data (this portion carries the XML element data).
• End tag (this is the last part which will ensure all the tag has been parsed).
There are three main steps to create SAX parser.
• Create an object model.
• Create a SAX parser.
• Create a document Handler.
Advantages
• It reduced memory and the CPU usage since it works one section at one time.
• It is customizable because when an event is fired then the associated applica-tions will retrieve data.
• It is fast and support pipelining. It produced output as soon as data is parsed.
• We can stop processing at any time so it good to fetch particular data.
Disadvantages
• It works in one direction. It can only parse forward in the document.
• Since there is only one part of XML file is in the memory at one time so it is difficult to perform structure operation.
• It is not possible to access data in random order by using SAX.
XML DOM: Document Object Model
“The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of a document.”[16]
Document object model is W3C (World Wide Web consortium) standard to parse the XML document. DOM places a layer between the XML file and the application. Parser read the XML data and feed it to the DOM. Then DOM is used by application and performs dif-ferent operation on this data. DOM present XML file in a tree structure. It is language and platform independent. DOM has three levels, XML DOM define some standard set of object for xml file, now it is a standard way to access any xml document.
DOM level 1: It used to provide some basic level of functionality for navigating and oper-ating HTML and XML documents. First time level 1 declared as W3C standard.
DOM level 2: This is an up graded version of level 1. It gives supports for XML namespaces and style sheet.
DOM level 3: This is a very comprehensive level since it gives a complete mapping between DOM and XML. It also supports entity declaration. [17][18]
DOM Functionality is very interesting; it accesses the XML document and creates a tree in the memory. The tree is made up of a hierarchy of nodes. Mostly compose the nodes in dif-ferent category like element node, attributes node, entity node and text node. It allows ran-dom access to each node. Unlike SAX it allows very powerful document navigation. Work-ing with large size documents it needs more memory. [19]
Figure 2:1 DOM Tree [20]
In figure 2.1 tree structure of DOM is shown. Root node (address book) contains two records (persons) show in tree structure. Unlike SAX, DOM provides complete access by allowing backward and forward movements. Although it is a not a memory efficient technol-ogy but it is a useful for complex search.
Advantages
• It provides random access to the data.
• It is good when we need to implement a complex searches.
• It is a best choice when we need to perform XSLT transformation.
• We can use when we need to modify the XML file.
Disadvantages
• DOM is a resource intensive parser, since xml structure stay in the memory, so the large file needs more memory.
• It is a slow when we compare it to SAX.
Parsing P2A file
After complete comparative study of both DOM and SAX we have decide to use SAX for parsing xml analysis file. Since we do not need to perform any changes in analysis file and search is also not complex so SAX will work fine with this scenario.” The thing I like most about SAX is that it allows you to ignore all the portions of your XML document that you don’t care about, making it not only trivial to only pick the information you are interested in, but also easier to migrate your schema over time, should you decide to do so [21]”. SAX is efficient and it save memory, SAX parser will parse only those file which are well formed.
Requirement Analyssis and Design
In this chapter we will analyzee the system requirements and interaction between user and the system, using UML diagrams.
Methodology
In this section we will discusss software development methodology whichh we have used for our work. After considering the goal of our project we have to attain wee choose water fall model to develop our softwaree.
Waterfall software deevelopment methodology
Waterfall model has different phases as shown in the figure 3.1. Waterfall model has different phases. First phase of the Waaterfall model is requirements, in this phase all the software re-quirements are gathered. Bettter job in this phase leads to better end. In this phase software goals and constraints are analyyzed.
System specification phase is based upon the requirement phase. Thiis phase define the system functionality. System specification and requirement phase can be seeen as single.
In system design phase syystem has to be designed properly before anny implementation. This phase defines the main blocks and components of the system. Software components should be compatible with the user requirements and possible scalability off the system.
Figure 3:1 Waterfall model [23]
In implementation phase software engineering work translated into code. In this phase software development started with small components called units. These components can be integrate later to achieve complete functionality of the system.
In system integration phase different software components integrated together. In above phase every software component developed independently and tested its functionality through unit testing. In this phase units are brought together and performed testing to insure that they working in combined.
Last phase of the waterfall model is maintenance, in this phase software handed over to the customer with expectation that it meets the user requirements. But if customer found some problems in the software then these problems are solved in this phase. [22][23]
Waterfall software development methodology implementation
We adopted this methodology because requirements and specifications of the system are crystal clear. Second reason for selecting Waterfall model is development of the system in units like user authentication, to reduce the complexity and testing the effectiveness and effi-ciency of every unit. After completing unit testing these units will be integrated to test the functionality of whole system.
Requirements
Requirements analysis is the first phase of the system development and waterfall model. Design and implementation of the system is based on the requirements specifications. Re-quirement specification helps us to understand and define the scope of system. Requirements specifications also draw the boundaries of the system. There are two types of requirements.
• Functional Requirements
• Non-Functional Requirements
Functional Requirements
In this section we will describe and discuss the functional requirements of the proposed sys-tem. Function requirements defines the important component and functions of the system, it also includes the required function of the proposed system, level of importance with a brief description.
Table of contents :
1 Introduction
1.1 Motivation
1.2 Problem definition
1.3 Report Structure
2 Working with P2A
2.1 XML understanding
2.1.1 Analysis introduction
2.1.2 P2I
2.2 XML document validation
2.3 Document Type Definition (DTD)
2.3.1 XML Schema
2.3.2 Technology used in the thesis and why?
2.4 XML PARSING
2.4.1 SAX: Simple API for Xml
2.4.2 XML DOM: Document Object Model
2.5 Parsing P2A file
3 Requirement Analysis and Design
3.1 Methodology
3.1.1 Waterfall software development methodology
3.1.2 Waterfall software development methodology implementation
3.2 Requirements
3.2.1 Functional Requirements
3.2.2 Non-Functional Requirements
3.3 Use Case Diagram
3.4 System Sequence Diagram
4 Implementation
4.1 Dreamweaver MX 2004
4.2 Apache 2.2.11
4.3 PHP 5.2.9-2
4.4 MYSQL 5.1.33
4.5 phpMyAdmin 3.1.3
4.6 Website Development
4.6.1 Three tier architecture
4.6.2 Client side development
4.7 Server side development
4.7.1 Createuser.php
4.7.2 fileDownload.php
4.7.3 Fileupload.php
4.7.4 loginScript.php
4.7.5 Newschema.php
4.7.6 Verification.php
4.7.7 Connect.php
5 Database design and implementation
5.1.1 Users
5.1.2 Files
5.1.3 New Schema
5.1.4 P2A
5.1.5 Main_Table
5.1.6 P2I_ analysis Setup (Example Table)
5.1.7 P2I_ Analyzed Project (Example Table)
5.1.8 P2I_ Analyzed Project Version (Example Table)
5.1.9 P2I_heapObj (Example Table)
5.1.10 P2I_Call (Example Table)
5.1.11 How to Extend database
6 Conclusion and Feature enhancement
6.1 Conclusion
6.2 Future work
7 References
7.1 XML basic and Schema
7.2 Related Studies