What is it?
XPath for Watir supplies a more powerful way to identify page elements in a Watir script. The more powerful way uses XPath expressions to identify specific elements of interest in the page.Why is it required?
The attributes that Watir provides for identifying the HTML elements on the page are sometimes not sufficient enough to identify element(s) on the page especially if you are testing non-trivial real world web applications. XPath is well established and powerful query language for XML documents. It is trivial to clean HTML and convert it into XHTML such that we can use powerful query language XPath for identifying the element structurally on the page. XPath empowers user to generate scripts which are more generalized and less brittle.For example, if you have the following HTML:
"1.jpg">First Image
"2.jpg">Second Image
"3.jpg">Third Image
Now if you want to try this using Watir (without XPath), then the code might be as follows:
ie.tables.each do |t| # since you don't have ID's, look at every table for i in 1..t.row_count # for every row in this table t[i].each do |cell| # for every column in this row, look at its contents if cell.image(:src, /3.jpg/).exists? # if true, this is your cell puts cell.text end end end end
If you try to access that element using Watir with XPath the code will be:
puts ie.cell(:xpath, "//img[@src='3.jpg']/../").text
Third Image
What should I have before using XPath-Watir?
If you have the latest Ruby and Watir, you are fine.If not, you should have at least:
- Ruby 1.8.2
- REXML > 3.1.6
- Watir > 1.4.1
How to use XPath in test script?
All commonly used HTML elements for testing has been provided with a new attribute in Watir called xpath for identifying them using an XPath query. This attribute is provided for most of the elements except for frame. All you have to do is to provide XPath query as second argument with this attribute and get the element.For example:
ie.select_list(:xpath, "xpath query")
Example Code
If you want to get text from this link:you can use this code:
ie.link(:xpath,"//a[@href='test.htm']/").text # => "click me"
What to do for elements not having class in Watir?
You can use element_by_xpath function of IE class to get the underlying ole_object. Then you can invoke any method supported by that element.Example Code
Suppose you have a map and you want to click an area. As Watir don't have any class for map or area, directly use element_by_xpath function to get underlying ole_object.HTML code:
> "poly" coords="150,16,159,17,168,20,175,25,182,32,150,56,150,56" > "poly" coords="182,32,188,43,190,56,150,56,150,56" href="PieChart.html?category=Critical&pieIndex=0">
# get the underlying object and execute click method ie.element_by_xpath("//area[contains(@href , 'PieChart.html')]/").click
Limitations
There is no support for frames, however the existing functionality that Watir provides will remain. That is you can access frames using the attributes that Watir provides but you can't use xpath attribute for accessing frames.How it is implemented?
XPath support is added using REXML. We retrieve source HTML from the IE DOM model, which is then cleaned and converted into XHTML. Cleaned XHTML output is passed to the REXML parser. XPath supplied by the user is resolved locally using REXML to get the complete path of element(s) from the document root. We then traverse element's complete path over IE DOM using series of COM calls and return the requested element.- Get the source HTML from the IE DOM model.
- Clean the HTML source and convert it to a valid XHTML.
- Pass the cleaned XHTML as input to REXML.
- Resolve the XPath expression locally using REXML and get the complete path of the element(s).
- Traverse the element's complete path over IE DOM using series of COM calls and return the requested element.
Code Design
rexml_document_object
Function returns the REXML document object.create_rexml_document_object
This is a private function and is called only if REXML document is not created before for that HTML page. This function is called internally by rexml_document_object by checking the variable rexmlDomobject which gets set to nil when new page is rendered in the browser. If you execute multiple XPath queries on the same source page, then the object is created once on first request and reused hence forth.elements_by_xpath(XPath)
This function resolves the XPath query, provided by the user and returns all qualified elements. The XPath query should be compatible with REXML i.e. it should contain only those attributes or functions that are supported by REXML. The function gets the complete path from the root element for all qualified elements locally using REXML. Now this path is passed to elements_by_absolute_xpath and finally qualified elements are returned back to the caller.element_by_absolute_xpath(XPath)
This is the private function that maps the element's complete path (passed XPath) to actual element in IE DOM tree. It traverses over the IE DOM starting from the tag following the passed complete path. We call IHtmlElement::getChildNodes function to get all the child nodes and select the one of our interest as indicated by the complete path. We loop till we have traversed the complete path and finally desired element is returned back.html_source(element, htmlString, spaceString)
This function traverses the IE DOM model and creates a string that contains the HTML displayed on the browser. Same as you see with view source option when you right click on the browser. This function is initially called with element as body. During processing the function gets all the attributes of the element using IHtmlElement::outerHtml. Then the attributes of element are cleaned. If element can't have child nodes then the tag is made as self closing tag, else we just close the tag. Then this function is called recursively on the child nodes of this element. While unwinding the stack created during recursion we again check if element can have child nodes if yes then we place a closing tag that was left opened else we simple return. Script and comments tags are not taken into consideration.tokenize_tagline(outerHtml)
This function scans the outer HTML of the element and returns an array of tokens. Token could be either tagName or "=" or attribute name or attribute value. Attribute value could be either quoted string or single word.Input:
{'input', 'type', '=', 'radio', 'name', '=', 'WATiR', 'checked'}
all_tag_attributes(outerHtml)
This function is called by HTML source function to get all the attributes of the tag. This function calls tokenize_tagline to separate attributes name and value. Now this function cleans the outer HTML by placing quote around the attribute values. If only attribute name is present then it attribute value is set to same as attribute name.Input:
"radio" name="WATiR" checked="checked">
xml_escape
This function is used to escape the characters that are not considered as valid data in XML. For eg: &, <, >, ", etc.More information on REXML
The XPath that is returned by REXML is of the format:/html/body/table/tbody/tr[2]/td[2]/div/table/tbody/tr[3]/td[1]/a[2]
For e.g.:
If the above provided string is returned as a result then it means that the element resides inside HTML (root element), body tag inside html (element1 inside element2: this means element2 tag inside element1 tag obtained in previous step), table tag inside body, tbody tag inside table, second tr tag inside tbody, second td tag inside row, div tag inside td, table tag inside div, tbody tag inside table, third tr tag inside tbody, first td tag inside tr, second a tag inside td.
No comments:
Post a Comment