Friday, April 9, 2010

Watir:XPath: What is XPath?: How to identify XPath in ruby scripting?

What is it?

XPath for Watir supplies a more powerful way to identify page elements in a Watir script. The more powerful way uses XPath expressions to identify specific elements of interest in the page.

Why is it required?

The attributes that Watir provides for identifying the HTML elements on the page are sometimes not sufficient enough to identify element(s) on the page especially if you are testing non-trivial real world web applications. XPath is well established and powerful query language for XML documents. It is trivial to clean HTML and convert it into XHTML such that we can use powerful query language XPath for identifying the element structurally on the page. XPath empowers user to generate scripts which are more generalized and less brittle.
For example, if you have the following HTML:
"1.jpg">First Image
"2.jpg">Second Image
"3.jpg">Third Image
Now you want the text of tag which has image with source 3.jpg. This is called structural addressing where you try to find out the element with respect to some other elements that may be above or below the element of interest, based on the tree structure of HTML that will be shown on the browser.
Now if you want to try this using Watir (without XPath), then the code might be as follows:
ie.tables.each do |t|                      # since you don't have ID's, look at every table
  for i in 1..t.row_count                  # for every row in this table
    t[i].each do |cell|                    # for every column in this row, look at its contents
      if cell.image(:src, /3.jpg/).exists? # if true, this is your cell
        puts cell.text
On the other hand XPath is a very powerful mechanism for identifying the element structurally on the page.
If you try to access that element using Watir with XPath the code will be:
puts ie.cell(:xpath, "//img[@src='3.jpg']/../").text
In both cased the output will be
Third Image

What should I have before using XPath-Watir?

If you have the latest Ruby and Watir, you are fine.
If not, you should have at least:

How to use XPath in test script?

All commonly used HTML elements for testing has been provided with a new attribute in Watir called xpath for identifying them using an XPath query. This attribute is provided for most of the elements except for frame. All you have to do is to provide XPath query as second argument with this attribute and get the element.
For example:
ie.select_list(:xpath, "xpath query")

Example Code

If you want to get text from this link:
you can use this code:,"//a[@href='test.htm']/").text # => "click me"
The Watir class that you use for identifying the element using XPath (link in above example) should match with the element that you are trying to access. Now you can use any of the methods or properties that are exposed by Watir for Link element.

What to do for elements not having class in Watir?

You can use element_by_xpath function of IE class to get the underlying ole_object. Then you can invoke any method supported by that element.

Example Code

Suppose you have a map and you want to click an area. As Watir don't have any class for map or area, directly use element_by_xpath function to get underlying ole_object.
HTML code:

      "poly" coords="150,16,159,17,168,20,175,25,182,32,150,56,150,56" >
      "poly" coords="182,32,188,43,190,56,150,56,150,56" href="PieChart.html?category=Critical&pieIndex=0"> 
Watir code:
# get the underlying object and execute click method
ie.element_by_xpath("//area[contains(@href , 'PieChart.html')]/").click


There is no support for frames, however the existing functionality that Watir provides will remain. That is you can access frames using the attributes that Watir provides but you can't use xpath attribute for accessing frames.

How it is implemented?

XPath support is added using REXML. We retrieve source HTML from the IE DOM model, which is then cleaned and converted into XHTML. Cleaned XHTML output is passed to the REXML parser. XPath supplied by the user is resolved locally using REXML to get the complete path of element(s) from the document root. We then traverse element's complete path over IE DOM using series of COM calls and return the requested element.
  1. Get the source HTML from the IE DOM model.
  2. Clean the HTML source and convert it to a valid XHTML.
  3. Pass the cleaned XHTML as input to REXML.
  4. Resolve the XPath expression locally using REXML and get the complete path of the element(s).
  5. Traverse the element's complete path over IE DOM using series of COM calls and return the requested element.

Code Design


Function returns the REXML document object.


This is a private function and is called only if REXML document is not created before for that HTML page. This function is called internally by rexml_document_object by checking the variable rexmlDomobject which gets set to nil when new page is rendered in the browser. If you execute multiple XPath queries on the same source page, then the object is created once on first request and reused hence forth.


This function resolves the XPath query, provided by the user and returns all qualified elements. The XPath query should be compatible with REXML i.e. it should contain only those attributes or functions that are supported by REXML. The function gets the complete path from the root element for all qualified elements locally using REXML. Now this path is passed to elements_by_absolute_xpath and finally qualified elements are returned back to the caller.


This is the private function that maps the element's complete path (passed XPath) to actual element in IE DOM tree. It traverses over the IE DOM starting from the tag following the passed complete path. We call IHtmlElement::getChildNodes function to get all the child nodes and select the one of our interest as indicated by the complete path. We loop till we have traversed the complete path and finally desired element is returned back.

html_source(element, htmlString, spaceString)

This function traverses the IE DOM model and creates a string that contains the HTML displayed on the browser. Same as you see with view source option when you right click on the browser. This function is initially called with element as body. During processing the function gets all the attributes of the element using IHtmlElement::outerHtml. Then the attributes of element are cleaned. If element can't have child nodes then the tag is made as self closing tag, else we just close the tag. Then this function is called recursively on the child nodes of this element. While unwinding the stack created during recursion we again check if element can have child nodes if yes then we place a closing tag that was left opened else we simple return. Script and comments tags are not taken into consideration.


This function scans the outer HTML of the element and returns an array of tokens. Token could be either tagName or "=" or attribute name or attribute value. Attribute value could be either quoted string or single word.
Output: An array with values.
{'input', 'type', '=', 'radio', 'name', '=', 'WATiR', 'checked'}


This function is called by HTML source function to get all the attributes of the tag. This function calls tokenize_tagline to separate attributes name and value. Now this function cleans the outer HTML by placing quote around the attribute values. If only attribute name is present then it attribute value is set to same as attribute name.
"radio" name="WATiR" checked="checked">


This function is used to escape the characters that are not considered as valid data in XML. For eg: &, <, >, ", etc.

More information on REXML

The XPath that is returned by REXML is of the format:
The XPath string returned by REXML provides information about the location of the element from the root element for which XPath query is given.
For e.g.:
If the above provided string is returned as a result then it means that the element resides inside HTML (root element), body tag inside html (element1 inside element2: this means element2 tag inside element1 tag obtained in previous step), table tag inside body, tbody tag inside table, second tr tag inside tbody, second td tag inside row, div tag inside td, table tag inside div, tbody tag inside table, third tr tag inside tbody, first td tag inside tr, second a tag inside td.

Future Enhancements

Tweak the code of REXML to get the position of element with respect to parent. REXML code can be tweaked provided TIDY doesn't mess with the ordering of the tags. This will avoid series of COM calls required for iterating over children, we can directly jump to intended child via index.

More information

XPath and WATiR article at Angrez's blog.

No comments:

Post a Comment