Russ Thomas – SQL Judo

The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded

XQuery for the SQL Server DBA: XPath

“I don’t always query XML, but when I do; I drink dos equis” – Dan Sorensen.

This is the first tutorial in a complete series on XPath/XQuery.  The master post for this series can be found here.

Whether you like it or not, XML is everywhere in SQL Server. It isn’t a matter of if you’ll have to deal with XML as a DBA but when.  Take a quick look at the sys.dm_os_ring_buffers.  This little gem stores a wealth of knowledge about what’s going on with your server’s memory.  If you run a select statement against that dmv however you will find that all the really helpful information is stored within an XML document.  XML also pops up in query plans, deadlock graphs, extended events, and a growing number of other areas within SQL Server critical to DBAs.

Grant Fritchey wrote an excellent article last year on navigating the XML message provided by the ring buffer dmv.  After you complete my walk-through I challenge you to use XPath to modify, extend, and build upon what he wrote.  The key to learning something is using it in practice.

labyrinth

Navigating XML with XPath

Even though some people mistakenly use the two terms interchangeably XPath is actually a subset of XQuery.  XQuery implements XPath; but each remain their own standard. Further, Microsoft does not implement 100% of either standard in SQL Server. The XPath and XQuery that you’ll find in SQL Server is a partial set of features defined in the XQuery 1.0 and XPath 2.0 W3C standards.  (see link for full MS support document)

XPath is used for navigation of an XML document and can return a sequence of values contained within the element(s) identified by the path.  XQuery can leverage XPath but also provides much more functionality of its own such as looping and re-ordering.  It’s important to know the difference because while not as powerful, XPath alone is much more performant.  I’ll tackle the larger topic of XQuery in my next post.

I’m a fantasy baseball geek so I wrote a simple baseball related XML document to use as an example.  Thorough explanations abound on the web if you need a refresher on what makes a document valid XML.

DECLARE @x XML
SET @x = '<fantasy>
  <player Position="3B">
    <name>Evan Longoria</name>
    <team>Rays</team>
    <slg>512</slg>
  </player>
  <player Position="OF">
    <name>Carolos Gonzales</name>
    <team>Rockies</team>
    <slg>530</slg>
  </player>
  <player Position="C">
    <name>Buster Posey</name>
    <team>Giants</team>
    <slg>486</slg>
  </player>
</fantasy>';

SELECT @x.query('fantasy/player/team')

@x is a variable of XML data type that has been set to contain my sample document.  The select statement is using the query method of @x which is one of the various methods available to the XML data type within TSQL.  The string ‘fantasy/player/team’ being passed to the query method is a simple XPath statement.  Run the query and you get:

<team>Rays</team><team>Rockies</team><team>Giants</team>

Steps and Node Tests

It’s usually easiest to read an XPath statement right to left. Stepping through the statement above we are saying:

  • give me all the team elements
  • that are children of player elements
  • that are children of fantasy root

It reads much like a folder system or a URI.  If you do read it right to left just don’t forget that it must ultimately find it’s way back to the root node ( or at least a level one node if the xml doc isn’t well formed ).

In XPath lingo each section of a path statement divided by a forward slash is called a step.  To dive further into XPath lingo, each of the steps above complete a node test.

Hint:  specifying a leading slash is optional, the query will work either way.  A trailing slash however will produce an error – it causes the query to think a step is missing.

SELECT @x.query('fantasy/player/team') -- works
SELECT @x.query('/fantasy/player/team') -- works
SELECT @x.query('/fantasy/player/team/') -- error

Msg 9341, Level 16, State 1, Line 20
XQuery [query()]: Syntax error near ‘<eof>’, expected a step expression.


It’s important to understand that XPath navigates the entire document.  It will always return every element that matches the path provided.  It will also always return them in the same order they are found in the document.  This is something that throws relational folks off.  Unlike relational data, ordinal position of elements matters in XML.

FYI: Attributes are a different story.  In the pure XML realm the order of attributes is maintained.  In the SQL Server world, when querying XML data types, SQL doesn’t guarantee order of attributes.

Predicates

Let’s say you’re only interested in the one team that Buster Posey plays for.  We need to filter out those elements that we are not interested in via a predicate.  This predicate is placed after the path element to be filtered in brackets [ … ].  This predicate can be an ordinal position, an equality/inequality comparison, or a number of other options.

Since we know order matters to XML, let’s look at ordinal position first.  If you know that Posey is contained in the third player element in the document; it’s completely valid to pass an XPath statement like this:

SELECT @x.query('/fantasy/player[3]/team')
  • give me all the team elements
  • that are children of player elements
  • [ that are the third player element down ]
  • that are children of fantasy root

There is only one team element that meets this path:

<team>Giants</team>

Another predicate you could use to get Buster Posey’s team is to filter the player element by its attribute value.  SQL folks get thrown off by the @ sign.  In TSQL this is a variable designator, but in XPath it’s shorthand for an attribute.

SELECT @x.query('/fantasy/player[@Position="C"]/team')

The result of this query is the same as above.  <team>Giants</team>.  However we arrived at this result by taking a different path.

  • give me all the team elements
  • that are children of player elements
  • [that have attribute Position equal to “C”]
  • that are children of fantasy root

This is a good time to point out that XPath is case-sensitive.  Position and position are two different words.  One does not equal the other.  Anytime you get unexpected results from XPath, check case.

text()

Now, let’s say you don’t want the team element.  You want the text contained inside of the team element.  You can do that with the text() function provided as part of the XPath standard.  Let’s try that previous query again, this time with the text() function.

SELECT @x.query('/fantasy/player[@Position="C"]/team/text()')
  • give me the text
  • contained within team elements
  • that are children of player elements
  • [ that have Position attribute equal to “C”]
  • that are children of fantasy root

The result now is just:   Giants

Data Type and the value() Method

When you run the above query in SSMS you should notice that it is underlined in blue just like the xml snippets returned by the previous queries.  Even though the return result is a single string with no markup, it is still of type XML.

Query Result

If you want the result returned as another value you could cast it, but SQL Server provides a better way.  The value method of the XML data type.  Unlike the query method that always returns a sequence, to use the value method, you must construct your XPath in a way to satisfy SQL Server that the data returned is singleton.  The value method also requires a second parameter that allows you to specify the data type of the value to be returned.

SELECT @x.value(
     '(/fantasy/player[@Position="C"]/team/text())[1]'
     ,'varchar(50)') AS TeamName

To satisfy the singleton requirement I enclose the xpath statement in ( ) and then add a predicate that says I only want ordinal position [1].  This is roughly the equivalent of using a TOP 1 designator in a Select Statement.

attribute value

It’s also important to note that the value method is the only way to obtain the value of an attribute.  To obtain the position of the 1st player for example:

-- this code works
SELECT @x.value('(fantasy/player/@Position)[1]','varchar(5)') 
-- this code produces an error
SELECT @x.query('fantasy/player[1]/@Position')

Msg 2396, Level 16, State 1, Line 27
XQuery [query()]: Attribute may not appear outside of an element


The AXIS ( of evil ?? )

Hopefully up until now this has all been pretty straight-forward.  Unfortunately this is where things get tricky; also where most people give up on XPath.  Let’s say you want the team that Posey plays for but you do not know the ordinal position of the player element nor do you know any unique player attribute values as in the previous examples.

You should understand XPath enough at this point to be able to navigate to the player name element using a predicate of text()=”Buster Posey”.  The question is; how do you get from there to the sibling team element under the same player element?  XPath provides an option to navigate a path and then set a reference point known as an axis, and then navigate elsewhere using that point as a reference.

SELECT @x.query('
/fantasy/player/name[text()="Buster Posey"]/../team/text()
')

The query above uses the axis .. which is shorthand for the parent axis.  XPath designates a variety of axis options including references for sibling, ancestor, descendant etc.  Check the XPath standards document for all axis designators supported by Microsoft in SQL Server.

Stepping through from right to left

  • give me the text value
  • that is contained within team elements
  • that are children of any element
  • that is the parent of a name element <– AXIS
  • [that has text equal to “Buster Posey”]
  • that are children of the player elements
  • that are children of the fantasy root

The strength of XML lies in its support of semi-structured data, flexibility, and cross platform support, not its ease of navigation for queries.  I doubt this post has made you a fan of XPath.  I do hope that I’ve at least cleared up the concept a little bit.  We’ll build from here in the next post.  You can also return to the master post of this series.

3 comments on “XQuery for the SQL Server DBA: XPath

  1. NoName
    December 9, 2013

    Great post on the basics of XPath, I’m looking forward to the next post to build on this.

  2. NoName
    June 6, 2014

    This was very helpful! Great job explaining this. My summer-internship has had me work with a lot of XML – I do not have experience with XML so I’m having to learn it as I go. Your article has helped me connect the dots. Thanks! Good job mixing humor in with this tricky/frustrating topic!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Information

This entry was posted on December 2, 2013 by in XPath XQuery XML.
%d bloggers like this: