Sunday, October 7, 2007

Query XML using XLINQ

LINQ to XML is a built-in LINQ data provider that is implemented within the System.Xml.Linq namespace in .NET 3.5.
It enables us do the following to XML data:

  • Read.
  • Construct.
  • Write.

We can perform LINQ queries over XML from the file-system, from a remote HTTP URL or web-service, or from any in-memory XML content.
LINQ to XML provides much richer (and easier) querying and data shaping support than the low-level XmlReader/XmlWriter API in .NET 2 and also much more efficient with usage of much less memory than the DOM API that XmlDocument provides. That's because it does not require you to always have a document object to be able to work with XML. Therefore, you can work directly with nodes and modify them as content of the document without having to start from a root XmlDocument object. This is a very powerful and flexible feature that you can use to compose larger trees and XML documents from tree fragments. Now that you have an overview of the XLinq's capabilities, the next few sections will examine the reading and writing features of XLinq before discussing the query capabilities.

01 (Figure and classes explanations are taken from XLINQ overview.doc)

Of the classes shown in this figure, the XNode and XContainer classes are abstract. The XNode class is the base for element nodes, and provides a Parent method and methods such as AddBeforeThis, AddAfterThis, and Remove for updates in the imperative style. For IO, it provides methods for reading (ReadFrom) and writing (WriteTo).
Although the XElement class is bottom-most in the class hierarchy, it is the fundamental class. As the name suggests, it represents an XML element and allows you to perform the following operations:

  • Create elements with a specified element name
  • Change the element's contents
  • Add, change, or delete child elements
  • Add attributes to the element
  • Save the element as an XML fragment
  • Extract the contents in text form

This post will introduce hoe to query xml and it is the first of series of posts regards to XLINQ.

Lets get a sense of how LINQ to XML works!

Query xml from URL

   1:  public static void GetRssFeedFromURL()
   2:  {
   3:      string url = "http://feeds.feedburner.com/MaorDavid?format=xml";
   4:   
   5:      // load the rss feeds into the XElement
   6:      XElement feeds = XElement.Load(url);
   7:   
   8:   
   9:      if (feeds.Element("channel") != null)
  10:      {
  11:          var query = from f in feeds.Element("channel").Elements("item").Take(10)
  12:                      select new { Title = f.Element("title").Value, Link = f.Element("link").Value };
  13:   
  14:          foreach (var feed in query)
  15:          {
  16:              Console.WriteLine(String.Format("Feed title: {0}",feed.Title));
  17:              Console.WriteLine(String.Format("Link: {0}",feed.Link));
  18:          }
  19:      }
  20:  }

XLinq is an XML query language that inherits from the LINQ query foundation. You can use it to query XLinq objects such as XElement, XDocument, etc using LINQ query facilities.

We start by loading the XML into memory using the Load() method of the XElement class. (Line 6).


After loading the XML , the next step is to retrieve all items (Line 11) and now you can query and iterate the results as described in my previous posts.(Var keyword, Getting started with Linq).

Very simple! This example load an XML from URL. What if you want to query XML from the file system? Nothing changed beside the uri parameter to load into the XElement. (Line 6)


   1:  public static void GetRssFeedFromFile()
   2:  {
   3:      string path = @"C:\Work\Projects\Samples\VS2008Samples\LinqToXML\Maor Davids Blog.xml";
   4:   
   5:      // load the rss feeds into the XElement
   6:      XElement feeds = XElement.Load(path);
   7:   
   8:   
   9:      if (feeds.Element("channel") != null)
  10:      {
  11:          var query = from f in feeds.Element("channel").Elements("item").Take(10)
  12:                      select new { Title = f.Element("title").Value, Link = f.Element("link").Value };
  13:   
  14:          foreach (var feed in query)
  15:          {
  16:              Console.WriteLine(String.Format("Feed title: {0}", feed.Title));
  17:              Console.WriteLine(String.Format("Link: {0}", feed.Link));
  18:          }
  19:      }
  20:  }

 

As you can see, all the code snippets presented above are fairly simple. Once the XML loaded into the LINQ to XML API, you can write queries over that tree. The query syntax is easier than XPath or XQuery.

Enjoy!!

No comments: