1   Introduction

In this document we provide examples of the use of SweetXml in Elixir to access data items in an XML document.

SweetXml is built on top of the Erlang Xmerl library. It returns data structures (lists and tuples) that map onto the records defined in the Erlang Xmerl include files. And so, we'll also learn how to make those Xmerl record definitions available in Elixir and how to use them to access nested data items by name.

2   Sample code

The sample Elixir code files are here:

  • The mix config file: mix.exs -- Notice that, because we are building an escript, we need to declare the module for our command line version.
  • The main code file -- It implements the functions that make the queries and do the work: lib/test16.ex
  • The Elixir file that implements the version that can be run from the command line: lib/test16cli.ex

3   Creating, building, and using a project

I used mix to create a project named test16 as follows:

$ mix new test16

Build the project with the following:

$ cd test16
$ mix deps.get
$ mix deps.compile
$ mix compile
$ mix escript.build

Start the interactive Elixir interpreter and make our project available:

$ iex -S mix

Each time we edit and change a file, we can recompile at the interactive prompt with the following:

iex(47)> IEx.Helpers.recompile

4   Using the code from the sample project

You will need to create a mix project, then modify several of the files to agree with the sample files above. In particular, modify: mix.ex and lib/my_project.ex.

My project was called "test16".

Then, do the following, for example:

Create multiple processes and use a remote call to request a service from a process:

$ cd test16
$ iex -S mix
tree1 = File.stream!("Data/test02.xml") |> SweetXml.parse
{:ok, [p1, p2, p3]} = Test16.start(tree1, 3)
Test16.rpc(p2, :show_tags, "ddd")
Test16.rpc(p2, :show_namespace_defs, "ns1:ccc")

Alternatively, you can call any of the helper functions directly:

# if you already have parsed an XML document into a tuple/tree
iex(11)> Test16.show_text(tree1, "ddd")
# if you want the tree created
iex(12)> Test16.show_text("Data/test02.xml", "ddd")

5   Using Erlang record definitions

We are going to extract record definitions from the Xmerl include files. In order to do so, we need to know how reference those files. You can find information about file inclusion here: http://erlang.org/doc/reference_manual/macros.html#file-inclusion

Here are several examples of the declaration of records that are defined in Erlang. (see: lib/test16.ex):

defmodule XmerlRecs do
  @moduledoc """
  Define Xmerl records using record definitions extracted from Erlang Xmerl.
  """

  require Record
  Record.defrecord(:xmlElement, Record.extract(:xmlElement,
    from_lib: "xmerl/include/xmerl.hrl"))
  Record.defrecord(:xmlText, Record.extract(:xmlText,
    from_lib: "xmerl/include/xmerl.hrl"))
  # ...

end

Given the above definition of the xmlElement record, we can access fields within an element as follows:

# extract the tag (name) from the element record.
tag = XmerlRecs.xmlElement(element, :name)
# extract the attributes from the element record.
attributes = XmerlRecs.xmlElement(element, :attributes)

6   Using SweetXml

In this sample application I use SweetXml in order to do xpath queries into an XML document.

You can learn about SweetXml here: https://hexdocs.pm/sweet_xml/SweetXml.html

You can find information on XPath here: http://www.w3.org/TR/xpath-30/. In particular, look for the description of the query language, here:

Here is a snippet of Elixir code that extracts the elements whose tag (name) match a pattern:

File.stream!(file_path)
|> SweetXml.xpath(~x"//#{pattern}"l)
|> Enum.each(fn (el) ->
  IO.puts("tag: #{XmerlRecs.xmlElement(el, :name)}")
end)

Notes:

  • SweetXml uses an "~x" sigil to specify patterns.
  • Also notice the letter "l" at the end of the query. That instructs SweetXml to return a list rather than a single element.

Notice that in our sample code I used SweetXml.xpath to both parse the XML document and then perform a query in a single step. That may not always be optimum. In particular, when we want to do multiple queries into the same document, we might want to separate out the parsing step, then use that result in multiple queries. Here is an example (done inside IEx):

iex(60)> import SweetXml
iex(61)> tree = File.stream!("Data/people.xsd") |> SweetXml.parse()
iex(62)> SweetXml.xpath(tree, ~x"//xs:simpleType")
iex(63)> SweetXml.xpath(tree, ~x"//xs:complexType")

Published

Category

elixir

Tags

Contact