1   Introduction

I've been interested in structured (and semi-structured) data for a long time. Both Erlang and Elixir provide a variety of ways for representing structured data. This document summarizes them and provides a brief reference for using them.

2   Erlang data structures

2.1   Property lists

A property list is a list of 2-tuples. The equivalent data structure in Elixir is called a keyword or keyword list.

Examples:

1> Fruits = [{apr, "Apricot"}, {apl, "Apple"}, {grp, "Grape"}].
[{apr,"Apricot"},{apl,"Apple"},{grp,"Grape"}]
2>
2> l(keyword).
{error,nofile}
4> proplists:get_value(grp, Fruits).
"Grape"
5> proplists:get_value(apl, Fruits).
"Apple"
6> proplists:get_value(aplx, Fruits).
undefined
7> proplists:lookup(apl, Fruits).
{apl,"Apple"}
8> proplists:get_value(aplx, Fruits, not_found).
not_found

Notes:

  • lookup/2 returns the tuple, whereas getvalue/2 returns only the value associated with a property.
  • The proplists module ignores items that are not 2-tuples.

2.2   Records

Erlang records of a given type have a fixed number of fields with fixed field names.

Define a record with the -record module attribute. Examples:

-record(complexType, {
          name=nil,
          childDefs=[]
         }).
-record(element, {
          name=nil,
          type=nil
         }).
-record(simpleType, {
          name=nil,
          base=nil,
          constraints=nil
         }).

And, from .kerl/20.0/lib/xmerl-1.3.15/include/xmerl.hrl:

-record(xmlElement,{
          name,         % atom()
          expanded_name = [],   % string() | {URI,Local} | {"xmlns",Local}
          nsinfo = [],          % {Prefix, Local} | []
          namespace=#xmlNamespace{},
          parents = [],     % [{atom(),integer()}]
          pos,          % integer()
          attributes = [],  % [#xmlAttribute()]
          content = [],
          language = "",    % string()
          xmlbase="",           % string() XML Base path, for relative URI:s
          elementdef=undeclared % atom(), one of [undeclared | prolog | external | element]
         }).

Notes:

  • The default values for each field are optional.

Create an instance of a record -- Example:

Rec = #simpleType{name=Name, base=Base, constraints=Constraints},

Access a field in a record using dot notation -- Example:

Namespace = ComplexType#xmlElement.namespace,
Nodes = Namespace#xmlNamespace.nodes,

Or, more compactly:

Nodes = ComplexType#xmlElement.namespace#xmlNamespace.nodes,

Access fields using pattern matching -- Example:

#xmlElement{namespace=NSpace} = Root,

Modify the value of a field in a record -- Example:

NewRoot = Root#xmlElement{name="newTag"}

2.3   Maps

Maps provide a key-value lookup data structure in Erlang. The maps module provides helper functions for maps.

Create a map -- Example:

33> Veggies =#{cucumber => "green and crunchy", tomato => "red and juicy"}.
#{cucumber => "green and crunchy",tomato => "red and juicy"}

Get (retrieve) a value from a map -- Example:

34> maps:get(tomato, Veggies).
"red and juicy"

Update a map -- Example:

37> Veggies1 = maps:update(cucumber, "good in salads", Veggies).
#{cucumber => "good in salads",tomato => "red and juicy"}

Size -- Determine the number of pairs in a map -- Example:

41> map_size(Veggies1).
2
42> maps:size(Veggies1).
2

Process each of the items in a map -- Example:

43> Fun1 = fun (K, V, Acc) -> [{K, V * 2} | Acc] end.
#Fun<erl_eval.18.99386804>
45> M1 = #{aa => 11, bb => 22, cc => 33}.
#{aa => 11,bb => 22,cc => 33}
46> maps:fold(Fun1, [], M1).
[{cc,66},{bb,44},{aa,22}]

Convert a map to a list of 2-tuples -- Example:

47> maps:to_list(M1).
[{aa,11},{bb,22},{cc,33}]

Determine whether a key exists in a map -- Example:

48> maps:is_key(bb, M1).
true
49> maps:is_key(xx, M1).
false

Get all the keys or the values from a map -- Examples:

50> maps:keys(M1).
[aa,bb,cc]
51> maps:values(M1).
[11,22,33]

Filter (select) items based on a predicate (function) and create a new map -- Example:

52> Fun2 = fun (K, V) -> if V =< 22 -> true; true -> false end end.
#Fun<erl_eval.12.99386804>
53> maps:filter(Fun2, M1).
#{aa => 11,bb => 22}

Filter (select or reject) items based on a list of keys -- Examples:

54> maps:with([aa, cc], M1).
#{aa => 11,cc => 33}
55> maps:without([aa, cc], M1).
#{bb => 22}

Erlang is a functional language and does copying when you update complex data structures. However, apparently, Erlang (the Erlang Beam, actually) is very intelligent in doing this copying, and attempts to optimize with respect to space and time or both. For small maps, this likely does not matter. If your Erlang work involves large maps, you may want to read this thread on the "erlang-questions" Email list: http://erlang.org/pipermail/erlang-questions/2017-August/093155.html. And, another option for large maps (suggested in the above thread) is to use ETS. And, for persistence, use DETS.

3   Elixir data structures

Reminders:

  • Erlang atoms begin with a lower case letter or are enclosed in single quotes. Erlang variables begin with an upper case letter.
  • Elixir atoms begin with a colon or an upper case letter. Elixir variables begin with a lower case letter.

Additional help -- There is a very helpful set of Elixir examples, including ones that use Elixir data structures, here: https://elixir-examples.github.io/.

3.1   Tuples

Elixir tuples are quite like Erlang tuples.

Get the size and access individual elements -- Examples:

iex(2)> a = {11, 22, 33, 44}
{11, 22, 33, 44}
iex(3)> tuple_size(a)
4
iex(4)> elem(a, 0)
11
iex(5)> elem(a, 1)
22

There are helper functions in the Tuple module. See: https://hexdocs.pm/elixir/Tuple.html#content.

3.2   Keyword lists

Elixir keyword lists are lists of 2-tuples, where the first item of each 2-tuple is used as a key. Keyword lists are (roughly) the equivalent of Erlang maps. The Elixir Keyword module provides helper functions.

Note that the "keys" must be atoms.

You can create 2-tuple lists explicitly, or use Elixir's special syntax -- Examples:

iex(2)> veggies = [{:cucumber, "crunchy"}, {:tomato, "juicy"}]
[cucumber: "crunchy", tomato: "juicy"]
iex(3)> fruits = [peach: "suculent", watermelon: "yummy"]
[peach: "suculent", watermelon: "yummy"]

Retrieve values and keys from a keyword list -- Examples:

iex(6)> Keyword.fetch(fruits, :watermelon)
{:ok, "yummy"}
iex(7)> Keyword.fetch(fruits, :watermelonxxx)
:error
iex(8)> Keyword.get(fruits, :watermelon)
"yummy"
iex(9)> Keyword.get(fruits, :watermelonxxx)
nil
iex(10)> Keyword.keys(fruits)
[:peach, :watermelon]
iex(11)> Keyword.values(fruits)
["suculent", "yummy"]

For helper functions and more information, see:

3.3   Records

Elixir has no direct equivalent of Erlang records. (1) If you want something reasonably similar to Erlang records in Elixir, use structures. (2) If you have code written in Erlang and it contains Erlang records that you want to use from Elixir, then look at the Elixir Record module, and consider the following notes.

Remember that underneath, records are just tuples. So, you can, at the least, just work with them as you would Elixir tuples. In particular, you can use:

  • Pattern matching.
  • tuple_size/1 -- Get the number of items in a tuple.
  • elem/2 -- Get a specific item from a tuple by index.

Examples:

iex(20)> a = {:aa, :bb, :cc}
{:aa, :bb, :cc}
iex(21)> tuple_size(a)
3
iex(22)> elem(a, 2)
:cc

There is discussion here: https://stackoverflow.com/questions/28891758/elixir-how-to-convert-a-map-struct-to-a-record-struct

And, there is an example here: https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/file/stat.ex

3.4   Maps

Maps are the standard key-value store in Elixir.

Create a map with the "%{}" syntax, or with the helper function from the Map module. Examples:

iex(6)> a = %{}
%{}
iex(7)> b = %{:aa => 11, :bb => 22, :cc => 33}
%{aa: 11, bb: 22, cc: 33}
iex(10)> c = Map.new()
%{}
iex(12)> d = Map.new([{:aa, 11}, {:bb, 22}])
%{aa: 11, bb: 22}

If all the keys for a map are atoms, then you can use the following more convenient syntax for creating a map:

iex(25)> e = %{aa: 11, bb: 22, cc: 33}
%{aa: 11, bb: 22, cc: 33}

Access items in a map with the indexing syntax -- Examples:

iex(9)> b[:bb]
22

Or, with helper functions from Map -- Examples:

iex(14)> Map.fetch(b, :bb)
{:ok, 22}
iex(15)> Map.get(b, :bb)
22

And, if the keys are atoms, then you can use dot notation to access items in a map. Example:

iex(20)> b.bb
22

Determine whether a map contains a specified key -- Examples:

iex(20)> Map.has_key?(b, :cc)
true
iex(21)> Map.has_key?(b, :ccxxyy)
false

Get a list of keys or values from a map -- Examples:

iex(17)> Map.keys(b)
[:aa, :bb, :cc]
iex(18)> Map.values(b)
[11, 22, 33]

For helper functions and more information on maps, see:

3.5   Structures

According to the Elixir documentation: structs are extensions built on top of maps that provide compile-time checks and default values."

In particular: (1) structs are a data structure with named fields. (2) Restrictions are enforced on the field names used (which prevents errors).

Define a struct. Examples:

iex(40)> defmodule RGB do
...(40)>   defstruct red: 0, green: 0, blue: 0
...(40)> end

Notes:

  • The structure takes on the name of the module in which it is defined.

Create an instance of a struct. Examples:

iex(44)> a = %RGB{}
%RGB{blue: 0, green: 0, red: 0}
iex(45)> red = %RGB{red: 255}
%RGB{blue: 0, green: 0, red: 255}
iex(46)> green = %RGB{green: 255}
%RGB{blue: 0, green: 255, red: 0}
iex(47)> blue = %RGB{blue: 255}
%RGB{blue: 255, green: 0, red: 0}

Note that if we attempt to create an instance of a struct using a key that in not in that struct's definition, we get an error. Example:

iex(49)> stuff = %RGB{junk: 255}
** (KeyError) key :junk not found in: %RGB{blue: 0, green: 0, red: 0}

A struct is a map that has a special key :__struct__ (whose value is the name of the struct):

iex(48)> is_map(blue)
true
iex(49)> blue
%RGB{blue: 255, green: 0, red: 0}
iex(50)> blue.__struct__
RGB

We can access values in a struct with dot notation (as with maps). Examples:

iex(51)> blue.blue
255
iex(52)> blue.red
0

Update a struct -- Examples:

iex(57)> red_green = %RGB{red: 200, green: 200}
%RGB{blue: 0, green: 200, red: 200}
iex(58)> red_green_alt = %{red_green | red: 100, green: 100}
%RGB{blue: 0, green: 100, red: 100}
iex(59)> red_green
%RGB{blue: 0, green: 200, red: 200}
iex(60)> red_green_alt
%RGB{blue: 0, green: 100, red: 100}

We can also use pattern matching to access fields. Examples:

iex(65)> %RGB{red: red_amount} = red_green
%RGB{blue: 0, green: 200, red: 200}
iex(66)> red_amount
200

We can define a struct without giving default values to some fields, in which case the default value is nil. And, we can require that some fields are given a value when an instance is created. Examples:

iex(68)> defmodule Color do
...(68)>   @enforce_keys [:desc]
...(68)>   defstruct [:desc, :red, :green, :blue]
...(68)> end
 %Color{blue: nil, desc: nil, green: nil, red: nil}}
iex(707> color1 = %Color{}
** (ArgumentError) the following keys must also be given when building struct Color: [:desc]
iex(70)> color1 = %Color{desc: "color_no_1"}
%Color{blue: nil, desc: "color_no_1", green: nil, red: nil}
iex(71)> color2 = %Color{desc: "color_no_1", red: 0, green: 0, blue: 0}
%Color{blue: 0, desc: "color_no_1", green: 0, red: 0}

3.6   Enumerables and streams

For information on enumerables, streams, etc. see:

Functions in the Enum module are eager: they consume the items in a enumerable immediately. Functions in the Stream module are "lazy": they produce an object (a stream) that is capable of producing items when demanded. Example:

iex(11)> f1 = fn(x) -> x + 3 end
#Function<6.99386804/1 in :erl_eval.expr/5>
iex(12)> a = Stream.iterate(1, f1)
#Function<61.34375656/2 in Stream.unfold/2>
iex(13)> Enum.take(a, 10)
[1, 4, 7, 10, 13, 16, 19, 22, 25, 28]

Notes:

  • Stream.iterate/2 produces a stream.
  • Enum.take/2 creates a list that has the first n items from an enumerable.

A few suggestions:

  • You can "force" evaluation of a Stream by using Enum.each/2.

  • You can convert a Stream into a list with Enum.to_list/1, but be wary that the stream has a reasonable length.

  • You can "pipe" an enumerable from a function that produces it to a function that consumes it with the pipe operator "|>". The pipe operator takes an enumerable or a function that produces an enumerable on its left and a function that takes an enumerable as its first argument on its right. Examples:

    iex(22)> f1 = fn(x) -> x + 10 end
    #Function<6.99386804/1 in :erl_eval.expr/5>
    iex(23)> Stream.iterate(1, f1) |> Enum.take(8)
    [1, 11, 21, 31, 41, 51, 61, 71]
    
    iex(31)> b = Stream.iterate(10, f1)
    #Function<61.34375656/2 in Stream.unfold/2>
    iex(32)> b |> Enum.take(9)
    [10, 20, 30, 40, 50, 60, 70, 80, 90]
    
  • Look in the Enum, List, and Stream modules for more help with enumerables.

4   Advanced data structures

As in most programming languages, we can construct more complex data structures out of simpler one, for example, lists containing tuples containing maps containing ... Elixir offers another and more advanced way: DSL (domain specific languages). This sections explores and gives a simple, first level explanation of that facility.

Other things to explore W.R.T. data structures:

- Dave Kuhlman