String manipulation in Erlang. (Learning Erlang 9)

Published on Nov 23, 2012

Yesterday I was asked how you do string manipulation in Erlang, since they are treated internally as a list of numbers that represent characters.

I suspected that there should be such a think as a string module but I didn’t really know so I say as much.

After coming back home I decided to open the Erlang docs and sure enough, there is a string module.

Let’s take a look.

Note of caution. These are my notes while learning Erlang. You are welcome to follow along and use them as a guide.

Some basic operations.

  1> Str = "this is a string".
  "this is a string"
  2> string:len(Str).
  3> length(Str).

We can easily check for the string length using the len/1 function.

Notice that we can also use length on Str since it’s actually a list.

We can also check for the number of words using words/1 that uses a space as the word separator or words/2 that takes a character as the word separator.

  1> string:words("this is a string").
  2> string:words("this is a string", $a).

Building strings.

We can create new strings using a number of functions.

join/2 takes a list of tokens and a string as a separator and “stitch” them together.

  1> string:join(["this", "is", "a", "string"], " ").
  "this is a string"

concat/2 join two strings together.

  1> string:concat("this is a string", ", and now is longer.").
  "this is a string, and now is longer"

We can copy a string a given number of times with copies/2.

  1> string:concat("de",string:copies("do", 3)).

We can use chars/3 to build a string using the same character multiple times.

  1> string:concat("this is g", string:chars($r, 10, "eat")).
  "this is grrrrrrrrrreat"

There is another variation without the tail.

  1> string:chars($+, 20).

You can use a number of functions to pad a string with either spaces or some character.

  1> Msg = "To be continued".
  2> string:left(Msg, string:len(Msg)+3, $.).
  "To be continued..."
  3> string:left(Msg, string:len(Msg)+3).
  "To be continued   "
  4> string:centre(Msg, string:len(Msg)+3, $.).
  ".To be continued.."
  5> string:centre(Msg, string:len(Msg)+4, $.).
  "..To be continued.."
  6> string:right(Msg, string:len(Msg)+3, $.).
  "...To be continued"

Splitting and slicing.

We can easily split a string into a list of tokens using the tokens/2 function.

  1> Tokens = string:tokens("this is a string", " ").

Notice that token/2 will return a list of tokens without the separator.

  1> string:tokens("this is a string", "i").
  ["th","s ","s a str","ng"]

You can get parts of a string using the familiar substr and sub_string functions.

  1> string:substr("abcdefghijklm", 5).
  2> string:substr("abcdefghijklm", 5, 3).
  3> string:sub_string("abcdefghijklm", 5).
  4> string:sub_string("abcdefghijklm", 5, 8).

Both substr/2 and sub_string/2 are equivalent but substr/3 takes the start and length while sub_string/3 takes the start and ending index instead.

You can find a given word in a string by index with sub_word/2.

  1> string:sub_string("this is a string of characters", 4).

Sometimes a string is not a human language and we want to use a different character as the word separator using sub_word/3.

  1> string:sub_word("this is a string of characters", 4, $i).
  "ng of characters"

Finding and comparing strings.

String equality is easy, just call equal/2.

  1> string:equal("equality", "equality").
  2> A = "equality".
  3> B = "equality".
  4> string:equal(A, B).

In most cases you may want to normalize the strings before comparing them.

You can use to_lower or to_upper to normalize the capitalization of a string.

  1> A = "EQ".
  2> B = "Eq".
  3> string:equal(A, B).
  4> string:equal(string:to_lower(A), string:to_lower(B)).
  5> string:equal(string:to_upper(A), string:to_upper(B)).

Using the strip functions you can remove spaces or filled characters from the string.

  1> string:strip("    no blank    ").
  "no blank"
  2> string:strip("    no blank    ", left).
  "no blank    "
  3> string:strip("    no blank    ", right).
  "    no blank"
  4> string:strip("    no blank    ", both).
  "no blank"
  5> string:strip("+++++++no plus signs++++++", both, $+).
  "no plus signs"
  6> string:strip("+++++++no plus signs++++++", left, $+).
  "no plus signs++++++"
  7> string:strip("+++++++no plus signs++++++", right, $+).
  "+++++++no plus signs"

We can check for inclusion of a character or a string in another string using the chr/2 and str/2 functions and their reverse versions.

They will return the position in the string or 0 if not found.

  1> Str = "this is a string".
  2> string:str(Str, "n").
  3> string:str(Str, "p").
  4> string:rstr(Str, "n").
  5> string:rstr(Str, "s").
  6> string:str(Str, "s").
  7> string:chr(Str, 116).
  8> string:rchr(Str, 116).
  9> string:rchr(Str, 11).

Float and integer conversion.

These functions are very interesting.

If the string starts with an integer it will parse that part of the string returning the Integer and Rest in a tuple {Integer, Rest}.

  1> string:to_integer("98.87").
  2> {Ia, Irest} = string:to_integer("09.10").
  3> Ia.
  4> Irest.
  5> string:to_integer(Irest).
  6> {Ic,_} = string:to_integer("+3").
  7> {Id,_} = string:to_integer("-3").

The to_float/1 function has a similar behaviour.

  1> string:to_float("2.67").
  2> string:to_float("2.67 - 10").
  {2.67," - 10"}
  3> string:to_float("-10").
  4> string:to_float("-10.2").
  5> string:to_float("-10.").
  6> string:to_float("-10.0").

This covers the string module that give us most of the tools we are used to have in some other languages.