Monday 17 August 2015

String Extensions

Like many programmers, I find myself adding and removing characters from either the end or the beginning of strings a lot. It's a simple process, to add to the end of a string, we use:

// s = original string, containing "Hello World"
// n = new value to be added containing "!"
s = s + n //s now contains "Hello World!"

or to add to the beginning of the string, we can just as easily write:

s = n + s // s now contains "!Hello World!"

Retrieving and removing the end, or beginning, of a string is a little more complex, but still very straight forward.
For example, to get the last character from a string, and then removing it can be accomplished by:

// Assumes that s is a string that contains "Hello World"
v = s.Right(1) // v now contains "d"
s = s.Left(s.Len() - 1) // s now contains "Hello Worl"

And to get the first character, whilst removing it, we can use:

// Assumes that s is a string that contains "Hello World"
v = s.Left(1) // v now contains "H"
s = s.Right(s.Len() - 1) // s now contains "ello World"

Simple right?

Why would I need to even address this?

Well, I like simplicity, particularly in my code. I like it to be readable, even if I forget to comment.

To make these simple exercises even simpler, I took a leaf out of Javascript's array handling.
In Javascript, we have four methods for adding and removing items from the ends of an array: push, pop, shift and unshift.

pop and shift remove and return the last or beginning, respectively, element in the array whilst also removing that element from the array.
push and unshift will add the passed elements (yes, you can pass more than one) to the array, either at the end or the beginning.

So, I took this and applied to to the string type in Xojo, with a slight enhancement:
I didn't want to create a subclass the string type (in fact, Xojo won't allow you to do that anyway), so I decided to extend it, with the Extends keyword (as I showed in Extending The Date Class). Although it is not possible to subclass the intrinsic data types in Xojo, we are able to extend then, adding methods as we would for a class.

First off, I'll show you the Pop and Unshift methods:

Sub Push(Extends ByRef s As String, Value As String)
  s = s + Value
End Sub

Sub Unshift(Extends ByRef s As String, Value As String)
  s = Value + s
End Sub

As the eagle eyed amongst you have noticed, I have included the ByRef keyword as well as Extends. This allows us to affect the object, or datatype, being extended. If we didn't use the ByRef keyword, then any changes we made to the datatype would be lost when we exit the method.
Now, if I want to append something to a string, I can simply use the following:

// Assuming s = "Hello" and v = " World"
s.Push(v) // s now holds "Hello World"

Or we could do this:

// Assuming s = "World" and v = "Hello "
s.Unshift(v) // s now holds "Hello World"

To compliment these methods, I created the Pop and Shift methods. These, however, work a little differently to the Javascript originals. Normally, Pop and Shift return a single element from the Javascript array, however, as I sometimes need to get the first 2, or more, characters from the string, I have made these functions accept a parameter that allows me to specify the number of characters to retrieve:

Function Pop(Extends ByRef s As String, NumChars As Integer = 1) As String
  If NumChars < 1 Then NumChars = 1
  
  Dim ReturnValue As String = s.Right(NumChars)
  
  s = s.Left(s.Len() - NumChars)
  
  Return ReturnValue
End Function

Function Shift(Extends ByRef s As String, NumChars As Integer = 1) As String
  If NumChars < 1 Then NumChars = 1
  
  Dim ReturnValue As String = s.Left(NumChars)
  s = s.Right(s.Len() - NumChars)
  
  Return ReturnValue
End Function

Again, we are paying the extended string by reference. This is because we are removing characters from the string. The methods also return the removed characters. So, no if I want to get the last two characters of a string, and remove them from the original, I can use:

// Assuming s contains "Hello World"
v = s.Pop(2) // v now contains "ld" and s contains "Hello Wor"

If I wanted the first two characters, I would use:

v = s.Shift(2)

I have given the NumChars parameter a default of -1. I have also put conditions in the method to prevent the methods receiving a value for NumChars less than 1.

If I only want the first character, I can omit the NumChars parameter all together:

v = s.Shift()

and if I only want the last character, I can use:

v = s.Pop()

I don't know if you'll find these methods useful, but I find they save me a little typing and make my code a tad bit more readable.

If you would like to use them, feel free. Either copy and paste the code form here, or you can download them from here.

One small caveat, I use Xojo 2014 r2.1, which doesn't have the new framework. This may already be in the new framework, or this code might have issues. If anyone would like to test it in a later version of Xojo and let me know, I would be grateful.