[Grace-core] [minigrace] Need more string functionality (#52)

Kim Bruce kim at cs.pomona.edu
Thu Aug 21 11:56:18 PDT 2014


This is an interesting question and has led me to think about alternatives.

First a quick review.  Most languages (including Smalltalk, much to my surprise!) return -1, 0, or some other impossible value to indicate the absence of the element searched for.  Python supports this, but also has a version that throws an exception if the value is not there.  This seems like a gratuitous use of exceptions (I feel exceptions should be used only in exceptional circumstances), so I won't consider that further.

Haskell has a function elemIndex (on lists) that returns a value of type Maybe Int -- which corresponds to the index if found and the special value Nothing if not found.  This seems less complicated that Andrew's interface, which requires a block to handle the case where the element is not found, as well as a match statement to extract the number if it is there.

The two most attractive alternatives to me are:
1.  The Haskell solution:  indexOfOpt(pattern:String -> Option<Number>
      where type Option<T> = T | nothing, where nothing is a special object with minimal behavior

2.  Use blocks for both the positive and negative results
      indexOfBlock<T>(pattern: String) ifPresent (posAction: Block1<Number,T>) ifAbsent(negAction: Block0<T>) -> T
      In the above, Block1<U,T> = {apply(p:U) -> T} while Block0<T> = {apply -> T}

These provide very different ways of handling the possible lack of a reasonable value.  

The first returns that absence and forces the calling context to deal with it, generally via a match statement.

The second forces the call to index to decide what to do with the value, rather than waiting for the value to come back and then handle it.

Here is an example of the use of each of these to write a simple contains method (outside of String), writing it as a method of String would be similar

method containsOpt(pattern:String) in (source:String) -> Boolean {
      match (source.indexOfOpt(pattern))
            case{index: Number -> true}
            case{nothing -> false}
}

method containsBlock(pattern:String) in (source:String) -> Boolean {
       source.indexOfBlock(pattern) ifPresent{ index: Number -> true}
                                                       ifAbsent{ false}
}

I personally like the second one slightly more than the first, though the first is closer to what is found in other languages if that is something we would like to consider in our design, and is only slightly more complex (you have to deal with Option types, whereas you can ignore that in the second).

By way of contrast, the simplest method I could come up with Andrew's spec is the following:

method containsBlack(pattern:String) in (source:String) -> Boolean {
       source.indexOfBlack(pattern) ifAbsent{ return false}
       true
}

While it is shorter than the above examples, if the index were actually to be used, you would have to add a (gratuitous) match statement and then ignore the case of type W.  I also find this solution difficulty to explain to novices as you only compute the index in order to handle the failure.

----------------
As for Andrew's question why we need the variant that starts at an arbitrary index, the reasons are purely efficiency.  If you are searching for all occurrences of a phrase in a large text (e.g., a book) then making all those substrings will be very inefficient.  We could argue that doesn't matter (efficiency has been a low priority for us), going from a linear to quadratic algorithm to find all occurrences of a substring grates a bit!

Kim



On Aug 21, 2014, at 11:03 AM, Andrew Black <notifications at github.com> wrote:

> The fist (indexOf) these methods is now implemented, as is `lastIndexOf. This and other methods on string are even documented in your BuiltInTypes.tex document, (which should probably be moved out of ObjectDraw). They return 0 if the pattern is not found.
> 
> The second is not implemented. I dispute the need for it, since the same thing can be accomplished by creating a substring. But I'm willing to be convinced.
> 
> I also believe that the right interface for these methods is
> 
> indexOf(pattern:String) ifAbsent(action:Block0<W>) -> Number | W 
> lastIndexOf(pattern:String) ifAbsent(action:Block0<W>) -> Number | W 
> where the client can put { 0 }, {-1}, { "Absent" }, {NoSuchObject.raise "pattern not found"}, or {return} as the action, depending on circumstances. We should not be encouraging students to believe that interfaces that return arbitrary numbers that then have to be tested are a Good Idea.
> 
>> Reply to this email directly or view it on GitHub.
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailhost.cecs.pdx.edu/pipermail/grace-core/attachments/20140821/8aff2283/attachment.html>


More information about the Grace-core mailing list