Java Jazzle: Replace word

Thursday, October 28, 2010

Replace word

Problem: Write a method to replaces all occurences a word in a string with another word. Assume the method signature is

   static String replaceWord(String original, String find, String replacement)

Note that Strings are immutable, so we must create a new String. The word "replace" is probably misleading. Also, the method is static because it doesn't reference instance variables. This allows some optimizations by the compiler.

A simple, but incorrect, solution

This first attempt only replaces one word, but we could fix that up later with a loop -- but it's not worth the effort because it's wrong anyway.

//--------------------------------------------- replaceWord() 
// BAD BAD BAD BAD BAD  
static String replaceWord(String original, String find, String replacement) {
    int i = original.indexOf(find);
    if (i < 0) {
        return original;  // return original if 'find' is not in it.
    }
  
    String partBefore = original.substring(0, i);
    String partAfter  = original.substring(i + find.length());
  
    return partBefore + replacement + partAfter;
}

This example method replaces the first occurrence of a one string with another. If we execute the following code,

   t = "A great man";
   s = replaceWord(t, "man", "woman");

the value in s will be "A great woman". Ok, but what if we changed the orginal string to contain the word "human", or anything else that contained a substring, but not a separate word, that has "man" in it.

   t = "A great human";
   s = replaceWord(t, "man", "woman");

Now s will contain "A great huwoman". Not what was intended.

The solution is to know about "words"

Simply finding a substring isn't sufficient. There are several ways to solve the problem.

Check the character preceding and following the string to see if they are delimiters, and can't be part of a word, or the word boundary is at the beginning or end of the string.
Use java.util.StringTokenizer. This is a common solution, but has been superceded by regular expressions (see below).
Java 1.4 added Regular Expressions, which is the most common way to solve the problem in most other programming languages. [TO DO: Write solution using regular expressions.]

StringTokenizer solution

This replaces all occurences of a "word", not just the first.

static String replaceAllWords1(String original, String find, String replacement) {
    String result = "";
    String delimiters = "+-*/(),. ";
    StringTokenizer st = new StringTokenizer(original, delimiters, true);
    while (st.hasMoreTokens()) {
        String w = st.nextToken();
        if (w.equals(find)) {
            result = result + replacement;
        } else {
            result = result + w;
        }
    }
    return result;
}

This breaks the string into tokens, consisting of the text between delimiters, as well as the tokens consisting of the delimiters. The true parameter causes the delimiters to also be returned as tokens. You might not need them if you are only interested in what comes between them. Naturally, you would choose the delimiters appropriate to your problem.

Buiding a string by sucessive concatenations is inefficient, as it creates new string as it is immutable and should be replaced by StringBuilder (or the equivalent pre-Java 5 StringBuffer) for efficiency. See below.

Using StringBuilder to build a String

Creating new strings in a loop can be inefficient because of the number of new objects that are created and discarded. It's better to use a StringBuilder (introduced Java 5) or the slightly slower, synchronized, StringBuffer, which has been in Java from the beginning.

static String replaceAllWords2(String original, String find, String replacement) {
StringBuilder result = new StringBuilder(original.length());
    String delimiters = "+-*/(),. ";

StringTokenizer st = new StringTokenizer(original, delimiters, true);
    while (st.hasMoreTokens()) {
        String w = st.nextToken();
        if (w.equals(find)) {
            result.append(replacement);
        } else {
            result.append(w);
        }
    }
    return result.toString();
}
This creates a StringBuilder with an initial capacity equal to the original string. It will expand if necessary, so this initial capacity isn't critical. The append(...) method is used to add to the end of it. The toString() call at the end returns a String. This is an efficient conversion.

Java Jazzle

Pages

Thursday, October 28, 2010

Replace word

A simple, but incorrect, solution

The solution is to know about "words"

StringTokenizer solution

Using StringBuilder to build a String

No comments:

Post a Comment

Chitika

Other Blogs

Blog Rankings

Total Pageviews

Disclaimer

About this blog