Chapter Seven
Doing Stuff with Strings


String Manipulation: The Single Most Important Skill You'll Learn

It is probably difficult for the beginning programmer to fathom just how important string manipulation is. It does sound boring, doesn't it?

But strings are the heart and guts of nearly every script you'll put together.

You'll use strings to assemble data input from the visitor to your site, to format HTML you wish to display on the visitor's screen, to read in data from a database, etc.

The scripts for this chapter are designed to give you examples of the various methods of "picking apart" and "putting back together" of strings using the string manipulation calls.

variable1=variable2.charAt(x);

This built-in call of JavaScript tells you the character at any position you wish in a selected string.

variable1 is a string that will be set equal to the character at position x in the string you have defined as variable2.

variable2 is any literal string or a variable you've set equal to a string that you want to search in.

x is any whole number literal or numeric variable you've set equal to a whole number from 0 to the length of your string less one.

Things to Remember:

Characters in any string are counted from left to right. But you do not count from one.

The position of the first character is 0, and the position of the last character is the length of the string minus one. If the position you supply is out of range (less than zero or greater than the string length minus one) JavaScript returns an empty string.

Why is the highest number the length of the string minus one? Because you begin counting with 0 instead of one:

          string:       abcdef  (length=6)
                        ||||||
          positions:    012345  (length-1)

The following example displays characters at different locations in the string "Now is the time".

var str="Now is the time";
var chr=charAt(str,0);

This code snippet would result in:

chr="N";

And another example:

var str="Now is the time";
var chr=charAt(str,9);

This code snippet using our variable named "chr" would result in:

chr="e";

You do not need to use a variable string to find the character at a particular position. You can also use a literal:

var chr=charAt("Now is the time",9);

This code snippet using a literal string would also result in:

chr="e";

What happens if you ask for the character at a position past the end of the string?

var str="Now is the time";
var chr=charAt(str,15);

This code snippet would result in:

chr="";

Why? Because 15 is out of range. Whenever you ask for a character at a position not existing in the string, the variable will be set to "" (empty, or nothing)

x=variable.indexOf(var,y);

This is a very important built-in call that you will use many times in your own scripts. It permits you to find out if a word exists in a sentence, for example, and where it is, if it does.

In conjunction with the built-in .substring(); call you can take apart sentences and rewrite them, check to see what information was entered in a form, etc.

variable is any literal string or a variable you have set equal to a string.

var is a literal string or a variable you have set equal to a string comprised of the single letter, group of letters or word you are looking for in variable.

y is the position in variable where you want the search for your characters to begin. It can be any literal whole number or a variable you have set equal to a whole number from 0 to the length of variable minus one .

Timesaver: The y does not have to appear in the .indexOf(); call. If you leave it out, the call will start looking at the first character of variable (position 0). Characters in your string are counted from left to right. The position of the first character is 0, and the position of the last character is one less than the length of your string.

You don't need to include a value for position for most searches, since you nearly always will be searching the whole string and JavaScript will assume you mean 0 for position if you don't place a value in y.

If your var group is not found in variable, you will get back a value of -1, that is, x will be set equal to -1.

Here's a little script that uses both the .charAt(); and .indexOf(); calls:

<HTML><HEAD>
<TITLE>.charAt and .indexOf Calls/TITLE>
<SCRIPT LANGUAGE="JavaScript">
<!-- Hide from JS-Impaired Browsers
/* alph is simply a string to hold
   the letters of the alphabet in
   order of the corresponding .gif
   files. */
alph="ABCDEFGHIJKLMNOPQRSTUVWXYZa"
+"bcdefghijklmnopqrstuvwxyz0"
+"123456789!@#$%^&*()-_=+:;\"',.?/ ";
/* And a sentence to send to screen
   using image swaps. */
ls="The quick brown fox jumped over the"
+" lazy dogs";
 
function prtIt(){
 for (var i=0;i<45;i++){
  j=ls.charAt(i); // get the character
  k=alph.indexOf(j); // get its position
  document.images[i].src="alpha/"+k+".gif";
  }
 }
 
function showIt(){
 for (var i=0;i<45;i++){
  document.images[i+ad].src="nr/rbd.gif";
  }
 document.images[flg+ad].src="nr/rbl.gif";
 document.images[ad+45].src="nr/w.gif";
 /* This next line combines both the
     .indexOf() and .charAt() calls. */
 document.images[ad+45].src="alpha/"
 +alph.indexOf(ls.charAt(flg))+".gif";
 }
 
/* This little routine is simply used to
   count the number of images you may
   place on your web page prior to the
   radio buttons. Just makes the routine
   independent of how you lay your
   page out. */
function getImgAdd(){
 for (var i=0;i<50;i++){
  if (document.images[i].src.indexOf("rbd.gif")>-1){
   ad=i;
   i=50;
   }
  }
 }
// End Hiding -->
</SCRIPT>
</HEAD>
<BODY BGCOLOR="white">
<CENTER><P>
<B>JavaScript Using the
<FONT COLOR="blue">.charAt()</FONT>
 and <FONT COLOR="blue">.indexOf()
 </FONT> Calls</B>
<P><TABLE BORDER=0 WIDTH=450
 CELLPADDING=0 CELLSPACING=0>
<TR><TD>Click the Radio Button below any letter
in the red string to see the .charAt() call used to
return the letter at that position to screen<P></TD></TR>
 
<SCRIPT LANGUAGE="JavaScript">
<!-- Hide from JS-Impaired Browsers
document.write("<TR><TD>");
for (var i=0;i<45;i++){
 document.write("<IMG SRC='nr/w.gif' WID"
 +"TH=10 HEIGHT=14 BORDER=0>");
 }
document.write("</TD></TR><TR><TD>");
for (var i=0;i<45;i++){
 document.write("<A HREF='s12.htm'"
 +" onClick='flg="+i+";showIt();return fa"
 +"lse;'><IMG SRC='nr/rbd.g"
 +"if' WIDTH=10 HEIGHT=10 BORDER=0>");
 }
document.write("</TD></TR><TR><TD ALI"
+"GN=CENTER><BR><FONT SIZE=5><B>You"
+" clicked below this letter:</B><BR><IMG"
+" SRC='nr/w.gif' WIDTH=20 HEIGHT=2"
+"8 BORDER=0></TD></TR></TABLE>");
// End Hiding -->
</SCRIPT>
<SCRIPT LANGUAGE="JavaScript">
<!--Hide from JS-Impaired Browsers
getImgAdd();
prtIt();
// End Hiding -->
</SCRIPT>
</BODY>
</HTML>

Click Here to See This Script.

x=variable.lastIndexOf(var,y);

This built-in call of JavaScript is exactly like the .indexOf(); call, above, but works backwards (i.e., from the end of variable toward the beginning.

variable is any literal string or a variable you set equal to a string.

var is a literal string or a variable you set equal to a string of the single letter, group of letters or word you are looking for in variable.

y is the position in variable that you want the search for your var to begin. It can be any literal whole number from the length of variable -minus-one to one.

The following simple examples use the .indexOf(); and .lastIndexOf(); calls to locate values in the string "Now is the time".

var str="Now is the time";
var chr="is";
var pos=str.indexOf(chr);

This code snippet would result in:

pos=4;

Here's another code snippet:

var str="Now is the time";
var pos=str.indexOf("the");

This code snippet would result in:

pos=7;

Yet another example:

var str="Now is the time";
var pos=str.indexOf("the",8);

This code snippet would result in:

pos=-1;

Why? Because "the" is not found beginning at position 8 to the end.

Here's an example where it is found:

var str="Now is the time";
var chr="i";
var pos=str.lastIndexOf(chr);

This code snippet would result in:

pos=12;

12 being the position of the last "i" in our search string.

Another example:

var str="Now is the time";
var pos=str.indexOf("t");

This code snippet would result in:

pos=11;

And a final example:

var str="Now is the time";
var pos=str.lastIndexOf("the",12);

This code snippet would result in:

pos=7;

Take note that even when reading backwards from the end of the string, the returned position is counted from the beginning of the string, again with the count commencing with 0.

variable1=variable2.substring(x,y)

You'll almost always use this built-in function of JavaScript in conjunction with the indexOf();, .lastIndexOf(); and .charAt(); calls that you have just learned something about.

This is how you harness the power of JavaScript to take database information in very long strings (30k or more) and find specific pieces of information in those strings. Or to take form data entered by a visitor to your pages and pre-process it before sending it to a server .cgi script for recording in files.

variable1 is any variable you wish to name to accept the results of this call.

variable2 is any string literal or variable you have set equal to a string that you wish to parse apart.

x and y are any literal whole numbers or variables you have set from 0 to the length of the string minus one.

How it works:

Just as with .indexOf();, .charAt(); and .lastIndexOf();, the characters in any string are counted from left to right.

The position of the first character is 0, and the position of the last character is the length of the string minus one.

So, once you have grasped the concept for any of these built-in calls you understand the others, including the .substring(); call.

When you have determined the portion of the string you wish to read into variable1, you may grab that substring portion without worrying whether x is less than y.

If your x is less than your y, the .substring(); function returns the portion starting with the character at x and ending with the character before y.

If x is greater than y, the substring function returns the portion starting with the character at y and ending with the character before x.

Now, if you re-read the prior two paragraphs very carefully, you will see that the resulting substring is identical!

So, the upshot of all that gobbledy-gook is this: ya don't have to worry about which is less or perform any tests to find out.

If x is equal to y, the .substring() call returns an absolutely empty string. Why? Because you have specified a substring commencing at the first number and ending before the second number, and since they are equal, nothing's left!

The following example uses the .substring(); call to display characters from the string "Now is the time".

str="Now is the time";
str1=str.substring(0,3);
str2="Now is the time".substring(0,3);
str3=str.substring(7,10);

This results in:

str1="Now";
str2="Now";
str3="the";

OK. Now let's take a look at a script that makes use of the .substring(); call to "pick apart" and "reassemble" some strings.

This is an important call to get your head around, because you will be using it all the time for database retrieval and screen display, etc.

<HTML><HEAD>
<TITLE>Using .substring Calls</TITLE>
<SCRIPT LANGUAGE="JavaScript">
<!-- Hide from JS-Impaired Browsers
/* This "curse" string may be changed to
   "**********" or anything you deem more
   appropriate. It must be as long as your
   longest "bad" word. */
curse="#@&*%!#@&*%!#@&*%!";
nr=20; // Number of words in "bad"
 
function smutEngine(){
 /* Caution: this particular script contains
    really nasty and offensive words. That's
    the whole point, of course. You may
    well want to prevent such words from
    appearing in content submitted by
    visitors to your site. This routine uses
    the .substring call to replace any words
    you decide are offensive at your Web
    Site with the curse symbols. The "bad"
    string should only be comprised of
    those words you don't wish to appear
    in postings or submissions at YOUR
    Web Site. Don't forget the tilde after
    the last word in your own list of "bad"
    words. */
bad="sex~babes~shit~fuck~damn~porn"
+"o~cum~cunt~prick~pecker~asshole~pe"
+"dophil~man-boy~man/boy~dong~twa"
+"t~hell~whore~bastard~bitch~";
 txt=document.ug.ly.value; // get input
 /* We must "fold" the input text to
    lower case, so that any bad words
    will "match" when we look in the
    visitor's input for them. Note that
    we do NOT change the "txt" itself,
    but put it in temporary variable
    named "tmp" in this script. */
 tmp=txt.toLowerCase();
 for (var i=0;i<nr;i++){
  pos=bad.indexOf("~");
  /* Pick the word from the
     front of "bad" */
  wrd=bad.substring(0,pos);
  /* Rewrite "bad" from that word
     to end of string. */
  bad=bad.substring(pos+1,bad.length);
  /* Now look for every occurrence
     of that bad word in the temp
     variable. (Might be more than
     one, so the "while" call is used ) */
  while (tmp.indexOf(wrd)>-1){
   pos=tmp.indexOf(wrd);
   /* First, replace the offending
     stuff in the temporary variable
     so we won't be in fatal loop */
   tmp=tmp.substring(0,pos)
   +curse.substring(0,wrd.length)
   +tmp.substring((pos+wrd.length),tmp.length);
   /* Now, since it appears in exactly
     the same position in the input,
     we do the same to the visitor's
     string. */
   txt=txt.substring(0,pos)
   +curse.substring(0,wrd.length)
   +txt.substring((pos+wrd.length),txt.length);
  }
 }
 /* Once we are done testing for
    all bad words and have rewritten
    it, send it back to the form
    element. */
 document.ug.ly.value=txt;
 }
 
function htmlOut(){
 txt=document.ug.ly1.value;
 ctr=0;
 while ((txt.indexOf("<")>-1)&&(ctr<4)){
  pos=txt.indexOf("<");
  txt=txt.substring(0,pos)+"<"
  +txt.substring(pos+1,txt.length);
  ctr++;
  }
 while (txt.indexOf(">")>-1){
  pos=txt.indexOf(">");
  txt=txt.substring(0,pos)+">"
 +txt.substring(pos+1,txt.length);
  }
 document.ug.ly1.value=txt;
 }
 
// End Hiding -->
</SCRIPT>
<BODY BGCOLOR="white">
<CENTER>
<P>JavaScript employing the <FONT COLOR="blue">
.substring</FONT> call</B>
<FORM NAME="ug">
<TABLE BORDER=0 WIDTH=486>
<TR><TD><B>First example: A "Smut Engine" to
Replace Unwanted Words in Form Submissions</B>
<P>Depending upon the audience for your Web Site,
you may wish to screen form submissions (at least
perfunctorily) for content before processing them.
This routine uses the <B>.substring</B> call
to simply replace "offending" words by "taking the
string apart" and then "reassembling" it.
<P>Type some dirty words here and click the submit button.
<P><DIV ALIGN=CENTER>
<INPUT TYPE="text" NAME="ly" SIZE=60
 VALUE="">
<BR><INPUT TYPE="button" NAME="but1"
 VALUE=" Submit "
 onClick="smutEngine()"></DIV><P>
</TD></TR>
<TR><TD><P><HR NOSHADE>
<P>Second example: Stripping the HTML from Visitor
Entries</B>
<P>If entries will be displayed later, it may be
important to "disable" any direct HTML entries folks
decide they want to enter. Here we just look for
left and right carets (<>) and use the
<B>.substring</B> call to replace them with
"<" and ">" respectively.
<P>Type some HTML into the text element below.
Then click the submit button to see how it is altered.
<P><DIV ALIGN=CENTER>
<INPUT TYPE="text" NAME="ly1" SIZE=60
 VALUE="">
<BR><INPUT TYPE="button" NAME="but2"
 VALUE=" Submit " onClick="htmlOut()">
</DIV><P></TD></TR>
</TABLE></FORM>
</BODY>
</HTML>

Click Here to See This Script.

variable1=variable2.toLowerCase();
and
variable1=variable2.toUpperCase();

Whenever you are doing searches using JavaScript, you will find these calls extremely important.

Why? Because searches are case sensitive. If you search a string for the word "now" and the word "Now" is in that string, you will not find it.

Thus, for databases you will be using, you may find it useful to store them either as all upper case or all lower case just to save yourself the additional line of JavaScript to convert that data to either all upper or all lower case.

But in either instance, if you are relying upon form data entered by the visitor to your site, you will need to use one of these calls to insure that a valid search doesn't fail due to case insensitiveness.

variable1 is any variable you decide upon to receive the results of these calls.

variable2 is any literal string or variable you have set equal to a string which contains (perhaps) both upper and lower case letters.

str="Now Is The Time";
str1=str.toLowerCase();
str2="Now Is The Time".toLowerCase();
str3=str.toUpperCase();
str4="Now Is The Time".toUpperCase();

This will result in the following:

str1="now is the time";
str2="now is the time";
str3="NOW IS THE TIME";
str4="NOW IS THE TIME";

x=variable.length;

Although I've included the .length call in this section on string parsing, since it is used most often in that context, this call is useful in a number of other contexts as well. >x is a numeric set equal to the length of what-ever you want to determine the length of.

variable may be a string you wish to know the length of, or any of the following:

forms - how many different forms?
frame_name - how many children?
history - how many URL's in Go menu?
radio_name - how many in this group?
select element - how many select boxes?
windows - how many referenced?
anchors - how many?
elements - how many?
links - how many?
select_name.options - how many options?

.length is always a read-only property. For a null string (a string equal to "") length will be set to zero.

In the following example, the getSel(); function uses the length call to iterate over every element in the "a" array. "a" is a select element on the "isn" form.

function getSel() {
 len=document.isn.a.length;
 for (var i = 0; i < len; i++) {
  if (document.isn.a.options[i].selected) {
   x=i;
   i=len; // exit loop cleanly, clear register
   }
  }
 }

and using this form:

<FORM NAME="isn">
<SELECT NAME="a" SIZE=3>
<OPTION>First
<OPTION>Second
<OPTION>Third
</SELECT>

variable=document.location

This call sets variable equal to the entire URL (or file pathname when run locally) It is a read-only call, i.e., you can't change location with this call.

To change location,

location.href ="url";

will change the location. Don't let the similarity of these calls confuse you.

The following example displays the URL of the document presently on screen:

str=document.location; document.write("Current URL is "+str);

ISO LATIN-1 Character Set

Finally, some very esoteric scripts that you write may require ASCII encoding in the ISO Latin-1 character set. (I doubt it, though)

However, on the off chance that this is something you'll want to do, the final two calls in this chapter are provided in the interests of completeness.

Skip them if you can!

variable1=escape(variable2);

This call sets the variable1 you specify equal to the ASCII encoding of a character (variable2) you also specify in the ISO Latin-1 character set.

variable2 is any literal character or variable you have set equal to a character that is not part of the lower case or upper case alphabet or a number.

If you do mistakenly set variable2 to an alpha-numeric, it will not error out on you, but will simply set variable1 equal to that self-same alpha-numeric.

variable1 will be set in this form: "%xx", where xx is the ASCII encoding of variable2.

This code snippet:

str="!";
str1="#&!";
str2=escape(str); // using variable
str3=escape("!"); // using literal
str4=escape(str1); // using variable
str5=escape("#&!"); // using literal

Would result in the following:

str2="%21";
str3="%21";
str4="%23%26%21";
str5="%23%26%21";

variable1=unescape(variable2);

This call sets the variable1 you specify equal to the character in the ISO Latin-1 character set that corresponds to the ASCII characters in variable2.

variable2 is any literal characters or variable you have set equal to characters that is not part of the lower case or upper case alphabet or a number.

If you do mistakenly set variable2 to an alpha-numeric, it will not error out on you, but will simply set variable1 equal to that self-same alpha-numeric.

variable2 should be set in this form: "%xx", where xx is the ASCII encoding of variable2.

This code snippet:

str="%21";
str1="%23%26%21";
str2=unescape(str);
str3=unescape("%21");
str4=unescape(str1);
str5=unescape("%23%26%21");

Would result in the following:

str2="!";
str3="!";
str4="#&!";
str5="#&!";



© Copyright 1997, John H. Keyes john.keyes@intellink.net