File Pyth0012.htm
April 6, 2000
Something for everyone
Beginners start at the beginning, and experienced programmers jump in further along. Lesson 1 provides an overall description of this online programming course.
This lesson provides an introduction to the use of strings.
A person's first name usually consists of several characters, and these characters are treated as a unit to produce a name.
What is a literal?
Perhaps the best way to describe a literal is to describe what it is not.
A literal is not a variable. In other words, the value of a literal doesn't change with time as the program executes. You might say that it is taken at face value.
An expression using variables
For example, the following expression describes the sum of two variables named var1 and var2:
sum = var1 + var2
The result of this expression can vary depending on the values stored in var1 and var2 at the instant in time that the expression is evaluated.
An expression using literals
On the other hand, the following expression describes the sum of two literal numeric values:
sum = 6 + 8
No matter when this expression is evaluated, it will always produce a sum of 14.
String literals
Literal values can also be used for strings.
For example, the interactive code fragment in Figure 1 shows
The first two entries are valid string literals. As you can see, in the first two cases, the interpreter displays my name in the output.
Note that in the first two cases, my name is surrounded by either quotes (sometimes called double quotes) or apostrophes (sometimes called single quotes).
A syntax error
However, the third entry is not a valid string literal, and the interactive interpreter produced a syntax error message. In the third case, my name is not surrounded by either double quotes or single quotes, and that is what produced the error.
So, what is a valid string literal?
According to the Python Reference Manual,
String literals can be enclosed in matching single quotes (') or double quotes ("). |
This explains why the first two input lines in the above interactive code fragment were accepted and the third line produced an error.
Proper syntax
In the first line, my name was surrounded by matching double quotes. In the second input line, my name was surrounded by matching single quotes.
Bad syntax
However, in the third input line, my name was not surrounded by quotes of either type and this produced a syntax error.
More examples
Figure 2 shows two more examples of valid string literals with the input value highlighted in boldface.
(Note that I purposely colored the "\012" in red to make it stand out. It was not that color in the original interpreter output. I will explain what it means later.)
What does """...""" mean?
This syntax is explained by the following excerpt from the Python Reference
Manual
Strings can also be enclosed
in matching groups of three single or double quotes (these are generally
referred to as triple-quoted strings).
The backslash (\) character
is used to escape characters that otherwise have a special meaning, such
as newline, backslash itself, or the quote character.
|
Use of triple quoted strings
One of the main advantages of using triple-quoted strings is that this makes it possible to
The newline (\012) character
When this triple quoted, multiple-line input was displayed, by the interpreter, the display included "\012".
This is a numeric representation of the newline character. (I will show you another representation later.) It appeared in the output at the point representing the end of the first line of input. This indicates that the interpreter knows and remembers that the input string was split across two lines.
Why "represent" the newline character?
As the name implies, a newline character is a character that means, "Go to the beginning of the next line."
The newline character is sort of like the wind. You can't see the wind, but you can see the result of the wind blowing through a tree.
Similarly, you can't see a newline character, but you can see what it does. Therefore, we must represent it by something else, like \012 if we want to be able to see where it appears within a string.
An escape sequence
The \012 is what we call an escape sequence. I will discuss escape sequences in detail a little later.
One more syntax option
The Python Reference Manual describes one more syntax option for strings
as shown below. I am going to let this one lie for the time being.
I will come back and address it in a subsequent lesson if I have the time.
I am including it here simply for completeness.
String literals may optionally
be prefixed with a letter `r' or `R'; such strings are called raw strings
and use different rules for backslash escape sequences.
... Unless an `r' or `R' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C." |
An example of the first category is the newline character. Except when using triple quoted strings, you cannot enter the newline character directly into a string.
Why? Because when you press the Enter key in an attempt to enter a newline, that simply terminates your input for that line. It doesn't enter the newline character into the string.
Using the newline character
The interactive code fragment in Figure 4 illustrates the use of an escape sequence to enter the newline character into a string. Note the \012 between my first and last names.
What does print mean?
This fragment uses a print statement. I haven't explained that statement to you before, but you can probably guess what it means.
When print is used interactively, it is a request to have its right operand (the expression to its right) printed on the next line. In this case, it is a request to have my name printed on the next line.
Including the newline character
In this fragment, I entered the newline escape sequence between my first and last names when I constructed the string. Then, when the string was printed, the cursor advanced to a new line following my first name and printed my last name on the new line. That is what escape sequences are all about.
print renders according to meaning
Note also that the print statement rendered the newline character according to its meaning.
What I mean by this is that the print statement did not print something that represented the newline character (\012) as we have seen before. Rather, it actually did what a newline character is supposed to do -- go to the beginning of the next line.
Escaping the quote character
Suppose that you are constructing a string that is surrounded by double quotes, and you want to use a pair of double quotes inside the string. If you were to simply enter the double quote when you construct the string, that quote would terminate the string.
The interactive code fragment in Figure 5 shows how to escape the double quote character -- precede it with a backslash character.
What I mean by this is that if you want to include a double quote inside a string that is surrounded by double quotes, you must enter the double quote inside the string as follows: \"
Avoiding the quote problem
Because this is such a common problem, and because the escape solution is so ugly and difficult to read, Python gives us another way to deal with quotes inside of quotes. This solution, shown in Figure 6, is the use of single and double quotes in combination.
In Python, double quotes can be included directly in strings that are surrounded by single quotes, and single quotes can be included directly in strings that are surrounded by double quotes. This is much easier to read than the solution that requires you to place a lot of backslash characters inside your string.
List of escape sequences
A complete list of the escape sequences supported by Python is available in the Python Reference Manual.
End the line with a backslash
As shown in Figure 7, the use of a backslash at the end of the line makes it possible to continue the string on a new line. However, the backslash is not included in the output, and there is no newline character in the output.
Not restricted to strings
Actually, the backslash can be used at the end of a line to cause that line to be continued on the next line whether inside a string or not. This is illustrated in the review section.
A form of concatenation
When used in this way with a string, the backslash at the end of the line becomes a form of string concatenation. The portions of the strings on each of the input lines are concatenated to produce a single line containing both parts of the string in the output.
I will have more to say about string concatenation later in this lesson.
Use the \n escape sequence
As shown in Figure 8, the inclusion of "\n" inside the string produces the same result as the inclusion of the numeric representation of the newline character, "\012" shown earlier.
This is the common form of the newline escape sequence typically used in C, C++, and Java.
Combine backslash and \n
The code in Figure 9 shows how to combine the backslash at the end of the line with a newline character placed there to cause the output to closely resemble the input.
Literal string concatenation
You can cause literal strings to be concatenated just by writing one adjacent to the other as shown in Figure 10.
Note that you can mix the different quote types and it doesn't matter if there is whitespace in between.
Creating whitespace
However, if you want any space between the substrings in the output, you must include that space inside the quotes that delimit the individual strings as shown in Figure 11.
Using + for concatenation
The plus operator (+) can be used to concatenate strings as illustrated in Figure 11.
This fragment assigns string literal values to two variables, and then uses the plus operator to concatenate the contents of those variables with another string literal.
Of course, it could also have been used to concatenate the contents of the two variables without the string literal in between.
Whitespace is included in the quotes
Note that the string literals contain space characters. There is a space after the d in my first name and before the B in my last name. That is what I meant earlier when I said that if you want any space between the substrings in the output, you must include that space inside the quotes
Ans: The common interpretation of the word string in computer programming jargon is that a string is a sequence of characters that is treated as a unit. For example, a person's first and last names are often treated as two different strings.
2. Describe the common meaning of the word literal in your own words.
Ans: Perhaps one way to describe the meaning of the word literal would be that the literal item is taken at face value, and its value is not subject to change as the program executes.
3. Describe three different ways to format string literals (without spanning lines) and show examples.
Ans: Surround with matching pairs of single quotes, double quotes, or triple quotes as shown in Figure 12.
4. What is one of the advantages of using triple quoted strings? Show an example.
Ans: The use of triple quoted strings, as shown in Figure 13, makes it possible for you to continue a string on a new line, and to preserve the line break in the string.
5. Show two different representations of the newline character.
Ans: \012 and \n as shown in Figure 14. Of the two, the latter is probably the most commonly used, perhaps because it is easiest to remember.
6. Describe, in your own words, the purpose of an escape sequence. Show two examples.
Ans: Escape sequences are special sequences of characters used to represent other characters that either
7. Show two different ways to include a double quote character in a string.
Ans: Surround with single quotes, or use an escape character as shown in Figure 16.
8. Show the escape sequence for the tab character.
Ans: The escape sequence for the tab character is \t as
shown in Figure 17.
Copyright 2000, Richard G. Baldwin. Reproduction in whole or in part in any form or medium without express written permission from Richard Baldwin is prohibited.
Richard has participated in numerous consulting projects involving Java, XML, or a combination of the two. He frequently provides onsite Java and/or XML training at the high-tech companies located in and around Austin, Texas. He is the author of Baldwin's Java Programming Tutorials, which has gained a worldwide following among experienced and aspiring Java programmers. He has also published articles on Java Programming in Java Pro magazine.
Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.
-end-