This lesson, and the next several lessons will concentrate on network programming.
A similar situation exist for networking. It isn't very difficult to learn how to use the Java language to implement some network operations. However, in order to achieve much depth in this area, you probably also need to know something about the many other technical aspects of networking.
Many good books have been written on the technical details of networking and your are referred to one or more of those books to gain an in-depth knowledge of networking. In particular, I would refer you to Java Network Programming by Elliotte Rusty Harold.
In addition there are many other books that contain excellent sections on network programming. I would recommend that you take a look at the following:
Exploring Java by Patrick Niemeyer & Joshua Peck
Just Java 1.1 and Beyond, Third Edition by Peter van der Linden
Java Primer Plus by Tyma, Torok, and Downing
Java How to Program by Deitel and Deitel
For the most part, this and the next few lessons will be restricted to how you can use the programming capabilities of Java to write and execute network programs and won't attempt to go into overall network programming in any depth. However, a minimal amount of background information will be required, so we will attempt to provide that background in this lesson. Subsequent lessons will use this background along with the network programming capabilities of Java to write some simple, but interesting networking programs.
Each of the devices on the network can be thought of as a node, and each node has a unique address. The manner in which addresses are assigned will vary from one type of network to another, but in all cases, the address of each device must be unique so as to distinguish it from the other devices.
Addresses are numeric quantities that are easy for computers to work with, but are not easy for humans to remember. Therefore, some networks also provide names that humans can more easily remember than numbers.
Modern networks transfer data using a concept known as packet switching. This means that the data are encapsulated into packets which are transferred from the source to the destination. At the destination, it is necessary to extract the data from one or more packets and use it to reconstruct the original message.
Teaching your children to say please and thank you involves teaching them something about a protocol. If they occasionally forget to say please, however, they will probably get the cookie anyway.
If a computer protocol requires the participating computers to say please, and they forget to say please, they probably won't get the cookie.
There are many protocols available. For example, the HTTP protocol defines how web browsers and servers communicate and the SMTP protocol defines how email is transferred (we will write programs that implement part of the HTTP and SMTP protocols).
Note here that I have been discussing application protocols that operate at the surface level. We will also be making mention of lower-level protocols that operate below the application level. Fortunately, as high-level Java programmers, we don't have to be too concerned about the lower-level protocols. We'll let the systems people worry about them.
The Application Layer is the layer that delivers data to the user. The layers below that are involved with getting data from the Application Layer at one end of the conversation to the Application Layer at the other end. For the most part, we will be concerned only with the Application Layer.
In fact, in some situations, some other protocol may be used to move
our data between a client and a server. As long as it works, we really
don't care too much.
In a nutshell, IP is a network protocol that moves packets of data from a source to a destination. As the name implies, this is the protocol normally used on the Internet. |
The Transmission Control Protocol (TCP) was added to IP to give each end of a connection the ability to acknowledge receipt of IP packets and to request retransmission of lost packets. Also TCP makes it possible to put the packets back together at the destination in the same order that they were sent.
Therefore, you will often hear people using both acronyms in the same breath, as in TCP/IP. The two work together to provide a reliable method of encapsulating a message into data packets, sending the packets to a destination, and reconstructing the message from the packets at the destination.
For example, if one computer is sending date and time information to another computer every 100 milliseconds, and the data in the packets is displayed on a digital clock as it is received, you might prefer that each packet make the trip as quickly as possible even if that means that occasionally a packet will be lost or damaged.
The User Datagram Protocol (UDP) is available to support this type of operation. UDP is often referred to as an unreliable protocol because there is no guarantee that a series of packets will arrive in the right order, or that they will arrive at all.
As Java programmers, we have the choice of TCP or UDP, and we need to know enough about the characteristics of each to be able to make informed choices between them.
Every computer attached to an IP network has a unique four-byte (32-bit) address.
Thirty-two bits are sufficient to define a large number of unique addresses, but the manner in which addresses are allocated is wasteful, and many of the addresses that have been allocated are not being used.
Efforts are underway to expand the number of possible unique addresses to a much larger number. The planned number is the number of unique addresses that can be represented with a 128-bit address. Although I haven't taken the time to calculate the figure, Elliotte Rusty Harold reports it to be 1.6043703E32 in his book entitled Java Network Programming.
For human consumption, we usually convert the value of each of the four bytes to an unsigned decimal value and display them connected by periods to make them easier to remember. For example, as near as I can tell, as of this writing, the IP address of www.javasoft.com is 204.160.241.98.
Even though we can do some tricks to make the numeric IP addresses easier to remember, humans don't do a very good job of remembering long strings of numbers. Humans remember words and names better. Therefore, most IP addresses have a corresponding name known as a domain name. The domain name for the IP address 204.160.241.98 is www.javasoft.com.
The Domain Name System (DNS) was developed to translate between IP addresses and domain names. Whenever you log your browser onto the internet and attempt to connect to a server using its domain name, the browser first communicates with a DNS server to learn the corresponding numeric IP address. The numeric IP address (and not the domain name) is encapsulated into the data packets and used by the internet protocol to route those packets from the source to the destination.
We will learn how to use the Java InetAddress class to find the domain name corresponding to an IP address, and to find the IP address corresponding to a domain name.
If (like me) you use a commercial Internet Service Provider (ISP), you really don't have a fixed IP address or a fixed domain name. Rather, the ISP has a block of IP addresses reserved. When you dial up the ISP and log onto the Internet, the ISP temporarily assigns an IP address to you for the duration of that connection. If you disconnect and reconnect, chances are good that you will get a different IP address for that second session.
One of my Java books refers to the IP address as being analogous to the telephone number of a company and the port to be analogous to the employee's telephone extension within that company.
Theoretically, there are 65,535 available ports. Port numbers between 1 and 1023 are predefined to be used for certain standard services. For example, if you want to connect with server software that communicates using the HTTP protocol, you would normally connect to port 80 on the server of interest.
Similarly, if you want to connect to a port that will tell you the time, you should connect to port 13. If you want to connect to a port that will simply echo whatever you send to it (usually for test purposes), you should connect to port 7. We will write Java applications that connect to all of these ports
In the interest of brevity, I am not going to attempt to provide a list of ports. However, you should be able to find all the information you might need about port numbers and the services they support by starting your favorite WWW search engine and searching for "well known ports".
Oftentimes the proxy server will have the ability to cache web pages for limited periods of time. For example, if ten people inside the company attempt to connect to the same Internet server and download the same web page within a (hopefully) short period of time, that page may be saved on the proxy server on the first attempt and then delivered to the next nine people without re-acquiring it from the outside web server. This can significantly improve delivery time and reduce network traffic into and out of the company. It can also result in the delivery of stale pages in some cases.
These two URLs will probably provide you with enough reading material to keep you busy for awhile, and will also probably provide links where you can obtain additional information.
http://www2.austin.cc.tx.us/baldwin/index.html
http://www2.austin.cc.tx.us/baldwin/ |
For example, as of this writing, the file named index.html on my web page at the college contains several anchors inside the file. One of those anchors is identified as KnockKnock.
If you would like to cause your browser to download the file named index.html
and then go directly to the anchor where the "KnockKnock" applet is located
in the file, point your browser to the following URL:
http://www2.austin.cc.tx.us/baldwin/index.html#KnockKnock |
The general syntax of a URL is:
protocol://hostname[:port]/path/filename#ref |
You could fill in the optional port number and use the following URL
to access the KnockKnock reference on my page on port 80 (if you want to
do some extra typing).
http://www2.austin.cc.tx.us:80/baldwin/index.html#KnockKnock |
Generally, the two socket classes are used to implement both clients and servers , while the ServerSocket class is only used to implement servers. We will see numerous examples of socket programming in this series of lessons.
Socket programming provides a low-level approach by which you can connect two computers for the exchange of data. One of those is generally considered to be the client while the other is considered to be the server.
Although the distinction between client and server is becoming less clear each day, there is one fundamental distinction that is inherent in the Java programming language. The client initiates conversations with servers. Servers block and wait for a client to initiate a conversation.
The governing application-level protocol will determine what happens after the connection is made and the conversation has begun. The fact that the two computers can connect doesn't necessarily mean that they can communicate. In order to communicate, they must implement some mutually acceptable application protocol
For example, the fact that I can dial a telephone number for a telephone located in France doesn't mean that I can communicate with the person who answers the phone. I don't know how to speak the French language. Unless the person who answers the phone speaks English, very little communication is likely to take place.
Socket programming has been around for quite a while in the Unix world. Java simply makes it easier by encapsulating much of the complexity of socket programming into classes, and allowing you to approach the task on an object-oriented basis.
According to some authors, some of the generality and capability that Unix socket programmers have enjoyed has been lost in the encapsulation process.
Basically, socket programming makes it possible for you to cause data to flow in a full-duplex mode between a client and a server. This data flow can be viewed in almost exactly the same way that we view data flow to and from a disk: as a stream of bytes.
As with most stream data processing, the system is responsible for moving the bytes from the source to the destination. It is the responsibility of the programmer to assign meaning to those bytes.
Assigning meaning takes on a special significance for socket programming. In particular, as mentioned above, it is the responsibility of the programmer to implement a mutually acceptable communication protocol at the application level to cause the data to flow in an orderly manner.
An application protocol is a set of rules by which the programs in the two computers can carry on a conversation and transfer data in the process. For example, we will write a program using the SMTP mail protocol to send an email message to someone.
We will also write a program that implements a very abbreviated form of the HTTP protocol to download web pages from a server and display them.
We will also write a program that functions as an (abbreviated) HTTP server to deliver web pages to a client and also supports the echo protocol for both TCP and UDP programming.
Each of these programs will involve adherence to a fairly simple protocol (at least the part that we implement will be fairly simple).
In addition, we will also write a program that obtains the date and time from another computer. In this case, the protocol will be about as simple as it can possibly be. In this case, the client will simply make the connection and listen for a string containing the date and time. This will be sort of like dialing the local time service, except that we won't have to listen to an advertisement before getting the time.
The bottom line is that with socket programming, it is easy to write code that will cause a stream of bytes to flow in both directions between a client and a server. This is no more difficult than causing a stream of bytes to flow in both directions between memory and a file on a disk.
However, getting the bytes to flow is the easy part. Beyond that, you must do all of the programming to implement an application protocol that is understood by both the client and the server.
In theory, by using the URL class, you can open a connection to a resource on the web, specified by a URL object, and simply invoke the getContent() method on that URL object. The content of the resource will then be magically downloaded and will appear as an object on the client machine, event if it requires an application protocol that didn't exist when you wrote the program, and contains content that you didn't understand when you wrote the program.
This description may be a bit of an overstatement, but it is pretty close to the claims being made. This is a powerful idea, which may or may not bear fruit in the future.
If fully implemented by browsers, the idea means that you can place new and unusual material on a web site along with special content handlers and protocol handlers. Then a cooperating browser will use those special handlers to move that material from the web site to the client and interpret its content once it get there without a requirement to install software (such as plug-ins) on the client computer on a permanent basis.
Unfortunately, this is what Peter van der Linden has to say about this
topic in his excellent book entitled Just Java 1.1 and Beyond, Third
Edition (emphasis added by baldwin).
"If a browser doesn't recognize a media type, it should be able to download the code to process it from the same place it got the file. If they ever get this working, it will be ... a good thing." |
That is not to say that you couldn't use the capability right now if you were developing an intranet and wanted the clients to have access to new and unusual content. It would be necessary for you to provide the appropriate protocol and content handlers, and it would probably be necessary for the clients to run Java applications written by you instead of standard browsers to access the data.
Also, the URL class provides an alternative way to connect one computer to another and transfer data on a stream basis, so we will see some examples of retrieving data from a server by obtaining a URL connection, and then opening and servicing I/O streams between the client and the server. We will see some sample programs that make use of this technique, but we will also see that it is redundant with the socket programming approach.
-end-