Network programming

Network programming refers to all those operations that happen behind the scenes and involve cooperation between multiple applications running on multiple machines. These are the programs the user does not see, but which do the real work on the Internet and other networked environments.

The workhorse of network programming is the server. A server is a program that provides services to other programs. A server waits for a request from another program, decodes that request, and sends a response. Servers run unattended for days and weeks, so they must be robust.

The program requesting a service from a server is a client. A client issues a request and waits for a response from the server.

A middleware application is a program that acts as a go-between between two other programs. Generally, a middleware program adds value to the transaction. An active content server that receives requests from a web server and queries a database to fill the request is acting as middleware. It manages the communication between the web server and the database server and adds value by adding HTML tagging to the database results.

A middleware application acts as both a client and a server. Multi-tier architectures are possible in which multiple middleware applications broker and add value to communications across a network. Though they sound exotic, servers and middleware applications are easy to program in OmniMark using the connectivity libraries.

When embarking on a network programming project, you will need to know a little bit about protocols. A protocol is simply an agreement between a client and a server about how they will communicate. If you use a common published protocol, or publish your own protocol, you can enable any number of clients to communicate with your server. On the other hand, if you keep your protocol private and encrypted you can help to secure your server against intrusion.

There are two important types of protocol you need to know about:

Transport protocols
Application protocols

Transport protocols are used to actually get messages across the network from one machine to another in good order. TCP/IP is the transport protocol used on the Internet, and supported by OmniMark's network libraries and the service and connection data types.

An application protocol is an agreement about what constitutes a message and what the message means. While disk files have natural ends, a network message is just a stream of bytes over an open connection. You have to look at the data itself to determine if you have found the whole message. The OmniMark TCP library supports all the common methods of delimiting a message.

Once you have a complete message you must decode it to see what it means, and then generate and send the appropriate response. OmniMark is the ideal language for decoding network protocols. Its streaming features make it very easy to interpret a message and to formulate a response quickly.

A simple server

The following is a simple OmniMark server program. This server returns the first line of a nursery rhyme when it receives a message naming the principal character of that rhyme.

  declare catch server-die
  import "omtcp.xmd" prefixed by tcp.
  
  process
     local tcp.service my-service
  
     set my-service to tcp.create-service on 5432
     repeat
        local tcp.connection my-connection
  
        set my-connection to tcp.accept-connection from my-service
        using output as tcp.writer of my-connection
           submit tcp.reader of my-connection protocol tcp.end-delimited "%10#"
  
        catch #external-exception
           identity catch-id
           message catch-msg
           location catch-loc
           put #error '%g(catch-id) : %g(catch-msg)%n%g(catch-loc)%n'
     again
     catch server-die
  
  find "Mary" "%13#"? =|
     output "Mary had a little lamb%13#%10#"
  
  find "Tom" "%13#"? =|
     output "Tom, Tom, the piper's son.%13#%10#"
  
  find "die" "%13#"? =|
     output "Argh!%13#%10#"
     throw server-die

A server operates rather like a telephone. First we place it in service by assigning it a telephone number. Then it must wait for a call. When a call comes it must answer it, listen to the message, and make an appropriate response. The conversation may consist of a single exchange, or of multiple exchanges. When the conversation is over, it hangs up and goes back to waiting for the next call.

The essential operation of a server, then, comes down to three things:

Start up: put the server in service
Request loop: wait for calls, respond, and repeat
Shut down: take the server out of service

Because it runs for a long time and has to handle many requests, a server has two overriding performance requirements:

No matter what happens while servicing a request, the server must not crash. It must stay running.
No matter what happens while servicing a request, the server must always return to a consistent ready state when the request is complete. If the server was in a different state for each request, its responses would not be reliable.

Let's look at how our sample server meets these requirements, line by line:

  process 
     local tcp.service my-service
     set my-service to tcp.create-service on 5432

This is the code that puts the server in service. It uses a service data type to establish a service on port 5432 of the machine it is running on. The server's address (its phone number) will be the machine's network address combined with the port number. Many different servers can run on the same machine using different ports.

  repeat
        local tcp.connection my-connection
        ...
        set my-connection to tcp.accept-connection from my-service
  again

This is the code that listens for an incoming call. accept-connection waits for a client to connect. When it receives a connection, it returns a value which represents the connection to the client. The connection variable "my-connection" is declared inside the repeat loop so that it will go out of scope at the end of the loop, providing automatic closure and cleanup of the connection.

  repeat
        local tcp.connection my-connection
        ...
        set my-connection to tcp.accept-connection from my-service
        ...
           submit reader of my-connection protocol end-delimited "%10#"
  again

A connection value provides an OmniMark source so that data can be read from the client. Reading data from a network connection, however, is different from reading from a file. While you can either read from a file or write to it, but not both, a network connection, like a telephone connection, is two way. This means that OmniMark cannot detect the end of a message on a network connection the way it detected the end of a file. The connection stays open and there could always be more characters coming. For this reason, all network data communication requires a specific application protocol for determining the end of a message. OmniMark provides support for all the common application protocols used for this purpose through the provided I/O protocols in the TCP library. In this case we are using a line based protocol. In our request protocol, the end of a message is signaled by a line-end combination (ASCII 13, 10). This can be recognized by using the end-delimited protocol with the delimiter set to "%10#". We submit data from that source to our find rules which will analyze the message and generate the appropriate response.

  repeat
        ...
        using output as writer of my-connection
           submit reader of my-connection protocol end-delimited "%10#"
        ...
     again

Our my-connection represents a two way network connection. Not only must we get a source from it to read data, we must also attach an output stream to it so that we can send data over the connection to the client. We do this with the writer function.

Once our submit is prefixed with using output as my-response, our find rules are reading from and writing to the network connection.

  find "Mary" "%13#"?=|
     output "Mary had a little lamb%13#%10#"
     
  find "Tom" "%13#"?=|
     output "Tom, Tom, the piper's son.%13#%10#"

Ours is a line-based protocol, but line ends are different on different platforms (13,10 on Windows, 10 on UNIX). Across a network, which can include machines from different platforms, we have to pick one for ourselves. Our protocol specifically requires 13, 10. But for matching purposes, we use "%10#" as the delimiter with an optional preceding "%13#" so that even if the client forgets to send the appropriate line end sequence, we can still read the message. When we send, however, we explicitly send "%13#%10#" rather than "%10#". In this we are following an important maxim of network programming: be liberal in what you accept, conservative in what you send.

  find "die"
     output "Argh! Splat!%13#%10#"
     throw server-die

This is the find rule that detects the poison pill message. To ensure an orderly shutdown, we provide a method of terminating our server by sending it a message to shut itself down. (In a production system, you might want to pick a slightly less obvious message for the poison pill.)

Shutting down the server is an exception to normal processing. We accomplish it by initiating a throw to a catch named server-die.

  process
     ...
     repeat
        ...
     again
     catch server-die

We catch the throw to server-die after the end of the server loop. OmniMark cleans up local scopes on the way, ensuring a clean and orderly shutdown. We are at the end of the process rule now, so the program exits normally.

Error and recovery

A server needs to stay running despite any errors that occur in servicing a particular request. On the other hand it should shut down if it cannot run reliably. The following code provides for both these situations:

   catch #external-exception
           identity catch-id
           message catch-msg
           location catch-loc
           put #error '%g(catch-id) : %g(catch-msg)%n%g(catch-loc)%n'

If there is an error in processing a request, OmniMark initiates a throw to #external exeption. We catch the throw at the end of the server loop. This provides for an automatic cleanup of any resources in use in servicing the request in progress, and assures that the server returns to its stable ready state. (No attempt is made to rescue the specific request in which the error occurred. In a production server you would want to provide such error recovery, but make sure you always have a fallback that aborts the current request and returns to a stable ready state.)

This simple server program has everything you need for a robust and usable production server. You would need to adapt the code to the protocol you are using, but apart from that, once input and output are bound to the connection, everything else is just regular OmniMark programming.

A simple client

Any client program, written in any language, can use our server as long as they know the protocol. Here is a simple client written in OmniMark:

  import "omtcp.xmd" prefixed by tcp.
  
  process
     local tcp.connection my-connection
  
     set my-connection to tcp.connect to "localhost" on 5432
  
     put tcp.writer of my-connection #args[1] || "%13#%10#"
     close my-request
  
     output tcp.reader of my-connection

This client is called with the name of the nursery rhyme character on the command line and prints out the line it receives from the server. Let's go through it line by line:

  set my-connection to tcp.connect 
                   to "localhost" on 5432

Like the server program, the client uses an instance of the connection data type to create a connection. Unlike the server it does not require a service component, as it is not establishing a service, but simply making a connection to a service established elsewhere. The client takes a more active role than the server, however. While the server waits for a call, the client must take the initiative and make a call. It does this with the connect function. The connect function takes a network and port address for a server and when the connection is made it returns a connection value, which we can write to and read from just as we did in the server program.

  output tcp.reader of my-connection

When we read the data returned from the server we actually have two choices. Since ours is a line-based protocol, we could use the protocol parameter of reader to read the response. But we also know that the server will drop the connection as soon as it has finished sending data. (This behavior is part of our protocol as well.) So we choose to keep reading data until the connection is dropped. This way we will get at least partial data even if something goes wrong and the server never sends the end of line. Be conservative in what you send and liberal in what you accept.

Clients for common servers

Most of the client program you write in OmniMark will probably be for well-known servers such a HTTP (Web), FTP, or SMTP/POP (Mail). OmniMark's connectivity libraries provide direct support for these and other common protocols, greatly simplifying the task of retrieving data from these servers.

Prerequisite Concepts