159231153036212041242100244042145096184016146223183074028121215138254101020247099007114122
Forming Internet (IPv4) Socket Addresses
The most commonly used address family under Linux is the AF_INET family. This gives a socket an IPv4 socket address to allow it to communicate with other hosts over a TCP/IP network. The include file that defines the structure sockaddr_in is defined by the C language statement:
#include <netinet/in.h>
Listing 2.7 shows an example of the structure sockaddr_in which is used for Internet addresses. An additional structure in_addr is also shown, because the sockaddr_in structure uses it in its definition.
Example 2.7. The sockaddr_in Structure
struct sockaddr_in {
sa_family_t sin_family; /* Address Family */
uint16_t sin_port; /* Port number */
struct in_addr sin_addr; /* Internet address */
unsigned char sin_zero[8]; /* Pad bytes */
};
struct in_addr {
uint32_t s_addr; /* Internet address */
};
Listing 2.7 can be described as follows:
The sin_family member occupies the same storage area that sa_family does in the generic socket definition. The value of sin_family is initialized to the value of AF_INET.
The sin_port member defines the TCP/IP port number for the socket address. This value must be in network byte order (this will be elaborated upon later).
The sin_addr member is defined as the structure in_addr, which holds the IP number in network byte order. If you examine the structure in_addr, you will see that it consists of one 32-bit unsigned integer.
Finally, the remainder of the structure is padded to 16 bytes by the member sin_zero[8] for 8 bytes. This member does not require any initialization and is not used.
Now turn your attention to Figure 2.3 to visualize the physical layout of the address.
Figure 2.3. Here is the structure sockaddr_in physical layout.
In Figure 2.3, you see that the sin_port member uses two bytes, whereas the sin_addr member uses four bytes. Both of these members show a tag on them indicating that these values must be in network byte order.
TIP
Information about IPv4 Internet addresses can be obtained by examining the ip(4)
man page.
Understanding Network Byte Order
Different CPU architectures have different arrangements for grouping multiple bytes of data together to form integers of 16, 32, or more bits. The two most basic byte orderings are
Other combinations are possible, but they need not be considered here. Figure 2.4 shows a simple example of these two different byte orderings.
Figure 2.4. Here is an example of the basic big- and little-endian byte ordering.
The value illustrated in Figure 2.4 is decimal value 4660, which, in hexadecimal, is the value 0x1234. The value requires that 2 bytes be used to represent it. It can be seen that you can either place the most significant byte first (big-endian) or you can place the least significant byte value first (little-endian.) The choice is rather arbitrary and it boils down to the design of the CPU.
You might already know that the Intel CPU uses the little-endian byte order. Other CPUs like the Motorola 68000 series use the big-endian byte order. The important thing to realize here is that CPUs of both persuasions exist in the world and are connected to a common Internet.
What happens if a Motorola CPU were to write a 16-bit number to the network and is received by an Intel CPU? "Houston, we have a problem!" The bytes will be interpreted in the reverse order for the Intel CPU, causing it to see the value as 0x3412 in hexadecimal. This is the value 13330 in decimal, instead of 4660!
For agreement to exist over the network, it was decided that big-endian byte order would be the order used on a network. As long as every message communicated over the network obeys this sequence, all software will be able to communicate in harmony.
This brings you back to AF_INET addresses. The TCP/IP port number (sin_port) and the IP number (sin_addr) must be in network byte order. The BSD socket interface requires that you as the programmer consider this when forming the address.
Performing Endian Conversions
A few functions have been provided to help simplify this business of endian conversions. There are two directions of conversion to be considered:
By "host order" what is meant is the byte ordering that your CPU uses. For Intel CPUs, this will mean little-endian byte order. Network order, as you learned earlier, is big-endian byte order.
There are also two categories of conversion functions:
The following provides a synopsis of the conversion functions that you have at your disposal:
#include <netinet/in.h>
unsigned long htonl(unsigned long hostlong);
unsigned short htons(unsigned short hostshort);
unsigned long ntohl(unsigned long netlong);
unsigned short ntohs(unsigned short netshort);
TIP
These functions are all described in the byteorder(3) man page.
NOTE
In the context of these conversion functions, "short" refers to a 16-bit value and "long" refers to a 32-bit value.
Do not confuse these terms with what might be different sizes of the C data types. For example, a long data type on some CPUs running Linux could conceivably be 64-bits in length.
Use of these functions is quite simple. For example, to convert a short integer to network order, the following code can be used:
short host_short = 0x1234;
short netw_short;
netw_short = htons(host_short);
The value netw_short will receive the appropriate value from the conversion to network order. To convert a value from network order back into host order is equally simple:
host_short = ntohs(netw_short);
TIP
The h in the function name refers to "host," whereas n refers to "network." Similarly, s refers to "short" and l refers to "long."
Using these conventions, it is a simple matter to pick the name of the conversion function you need.
CAUTION
The byteorder(3) functions may be implemented as macros on some systems. Linux systems that run on CPUs using the big-endian byte ordering might provide a simple macro instead, because no conversion of the value is required.
Initializing a Wild Internet Address
Now you are ready to create an Internet address. The example shown here will request that the address be wild. This is often done when you are connecting to a remote service. The reason for doing this is that your host might have two or more network interface cards, each with a different IP number. Furthermore, Linux also permits the assignment of more than one IP number to each interface. When you specify a wild IP number, you allow the system to pick the route to the remote service. The kernel will then determine what your final local socket address will be at the time the connection is established.
There are also times when you want the kernel to assign a local port number for you. This is done by specifying sin_port as the value zero. The example code shown in Listing 2.8 demonstrates how to initialize an AF_INET address with both a wild port number and a wild IP number.
Example 2.8. Initializing an IN_ADDRANY AF_INET Address
<$nopage>
001 1: struct sockaddr_in adr_inet;
002 2: int adr_len;
003 3:
004 4: memset(&adr_inet,0,sizeof adr_inet);
005 5:
006 6: adr_inet.sin_family = AF_INET;
007 7: adr_inet.sin_port = ntohs(0);
008 8: adr_inet.sin_addr.s_addr = ntohl(INADDR_ANY);
009 9: adr_len = sizeof adr_inet;
010 <$nopage>
The steps used in Listing 2.8 are as follows:
The value adr_inet is defined using the structure sockaddr_in (line 1).
The address adr_inet is zeroed by calling memset(3) in line 4. (This is optional.)
The address family is established by assigning the value AF_INET to adr_inet.sin_family (line 6).
A wild port number is specified in line 7. Notice the use of the function ntohs(3). The value zero indicates a wild port number.
A wild IP number is assigned in line 8. Again, note the use of the ntohl(3) function to perform the endian conversion.
The size of the address is simply computed as the size of the structure adr_inet (line 9).
Another commonly used IP number is 127.0.0.1. This refers to the loopback device. The loopback device lets you communicate with another process on the same host as your process. You'll see more of this IP number later. For now, just note how the address can be assigned below. Line 8 of Listing 2.8 could be changed to the following statement:
adr_inet.sin_addr.s_addr = ntohl(INADDR_LOOPBACK);
This will address your current host through the loopback device. In the next section, you will learn how to set up any IP number and port number.
Initializing a Specific Internet Address
The previous section dealt with a simple case for AF_INET addresses. Things get more complicated when you want to establish a specific IP number in the address. Listing 2.9 shows a complete program listing that you can compile by simply performing the following command:
$ make af_inet
Then, just invoke the compiled program by the name af_inet.
Example 2.9. af_inet.c—Establishing a Specific AF_INET Address
<$nopage>
001 1: /* af_inet.c:
002 2: *
003 3: * Establishing a Specific AF_INET
004 4: * Socket Address:
005 5: */
006 6: #include <stdio.h>
007 7: #include <unistd.h>
008 8: #include <stdlib.h>
009 9: #include <errno.h>
010 10: #include <string.h>
011 11: #include <sys/types.h>
012 12: #include <sys/stat.h>
013 13: #include <sys/socket.h>
014 14: #include <netinet/in.h> <$nopage>
015 15:
016 16: /*
017 17: * This function reports the error and
018 18: * exits back to the shell:
019 19: */
020 20: static void
021 21: bail(const char *on_what) {
022 22: perror(on_what);
023 23: exit(1);
024 24: }
025 25:
026 26: int
027 27: main(int argc,char **argv,char **envp) {
028 28: int z; /* Status return code */
029 29: int sck_inet; /* Socket */
030 30: struct sockaddr_in adr_inet;/* AF_INET */
031 31: int len_inet; /* length */
032 32: const unsigned char IPno[] = {
033 33: 127, 0, 0, 23 /* Local loopback */
034 34: };
035 35:
036 36: /* Create an IPv4 Internet Socket */
037 37: sck_inet = socket(AF_INET,SOCK_STREAM,0);
038 38:
039 39: if ( sck_inet == -1 )
040 40: bail("socket()");
041 41:
042 42: /* Create an AF_INET address */
043 43: memset(&adr_inet,0,sizeof adr_inet);
044 44:
045 45: adr_inet.sin_family = AF_INET;
046 46: adr_inet.sin_port = htons(9000);
047 47: memcpy(&adr_inet.sin_addr.s_addr,IPno,4);
048 48: len_inet = sizeof adr_inet;
049 49:
050 50: /* Now bind the address to the socket */
051 51: z = bind(sck_inet,
052 52: (struct sockaddr *)&adr_inet,
053 53: len_inet);
054 54:
055 55: if ( z == -1 )
056 56: bail("bind()");
057 57:
058 58: /* Display all of our bound sockets */
059 59: system("netstat -pa --tcp 2>/dev/null | "
060 60: "sed -n '1,/^Proto/p;/af_inet/p'");
061 61:
062 62: close(sck_inet);
063 63: return 0;
064 64: }
065 <$nopage>
The steps used in this program are almost identical to the others shown in Listings 2.3 and 2.5. Lines 43 to 48, however, require some explanation:
Line 30 defines the sockaddr_in structure with the name adr_inet. Additionally, the socket address length is defined as an integer in line 31 as len_inet.
An unsigned character array is defined as IPno[4] in lines 32 and 33. Here the individual bytes spell out a specific IP address 127.0.0.23.
Line 43 zeros out adr_inet as usual. Note that, again, this is optional.
Line 45 establishes the address family as AF_INET.
This example chose to establish a TCP/IP port number 9000 in line 46. Note the use of the conversion function htons(3) in line 46.
The character array IPno[4] is copied to the location adr_inet.sin_addr.s_addr in line 47. Because the bytes are defined in network order back in step 2, there is no endian conversion required here. You will recall that network byte ordering has the most significant byte presented first.
The size of the address structure is computed as before (line 48).
You might have noticed that Internet addresses have a fixed length. If you review Figure 2.3, this is readily apparent. However, you will remember that the AF_LOCAL address was variable in length (refer to Figure 2.2). For AF_INET addresses, you merely need to supply the size of the socket structure sockaddr_in. In C language terms, this is
sizeof(struct sockaddr_in)
You should be well equipped now for forming Internet IPv4 addresses. To broaden your knowledge on socket addressing, the next sections will show you how some other address families can be specified.
|