python网络编程.pdf
PythonNetwork Programmingby Sebastian V.TiponutTechnical University TimisoaraVersion 0.00,16.July 20012LIST OF FIGURESContents1Introduction42Basic socket usage52.1Creating a socket.52.2Connecting a socket and data transfer.62.3Binding a name to socket.62.4Listening and accepting connections.62.5UDP sockets.72.6Closing the socket.72.7Using functions provided in socket module.82.7.1Functions based on resolver library.82.7.2Service-related functions.82.7.3Miscellaneous functions.83Basic network structures design93.1Designing a TCP server.93.2The TCP client.113.3Modeling datagram applications.114Advanced topics on servers134.1Building a pristine environment.134.2Handling multiple connections.134.2.1Threaded servers.144.2.2Using select.154.2.3Fork servers.164.3Dealing with classes.184.3.1Simple connection object.184.3.2Applying a design pattern.204.4Advanced aspects concerning clients.225HTTP protocol235.1CGI module.235.1.1Build a simple CGI script.235.1.2Using CGI module.245.1.3Configuring Apache on Linux for using with CGI scripts.256Common protocols266.1Designing Telnet applications.266.2File Transfer Protocol.286.3SMTP protocol.297TO DOs30List of Figures1TCP connection.92UDP connection.10LIST OF FIGURES33Threaded server diagram.144Fork server diagram.175Designing a TCP connection with state pattern.2041INTRODUCTION1IntroductionNetwork programming is a buzzword now in the soft world.We see the market filled withan avalanche of network oriented applications like database servers,games,Java servlets andapplets,CGI scripts,different clients for any imaginable protocol and the examples may con-tinue.Today,more then half of the applications that hit the market are network oriented.Data communication between two machines(on local net or Internet)is not any more a cu-riosity but is a day to day reality.“The network is the computer”says the Sun Microsystemsmotto and they are right.The computer is no more seen as a separate entity,dialogging onlywith its human operator but as part of a larger system-the network,bound via data linkswith other thousands of other machines.This paper is presenting a possible way of designing network-oriented applications usingPython.Because the author is a Linux fan,the examples contained in this paper are relatedto Linux1and apologizes all the Windows or Mac OS users(fans?)for any inconvenienceon reading this text.With a little effort,the examples are portable to another non-UNIXoperation system.Presenting a quick structure of this paper,first four sections are dealingwith primitive design at socket level of network applications.The remaining sections aretreating specific protocols like http,ftp,telnet or smtp.The section dealing with http willcontain a subsection about writing CGI scripts and using the cgi module.Going further on more concrete subjects,we are going to analyze the possibilities ofnetwork programming provided in Python.Raw network support is implemented in Pythonthrough the socket module,this module comprising mostly of the system-calls,functions andconstants defined by the 4.3BSD Interprocess Communication facilities(see 1),implementedin object-oriented style.Python offers a simple interface(much simpler than the correspondingC implementation,though based on this one)to properly create and use a socket.Primarily,is defined the socket()function returning a socket object2.The socket has several methods,corresponding to their pairs from C sys/socket.h,like bind(),connect(),listen()oraccept().Programmers accustomed with socket usage under C language3will find very easyto translate their knowledge in the more-easy-to-use socket implementation under Python.Python eliminates the daunting task of filling structures like sockaddr in or hostent and easethe use of previously mentioned methods or functions parameter passing and functions callare easier to handle.Some network-oriented functions are provided too:gethostbyname(),getprotobyname()or conversion functions ntohl(),htons(),useful when converting integersto and from network format.The module provides constants like SOMAXCONN,INADDR*,usedin gesockopt()or setsockopt()functions.For a complete list of above mentioned constantscheck your UNIX documentation on socket implementation.Python provide beside socket,additional modules(in fact there is a whole bundle of them)supporting the most common network protocols at user level.For example we may find usefulmodules like httplib,ftplib,telnetlib,smtplib.There is implemented support for CGIscripting through cgi module,a module for URL parsing,classes describing web serversand the examples may continue.This modules are specific implementations of well knownprotocols,the user being encouraged to use them and not trying to reinvent the wheel.Theauthor hopes that the user will enjoy the richness of Pythons network programming facilitiesand use them in new and more exciting ways.Because all the examples below are written in Python,the reader is expected to be fluentwith this programming language.1And to other*NIX systems,POSIX compliant.2We call this further just socket.34.3BSD IPC implementation found on mostly UNIX flavors.2BASIC SOCKET USAGE52Basic socket usageThe socket is the basic structure for communication between processes.A socket is definedas“an endpoint of communication to which a name may be bound”1.The 4.3BSD imple-mentation define three communication domains for a socket:the UNIX domain for on-systemcommunication between processes;the Internet domain for processes communicating overTCP(UDP)/IP protocol;the NS domain used by processes communicating over the old Xe-rox communication protocol.Python is using only4the first two communication domains:UNIX and Internet domains,the AF UNIX and AF INET address families respectively.UNIX domain addresses are repre-sented as strings,naming a local path:for example/tmp/sock.This can be a socket createdby a local process or,possibly,created by a foreign process.The Internet domain addressesare represented as a(host,port)tuple,where host is a string representing a valid Internethostname,say matrix.ee.utt.ro or an IP address in dotted decimal notation andport isa valid port between 1 and 655355.Is useful to make a remark here:instead of a qualifiedhostname or a valid IP address,two special forms are provided:an empty string is usedinsteadINADDR ANY and the string instead of INADDR BROADCAST.Python offer all five type of sockets defined in 4.3BSD IPC implementation.Two seemto be generally used in the vastness majority of the new applications.A stream socket is aconnection-oriented socket,and has the underlaying communication support the TCP proto-col,providing bidirectional,reliable,sequenced and unduplicated flow of data.A datagramsocket is a connectionless communication socket,supported through the UDP protocol.Itoffers a bidirectional data flow,without being reliable,sequenced or unduplicated.A processreceiving a sequence of datagrams may find duplicated messages or,possibly,in another orderin which the packets were sent.The raw,sequenced and reliably delivered message socketstypes are rarely used.Raw socket type is needed when one application may require access tothe most intimate resources provided by the socket implementation.Our document is focusingon stream and datagram sockets.2.1Creating a socketA socket is created through the socket(family,type,proto)call;family is one of theabove mentioned address families:AF UNIX and AF INET,type is represented through the fol-lowing constants:SOCK STREAM,SOCK DGRAM,SOCK RAW,SOCK SEQPACKET and SOCK RDM.proto argument is optional and defaults to 0.We see that socket()function returns a socketin the specified domain with the specified type.Because the constants mentioned above arecontained in the socket module,all of them must be used with the socket.CONSTANT nota-tion.Without doing so,the interpreter will generate an error.To create a stream socket inthe Internet domain we are using the following line:sock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)Substituting socket.SOCK_STREAM with socket.SOCK_DGRAM we create a datagram socket inthe Internet domain.The following call will create a stream socket in the UNIX domain:sock=socket.socket(socket.AF_UNIX,socket.SOCK_STREAM)We discussed so far about obtaining a socket of different types in different communicationdomains.4Xerox Network System is no longer used.5Using a port under 1000 must be done with root privileges.62BASIC SOCKET USAGE2.2Connecting a socket and data transferA server from our point of view is a process which listen on a specified port.We may callthe association port,process as a service.When another process wants to meet the server oruse a specific service it must connect itself to the address and portnumber specified by theserver.This is done calling the socket method connect(address),where address is a pair(host,port)in the Internet domain and a pathname in the UNIX domain.When usingthe Internet domain a connection is realized with the following code:sock.connect(localhost,8000)while in UNIX domain,sock.connect(/tmp/sock)If the service is unavailable or the server dont want to talk with the client process a socket.error-(111,Connection refused)is issued.Elsewhere,after the connection is established with thedesired server,data is sent and received with send(buffer,flags)and recv(buffer,flags)methods.These methods accepts as mandatory parameter the size of the buffer inbytes and some optional flags;for a description about the meaning of the flags consult theUNIX man page for the corresponding function6.2.3Binding a name to socketThe socket,after creation,is nameless,though it have an associated descriptor.Before it canbe used it must be bind to a proper address since this is the only way a foreign process mayreference it.The bind(address)method is used to“name”a socket.The meaning of theaddress is explained above.Next call will bind a socket in the Internet domain with addresscomposed from hostname localhost and port number 8000:sock.bind(localhost,8000)Please take care when typing:indeed there are two pairs of parenthesis.Doing elsewherethe interpreter will issue a TypeError.The purpose of the two pairs of parenthesis is simple:address is a tuple containing a string and an integer.The hostname must be properly picked,the best method is to use gethostname()routine in order to assure host independence andportability7.Creating a socket in the UNIX domain use address as a single string,naminga local path:sock.bind(/tmp/sock)This will create the/tmp/sock file(pipe)which will be used for communication betweenthe server and client processes.The user must have read/write permissions in that specificdirectory where the socket is created and the file itself must be deleted once its no longer ofinterest.2.4Listening and accepting connectionsOnce we have a socket with a proper name bound to it,next step is calling the listen(queue)method.It instructs the socket to passively listen on port port.listen()take as parameter6Do“man recv(2)”or“man send(2)”.7Dont bind a real hostname to a socket unless you do it for testing purposes only;if you do so the programwill run solely on that particular system whom hostname was bind to the socket.2BASIC SOCKET USAGE7an integer representing the maximum queued connection.This argument should be at least1 and maximum,system-dependent,5.Until now we have a socket with a proper boundedaddress.When a connection request arrives,the server decide whether it will be accepted ornot.Accepting a connection is made through the accept()method.It takes no parameterbut it returns a tuple(clientsocket,address)where clientsocket is a new socket serveruses to communicate with the client and address is the clients address.accept()normallyblocks until a connection is realized.This behavior can be overridden running the method ina separate thread,collecting the new created socket descriptors in a list and process them inorder.Meantime,the server can do something else.The above mentioned methods are usedas follows:sock.listen(5)clisock,address=sock.accept()The code instructs the socket on listening with a queue of five connections and accept allincoming“calls”.As you can see,accept()returns a new socket that will be used in furtherdata exchanging.Using the chain bind-listen-accept we create TCP servers.Remember,aTCP socket is connection-oriented;when a client wants to speak to a particular server it mustconnect itself,wait until the server accepts the connection,exchange data then close.Thisis modeling a phone call:the client dial the number,wait till the other side establish theconnection,speak then quit.2.5UDP socketsWe chose to deal with connectionless sockets separately because these are less common inday to day client/server design.A datagram socket is characterized by a connectionless andsymmetric message exchange.Server and client exchange data packets not data streams,packets flowing between client and server separately.The UDP connection resemble thepostal system:each message is encapsulated in an envelope and received as a separate entity.A large message may be split into multiple parts,each one delivered separately(not in thesame order,duplicated and so on).Is the receivers duty to assemble the message.The server8have a bind()method used to append a proper name and port.There areno listen()and accept()method,because the server is not listening and is not acceptsconnection.Basically,we create a P.O.Box where is possible to receive messages from clientprocesses.Clients only send packets,data and address being included on each packet.Data packets are send and received with the sendto(data,address)and recvfrom(buf-fer,flags)methods.First method takes as parameters a string and the server addressas explained above in connect()and bind().Because is specified the remote end of thesocket there is no need to connect it.The second method is similar to recv().2.6Closing the socketAfter the socket is no longer used,it must be closed with close()method.When a user is nomore interested on any pending data a shutdown may be performed before closing the socket.The method is shutdown(how),where how is:0 if no more incoming data will be accepted,1will disallow data sending and a value of 2 prevent both send and receive of data.Remember:always close a socket after using it.8Which process is the server and which is the client is hard to predict at socket level,because of thesymmetrical connection.82BASIC SOCKET USAGE2.7Using functions provided in socket moduleWas presented before that socket module contains some useful functions in network design.This functions are related with the resolver libraries,/etc/services or/etc/protocols mappingfiles or conversions of quantities.2.7.1Functions based on resolver libraryThere are three functions using BIND8 or whatever resolver you may have.This functions usu-ally converts a hostname to IP address or IP address to hostname.One function is not relatedwith the resolver,gethostname(),this function retu