Advanced Python Programming

From Grundy
Revision as of 06:45, 6 April 2020 by Parth1811 (talk | contribs) (→‎Metaclasses)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Many developers think that they know enough about a language, but there is always something new to learn. This article will help you learn some advanced concepts in python.

This tutorial assumes that you know your way around the basic stuff and know useful libraries.

Socket Programming

Sockets programming is the fundamental technology behind communications on TCP/IP networks. A socket is one endpoint of a two-way link between two programs running on a network. The socket provides a bidirectional communication endpoint to send and receive data with another socket. Socket connections normally run between two different computers on a local area network (LAN) or across the internet, but they can also be used for interprocess communication on a single computer.

In any given connection, one of the sockets is treated as a client, while the other is treated as a server. Socket endpoints on TCP/IP networks each have a unique address that is the combination of an IP address and a TCP/IP port number. Because the socket is bound to a specific port number, the TCP layer can identify the application that should receive the data sent to it. Here is a typical lifecycle of a socket.

Lifecycle of a Socket

Error creating thumbnail: File missing
Life cycle of a TCP socket Reference

This client opens up a socket connection with the server, but only if the server program is currently running. To test this out yourself, you will need to use 2 terminal windows at the same time.

Next, the client sends some data to the server: I am CLIENT

Then the client receives some data it anticipates from the server.

Done! You can now get started streaming data between clients and servers using some basic Python network programming.

Creating a Socket

# create an INET, STREAMing socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# now connect to the web server on port 80 - the normal http port
s.connect(("www.google.com", 80))

Here we made a socket instance and passed it two parameters. The first parameter is AF_INET and the second one is SOCK_STREAM. AF_INET refers to the address-family ipv4. The SOCK_STREAM means connection-oriented TCP protocol.

In the next line, the connect function tries to connect to the server which is located at "www.google.com". When the connect completes, the socket s can be used to send in a request for the text of the page. The same socket will read the reply, and then be destroyed. That’s right, destroyed. Client sockets are normally only used for one exchange (or a small set of sequential exchanges).

Note:- The domain "www.google.con" actually corresponds to the public IP address which is resolved by the DNS server. If you direct used the IP of the website, this will have the same results

Writing a Server Socket

A server has a bind() method which binds it to a specific IP and port so that it can listen to incoming requests on that IP and port. A server has a listen() method which puts the server into listen mode. This allows the server to listen to incoming connections. And last a server has an accept() and close() method. The accept method initiates a connection with the client and the close method closes the connection with the client.

# first of all import the socket library 
import socket                
  
# next create a socket object 
s = socket.socket()          
print "Socket successfully created"
  
# reserve a port on your computer in our 
# case it is 12345 but it can be anything 
port = 12345                
  
# Next bind to the port 
# we have not typed any ip in the ip field 
# instead we have inputted an empty string 
# this makes the server listen to requests  
# coming from other computers on the network 
s.bind(('', port))         
print "socket binded to %s" %(port) 
  
# put the socket into listening mode 
s.listen(5)      
print "socket is listening"            
  
# a forever loop until we interrupt it or  
# an error occurs 
while True: 
  
   # Establish connection with client. 
   c, addr = s.accept()      
   print 'Got connection from', addr 
  
   # send a thank you message to the client.  
   c.send('Thank you for connecting') 
  
   # Close the connection with the client 
   c.close()

Explaining the code:-

  • First of all, we import socket which is necessary.
  • Then we made a socket object and reserved a port on our pc.
  • After that, we binded our server to the specified port. Passing an empty string means that the server can listen to incoming connections from other computers as well. If we would have passed 127.0.0.1 then it would have listened to only those calls made within the local computer.
  • After that we put the server into listen mode.5 here means that 5 connections are kept waiting if the server is busy and if a 6th socket tries to connect then the connection is refused.
  • At last, we make a while loop and start to accept all incoming connections and close those connections after a thank you message to all connected sockets.

Writing a Client Socket

Before writing a client program we need a server program running to which a client can connect. For this save the code for server mentioned above in a file server.py. Now open a terminal in the same directory can run the server by typing the following commands.

$ python server.py

Socket successfully created
socket binded to 12345
socket is listening

Once we are ready with this we can start writing the client-side program.

# Import socket module 
import socket                
  
# Create a socket object 
s = socket.socket()          
  
# Define the port on which you want to connect 
port = 12345                
  
# connect to the server on local computer 
s.connect(('127.0.0.1', port)) 
  
# receive data from the server 
print s.recv(1024) 
# close the connection 
s.close()

Explaining the code:-

  • First of all, we make a socket object.
  • Then we connect to localhost on port 12345 (the port on which our server runs) and lastly, we receive data from the server and close the connection.
  • Now save this file as client.py and run it from the terminal after starting the server script.

Additional Resources

  • This article explains the mechanisms to set up multiple communication channels between a server & client(s). It also demonstrates how one can send & receive complex python objects using the pickle library.
  • ZeroMQ (or ØMQ) is a high-performance asynchronous messaging library, aimed at use in distributed or concurrent applications. It removes the hassle of going through the nitty-gritties of setting up sockets for communication & allows one to focus more on the pattern & protocol for messaging. This documentation demonstrates in details the capabilities of pyzmq, the python library for ØMQ along with examples for all the messaging patterns.

Speeding Up Python Code with Concurrency & Parallelism

In this section, we will walk through the concepts involving concurrency & parallelism, along with their implementation in python. Concurrency means multiple computations are happening at the same time. Concurrency is everywhere in modern programming, whether we like it or not:

  • Multiple computers in a network
  • Multiple applications running on one computer
  • Multiple processors in a computer

Concurrency can make a big difference for two types of problems. These are generally called CPU-bound and I/O-bound. I/O-bound problems cause your program to slow down because it frequently must wait for input/output (I/O) from some external resource. They arise frequently when your program is working with things that are much slower than your CPU. To tackle this issue, running several I/O processes concurrently or interleaving them with some CPU-bound processes may help in achieve massive gains in the performance.

There are several ways to achieve concurrency & some of the paradigms are discussed below:

Multithreading

As a brief refresher of basics concepts, we first go through the basics of processes & threads. In computing, a process is an instance of a computer program that is being executed. Any process has 3 basic components:

  • An executable program.
  • The associated data needed by the program (variables, work space, buffers, etc.)
  • The execution context of the program (State of process)

A thread is an entity within a process that can be scheduled for execution. Also, it is the smallest unit of processing that can be performed in an OS. Multithreading is the ability of a CPU to provide multiple threads of execution concurrently, supported by the operating system. Multithreading aims to increase utilization of a single core by using thread-level parallelism, as well as instruction-level parallelism.

  • This article introduces the basic implementation of using threads in python by using the threading module. It starts with a single threaded process & goes on to build multithreaded applications leveraging the ThreadPoolExecutor. It also talks about handling race conditions, etc. which are of utmost importance while dealing with multithreaded programs.
  • While delving into multithreading with python, one would definitely come across this controversial entity called the Global Interpreter Lock (GIL). To know more about this & the effects that it potentially has on multithreaded applications, we suggest going through this article.

Since you are now familiar with sockets & multithreading, as an exercise, you could try developing a program which would start 2 (or more) threads listening on different ports. You can have another program which connects to these ports & sends messages at random intervals of time (within certain bounds, say, 10 seconds). The initial program is supposed to log the messages received at each port & mention the thread receiving the message along with the details of the sender (IP address & port)

Coroutines & Asynchronous Programming

As defined in the python documentation, coroutines are a more generalized form of subroutines. Subroutines are entered at one point and exited at another point whereas coroutines can be entered, exited, and resumed at many different points.

Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. When the work is complete, it notifies the main thread about completion or failure of the worker thread. There are numerous benefits to using it, such as improved application performance and enhanced responsiveness.

The library to write concurrent code using the async/await syntax in python is asyncio. Coroutines declared with the async/await syntax is the preferred way of writing asyncio applications. For example:

import asyncio

async def main():
    print('hello')
    await asyncio.sleep(1)
    print('world')

asyncio.run(main())

Note that simply calling a coroutine by main()will not schedule it to be executed.

asyncio uses different constructs: event loops, coroutinesand futures.

  • An event loop manages and distributes the execution of different tasks. It registers them and handles distributing the flow of control between them.
  • Coroutines (covered above) are special functions that work similarly to Python generators, on await they release the flow of control back to the event loop. A coroutine needs to be scheduled to run on the event loop, once scheduled coroutines are wrapped in Tasks which is a type of Future.
  • Futures represent the result of a task that may or may not have been executed. This result may be an exception.

Following are some resources to understand asynchronous programming:

  • This tutorial on asyncio explains all the concepts related to asynchronous programming with asyncio along with intuitive examples & design considerations needed while working with this module.
  • This blog by Dan Bader demonstrates the design of a web server application using different levels & types of concurrency: starting from synchronous form, to asynchronous (blocking as well as non-blocking) form through the course of 8 examples. It will allow readers to get familiarized with other libraries like queue, gevent & twisted.

The above resources should give you a fair understanding about asynchronous programming in python & you should now be ready to implement your very own asynchronous applications including servers, event listeners, etc.

Multiprocessing

The processes mentioned previously are generally used to provide a good speedup to I/O-bound tasks, are not sufficient to improve the performance of CPU-bound tasks. To achieve greater performance with CPU-bound processes, one must leverage parallelism instead of concurrency. If that confuses you, here is an explanation of both these concepts & the differences between them.

Multiprocessing is the use of two or more CPUs within a single computer system. In python, one can write programs based on multiprocessing using multiprocessing library. It was designed to break down the barrier posed by the Python GIL (which restricts multithreaded programs to be scheduled on the same CPU only) & run code across multiple CPUs. At a high level, it does this by creating a new instance of the Python interpreter to run on each CPU and then farming out part of your program to run on it.

  • Here is a tutorial that will help you understand the basic syntax & semantics used in programs that use multiprocessing, along with a primer on some key concepts involved in this domain.
  • This article demonstrates how the same problem can be approached by a single-treaded, multithreaded & multiprocessing approaches, thereby contrasting the results of each of these. It should give you a good understanding of when to use which paradigm to suit your need.

Meta Programming

Metaprogramming is a branch of programming in which the code manipulates code. This seems very funky and alien thing but if you have ever worked with decorators or metaclasses, you were doing metaprogramming there.

Most of the developers can use python for years and still go without using metaprogramming even for once. This is a somewhat rare topic which is not often needed but once you need it there nothing else which can solve your problem.

As quoted by Tim Peters Metaclasses are deeper magic that 99% of users should never worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).

Decorators

A decorator is a function that takes a function as its only parameter and returns a function. This is helpful to “wrap” functionality with the same code over and over again.

Let us say we have to have to log how much time it takes for some functions to run, normally what we would do is add a timer at the start and print the difference at the end.

def foo(arg1, arg2):
    #Starting the timer
    start_time = time.now()

    #function does something here

    #computing the total time
    end_time = time.now()
    print("The function took: %s seconds" %(end_time - start_time))

But doing this for many functions would be a tedious job you will have to go and edit every function. Also, let's you have to change the print statement then you will have to go to each function and change the print statement. This can be easily done by using decorators.

def time_function(func):

    #defining a function which will call
    #the original passed function inside a timer
    def timed_func(*args, **kwarg)
        #Starting the timer
        start_time = time.now()

        #calling the original function here
        func(*args, **kwarg)

        #computing the total time
        end_time = time.now()
        print("The function took: %s seconds" %(end_time - start_time))


    return timed_func


@time_function
def foo(arg1, arg2):
    #does something here

The above code will have the same effect as the previous one, the only difference is now we can add the decorator @time_function above any function and the timing of that function will be logged

Metaclasses

Metaclasses are an esoteric OOP concept, lurking behind virtually all Python code. You are using them whether you are aware of it or not. For the most part, you don’t need to be aware of it. Most Python programmers rarely, if ever, have to think about metaclasses. When the need arises, however, Python provides a capability that not all object-oriented languages support: you can get under the hood and define custom metaclasses

What is Type class

In Python, everything has some type associated with it. For example, if we have a variable having integer value then it’s type is Int. You can get the type of anything using type() function.

Every type in Python is defined by Class. Unlike C or Java where int, char, float are primary data types, in Python, they are the object of Int class or str class. So we can make a new type by creating a class of that type.

How is a Class created in Python

A Class is also an object, and just like any other object, it’s an instance of something called Metaclass. A special class type creates these Class object. The type class is default metaclass which is responsible for making classes. Because Classes are also an object, they can be modified in the same way. We can add or subtract fields or methods in class in the same way we did with other objects.

# Defined class without any 
# class methods and variables 
class test:pass
  
# Defining method variables 
test.x = 45
  
# Defining class methods 
test.foo = lambda self: print('Hello') 
  
# creating object 
myobj = test() 
  
print(myobj.x)    #output: 45
myobj.foo()       #output: Hello
Error creating thumbnail: File missing
Creation of Classes

A metaclass is responsible for the generation of classes, so we can write our own custom metaclasses to modify the way classes are generated by performing extra actions or injecting code. Usually, we do not need custom metaclasses but sometimes it’s necessary. There are problems for which metaclass and non-metaclass based solutions are available (often simpler) but in some cases only metaclass can solve the problem.

Writing your own Metaclass

To create our custom metaclass, our custom metaclass have to inherit type metaclass and usually override –

  • __new__(): It’s a method which is called before __init__(). It creates the object and returns it. We can override this method to control how the objects are created.
  • __init__(): This method just initializes the created object passed as a parameter.

In the following example, we will be creating a metaclass MultiBases which will check if the class being created have inherited from more than one base class. If so, it will raise an error.

class MultiBases(type): 
    # overriding __new__ method 
    def __new__(cls, clsname, bases, clsdict): 
        # if no of base classes is greator than 1 
        # raise error 
        if len(bases)>1: 
            raise TypeError("Inherited multiple base classes!!!") 
          
        # else execute __new__ method of super class, ie. 
        # call __init__ of type class 
        return super().__new__(cls, clsname, bases, clsdict) 
  
# metaclass can be specified by 'metaclass' keyword argument 
# now MultiBase class is used for creating classes 
# this will be propagated to all subclasses of Base 
class Base(metaclass=MultiBases): 
    pass
  
# no error is raised 
class A(Base): 
    pass
  
# no error is raised 
class B(Base): 
    pass
  
# This will raise an error! 
class C(A, B): 
    pass

Bonus Material for More Use-Cases

Here are links to some awesome tutorials that will allow you to diversify your knowledge of Advanced Python Programming & tackle several other potential use-cases of Python out in the real world:

  • Have you been wondering if one can communicate between C++ & Python code? Through this tutorial, you will learn how to call C/C++ functions using Python using something known as python bindings & pass data between code written in these different languages.
  • Can Python be used to send emails? Here is a tutorial that will walk you through all you need to set up an SMTP Server & send you first email using Python! Once you know how to use smtplib, you can also send emails from you Gmail account. Here is how you can do so.
  • Pattern matching in string finds a utility almost everywhere. An effective way to do so is using Regular Expressions. This tutorial will help you understand the re library for using Reguar Expressions (or Regex in short) for your very own string pattern matching projects.