Hello everyone. We prepared a translation of an interesting article ahead of the launch of the Python Developer Base course

ITKarma picture


Introduction


Asynchronous programming is a type of parallel programming in which a unit of work can be performed separately from the main thread of the application. When work is completed, the main thread is notified of the completion of the workflow or an error occurred. There are many benefits to this approach, such as improved application performance and increased response speed.

ITKarma picture

Over the past few years, asynchronous programming has attracted close attention, and there are reasons for this. Although this kind of programming can be more complicated than traditional sequential execution, it is much more efficient.

For example, instead of waiting for the completion of HTTP request before By continuing execution, you can send a request and do other work, which is waiting in line, using asynchronous coroutines in Python.

Asynchrony is one of the main reasons for choosing Node.js to implement the backend. The large amount of code we write, especially in applications with heavy I/O, such as websites, depends on external resources. It can contain anything from a remote database call to POST requests to a REST service. As soon as you send a request to one of these resources, your code will just wait for a response. With asynchronous programming, you let your code handle other tasks while you wait for a response from resources.

How does Python manage to do several things at once?


ITKarma picture

1. Multiple Processes

The most obvious way is to use several processes. From the terminal, you can run your script two, three, four, ten times, and all scripts will be executed independently and simultaneously. The operating system will take care of distributing processor resources between all instances. Alternatively, you can use the library multiprocessing , which can spawn multiple processes as shown in the example below.

from multiprocessing import Process def print_func(continent='Asia'): print('The name of continent is : ', continent) if __name__ == "__main__": # confirms that the code is under main function names=['America', 'Europe', 'Africa'] procs=[] proc=Process(target=print_func) # instantiating without any argument procs.append(proc) proc.start() # instantiating process with arguments for name in names: # print(name) proc=Process(target=print_func, args=(name,)) procs.append(proc) proc.start() # complete the processes for proc in procs: proc.join() 

Conclusion:

The name of continent is : Asia The name of continent is : America The name of continent is : Europe The name of continent is : Africa 

2. Multiple Streams

Another way to run multiple jobs in parallel is to use threads. A thread is a run queue that is very similar to a process, however in one process you can have multiple threads and all of them will have shared access to resources. However, it will be difficult to write stream code because of this. Likewise, the operating system will do all the hard work of allocating processor memory, but global interpreter lock (GIL) will only allow one Python thread to run at a time, even if you have multi-threaded code. So GIL on CPython prevents multicore competition. That is, you can forcefully run on only one core, even if you have two, four or more.

import threading def print_cube(num): """ function to print cube of given num """ print("Cube: {}".format(num * num * num)) def print_square(num): """ function to print square of given num """ print("Square: {}".format(num * num)) if __name__ == "__main__": # creating thread t1=threading.Thread(target=print_square, args=(10,)) t2=threading.Thread(target=print_cube, args=(10,)) # starting thread 1 t1.start() # starting thread 2 t2.start() # wait until thread 1 is completely executed t1.join() # wait until thread 2 is completely executed t2.join() # both threads completely executed print("Done!") 

Conclusion:

Square: 100 Cube: 1000 Done! 

3. Coroutines and CDMY0CDMY:

Coroutines are a generalization of routines. They are used for cooperative multitasking, when a process voluntarily surrenders control (CDMY1CDMY) at some frequency or in periods of waiting to allow several applications to work simultaneously.Coroutines are similar to generators , but with additional methods and minor changes to the way we use the yield . Generators produce data for iteration, while coroutines can also consume data.

def print_name(prefix): print("Searching prefix:{}".format(prefix)) try : while True: # yeild used to create coroutine name=(yield) if prefix in name: print(name) except GeneratorExit: print("Closing coroutine!!") corou=print_name("Dear") corou.__next__() corou.send("James") corou.send("Dear James") corou.close() 

Conclusion:

Searching prefix:Dear Dear James Closing coroutine!! 

4. Asynchronous Programming

The fourth method is asynchronous programming, in which the operating system is not involved. On the part of the operating system, you will have one process in which there will be only one thread, but you can still perform several tasks at the same time. So what's the trick here?

Answer: CDMY2CDMY

CDMY3CDMY is an asynchronous programming module introduced in Python 3.4. It is designed to use coroutine and future to simplify writing asynchronous code and makes it almost as readable as synchronous code, due to the lack of callbacks.

CDMY4CDMY uses different designs: CDMY5CDMY, Coroutines and CDMY6CDMY.

  • event loop manages and distributes the execution of various tasks. He registers them and processes the distribution of control flow between them.
  • Coroutines (which we talked about above) are special functions that work similar to those generators in Python, using await they return the control flow back to the event loop . Coroutine launch should be scheduled in the event loop. Planned coroutines will be wrapped in Tasks , which is a type of Future .
  • Future reflects the result of the task, which may or may not be done. The result may be exception.

With CDMY7CDMY, you can structure your code so that the subtasks are defined as coroutines and allow you to schedule them to run as you please, including at the same time. Coroutines contain points CDMY8CDMY, in which we identify possible context switching points. If there are tasks in the waiting queue, the context will be switched, otherwise not.

Context Switching in CDMY9CDMY is CDMY10CDMY, which transfers control flow from one coroutine to another.

In the following example, we start 3 asynchronous tasks, which individually make requests to Reddit, extract and display the contents of JSON. We use aiohttp - the http client library that ensures that even an HTTP request will be executed asynchronously.

import signal import sys import asyncio import aiohttp import json loop=asyncio.get_event_loop() client=aiohttp.ClientSession(loop=loop) async def get_json(client, url): async with client.get(url) as response: assert response.status == 200 return await response.read() async def get_reddit_top(subreddit, client): data1=await get_json(client, 'https://www.reddit.com/r/' + subreddit + '/top.json?sort=top&t=day&limit=5') j=json.loads(data1.decode('utf-8')) for i in j['data']['children']: score=i['data']['score'] title=i['data']['title'] link=i['data']['url'] print(str(score) + ': ' + title + ' (' + link + ')') print('DONE:', subreddit + '\n') def signal_handler(signal, frame): loop.stop() client.close() sys.exit(0) signal.signal(signal.SIGINT, signal_handler) asyncio.ensure_future(get_reddit_top('python', client)) asyncio.ensure_future(get_reddit_top('programming', client)) asyncio.ensure_future(get_reddit_top('compsci', client)) loop.run_forever() 

Conclusion:

50: Undershoot: Parsing theory in 1965 (http://jeffreykegler.github.io/Ocean-of-Awareness-blog/individual/2018/07/knuth_1965_2.html) 12: Question about best-prefix/failure function/primal match table in kmp algorithm (https://www.reddit.com/r/compsci/comments/8xd3m2/question_about_bestprefixfailure_functionprimal/) 1: Question regarding calculating the probability of failure of a RAID system (https://www.reddit.com/r/compsci/comments/8xbkk2/question_regarding_calculating_the_probability_of/) DONE: compsci 336:/r/thanosdidnothingwrong -- banning people with python (https://clips.twitch.tv/AstutePluckyCocoaLitty) 175: PythonRobotics: Python sample codes for robotics algorithms (https://atsushisakai.github.io/PythonRobotics/) 23: Python and Flask Tutorial in VS Code (https://code.visualstudio.com/docs/python/tutorial-flask) 17: Started a new blog on Celery - what would you like to read about? (https://www.python-celery.com) 14: A Simple Anomaly Detection Algorithm in Python (https://medium.com/@mathmare_/pyng-a-simple-anomaly-detection-algorithm-2f355d7dc054) DONE: python 1360: git bundle (https://dev.to/gabeguz/git-bundle-2l5o) 1191: Which hashing algorithm is best for uniqueness and speed? Ian Boyd's answer (top voted) is one of the best comments I've seen on Stackexchange. (https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed) 430: ARM launches “Facts” campaign against RISC-V (https://riscv-basics.com/) 244: Choice of search engine on Android nuked by “Anonymous Coward” (2009) (https://android.googlesource.com/platform/packages/apps/GlobalSearch/+/592150ac00086400415afe936d96f04d3be3ba0c) 209: Exploiting freely accessible WhatsApp data or “Why does WhatsApp web know my phone’s battery level?” (https://medium.com/@juan_cortes/exploiting-freely-accessible-whatsapp-data-or-why-does-whatsapp-know-my-battery-level-ddac224041b4) DONE: programming 

Using Redis and Redis Queue RQ


Using CDMY11CDMY and CDMY12CDMY is not always a good idea, especially if you are using older versions of Python. In addition, there are times when you need to distribute tasks across different servers. In this case, you can use RQ (Redis Queue). This is the usual Python library for adding jobs to the queue and processing them by workers in the background. To organize the queue, Redis is used - a database of keys/values.

In the example below, we added a simple CDMY13CDMY function to the queue using Redis.

from mymodule import count_words_at_url from redis import Redis from rq import Queue q=Queue(connection=Redis()) job=q.enqueue(count_words_at_url, 'http://nvie.com') ******mymodule.py****** import requests def count_words_at_url(url): """Just an example function that's called async.""" resp=requests.get(url) print( len(resp.text.split())) return( len(resp.text.split())) 

Conclusion:

15:10:45 RQ worker 'rq:worker:EMPID18030.9865' started, version 0.11.0 15:10:45 *** Listening on default... 15:10:45 Cleaning registries for queue: default 15:10:50 default: mymodule.count_words_at_url('http://nvie.com') (a2b7451e-731f-4f31-9232-2b7e3549051f) 322 15:10:51 default: Job OK (a2b7451e-731f-4f31-9232-2b7e3549051f) 15:10:51 Result is kept for 500 seconds 

Conclusion


As an example, take a chess exhibition where one of the best chess players competes with a large number of people. We have 24 games and 24 people you can play with, and if the chess player plays with them synchronously, it will take at least 12 hours (assuming that the average game takes 30 moves, the chess player thinks through the move for 5 seconds, and the opponent takes about 55 seconds.) However, in asynchronous mode, the chess player will be able to make a move and leave time for the opponent to think, while moving on to the next opponent and dividing the move. Thus, you can make a move in all 24 games in 2 minutes, and they can all be won in just one hour.

This is what is meant when they say that asynchrony speeds up work. We are talking about such speed. A good chess player does not start playing chess faster, just time is more optimized, and it is not wasted on waiting. This is how it works.

By this analogy, the chess player will be a processor, and the main idea will be to keep the processor idle as little as possible. It’s about always having a lesson.

In practice, asynchrony is defined as a parallel programming style in which some tasks free the processor during periods of waiting so that other tasks can use it. Python has several ways to achieve concurrency that meets your requirements, code flow, data processing, architecture, and use cases, and you can choose any one of them.



Learn more about the course.


.

Source