What do we mean by “Poll”, “Push” and “Pull” in terms of data communication
Communications systems either “Push” or “Pull” data, but in some cases, when you need to know if anything is waiting a “Poll” is performed, these techniques each have advantages and disadvantages discussed here.
Polling
Polling is asking whether data is available, or can be sent, for example the pop3 protocol used by main readers. It is very simple and has the advantage that the server being polled need not know anything about the polling client state. The polling client must make periodic requests to the server to determine if data is ready or can be sent.
The disadvantages of “Polling” are that the polling client will not know exactly when it can send or receive data, hence to reduce latency the poll interval may need to be quite frequent increasing the server overhead, especially if it serves a number of clients.
If the polling interval is set to n, and the data transfer time m, then the average delivery/fetch latency is n/2 + m. This can be a limit in many systems.
Computer hardware has historically suffered from issues where some hardware did not use interrupts to indicate data reception, or, like in the PC the old interrupt controller having limited interrupts caused devices to share an interrupt. This in turn increased interrupt latency as the IBM PC had to poll all the hardware devices sharing this interrupt to determine the interrupt source. Hence the evolution of the APIC.
Polling is a solution only to used where servers need not know availability of clients, low latency is unimportant and the host being polled is able to handle the amount of polling requests.
Examples of services using “Poll” are NTP, POP3,
Pushing
Pushing of data is highly efficient and is where a host pushes data to the receiving host. Many protocols use such schemes such as:-
- Cups (Printed files are pushed to print server)
- FTP (oddly uses “Pull” as well) (files are ushed to remove server)
- LPR (Printed files are pushed to print server)
- SFTP (Secure FTP using ssh wrapper)
- SNMP (Simple Network Messaging Protocol)
- SMTP (Internet email delivery – very old – very reliable)
- Hardware Interrupts
“Pushing” is best used where data is ready for delivery and the client can accept data at any time.
Double Buffering
In some cases, such as writing large amounts of data to a “block” based piece of hardware, double buffering can be used to vastly reduce latency.
Consider a network adapter, which has an output buffer, and interrupts when it has completed a transmit. The OS writes a packet of data to the buffer, then waits for the card to send the data. When the hardware has successfully send the data it interrupts the OS to inform that it is ready to receive more data, but the time taken for the OS to service the interrupt filling the transmit buffer may delay a successive network transmit, adding unwanted delay between packets.
The solution is to utilise two transmit buffers in the hardware device, buffer a and b. Following successful transmission of buffer a, the network adapter will start to tranmit the data in buffer b (if ready) as well as interrupt the computer to instigate a data copy to buffer a. This ensures that the delay following the interrupt and the OS copy of data to buffer a does not add additional latency to the system. The OS software requires little change to cater for this type of system, but the throughput gains are massive. The overall latency of the system is not reduced, but the throughput is increased.
Pulling
Pulling of data is done when a client requires data, and is normally served by fast services. Most client user interfaces use “Pull” type services to achieve the fast response expected by a user. Examples of such services are:-
- FTP
- HTTP (driving the internet)
The HTTP protocol is a best use case, where users pull content “On Demand” and has driven the last 20 years of Internet development.
Conclusion
Most data communications systems work best when “Pulling” or “Pushing” data, the use of a “Poll” type system should be avoided unless their is a clear business case.
When designing systems it’s often simpler to implement a scheme which works “Sufficiently Well”, but if designed inappropriately requires more resources and power. It is often possible to implement systems that utilise very few resources by careful interface designs, and for low power embedded devices this is so important.
Very good and clarified definitions!
Is this this bothering you – I’ve been away ?