This article introduces the concept of message queues and discusses the strengths and weaknesses of three specific message queue services: Beanstalkd, IronMQ and Amazon SQS.
Any information described in this article is correct at the time of writing and is subject to change.
What are Message Queues?
Queues allow you to store metadata for processing jobs at a later date. They can aid in the development of SOA (service-oriented architecture) by providing the flexibility to defer tasks to separate processes. When applied correctly, queues can dramatically increase the user experience of a web site by reducing load times.
Advantages of message queues:
- Asynchronous: Queue it now, run it later.
- Decoupling: Separates application logic.
- Resilience: Won't take down your whole application if part of it fails.
- Redundancy: Can retry jobs if they fail.
- Guarantees: Makes sure that jobs will be processed.
- Scalable: Many workers can process individual jobs in a queue.
- Profiling: Can aid in identifying performance issues.
Disadvantages of message queues:
- Asynchronous: you have to wait until a job is complete.
- Load: each job in the queue must wait its turn before it can be processed. If one job overruns, it affects each subsequent job.
- Architecture: the application needs to be designed with queues in mind.
Use cases of message queues:
Any time consuming process can be placed in a queue:
- Sending/receiving data from a third-party APIs
- Sending an e-mail
- Generating reports
- Running labour intensive processes
You can also use queues in creative ways – locking jobs so only one user can access information at a time
There are many services that you can use to implement message queues, this article outlines differences between Beanstalkd, IronMQ and Amazon SQS.
Beanstalkd is "… a simple, fast work queue". It is released as open source under the MIT license. It's well documented, unit tested and can be downloaded for free to run on your own server. The architecture is borrowed from memcached and it is designed specifically to be a message queue.
An article on SitePoint by author Dave Kennedy called Giant Killing with Beanstalkd contains information on how to start using Beanstalkd with Ruby.
IronMQ is a hosted RESTful web service. There is a free tier for developers and many other subscription tiers for commercial applications.
Amazon SQS is an inexpensive hosted solution for implementing message queues. It comes as part of Amazon Web Services (AWS). Amazon offers a Free Tier for evaluating their web services which includes SQS.
|Self-hosted||Remotely hosted||Remotely hosted|
Runs on Linux and Mac OS X. Read the installation instructions from the Beanstalkd website for details on how to get it working on your system. The Beanstalkd server does not work on Windows.
IronMQ and SQS
IronMQ and Amazon SQS are cloud-based web services. No applications need to be setup on your server, you simply need to sign-up for an account and setup a queue.
Service Level Agreements (SLAs)
|None||99.95% per month||None|
As Beanstalkd is a server you host, you are responsible for ensuring its availability.
Iron.IO has a Service Level Agreement with an uptime percentage of at least 99.95% during any monthly billing cycle. Their Pro Platinum package ($2450/month) has custom contract terms which includes Service Level Agreements. They provide refunds in Service Credits.
Amazon does not have a specific Service Level Agreement for SQS. They do have Support Services available which can cover SQS at an extra cost.
|PUSH (sockets)||HTTP Web Service||HTTP Web Service|
Communicates via PUSH sockets providing instant communication between providers and workers.
When a provider enqueues a job, a worker can reserve it immediately if it is connected and ready. Jobs are reserved until a worker has sent a response (delete, bury, etc.)
SQS is a hosted RESTful web service.
There is push-like support for IronMQ. A subscriber can be called whenever a provider enqueues a job to the queue. Generally you will want to use the standard RESTful service to enqueue and dequeue jobs instead of the push approach.
SQS is a hosted web service.
There is no push support for SQS. You must poll at regular intervals to check if there are jobs in the queue.
SQS can use long polling known as (default: 0 seconds, max: 20 seconds) to keep a connection open while the worker waits for a job. This can mean fewer requests and longer socket opening times.
There are many open source Beanstalkd client libraries available in a myriad of programming languages. These are all independent projects from Beanstalkd.
The IronMQ client libraries are provided by Iron.IO and can be downloaded from the Dev Center.
You can also use a Beanstalkd client library with IronMQ if you'd like the flexibility of switching between the two services; however some commands (e.g.: kick, bury) are not supported. You also may need to implement the oauth command manually to connect to the service.
The AWS client libraries include the SQS client libraries. These are provided by Amazon and are available in many programming languages.
No graphical management interface is distributed by default. There are some open source projects to help with debugging and administration which can be found on the Beanstalkd tools page.
The IronMQ dashboard manages queues. It contains a helpful tutorial describing how to setup queues and shows you how to add jobs (IronMQ: messages) to a queue via cURL.
The interface allows you to manage your queues in an AJAX-driven website. You can create, read and delete jobs, view historical information and manage queue configuration from the dashboard view.
The AWS Management Console allows you to manage SQS. The interface is built on top of a stateless protocol so you need to press the refresh button to get up-to-date information.
You can create, read and delete jobs (SQS: messages) and manage queue configuration.
Redundancy is handled on the client side and if a server goes down you will lose jobs.
Beanstalkd does include an option to store jobs in a binary log. You must launch Beanstalkd with the option, however restoring the queue is a manual task and requires access to the server disks.
IronMQ is a cloud-based service with high persistence, availability and redundancy.
Jobs are stored on multiple servers in a hosted zone. This approach ensures the availability of the service and jobs should never be lost.
|None||Token||Key & secret|
No authentication is required to connect to Beanstalkd. Providers are able to enqueue jobs and workers are able to reserve jobs without passing through a security model. For this reason it is highly recommended to create a firewall blocking external connections to the port that Beanstalkd is running on.
You can invite collaborators via the project settings to use your message queues. Authentication to the application is done via an Iron.IO token and a project ID.
Authentication to SQS is realised through the Amazon API key and secret. Permissions can be granted and revoked for other AWS accounts to access your queues via the AWS Management Console.
|Fast||Internet Latency||Internet Latency|
Beanstalkd is very fast as it should be on the same network as its providers and workers. Beanstalkd can sometimes be so fast that if a provider puts a job in a queue and follows it with a call to MySQL, a worker may pick up your job before MySQL has finished executing.
Requests have an increased latency as they are sent to the IronMQ RESTful web service via HTTP.
Requests have an increased latency as they are sent to the SQS web service via HTTP.
Jobs may not be picked up straight away as they need to be distributed across different servers and data centres. This latency should be negligible if the application, a provider or a worker is hosted on an EC2 instance.
When you enqueue a job to SQS, it might not be immediately available. Jobs must be propagated to other servers. There is generally a one second wait at most.
|Prioritisable||No priority||No priority|
Queues are FIFO (first in, first out). Jobs with higher importance can be prioritised which will affect the order in which jobs are dequeued.
Queues are FIFO (first in, first out). Jobs cannot be prioritised.
Jobs will not come out in the same order that they entered the queue. Because SQS is a distributed service, jobs will be available on each server at different times. This is something to be acutely aware of when designing for SQS.
One-time pickup describes the restriction that unless a worker has timed out, two or more workers will never run the same job in parallel.
The socket-based architecture of Beanstalkd ensures one-time pickup.
IronMQ guarantees one-time pickup.
Because SQS is a distributed service, there is no guarantee for one-time pickup (but it is unlikely).
Jobs are automatically returned to the queue if a worker doesn't respond to Beanstalkd within a set amount of time or if the socket closes without responding to the job.
It's then ready for immediate pick-up by the next requesting worker (it doesn't need to be kicked).
IronMQ & SQS
Workers connect to a queue and reserve a job. From this moment, the worker has a set amount of time to delete the job from the queue before it is released and becomes available for workers to reserve again.
Creating new queues
|Automatic||Auto & manual||Manual|
Queues (Beanstalkd: tubes) are automatically created when jobs are enqueued. They do not need to be created manually.
Requires you to create a project in the dashboard. One project contains many queues. Queues can either be created automatically when jobs are enqueued or manually created with configuration from the dashboard.
Queues must be manually setup from the AWS management console for SQS. Each queue will generate a unique URL which acts as the queue name.
Note the region (e.g.: us-west-1, eu-west-1, etc.) that the queue belongs to as it's required to connect to SQS.
The Laravel framework has an excellent built-in wrapper which encapsulates message queues for Beanstalkd, IronMQ and Amazon SQS. You can change servers through configuration without altering any of your application.
PHP code samples
These code examples show you how you can connect to a server, enqueue, reserve and dequeue a job from a queue. If an exception is thrown, it will bury the job (if the server supports it).
Try stopping the execution after a job has been enqueued and using a management tool to debug your queue.
Tips for message queues
Regardless of which service you select, here are some tips for keeping your queues robust:
Your job can contain whatever data you like, provided it's within the limit of the server's job data size. Use JSON in your job body to make metadata easy to transmit.
Limit your job data size
Try not to crowd jobs with too much metadata. If you can can store some information in a database and only queue an ID for later processing, your queue will be more robust and easier to debug.
Keep track of job states
If for some reason an item which has already been processed re-enters a queue, you probably don't want it to be reprocessed. Unfortunately the job data is not forced to be unique and it's important that you keep track of the state of a job in a database.
This can be as simple as having a column on your jobs table to mark an item as processed. You can deleting the job from the queue if it already has been handled.
Some words are used differently between Beanstalkd and Amazon SQS. There's a quick list of translations:
|Job data||Message body||Message body|
|TTR (time-to-run)||Visibility timeout||Timeout|
|–||Retention Period||Expires in|
When working with queues you may come across these terms:
– puts a job in a failed state. The job cannot be reprocessed until it is manually kicked back into the queue. Not supported by IronMQ and SQS.
– see Worker.
– defer a job from being sent to a worker for a predetermined amount of time.
– see Dequeue.
– marks a job as completed and removes it from the queue.
– adds a job to a queue ready for a worker.
– describes the way jobs are handled in a queue as First In, First Out. This is the most common type of message queue.
– describes the way jobs are handled in a queue as First In, Last Out.
– a deferred task in a queue containing metadata to identify what task is waiting to be processed. Akin to database rows.
– returns a previously buried job to the queue ready for workers to pick up. Not supported by IronMQ and SQS.
– a client which connects to the message server to create jobs.
– a way to group similar jobs into a queue. Akin to database tables.
– delivers a job to a worker and locks it from being delivered to another worker.
– a client which connects to the message server to reserve, delete and bury jobs. These perform the labour intensive part of the processing.
There is no silver bullet for message queue services. Beanstalkd, IronMQ and Amazon SQS all have their strengths and weaknesses which can be used to your advantage. This article should provide you with enough information to help you make an informed decision as to which service is best for your skill level and project needs.
Which message queue service will you be using? If you currently use queues, Will you consider switching? Have you used message queues in an unconventional way that could help others? Leave a comment and let everyone know.
Bash is a London-based creative technologist specialising in physical computing and backend development. In the past, he has studied and tutored at universities in Australia but now works in a digital agency.
Женщина, наклонившаяся над умирающим, очевидно, услышала полицейскую сирену: она нервно оглянулась и потянула тучного господина за рукав, как бы торопя. Оба поспешили уйти. Камера снова показала Танкадо, его руку, упавшую на бездыханную грудь.