Stacking Delays On Tasks

One of my favorite task queue tricks is to add a stacking delay whenever multiple tasks need to be added. Here’s a sample piece of code::

TaskOptions task = TaskOptions.Builder.withUrl("/example");
task.param(parameter_name, parameter_value);
//Stacking delay here. Every subsequent task is delayed 
//an extra 5 minutes.
task.countdownMillis((1000 * 5) + (i * 5 * 60 * 1000));
Queue queue = QueueFactory.getDefaultQueue();
queue.add(task);

Notice the i variable in the countdownMillis function. The countdown function tells the task to delay a certain amount of milliseconds.

Placing this code within a for loop (and with i as the loop counter) causes each added task to be delayed for 5 seconds, plus an additional 5 minutes over the previous task. To see how it works, consider how the i variable is processed: the first task (when i = 0) will execute in 5 seconds (due to the 1000 times 5 part – 5,000 milliseconds translates into 5 seconds). The second task (when i = 1) will execute in 5 minutes & 5 seconds (1000 * 5 + 1 * 5 * 60 * 1000 turns into 5,000 + 1 * 300,000 which is 305,000 milliseconds, which is also 5 minutes, 5 seconds). The third task will wait for 605,000 milliseconds, which is 10 minutes plus 5 seconds, and so forth.

Staggering tasks like this can make an application run smoother and with fewer spikes in resource usage.

CORS Preflight Request Testing In cURL

When browsers send AJAX-JSONP requests, they often send a “preflight request” before the JSONP call. This request is a HTTP OPTIONS call asking the server whether it supports the cross origin resource sharing specification (in other words, JSONP requests).

To test a server’s support for cross origin resource sharing (CORS), you can use the cURL utility to emulate a HTTP OPTIONS request. A server that supports CORS will return a number of Access-Control headers specifying the requests it supports. Here’s an example cURL command:

curl -H "Origin: http://www.example.com" \
  -H "Access-Control-Request-Method: POST" \
  -H "Access-Control-Request-Headers: X-Requested-With" \
  -X OPTIONS --verbose \
  http://ip.jsontest.com/

Here’s an example of a proper CORS preflight response:

Access-Control-Allow-Origin is set to a wildcard, which means that all domains are permitted to make requests to it. Access-Control-Max-Age means that the results of this preflight request can be saved for 86,400 seconds (1 day). Access-Control-Allow-Methods means that GET and POST requests are supported.

Cloudflare Error

Many App Engine applications use Cloudflare for caching, SSL, and other services. However if Cloudflare can’t reach your website (for example, if the request is taking too long to finish) the above screen will be shown to the user.

Extract All HTTP Headers In Java

Here’s a code snippet that extracts all HTTP request headers and loops through them.

The variable req represents a javax.servlet.http.HttpServletRequest reference, while header_name and header_value represent the name and value of the header. A single header name may have multiple values; if so, header_value will record the first value listed.

Enumeration<String> headers = req.getHeaderNames();
//Loop through all headers
while (headers.hasMoreElements()) {
    String header_name = headers.nextElement();
    String header_value = req.getHeader(header_name);
    //Do something with the header name and value
}//end while loop going through headers.

Remember that HTTP header names are case-insensitive. If you’re doing any comparisons, you may want to turn the name into a lowercase form:

String header_name_lowercase = header_name.toLowerCase();

Don’t forget to import the Enumeration class:

import java.util.Enumeration;

Clearing Memcache

A quick note today: If you upload new data to the datastore via the bulk uploader or change your application’s data model, you should flush your application’s memcache to prevent stale data from being served to browsers. To do this, go to the Memcache Viewer screen (under the Data heading in the navigation bar) and press the button marked Flush Cache:

Looking Up App Engine Issues

Found a bug within App Engine? Want to request a feature? The best way to notify Google is to file an issue within App Engine’s tracker, located at: https://code.google.com/p/googleappengine/issues/list .

You can file an issue by clicking the New Issue button in the top left corner of the page:

But before you file a new issue, make sure that someone else hasn’t already filed the same issue. You can search issues by using the search box on the upper portion of the page:

If you don’t want to search, you can browse through the issues by using the dropdown boxes in each header:

Here’s an example of what an issue looks like:

Retrieving All Entities Older Than An Arbitrary Date

Here’s a Java code example to search the datastore for all entities within a kind older than a given date.

The variable kind is the entity kind being searched, add_date is a property on each entity that is set to the date the entity was created, and entities is a java.util.List object containing the returned entities. The variable time_point represents a point in time; we query the datastore for all entities with a date less than that.

/**
 * Retrieve all entities older than a set amount of time.
 */
Query q = new Query(kind);
//Represents a point in time 48 hours ago.
Date time_point = new Date((new Date().getTime()) - (1000 * 60 * 60 * 48));
Query.Filter time_point_filter = new Query.FilterPredicate("add_date", Query.FilterOperator.LESS_THAN_OR_EQUAL, time_point);
q.setFilter(time_point_filter);
PreparedQuery pq = DatastoreServiceFactory.getDatastoreService().prepare(q);
List<Entity> entities = pq.asList(FetchOptions.Builder.withLimit(30));
System.out.println(entities.size() + " entities returned.");

Suppose you wanted to loop through all of the returned entities. Here’s an example:

//Loop through all entities
for (int i = 0; i < entities.size(); i++) {
    Entity entity = entities.get(i);
    System.out.println("Entity: " + entity.toString());
    //Do something with the entity variable.
}//end loop going through all entities

Hanging Memcache Calls

Recently there was a discussion in the App Engine forums about memcache calls that were hanging; in one instance, a memcache async put call was taking 2 hours to complete!

This was a particularly interesting issue, and I’d like to share a number of thoughts I had while solving it:

App Engine has a number of internal rate limiting/throttling controls on services. Moving large quantities of data around can quickly cause an application to hit these limits. In fact, I suspect that this was the actual problem – the original poster’s application was storing multiple megabytes of data into memcache in multiple asynchronous calls that occurred simultaneously; this design could easily be hitting a number of different rate limits. My suggestion for solving this problem (which ultimately worked) was to add a short delay after each memcache put call and to split the data amongst an increased number of memcache put calls. The reasons for which I suggested this fix are numerous:

  1. Adding a short delay after each memcache put call buys time for App Engine’s rate limit to reset; it prevents App Engine from thinking that the application is malfunctioning or attempting to overwhelm the memcache pipeline.
  2. Delays are easy to implement – in Python it’s one call to time.sleep(number of seconds to delay)and in Java it’s a simple call to Thread.sleep(number of seconds to delay). Note that in Java, you have to catch the potential InterruptedException. The Go call is similar to Python: call time.Sleep(delay duration). In PHP a delay is even simpler than in all of the above languages: all you need to do is call sleep(delay seconds).
  3. Increasing the number of memcache put calls means that a smaller amount of data is being stored for each memcache put. This contributes to point 1: preventing the pipeline to memcache from being overwhelmed with data.
  4. The delay doesn’t need to be long: two to five seconds is more than enough. In some cases, even a one second delay is enough to work.

Fortunately, the above fix worked in this case. But if it had not, I was prepared with a number of other possible fixes. For instance, I would have suggested the use of the task queue: split the data among multiple tasks, and then have each task store their data into memcache. Since each task would constitute a separate request and may be split amongst multiple instances, there’s less of a chance for any rate limiting to kick in. If that option wasn’t palatable for any reason, then another option would be to switch to dedicated memcache; it seems to be much more forgiving in regards to usage.

If none of the above options had worked, I would have suggested dumping memcache entirely and writing to the datastore/Cloud SQL. While memcache is a terrific service, it is not reliable – persisting the data through alternative sources is a much better way to manage large quantities of information.

The short version of this post: hanging or slow memcache calls can be fixed by inserting delays after each call and decreasing the amount of data handled in each memcache call.

Basic Java Task Queue Code

Here’s a simple example of how to use the task queue in Java. The code below retrieves the default queue and queues up a task. The task will request the /example_url path and pass in the parameter parameter1 with the value parameter1_value.

Queue queue = QueueFactory.getDefaultQueue();
TaskOptions task = TaskOptions.Builder.withUrl("/example_url");
task.param("parameter1", parameter1_value);
queue.add(task);

Remember to import the task queue classes:

import com.google.appengine.api.taskqueue.Queue;
import com.google.appengine.api.taskqueue.QueueFactory;
import com.google.appengine.api.taskqueue.TaskOptions;