Saturday, March 31, 2012

Simple auto-completion with Google App Engine and Google Web Toolkit

This old post has been cross-posted from http://blog.yata.fr/ which is now closed.


Reading the post about self merge-joins and full-text search on Google App Engine, I thought this could be really helpful in some of my apps.

  Actually my needs were a bit different. I had simple names I wanted to search on which is quite the same use case. Once my search was nicely working, I thought I could try and go deeper with auto-completion on my search field.

  Hold on ! Auto-complete needs to be responsive, otherwise it would be useless. Is there a faster way to retrieve entities on App Engine datastore than looking if they contain a given word in one of their list attribute ? Well... if there is something to remember about the datastore, it is that queries on Keys are much faster than any other query. Fine let’s only use keys then... !

          Before going further you can try the result directly there.

          The code for this tutorial is given as an attachment to this post or can be downloaded from here.

Indexing
For this post purpose we’ll consider we have a very simple entity class which only has a key and a name: the Country class. We want to enable auto-complete on this name field. We thus need to build an index based on keys. One index entity would be composed of a part of the name as its own Key and a list of Keys : keys of the actual Country entity.
In other words for the name United Kingdom we would have the following index entities:


KeyKey list
u“United Kingdom Key”, other keys of countries beginning with “u”...
un“United Kingdom Key”, other keys of countries beginning with “un”...
uni“United Kingdom Key”, other keys of countries beginning with “uni”...
.....


Building such an index is actually really simple. Here is a piece of code that will do the job:

PersistenceManager pm = PMF.get().getPersistenceManager();
try{
    for (String word:words){
       for(int i=0;i<word.length();++i){
           String keyword = word.substring(0, i+1);
           CountryIndex index = null;
           try{
               index = pm.getObjectById(CountryIndex.class, keyword);
               if (index==null)
                    index = new CountryIndex(keyword);
          }catch(JDOObjectNotFoundException e){
              index =  new CountryIndex(keyword);
          }
   Key k = KeyFactory.stringToKey(key);
   index.getKeys().add(k);
   pm.makePersistent(index);
        }
    }
}finally{
    pm.close();
}

Let’s explain this a bit:

For each word, we rebuild it from the beginning. For instance, United Kingdom is composed of two words:
    • United will have ‘u’, ‘un’, ‘uni’, ‘unit’... for indexes.
    • Kingdom will be broken down to ‘k’, ‘ki’, ‘kin’, ‘king’...
Each index is then given the country Key it indexes.
Doing this takes time and you must not forget that a request can’t last more than 30s. The solution to this is using task queues. You should use task queues not only to call this code but you could also break it in several tasks, let’s say for each word for example.

         As you may have noticed this snippet only creates a really simple index which only allows you to search for countries containing words beginning by the keyword. Typing ‘ni’ will then return Nicaragua but won’t give you United Kingdom as a result.
II bet if you came so far, it won’t take you long to find the algorithm that suits your expectations.


          So what’s next ?


          Let’s have a look what is needed to retrieve suggestions from a keyword:
try{
    CountryIndex index = pm.getObjectById(CountryIndex.class, keyword);
    DatastoreService service = DatastoreServiceFactory.getDatastoreService();
    Map<Key, Entity> results = service.get(index.getKeys());
  
    ArrayList<String> suggestions = new ArrayList<String>();
    for (Entity entity:results.values()){
        String name = (String) entity.getProperty(Country.NAME);
        suggestions.add(name);
    }   
    return suggestions;
}finally{
    pm.close();
}

Simple and efficient: it only involves queries on Keys. Guess there is no need to detail anything here.

Using GWT Suggest Box

To display these suggestions GWT provides us a really simple widget: SuggestBox. This widget must be instantiated with a SuggestOracle as a parameter. What we need is an Oracle that can retrieve suggestions from the server. We’re simple gonna use GWT RPC to do this.

          Here is our RpcSuggestOracle :
try{
    public class RpcSuggestOracle extends SuggestOracle {
        private SuggestServiceAsync suggestService;
        public RpcSuggestOracle(SuggestServiceAsync searchService){
            this.suggestService = searchService;
        }
        @Override
        public void requestSuggestions(final Request request, final Callback callback) {
            //We only support one keyword for now.
            if (!request.getQuery().contains(" ")){
                suggestService.suggest(request.getQuery(), 
                new AsyncCallback<ArrayList<String>>() {
                    public void onFailure(Throwable caught) {
                         Window.alert("Error while getting suggestions.");
                    }
                    public void onSuccess(ArrayList<String> result) {
                        if (result!=null){
                            ArrayList<Suggestion> suggestions = 
                                new ArrayList<Suggestion>();
                            for (final String sug:result){
                                suggestions.add(new Suggestion() {
                                public String getReplacementString() { return sug; }
                                public String getDisplayString() { return sug; }
                            });
                        }
                        Response resp = new Response(suggestions);
                        callback.onSuggestionsReady(request, resp);
                    }
                }
            });
        }else{
            Response resp = new Response(new ArrayList<Suggestion>());
            callback.onSuggestionsReady(request, resp);
        }
    }
}
The list of String returned by our RPC Service is used to create a list of Suggestion objects which will be used by the SuggestBox to display them. The only thing that we miss is that SuggestBox instantiated. I am a huge fan of UiBinder so here is is how to instantiate a SuggestBox using it.

SearchBox.ui.xml
<!DOCTYPE ui:UiBinder SYSTEM "http://dl.google.com/gwt/DTD/xhtml.ent">
<ui:UiBinder xmlns:ui="urn:ui:com.google.gwt.uibinder"
                xmlns:g="urn:import:com.google.gwt.user.client.ui">
    <ui:style>
        [...]
    </ui:style>
    <g:HTMLPanel>
        <g:SuggestBox ui:field="suggestbox" [...] />
        [...]
    </g:HTMLPanel>
</ui:UiBinder>
SearchBox.java
public class SearchBox extends Composite{
    private static SearchBoxUiBinder uiBinder = 
        GWT.create(SearchBoxUiBinder.class);

    interface SearchBoxUiBinder extends UiBinder<Widget, SearchBox> {}

    @UiField(provided = true)
    SuggestBox suggestbox;
 
    public SearchBox(RpcSuggestOracle rpcSuggestOracle) {
        suggestbox = new SuggestBox(rpcSuggestOracle);
        initWidget(uiBinder.createAndBindUi(this));
    }
}

Here we need to to set the SuggestBox to “provided = true” since its constructor takes a parameter.


Well that’s quite all you need to get started !

Improvements


The first time you enter a keyword you might notice that suggestions take a while to appear and the following tries are really responsive. I don’t really know what’s the real reason to this. It seems like a cold start can take some time. To investigate...

Here are some ideas to improve performance and responsiveness:
  • Use memcache as much as you can. It costs you about two line of code and can significantly  improve performance:
  • You can limit the number of suggestions returned. This can spare some computing time.
Some feature improvements:
  • Improve index to find other suggestions than the ones beginning by the keyword.
  • Add multi-keyword support.

Other use

This index can be used to do simple search. However I must warn you about paging the results.
You might know that since SDK 1.3.6 there no longer is a 1000 entity limit on offset queries. This is really convenient to do paging.
But using our index, you will have to find your own way to page results without losing performance. I may share you my own way to do this in another post dedicated to paging.
Hope this post brought some help.

Saturday, September 3, 2011

Uploading (not that big) files to Google Sites via Google App Engine with GData Java client

This post will be a quick one.

I lost some time searching through forums and groups on this when I tried to do it and it was actually so simple that I almost got angry not having found it alone...

Some context maybe.
Imagine your Google App Engine application have a form with which you intend to upload a file to Google Sites. The first question is why ? Well... in my case the idea was to use Google Sites as an Android package repository accessible by all my customer's users and actually manage this repository from Google App Engine.

Here is what the Google Sites API documentation says us on uploading attachments.
However using a File on Google App Engine is not that easy.

I could also store a copy of these files in the Blobstore but they're not important enough for me to lose storage quota. This means the ideal way of uploading them to Google Sites would be to "transfer" in the same request than the form POST to our application.

And here is how I did it.

First of all you have get the file that was posted to you application. As a reminder, here is how to do it.
The only difference in our case is that instead of writing the stream back to client, I used  IOUtils.toByteArray method to keep the file in memory, ready to be used for transfer.

String contentFeedUrl = "https://sites.google.com/feeds/content/site/mysitesame";
ContentFeed contentFeed = service.getFeed(
                 new URL(contentFeedUrl + "?kind=filecabinet"),
                 ContentFeed.class);
//I only have one file cabinet page, so I just take the first one.
FileCabinetPageEntry parentPage = contentFeed.getEntries(
                  FileCabinetPageEntry.class).get(0);
String apkType = "application/vnd.android.package-archive";

AttachmentEntry newAttachment = new AttachmentEntry();
//MediaByteArraySource accepts byte array instead of a File
newAttachment.setMediaSource(new MediaByteArraySource(bytes, apkType));
newAttachment.setTitle(new PlainTextConstruct("myAndroidAppName.apk"));
newAttachment.setSummary(new PlainTextConstruct("myAndroidAppDescription"));
newAttachment.addLink(SitesLink.Rel.PARENT,
                                    Link.Type.ATOM,
                                    parentPage.getSelfLink().getHref());

newAttachment = sitesService.insert(new URL(contentFeedUrl), newAttachment);
String link = newAttachment.getLink(Rel.ALTERNATE, apkType).getHref();
//Guess you'll want to keep this link somewhere.
return link;

This that simple. As stated in the title, I didn't test this with big files and I really don't what would happen when keeping a large file in a byte array... So I leave that to you.
Feel free to let us know your results in the comments.

Also, I'm not telling here this is the only nor the best solution. For example, you could first write the file to the Blobstore, and create a task that will upload it to Google Sites and delete from the Blobstore in a second time.

Having a quick look at the Google Data Java client source code we can also find a MediaStreamSource class that could also do the trick. So I guess this piece of code would work as well, and may even be a better solution to upload larger files:

String apkType = "application/vnd.android.package-archive";
String contentFeedUrl= "https://sites.google.com/feeds/content/site/mysitename";
FileItemIterator iterator = new ServletFileUpload().getItemIterator(req);
while (iterator.hasNext()) {
	FileItemStream item = iterator.next();
	InputStream stream = item.openStream();
				
	if (!item.isFormField()) {
		SitesService service = new SitesService("AttachmentUploader-1");
		//Quick & Bad !! Always prefer OAuth when you can use it.
                service.setUserCredentials("myusername@gmail.com", "mypassword");
		service.setConnectTimeout(5000);
		service.setReadTimeout(5000);
					
		ContentFeed contentFeed = service.getFeed(new URL(contentFeedUrl+ "?kind=filecabinet"), ContentFeed.class);
		FileCabinetPageEntry parentPage = contentFeed.getEntries(FileCabinetPageEntry.class).get(0);
					
		AttachmentEntry newAttachment = new AttachmentEntry();
		newAttachment.setMediaSource(new MediaStreamSource(stream, apkType));
		newAttachment.setTitle(new PlainTextConstruct("myAndroidAppName.apk"));
		newAttachment.setSummary(new PlainTextConstruct("myAndroidAppDescription"));
		newAttachment.addLink(SitesLink.Rel.PARENT, Link.Type.ATOM, parentPage.getSelfLink().getHref());

		newAttachment = service.insert(new URL(contentFeedUrl), newAttachment);
		String link = newAttachment.getLink(Rel.ALTERNATE, apkType).getHref();
		logger.info(link);
	} else{
        //Do what you have to do with other fields...
    }
}