OpenSocial in the Cloud

Lane Liabraaten, Google Developer Programs
May 2008

Intro

Some OpenSocial apps can be written entirely with client side JavaScript and HTML, leveraging the container to serve the page and store application data. In this case, the app can scale effortlessly because the only request hitting your server is for the gadget spec XML file, and even that is cached by the container.

However, there are lots of reasons to utilize your own third party server:

Setting up an OpenSocial app that uses a third party server is pretty simple. There are a few gotchas and caveats, but the real issues come up when your app becomes successful - serving millions of users and sending thousands of requests per second. Apps can grow especially fast on social networks, so before you launch your next social app you should think about how to scale up quickly if your app takes off.

The problem is that scaling is a complex problem that's hard to solve quickly and expensive to implement. Luckily, there are several companies that provide cloud computing resources—places you can store data or run processes on virtual machines. These cloud computing solutions manage huge infrastructures so you can focus on your applications and let the cloud handle all the requests and data at scale.

This tutorial focuses on a simple photo-sharing app that uses a third-party server to host photos and associated metadata. If this app is going to host millions of images and support many requests per second, we won't be able to run it on a single dedicated host. We'll break the app down and analyze the interactions between the OpenSocial App and the back end server. Then we'll implement the app in the cloud; first using Google App Engine, then leveraging Amazon's S3 data storage service. Finally, we'll look at several ways to reduce the amount of network traffic the app generates, making the app faster and less expensive to host in the cloud.

The Photo Pier App

Photo Pier is a photo-sharing app where users can upload photos and tag them with keywords. Users can also view the photos that their friends have uploaded and select their favorite photos to display on their profile.

To enable this functionality we'll need to store some data for each photo: a reference to the user who uploaded the photo (referred to hereafter as its owner), a unique name, and a collection of string-based tags or labels that the owner has set for the photo.

Software Requirements

You'll need a few resources to complete this tutorial:

Moving to the Cloud

If this app grows to serve millions of users and photos, shared hosting or even a dedicated server won't have the bandwidth or CPU cycles to handle all the requests. We could invest in more servers and network infrastructure, shard the database, and load-balance requests, but that takes time, money and expertise. If you'd rather work on the new features of the app, it's time to move into the cloud.

It's important to focus on the interactions between the app and your server when designing an application that will run in the cloud. If we standardize the communication protocol and data format, we can easily change the server side implementation without modifying the OpenSocial app.

Secure Communication from the app to the server

Before we look at the specific requests and responses flowing between app and server, let's cover how OpenSocial enables this communication to be secure. With a standard makeRequest call, anyone can send a request to our server with the appropriate parameters and tag images or even upload images into another user's profile. This could lead to an embarrassing situation for your users and even bigger trouble for you and your app. Luckily, OpenSocial provides a mechanism for preventing this type of malicious behavior.

You can configure the makeRequest method to digitally sign the requests your app makes to your server using OAuth's algorithm for parameter signing. This means that when your server receives a request, it can verify that the request came from your application hosted in a specific container. To implement this, the calls to makeRequest in the OpenSocial app spec XML specify that the request should be signed, and the code that handles requests on the server side verifies that a signature is included and valid.

The change in the app is pretty small—just add a parameter that tells the container to use SIGNED authorization:

var params = {};
params[gadgets.io.RequestParameters.CONTENT_TYPE] = gadgets.io.ContentType.JSON;
params[gadgets.io.RequestParameters.AUTHORIZATION] = gadgets.io.AuthorizationType.SIGNED;

var url = buildUrl(this.request_base_url, 
                   ['photos'], [new Date().getTime()]);
gadgets.io.makeRequest(url, 
                       bind(this.closeFetchOwnerPhotos(callback), this), 
                       params);

When our server receives a request, we can verify that it came from our application by checking that the digital signature was signed by a valid container and that the application ID is correct. You can find code to do this in the opensocial-resources wiki.

Note that when signing a request, the container will add several request parameters including the id of the container and the id of the person that is using the app. The Photo Pier back-end leverages these parameters to know which user to associate each photo with.

Interactions between the app and the server

Based on the functionality described above, the following actions will result in the client-side app making a request to the server:

Uploading a new photo

To upload an image, the app creates a form that, when submitted, sends a HTTP POST request with the binary data and several additional parameters (don't worry, the form submission handles the encoding for you—all you have to do is specify an end-point).

http://<base_url>/photo

Note: <base_url> is used throughout this tutorial to denote the URL prefix which is a constant. All request URLs begin with this constant with the only thing changing being the end-point and the query string parameters.

When the server receives this request, it stores the image file and some metadata (i.e. the ID of the owner) in the datastore. An HTML reponse is expected with the text "Photo added" and an <img> tag configured to display the newly updated photo.

Fetching the current user's photos

When the user first loads the canvas page, the app needs to send a request to the server to get the list of photos to display. The app uses the gadgets.io.makeRequest function to send an HTTP GET request to the following URL:

http://<base_url>/photos?arg0=<TIMESTAMP>

When the server receives this request, it uses the oauth_consumer_key and opensocial_owner_id request parameters to locate the photos previously uploaded by the current user. For each photo, the server will return a URL and a list of tags for the photo.

The response to this request is a JSON string in the following format:

{"resultsSet":[ { "url":"http://foo", "tags":["Aruba", "snorkling" },
             { "url":"http://bar", "tags":["wedding", "cake"] } ]
}

Fetching photos for a user's friends

When the user clicks on the "Friends' Photos" tab, the app should display the photos that each of the user's friends have uploaded. Since our server isn't storing any relationship data, the app will need to send us a list of user IDs so we can fetch the appropriate photos. The fetchFriendPhotos method uses the makeRequest method to POST an HTTP request with the list of IDs included as post data. This POST request will go to the following URL:

http://<base_url>/photos?arg0=<TIMESTAMP>

Notice that this is the same URL above. The server will treat this request differently because the HTTP method is POST instead of GET. The post data in the request will be in the following format:


people=01495306580392390900,14088281537290874435

When our server receives the request it will parse this data and fetch the photos for each of the IDs provided by the request. The response is a JSON string in the following format:

{"resultsCollection":[ { "name" : "01495306580392390900",
               "photos" : [ { "url":"http://foo", "tags":["Aruba", "snorkling", "fish"] },
                            { "url":"http://bar", "tags":["snorkling", "shipwreck"] } ] },
             { "name" : "14088281537290874435",
               "photos" : [ { "url":"http://baz", "tags":["food", "pasta", "linguini"] },
                            { "url":"http://raz", "tags":["food", "dessert", "apple pie"] } ] } ]
}

Adding a tag to a photo

To add a text tag to a photo, the app sends the text along with the extended name of the photo (this includes the owner's OpenSocial ID and the photo name set during upload) to the server. The app uses the gadgets.io.makeRequest function to send a HTTP POST request to the following URL:

http://<base_url>/photo/<EXTENDED_PHOTO_NAME>

The actual text of the tag is sent as post data. When the server receives this request, it parses the extended photo name and uses the components to locate the stored photo to be tagged. The tag is then associated with the photo in the datastore. If the tagging is successful, the plain-text response should read "Tag added!".

Fetching tags added by a user

Finally, we'll add a request for returning all tags that a user has added. This set of tags can be shown in a drop-down list to enable the user to tag related photos more easily. Once again, the application uses a gadgets.io.makeRequest function call to send a GET request, this time to the URL below.

http://<base_url>/tags

The response is a stringified JSON object that looks like this:

{"tags":["Aruba", "snorkeling", "fish", "shipwreck"]}

Google App Engine

At this point you need to have all the software requirements installed on your development machine.

Our App Engine project will contain several files and a few third-party libraries detailed below:

The app.yaml file for our application is pretty simple:

application: datastore
version: 1
runtime: python
api_version: 1

handlers:
- url: /scripts
  static_dir: scripts

- url: /modules/.*
  script: modules.py

- url: /.*
  script: main.py

This file can be used to specify multiple handler scripts or locations for static content as we have done above. See Configuring an App for more details on this file.

Let's start coding our application. First we need to define handler classes for the various URLs that our app will be receiving requests on. The OpenSocial in the Cloud resource bundle contains the following skeleton of request handlers in cloud.py:

# cloud.py

import sys
sys.path.append('lib')

import re
import cgi
import urllib
import simplejson

from google.appengine.ext import webapp
from google.appengine.ext import db
from math import floor
from time import time

# Signature validation required libraries import
import base64
import hashlib
import oauth

from Crypto.PublicKey import RSA
from Crypto.Util import number

# Local port; change if another process is running on 8080
PORT = '8080'

class RootHandler(webapp.RequestHandler):
  def get(self):
    self.response.out.write("RootHandler received a GET request")

class TagsHandler(webapp.RequestHandler):    
  def get(self):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return

    self.response.out.write("TagsHandler received a GET request")

class PhotosHandler(webapp.RequestHandler):    
  def get(self):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return

    self.response.out.write("PhotosHandler received a GET request")

  def post(self):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return

    self.response.out.write("PhotosHandler received a POST request")

class PhotoHandler(webapp.RequestHandler):    
  def get(self):
    self.response.out.write("PhotoHandler received a GET request")
  
  def post(self):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return

    self.response.out.write("PhotoHandler received a POST request")

def _isValidSignature(self):

  # Code lab hack:
  # If the container is 'appengine' (e.g. app is running on localhost), return True
  if self.request.get('oauth_consumer_key') == 'appengine':
    return True

  # Construct a RSA.pubkey object
  exponent = 65537
  public_key_str = """0x\
00b1e057678343866db89d7dec2518\
99261bf2f5e0d95f5d868f81d600c9\
a101c9e6da20606290228308551ed3\
acf9921421dcd01ef1de35dd3275cd\
4983c7be0be325ce8dfc3af6860f7a\
b0bf32742cd9fb2fcd1cd1756bbc40\
0b743f73acefb45d26694caf4f26b9\
765b9f65665245524de957e8c547c3\
58781fdfb68ec056d1"""
  public_key_long = long(public_key_str, 16)
  public_key = RSA.construct((public_key_long, exponent))

  # Rebuild the message hash locally
  oauth_request = oauth.OAuthRequest(http_method=self.request.method, 
                                     http_url=self.request.url, 
                                     parameters=self.request.params.mixed())
  message = '&'.join((oauth.escape(oauth_request.get_normalized_http_method()),
                      oauth.escape(oauth_request.get_normalized_http_url()),
                      oauth.escape(oauth_request.get_normalized_parameters()),))
  local_hash = hashlib.sha1(message).digest()

  # Apply the public key to the signature from the remote host
  sig = base64.decodestring(urllib.unquote(self.request.params.mixed()["oauth_signature"]))
  remote_hash = public_key.encrypt(sig, '')[0][-20:]

  # Verify that the locally-built value matches the value from the remote server.
  if local_hash==remote_hash:
    return True
  else:
    return False

To keep this code nice and orderly, we place the end-point-to-class mapping in a separate file called main.py. The contents of this file should look like the following:

import cloud

def main():  
  application = webapp.WSGIApplication([('/', cloud.RootHandler),
                                        ('/tags', cloud.TagsHandler),
                                        ('/photos', cloud.PhotosHandler),
                                        ('/photo/.*', cloud.PhotoHandler),
                                        ('/photo', cloud.PhotoHandler)],
                                        debug=True)
                                        
  user = cloud.User.get_by_key_name(''.join(['appengine', '00000000000000000000']))
  if not user:
    _initializeDatastore()
  
  # Start application
  wsgiref.handlers.CGIHandler().run(application)
  
if __name__ == "__main__":
  main()

Before moving forward, pause for a moment and look at the _isValidSignature function provided above. In production environments that support digitally signing requests, this snippet could be run to verify the authenticity of a request—it encodes the parameters in accordance with the OAuth specification and verifies the digest of this with the request sent by the container. If they are the same, the request is known to be genuine; othewise, it is spoofed and your application should exit immediately. However, since we'll be running this example locally and not in a production environment, I've added a small section at the top of the routine that simply returns true if the container is 'appengine' (our fake container for this sample).

Be sure to remove this section if you deploy this application to a social network that does generate digital signatures for its requests.

Now that we've got a simple app, we'll test it with the development web server. If you haven't already, download the SDK and uncompress it. From the google_appengine directory, run './dev_appserver.py <your_app_directory>'. Verify that you can access your app from a browser (the default URL will be http://localhost:8080/).

Data Model

Google App Engine uses an object model datastore instead of a relational database. This means you just need to define the data elements that your app will use as Python objects that inherit from the db.Model class.

The Photo Pier app will keep track of two types of objects in the datastore: users and photos.

class User(db.Model):
  container = db.StringProperty()    # the container this user came from
  containerId = db.StringProperty()  # the ID provided by the container for this user
 
class Photo(db.Model):
  name = db.StringProperty()         # a unique ID for the photo
  content = db.BlobProperty()        # the binary data of the image
  contentType = db.StringProperty()  # the type of image (e.g. .jpg, .gif, etc.)
  user = db.ReferenceProperty(User)  # a reference to the user that uploaded this image (like a foreign key)
  tags = db.StringListProperty()     # a list of tags for this photo

As you can see, Google App Engine supports many data type in the datastore. For a complete list, see the Types and Property Classes documentation.

Uploading an image

The following implementation of the PhotoHandler class defines a post method that will be invoked any time the app gets an HTTP POST request on to the /photo end-point (as defined in the webapp.WSGIApplication constructor above). The post method first checks the container and personid parameters against the datastore to see if the user exists (and creates it if it doesn't exist). Then the photo's binary data is read and stored as a blob in the datastore.

class PhotoHandler(webapp.RequestHandler):    
  def get(self):
    self.response.out.write("PhotoHandler received a GET request")
  
  def post(self):
    form = cgi.FieldStorage()
    
    fileItem = form['file']
    personId = form.getfirst('personId')
    container = form.getfirst('container')

    self.response.headers['Content-Type'] = 'text/html'

    photo = createPhoto(container, personId, fileItem)
    if photo:
      self.response.out.write('Photo added.
') self.response.out.write(''.join(['<img src="http://localhost:', PORT, '/photo/', container, ':', personId, ':', photo.name, '" width="50"/><br/>'])) def getUser(container, personId): user = User.get_or_insert(''.join([container, personId]), container=container, personId=personId) return user def createPhoto(container, personId, fileItem): user = getUser(container, personId) name = ''.join([str(int(floor(time()))), fileItem.filename]) key = ''.join([container, personId, '_', name]) photo = Photo(key_name=key) photo.user = user photo.name = name photo.content = db.Blob(fileItem.file.read()) photo.contentType = fileItem.type photo.put() return photo

Notice the use of the getUser method above, which is called from createPhoto. This method queries the datastore looking for a user from the given container ID properties. If a match is found, it is returned to the calling method. Otherwise, a new user instance is created in the datastore and returned. App Engine makes this very easy with the Model class' get_or_insert method.

Note that this technique for uploading photos is not secure. Any server could send a HTTP POST request to our server with the appropriate parameters and upload a photo as any user in the system. Although it's outside the scope of this article, we could provide a mechanism for our OpenSocial app to request a one-time-use token that it would include in the request to upload a photo.

Fetching an individual's photo list

This implementation of the PhotosHandler class handles any HTTP GET requests with the get method. Again, it uses the parameters from the signed request to identify the user and fetches their photos via the getPhotos method. The photo information is written to the self.response.out object as a JSON string using the simplejson library's dumps method as demonstrated below.

class PhotosHandler(webapp.RequestHandler):    
  def get(self):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return
    
    personId = self.request.get('opensocial_owner_id')
    container = self.request.get('oauth_consumer_key')

    photos = getPhotos(container, personId)

    self.response.headers['Content-Type'] = 'text/plain'
    self.response.out.write(simplejson.dumps({'resultsSet': photos}))

The fetchPhotosForUser method calls the getPhotos function which in turn calls getPhotosForUser which queries the datastore for all photos uploaded by the specified person. The relevant information from the returned Photo objects is placed in a Python dictionary (a.k.a. an associative array or a map) that can be stringified by the simplejson library.

def getPhotos(container, personId):
  retArray = []

  photos = getPhotosForUser(container, personId)

  if photos:
    for objt in photos:
      photo = {}
      photo['url'] = ''.join(['http://localhost:', PORT, '/photo/', container, ':', personId, ':', objt.name])
      photo['tags'] = objt.tags

      retArray.append(photo)

  return retArray

def getPhotosForUser(container, personId):
  user = getUser(container, personId)

  return db.GqlQuery("SELECT * FROM Photo WHERE user = :1", user)

Note how the getPhotosForUser method uses the db.GqlQuery method to fetch a collection of Photo objects from the datastore.

Fetching the photo lists of multiple individuals

Since the server doesn't store any relationship data, the PhotosHandler class checks the post data of the request for a list of IDs from the container. It then calls the getPhotos function for each ID and returns the aggregate result as a JSON string. This response differs slightly from the response returned for a single individual's photo list: instead of a single array being returned in the response, multiple arrays may be returned, one per person. See the sample response several sections up.

class PhotosHandler(webapp.RequestHandler):    
  def post(self):
    form = cgi.FieldStorage()

    peopleIds = urllib.unquote(form.getfirst('people')).split(',')
    
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return

    personId = self.request.get('opensocial_owner_id')
    container = self.request.get('oauth_consumer_key')

    photoSetCollection = []

    for id in peopleIds:
      photoSet = {}
      photoSet['name'] = id
      photoSet['photos'] = getPhotos(container, id)

      if len(photoSet['photos']) > 0:
        photoSetCollection.append(photoSet)

    self.response.headers['Content-Type'] = 'text/plain'
    self.response.out.write(simplejson.dumps({'resultsCollection': photoSetCollection}))

Note that the post data is URL-encoded in the request so the post method uses urllib.unquote before splitting the comma-separated list of person IDs.

Adding a tag to a photo

To add a tag to a photo, the PhotoHandler class queries the datastore for a Photo with the given name and owner (the name and owner being passed into the request via the URL—i.e. .../photo/::). This format makes it fairly easy to find the appropriate Photo object in the datastore. Once found, the Photo object is updated with the new tag.

You may recall that PhotoHandler handles another type of POST request—photo uploading. We can distinguish between the types of POST requests by inspecting the parameters passed along with the request. An upload request, which is sent from a form, will have a 'file' member. If this request is present, we can proceed with the upload. Otherwise, we process the request as a tag post.

def post(self):
  form = cgi.FieldStorage()
  if not form.has_key('file'):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return
    
    personId = self.request.get('opensocial_owner_id')
    container = self.request.get('oauth_consumer_key')

    if form.getfirst('text'):
      textTag = urllib.unquote(form.getfirst('text'))

    match = re.search(r'^http://.*/photo/([\w\.]*?):([\w\.]*?):([\w\.]*)', urllib.unquote(self.request.uri))
    if match:
      photo = getPhoto(match.group(1), match.group(2), match.group(3))
      if photo:
        self.response.headers['Content-Type'] = 'text/plain'
        if 'textTag' in locals():
          addTextTagToPhoto(photo, textTag)
          self.response.out.write('Text tag added successfully')           
  else:
    # See file upload code above

def addTextTagToPhoto(photo, textTag):
  photo.tags.append(textTag)
  photo.put()

Notice that we've added a new conditional block at the very beginning of the method. If the form object does have a 'file' tag, we know that the request came from the upload form and we store the post data in the datastore as a Photo object. Otherwise, we parse the URI to get the photo name and owner credentials, retrieve the photo, and call the addTextTag function to update the Photo object in the datastore.

Fetching an individual's tags

This request is fairly straightforward. When the requests's end-point is /tags, the TagsHandler class kicks in and, with the help of the getTagsForUser function, fetches the user's photos, collecting all tags in a list. It then removes the duplicates from this list and returns the resulting list as a stringified JSON object using the simplejson library.
class TagsHandler(webapp.RequestHandler):    
  def get(self):
    if not _isValidSignature(self):
      self.response.out.write('SIGNATURE INVALID')
      return
    
    personId = self.request.get('opensocial_owner_id')
    container = self.request.get('oauth_consumer_key')

    tags = getTagsForUser(container, personId)

    self.response.headers['Content-Type'] = 'text/plain'
    self.response.out.write(simplejson.dumps({'tags': list(set(tags))}))

def getTagsForUser(container, personId):
  tags = []

  photos = getPhotosForUser(container, personId)

  if photos:
    for objt in photos:
      for tag in objt.tags:
        tags.append(tag)

  return tags

list(set(tags)) is a convenient and efficient way to remove duplicates from the original list.

Publishing the app

Up to this point, we've been using the development app server, but in order for an OpenSocial container like orkut or MySpace to access your Google App Engine application, the app needs to be hosted publicly. From the My Applications page, create a new application—you probably want to use something generic, like username-dev, since you can only create 3 apps with App Engine currently. Now update the app.yaml file to include this application name.

From the google_appengine directory, run './appcfg.py update <your_app_directory>' from the application directory to publish your app. Make sure you can access the application at http://your_app_name.appspot.com/ from your browser.

Note: You will need to sign up for a Google App Engine account to upload your application.

Amazon S3

S3 is a web service provided by Amazon for file/data storage. Conveniently for our needs, it can store any file between one byte and five gigabytes in size and, because it is a "cloud" service, it is "infinitely" scalable. You can upload as many files as you'd like as fast as you'd like. Of course, this comes at a cost, but it's still significantly cheaper (not to mention far more convenient) than renting out space in a data center.

S3 is a REST-based service meaning that you can use it in any development environment that is HTTP-aware. Cooler still, there are many open source client libraries available that make it a cinch to interact with S3 in whichever language you're most comfortable with. In the following section, we will use one such library to transform our sample above—now, instead of storing image binaries and text tags in the datastore, we will store the actual files and metadata in S3 instead.

Python

In order to continue through this section, you will need to register to be an S3 developer so that you can substitute your personal access and secret keys, which is needed by the library. Registration is easy enough. Once you have your keys, add the following to the top of cloud.py:

# Amazon AWS S3 import
import S3

# Amazon AWS parameters
AWS_ACCESS_KEY_ID = <YOUR_ACCESS_KEY>
AWS_SECRET_ACCESS_KEY = <YOUR_SECRET_KEY>

BUCKET_NAME = ''.join([AWS_ACCESS_KEY_ID.lower(), '.cloud'])

Very little of the Python classes defined above have to be changed. Instead, we will re-implement the helper functions to post and fetch data from S3 instead of the data store.

Let's start with the createPhoto function, which, as you may recall, was used above to post the image binary to the App Engine datastore. We'll reimplement it here to upload to S3 instead using the S3 library that we imported above.

def createPhoto(container, personId, fileItem):
  user = getUser(container, personId)
  
  name = ''.join([str(int(floor(time()))), fileItem.filename])
  key  = ''.join([container, personId, '_', name])

  headers = {
    'x-amz-acl':'public-read',
    'Content-Type': fileItem.type,
    'x-amz-meta-tags': ''
  }  
      
  conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
  
  conn.put(BUCKET_NAME, key, fileItem.file.read(), headers)

  return {'name': name}

getUser doesn't change since we want to continue to manage user information in App Engine's datastore. Notice that a new dictionary object is defined with the headers that we want to set (the last being the header that we'll eventually use to store tags with the image file). After that, it's just a matter of opening a connection to the S3 service by calling the libary's AWSAuthConnection constructor and using it to "put" a new file into the service, effectively uploading it.

Next, we want to be able to get the photo information out of S3. We'll modify getPhotosForUser to do this:

def getPhotosForUser(container, personId):
  retArray = []

  conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
  
  listResponse = conn.list_bucket(BUCKET_NAME, {'prefix': ''.join([container, personId, '_'])})
  
  if listResponse.entries:
    for entry in listResponse.entries:
      match = re.search(r'^.*?_(.*)$', entry.key)
      if match:
        getResponse = conn.get(BUCKET_NAME, entry.key)
        if getResponse.http_response.status_code==200:
          photo = {
            'name': match.group(1),
            'tags': []
          }
          if getResponse.object.metadata.has_key('tags'):
            tags = getResponse.object.metadata['tags']
            photo['tags'] = tags.split('|')
        
          retArray.append(photo)
  
  return retArray

After opening up a connection to the service, this code fetches all objects in the bucket that belong to the user (because we're prefixing the container and ID to the file name before we upload, we're able to easily query the service for a given user's photos by asking it to return only those that match a given prefix, just as we have here). Once all of the photos are available, a request is issued for each in order to get the content-type and metadata (tags) associated with each individually. This information is put into a list and returned.

addTextTagToPhoto becomes a little larger since it has to retrieve the photo from the service, set the appropriate header, and then "put" the object back. The code for this is printed below.

def addTextTagToPhoto(photo, textTag):
  headers = {
    'Content-Type': photo['contentType']
  }
  
  if not photo['tags'] == '':
    headers['x-amz-meta-tags'] = ''.join([photo['tags'], '|', textTag])
  else:
    headers['x-amz-meta-tags'] = textTag    
  
  conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
  
  conn.put(BUCKET_NAME, photo['key'], photo['content'], headers)

The only other function that needs to be changed substantially is getPhoto which fetches the data from S3 and returns the content to the browser (after specifying the appropriate content-type, of course).

def getPhoto(container, personId, photoName):
  key = ''.join([container, personId, '_', photoName])
  
  conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
  response = conn.get(BUCKET_NAME, key)

  photo = None
  if response.http_response.status_code==200:
    if response.http_response.headers.has_key('content-type'):
      photo = {
        'contentType': response.http_response.headers['content-type'],
        'content': response.http_response.content,
        'name': photoName,
        'key': key
      }
      
      if response.object.metadata.has_key('tags'):
        photo['tags'] = response.object.metadata['tags']
      else:
        photo['tags'] = ''
    
  return photo

Now once you replace your functions above with these and make a few additional changes (namely changing all object references (e.g. photo.name) to dictionary references like photo['name']), you've successfully completed your transition from a Google App Engine storage back-end to S3.

PHP

Easy to find and use a third-party open source library such as s3.class.php.

To add photos to your S3 account from a POST submission:

  $file = $_FILES['file'];
  $container = $_POST['container'];
  $containerId = $_POST['containerId'];

  /* FILE UPLOAD */

  $tmpFile = tempnam('/usr/local/www/PhotoBoard/tmp', '');
  move_uploaded_file($file['tmp_name'], $tmpFile);
  chmod($tmpFile, 0777);

  $tmpName = preg_replace('/\/var\/tmp\//', '', $tmpFile).$file['name'];
  $processedName = $container.':'.$containerId.':'.$tmpName;

  /* S3 */
  include 'S3/s3.class.php';

  $handle = fopen($tmpFile, 'rb');
  $fileContents = fread($handle, filesize($tmpFile));

  $srvc = new S3();
  $bucket = $srvc->setBucketName($srvc->keyId.'.PhotoBoardRepo');

  $metaInfo['photoid'] = "$photoId";

  $srvc->putObject($processedName, $fileContents, $srvc->getBucketName(), 'public-read', $file['type'], $metaInfo);

  fclose($handle);

Optimizations

A common misconception when coding in the cloud is that storage space, CPU cycles, and bandwidth are unlimited. While the cloud hosting provider can, in fact, provide all the resources your app needs, hosting in the cloud ain't free, so these resources are limited by your budget. Luckily, OpenSocial provides several mechanisms to cache images and data that will reduce the load on your server.

Using getProxyUrl

Consider the amount of traffic required to render the "Friends' Photos" tab. Assuming the average user has 10 friends with the app, and each has uploaded 20 photos (at 500KB each), rendering this page will request 100MB from your server. If this app gets popular and this tab gets 10,000 views a day, you're looking at 1TB of traffic, just for this one tab!

The gadgets infrastructure is designed to aggressively cache data to reduce the load on your server, but you have to tell it what images you want to cache. This is as simple as using the gadgets.io.getProxyUrl method to fetch the URL of the cached image and using that URL in the HTML of your app.

function showImage() {
  imgUrl = 'http://www.example.com/i_heart_apis_sm.png';
  cachedUrl = gadgets.io.getProxyUrl(imgUrl);
  html = ['<img src="', cachedUrl, '">'];
  document.getElementById('dom_handle').innerHTML = html.join('');
};

showImage();

This will greatly reduce bandwidth you use to serve images because the majority of the requests will be going to the cached URLs. To get the most benefit from caching, be sure to set the cache control headers appropriately for your content. For more information on caching, see the OpenSocial Latency Combat Field Manual.

Caching data to render the profile page

Profile pages make up the lion's share of application renders on OpenSocial container sites. If Photo Pier begins to get popular, say 100,000 users each with about 10 profile views a day, the app will be sending a million requests per day (over 11 requests per second) just to get the URLs of photos to display on the profile.

One technique for reducing traffic to your server is to use OpenSocial's Persistence API to store the data you need to render the profile view. Then your app doesn't need to contact your server at all to render the profile view.

In the case of Photo Pier, we're requesting a list of image URLs to include in the profile slideshow. Rather than storing this data in our database, we can store it in the Persistence API. When a user selects the photos to show in their slideshow, we can store this information in the container:

updateFavoritesData: function(value, photoUrl) {
  var req = opensocial.newDataRequest();
           
  if (value == true) {
    this.profilePhotoSet.push(photoUrl);
  } else {
    var index = this.profilePhotoSet.indexOf(photoUrl);
    if (index != -1) {
      this.profilePhotoSet.splice(index, 1);
    }
  }
  req.add(req.newUpdatePersonAppDataRequest(opensocial.DataRequest.PersonId.VIEWER, 
                                            'favoritePhotos', 
                                            gadgets.json.stringify(this.profilePhotoSet)));
  req.send();
}

Then when we render the profile view, we just request this piece of data from the container (not our server).

  fetchOpenSocialData: function(callback) {
    var req = opensocial.newDataRequest();
           
    req.add(req.newFetchPersonRequest(opensocial.DataRequest.PersonId.OWNER), 'owner');
    req.add(req.newFetchPersonAppDataRequest(opensocial.DataRequest.PersonId.OWNER, 'favoritePhotos'), 'profilePhotoUrls');
           
    req.send(bind(this.closeFetchOpenSocialData(callback), this));   
  },
  closeFetchOpenSocialData: function(callback) {
    return function(data) {           
      this.owner = this.getDataOr(data, 'owner', this.owner);
             
      var ownerData = data.get('profilePhotoUrls').getData()[this.owner.getId()];
      if (ownerData) {
        this.profilePhotoSet = gadgets.json.parse(gadgets.util.unescapeString(ownerData['favoritePhotos']));
      }
             
      if (callback && typeof(callback)=='function') {
        callback();
      }
    };
  }

In addition to reducing traffic to our server, this technique has the added benefit of being fast. Requesting data from the Persistence API is much faster than making the round trip to your server.

Resources

As you start coding your app in the cloud, you'll no doubt have some questions. Here are some resources to get you started.