Thursday, October 26, 2017

Don't Await when using AsyncFlatSpec in ScalaTest!


There is a subtle behavior to be aware of if you use the scala.concurrent Await in an AsyncFlatSpec. The Scaladoc of Future.apply shows that it requires an implicit ExecutionContext

def apply[T](body: =>T)(implicit execctx: ExecutionContext)

The problem is
ScalaTest will provide one for you and it is single-threaded, so it will block on Await.

For example the following never gets past Await

val f = Future(1)
Await.result(f, Duration.Inf)

This can be solved with

val f = Future(1)(aDifferentExecutionContext)
Await.result(f, Duration.Inf)

But the better approach is not to use Await at all in a AsyncFlatSpec because when using
AsyncFlatSpec you can return Future[Assertion], allowing ScalaTest will resolve the Future.

Sunday, October 1, 2017

Google Drive API Quickstart in Scala

I'm working on a Scala app that needs to upload files to google drive. While google as a company hasn't exactly embraced Scala, it does use Java extensively and it's generally easy to interop with Scala to Java. So, in this post I sketch out a simple Scala wrapper that works with the Java google drive client library. I did find a Scala google drive library on github but it was not in maven central, so I was not able to use it directly, however I did borrow some of its concepts. Also, the gradle quickstart commands failed on my machine. Fortunately gradle is not required and we can just as easily use maven or in the case of scala, sbt.

To get started, first you need to create an api project https://console.cloud.google.com/apis/dashboard

Then click the enable apis and enable drive api. To be clear, the google account you are using here is your developer account, which may not be the same google account that owns the google drive to be accessed.

https://console.cloud.google.com/apis/api/drive.googleapis.com/overview

Then go to credentials -> create credentials and select oauth client id

I selected "other" since I'm building a daemon app

Download the file it generates: client_secret_<somenumber>.json

Create a new Scala/sbt project in Intellij. Edit build.sbt

name := "name"version := "1.0"scalaVersion := "2.11.8"

libraryDependencies += "ch.qos.logback"    %  "logback-classic" % "1.1.3"
libraryDependencies += "com.google.api-client" % "google-api-client" % "1.22.0"
libraryDependencies += "com.google.oauth-client" % "google-oauth-client-jetty" % "1.22.0"
libraryDependencies += "com.google.apis" % "google-api-services-drive" % "v3-rev83-1.22.0"

Here's my Scala client, adapted from the Java Quickstart At some point I'll get this on github, once I clean it up.

import java.io.{File, InputStreamReader}
import java.util

import com.google.api.client.auth.oauth2.Credential
import com.google.api.client.extensions.java6.auth.oauth2.AuthorizationCodeInstalledApp
import com.google.api.client.extensions.jetty.auth.oauth2.LocalServerReceiver
import com.google.api.client.googleapis.auth.oauth2.{GoogleAuthorizationCodeFlow, GoogleClientSecrets}
import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport
import com.google.api.client.http.FileContent
import com.google.api.client.json.jackson2.JacksonFactory
import com.google.api.client.util.store.FileDataStoreFactory
import com.google.api.services.drive.model.{File => DriveFile}
import com.google.api.services.drive.{Drive, DriveScopes}
import org.slf4j.LoggerFactory

import scala.collection.JavaConverters._

object GoogleDriveClient {

  def main(args: Array[String]): Unit = {
    // first time setup    
GoogleDriveClient("my drive app", "drive-oauth-credentials", authorizationFlow = true)
  }

  def apply(appName: String, credentialsPath: String, authorizationFlow: Boolean = false) = {
    val credentialsFile = new File(credentialsPath)

    if (!authorizationFlow) {
      // on server we obviously won't be able to do the oauth auth with browser -- app needs to be deployed with creds      
if (!credentialsFile.exists) {
        throw new RuntimeException(s"Credentials does not exist ${credentialsFile.getAbsolutePath}")
      }
    }

    new GoogleDriveClient(appName, credentialsFile)
  }
}

class GoogleDriveClient(appName: String, credentialsFile: File) {

  val log = LoggerFactory.getLogger(GoogleDriveClient.getClass)

  /** Global instance of the JSON factory. */  
private val JSON_FACTORY = JacksonFactory.getDefaultInstance  
/** Global instance of the HTTP transport. */  
val HTTP_TRANSPORT = GoogleNetHttpTransport.newTrustedTransport()
  val DATA_STORE_FACTORY = new FileDataStoreFactory(credentialsFile)

  private val SCOPES = util.Arrays.asList(DriveScopes.DRIVE)

  private[this] def authorize: Credential = {
val in = GoogleDriveClient.getClass.getResourceAsStream("/client_secret.json")
    val clientSecrets = GoogleClientSecrets.load(JSON_FACTORY, new InputStreamReader(in))

    // Build flow and trigger user authorization request.    val flow = new GoogleAuthorizationCodeFlow.Builder(
      HTTP_TRANSPORT,
      JSON_FACTORY,
      clientSecrets,
      SCOPES)
      .setDataStoreFactory(
        DATA_STORE_FACTORY      ).setAccessType("offline").build // must set offline of you wont get an refreshToken and the token will only work for a few hours or so
    val credential: Credential = new AuthorizationCodeInstalledApp(flow, new LocalServerReceiver()).authorize("user")
    log.info("Credentials saved to " + credentialsFile.getAbsolutePath)
    credential
  }

  val drive = {
    val credential = authorize

    new Drive.Builder(
      HTTP_TRANSPORT, JSON_FACTORY, credential)
      .setApplicationName(appName)
      .build()
  }

  val root = drive.files.get("root").execute

  def pregenerateIds(numOfIds: Int): Seq[String] = {
    drive.files().generateIds().setSpace("drive").setCount(numOfIds).execute().getIds.asScala.toSeq
  }

  def createFolder(name: String, parent: Option[String]): DriveFile = {
    val fileMetadata = new DriveFile()
    fileMetadata.setName(name)

    parent.foreach(p => fileMetadata.setParents(Seq(p).asJava))

    fileMetadata.setMimeType("application/vnd.google-apps.folder")

    drive.files().create(fileMetadata)
      .setFields("id")
      .execute()
  }

  def uploadFile(localFile: File, parentDriveOption: Option[DriveFile], idOption: Option[String], checkIfExists: Boolean = false): DriveFile = {
    val fileType = "image/jpeg"
    def upload(): DriveFile = {
      val file = new DriveFile()
      file.setName(localFile.getName)

      parentDriveOption.foreach(p => file.setParents(List(p.getId).asJava))
      idOption.foreach(id => file.setId(id))

      val mediaContent = new FileContent(fileType, localFile)

      val result: DriveFile = drive.files.create(file, mediaContent).setFields("id").execute()

      log.debug("upload result is " + result.getId)

      result
    }

    if (checkIfExists) {
      val existing = getFileByParentAndName(parentDriveOption.get, localFile.getName)

      if (existing.isDefined) {
        log.info("File already exists in google drive")
        existing.get
      } else {
        upload()
      }
    } else {
      upload()
    }
  }

  def getFilesByParent(parent: DriveFile): List[DriveFile] = {
    val request: Drive#Files#List = drive.files.list.setQ("'%s' in parents and trashed = false".format(parent.getId))
    request.execute.getFiles.asScala.toList
  }

  def getFileByParentAndName(parent: DriveFile, name: String): Option[DriveFile] = {
    val request = drive.files.list.setQ("'%s' in parents and name = '%s' and trashed = false".format(parent.getId, name))
    request.execute.getFiles.asScala.toList.headOption
  }

  // unwind the path, and find each part by parent, starting with the root folder  // e.g. /raspi/camera1/files will return the [files], if it exists  
def findFileByPath(path: String): Option[DriveFile] = {
    def findFileByPath(parts: List[String], parent: DriveFile): Option[DriveFile] = {
      parts match {
        case Nil =>
          Some(parent)
        case head :: Nil =>
          getFileByParentAndName(parent, head)
        case head :: tail =>
          getFileByParentAndName(parent, head).map { child =>
              findFileByPath(parts = tail, parent = child)
          }.getOrElse(None)

      }
    }

    findFileByPath(path.split("/").toList, root)
  }
}

Create a src/main/resources folder and drop in your client_secret.json, downloaded from the api console. Make sure it is named client_secret.json, as that is what the app expects:

val in = GoogleDriveClient.getClass.getResourceAsStream("/client_secret.json")

Run the class and it will open a browser and send you to Google to select an account. This account does not need to be the same account has account that was used to generate the api key. This is the account to access google drive. Once this completes it will create a directory containing a single file "StoredCredential. From not on, as long as it finds this file you will not be prompted for authorization. Notice I asked for scope   private val SCOPES = util.Arrays.asList(DriveScopes.DRIVE)
which is basically sudo. You may want to ask for a more restrictive scope, e.g read-only. This should be a arg to the class. Secondly I asked for offline access

setAccessType("offline")

If I did not then this app would only work for 4 hours or however the access token last before expiration. With offline access, the api can ask for a new token with the refresh token.

Now we're in business and can make api calls. First it's important to note that google drive is not exactly like your filesystem. A folder may have multiple files of the same name, and a file can have multiple parents (folders). For example, if you want to create a file if it does not already exist, you'll need to check first; otherwise you'll end up with duplicates. I provided getFileByParentAndName
to check for an existing file. You can also provide an file id, which is a good idea because you could end up with duplicates if you use retry logic. The upload implementation I provided takes a parameter to check if the file exists first.

Simple operations on a filesystem, such as listing the contents of a folder aren't so simple with drive. To do this you perform a query: '%s' in parents and name = '%s' and trashed = false
providing the parent, which in this case is the google drive folder. Folders are just files but they have a special mime type: "application/vnd.google-apps.folder"
The api reference https://developers.google.com/drive/v3/reference/files/list mentions the query parameter but does not indicate query language. You have to skip to https://developers.google.com/drive/v3/web/search-parameters to learn how to contruct a query.

The v3 reference provides some examples in different languages, but only for some operations https://developers.google.com/drive/v3/web/manage-uploads

It was somewhat frustrating at first to perform some basic operations because many of the examples were using version 2 of the api and I had selected version 3, which hopefully will have some more runway. Some of the api parameters and usage had changed.

Finally, I should note that I'm really interested in server to server auth. I'm only using oauth because that is what available documentation lead me to. In this case I'm requesting oauth access to my own account and only my account, which is not exactly the intention of oauth, but is seems it should work as long as I don't need to reauthorize.