GeRes2 - a .NET-based Job-Execution Framework for Microsoft Azure Platform Service

This repository contains the source code for the GeRes2 framework. See the Documentation tab for a more details on how to use this framework.

Overview

GeRes2 is a framework for asynchronous, reliable and scalable execution of short- and long-running jobs across multiple Azure Platform as a Service compute instances

Description

We are all familiar with the problem…how can I execute multiple jobs at scale, asynchronously in the Cloud? Most likely we come up with an architecture like this:

  • A front end to take requests
  • A storage queue to persist requests and manage resiliency
  • A back end to process the requests asynchronously

 

geresrough

 

But it is not always that straightforward. Further considerations may include:

  • Job API (front end) needs to be simple and intuitive (Web API and .NET Client SDK)
  • Job Queues need to scale up and down
  • Jobs may need to be prioritized
  • Jobs need to fail safely
  • Job Notifications (push and pull)
  • Job Status needs to monitored
  • Job Processors (back end) needs to scale up and down based on custom rules
  • Job Orchestration (workflow and hierarchy)
  • Preconditions (infrastructure setup) 
  • Custom Jobs (including discovery at runtime not just at design time)
  • Multiple Jobs can be processed in parallel on a single Job Processor machine *
  • Multi-tenancy with job isolation *
  • On-premises deployment *

GeRes2 considers all of these points and provides a framework that can be tailored to meet the requirements of your business needs and scenarios.

* At the time of publishing this feature was not included.

How it Works

Jobs:

  • The implementation of your jobs (i.e. the code) are stored as a zip files in Azure Blob Storage

Client:

  • Request…the client submits requests to action a job (or batch of jobs) to be processed. All requests are made via a Job Web API hosted on the Job Hub (Azure Web Role)
  • [optional] Notifications…the client registers their interest to receive notifications on the progress of their job (SignalR API hosted on the Job Hub)
  • [optional] Cancel…the client submits a that the execution of a job (or batch of jobs) needs to cancelled using the Job API hosted on the Job Hub.
  • [optional] Monitoring…the client submits requests to the Job Monitoring API to GET status of submitted jobs. All requests are made via a Job Web API hosted on the Job Hub 

Job Execution:

  • Successful job requests are placed in a queue (Azure Storage Queue) and recorded for status monitoring (Azure Table Storage)
  • Processed…the queue(s) of jobs are read in order by priority and processed by the Job Host (Azure Worker Role)
    1. The job implementation is downloaded from Blob Storage (see Setup)
    2. The status of the job is updated to ‘Started’ (Table Storage)
    3. A Notification that the job has started is published to the Status Update Topic (Azure Service Bus)
    4. The job is processed in isolation within the Job Hub
    5. The status of the job is updated to ‘In Progress’ (Table Storage)
    6. [optional] Notification of the job’s progress is published to the Status Update Topic (Azure Service Bus)
    7. On completion (success or failure) a notification the job has completed is published to the Status Update Topic (Azure Service Bus)
    8. The status of the job is updated to ‘Complete’ (Table Storage)
    9. The executing code is removed from the machine
    10. The job is removed from the queue
  • [optional] Cancellation requests are sent as commands to the Job Hosts as messages published to the Cancellation Topic (Azure Service Bus)

Job Host Auto-scaling:

  • The AutoScaler (Azure Worker Role) manages the lifecycle of the Job Hosts through commands to and from Job Hosts based on the rules outlined in custom policies and the current topology of the Job Hosts (stored in Azure Table Storage). Commands are messages published to the Job Host Action Topic and Job Host Status Topic (Azure Service Bus).
  • Custom policies allow the owner of the service to determine the rules of how Job Hosts can be scaled (up and down). For example, as the number of jobs in the queue increases, the rule maybe to increase the number of Job Hosts. 

Putting this all together, the architecture of GeRes2 with all of its core components and their relationship looks as follows:

http://download-codeplex.sec.s-msft.com/Download?ProjectName=geres2&DownloadId=833459

Source Code

The source code consists of 5 parts:

  1. Cloud deployment package (includes a Job Submission and Monitoring Web API)
  2. .NET Client SDK
  3. Sample Clients
  4. Sample Jobs
  5. Sample Auto-scaling policies

Once you have cloned this site please refer to the documentation section which has steps on how to deploy locally (VS Azure Emulator) and to your own Cloud Service subscription.

Contribute

You can contribute by reviewing and sending feedback, suggesting and trying out new features as they are implemented, submitting bugs and helping us verify fixes as they are checked in.

Roadmap

The source code on this repo is under active development that will be part of our next release. Planned features and future direction we will be available soon. 

Last edited Jun 2, 2014 at 9:29 PM by mszcool, version 26