Parallec-logo

build status Build Status Coverage Status Apache V2.0 License

latest 0.10.x latest beta  maven central Gitter

Javadoc Documentation Samples Chinese

[ Get-Started | Features | Use Cases | Samples | Change Log / What's New / Versions | User Group | Motivation | Demos | Performance | Compare | Contributors | About | News | Plugin | 中文介绍 ]

[ API Overview | Generate & Submit Task | Track Status & Examine Responses | Configurations ]

Tweeted by the Creator of Akka & Featured in [ This Week in #Scala | OSChina - 2015 Top 100 ]

Parallec is a scalable yet agile polling and aggregation engine: a fast parallel async HTTP(S)/SSH/TCP/UDP/Ping client java library based on Akka. Scalably aggregate and handle API responses anyway and send it anywhere by writing 20 lines of code. A super convenient response context let you pass in/out any object when handling the responses. Now you can conduct scalable API calls, then effortlessly pass aggregated data anywhere to elastic search, kafka, MongoDB, graphite, memcached, etc. Flexible task level concurrency control without creating a 1,000 threads thread pool. Parallec means Parallel Client (pronounced as "Para-like"). Visit www.parallec.io

Watch Demo: 8,000 web server HTTP response aggregation to memory in 12 seconds / to ElasticSearch in 16 seconds.

Aggregated error messages - Debug friendly with full visibility: Having trouble debugging in concurrent environment? Not any more! All exceptions, timeout, stack traces, request sent and response received time are captured and aggregated in the response map. It is available in ParallelTask for polling right after you execute a task asynchronously. Multi-level (worker/manager) timeout guarantees tasks return even for 100,000s of requests.

Production Use Cases: widely used in many distributed systems as the polling and aggregation engine

  1. Application Deployment / PaaS: Parallec has been integrated in eBay main production application deployment system (PaaS). Parallec orchestrates 10+ API tasks, with each task targeting 10s to 1,000s servers over 1,000+ application pools in production.
  2. Data Extraction / ETL: Parallec has been used by eBay Israel's web intelligence team for executing 10k-100k API parallel calls to a single 3rd party server with dramatic improved performance and reduced resources.
  3. Network Troubleshooting via Probing: In eBay's network / cloud team, Parallec is instrumental to ensure an extremely low false alert rates to accurately detect switch soft failures. Parallec serves as the core polling engine in the master component to check agent healths and mark down agents to effectively and timely eliminate noises.
  4. Agent Management / Master: In eBay's site operation / tools team, Parallec serves as the core engine to manage and monitor a puppet agent/salt minion/kubernetes kubelet like agent on 100,000+ production servers to ensure scalable operations.

Workflow Overview

Get Started

Donwload the latest JAR or grab from Maven:

<dependency>
    <groupId>io.parallec</groupId>
    <artifactId>parallec-core</artifactId>
    <version>0.10.3</version>
</dependency>

Snapshots of the development version are available in Sonatype's snapshots repository.

or Gradle:

compile 'io.parallec:parallec-core:0.10.3'

6 Line Example

In the example below, simply changing prepareHttpGet() to prepareSsh(), prepareTcp(), prepareUdp(), preparePing() enables you to conduct parallel SSH/TCP/Ping. Details please refer to the Java Doc and Example Code.

import io.parallec.core.*;
import java.util.Map;

ParallelClient pc = new ParallelClient(); 
pc.prepareHttpGet("").setTargetHostsFromString("www.google.com www.ebay.com www.yahoo.com")
.execute(new ParallecResponseHandler() {
    public void onCompleted(ResponseOnSingleTask res,
        Map<String, Object> responseContext) {
        System.out.println( res.toString() );  }
});

20 Line Example

Now that you have learned the basics, check out how easy to pass an elastic search client using the convenient response context to aggregate data anywhere you like. You can also pass a hash map to the responseContext, save the processed results to the map during onCompleted, and use the map outside for further work.

ParallelClient pc = new ParallelClient();
org.elasticsearch.node.Node node = nodeBuilder().node(); //elastic client initialize
HashMap<String, Object> responseContext = new HashMap<String, Object>();
responseContext.put("Client", node.client());
pc.prepareHttpGet("")
        .setConcurrency(1000).setResponseContext(responseContext)
        .setTargetHostsFromLineByLineText("http://www.parallec.io/userdata/sample_target_hosts_top100_old.txt", HostsSourceType.URL)
        .execute( new ParallecResponseHandler() {
            public void onCompleted(ResponseOnSingleTask res,
                    Map<String, Object> responseContext) {
                Map<String, Object> metricMap = new HashMap<String, Object>();
                metricMap.put("StatusCode", res.getStatusCode().replaceAll(" ", "_"));
                metricMap.put("LastUpdated",PcDateUtils.getNowDateTimeStrStandard());
                metricMap.put("NodeGroupType", "Web100");
                Client client = (Client) responseContext.get("Client");
                client.prepareIndex("local", "parallec", res.getHost()).setSource(metricMap).execute();
            }
        });
node.close(); pc.releaseExternalResources();

Different Requests to the Same Target

Now see how easy to use the request template to send multiple different requests to the same target. Variable replacement is allowed in post body, url and headers. Read more..

pc.prepareHttpGet("/userdata/sample_weather_$ZIP.txt")
    .setReplaceVarMapToSingleTargetSingleVar("ZIP",
        Arrays.asList("95037","48824"), "www.parallec.io")
    .execute(new ParallecResponseHandler() {...}...

What's New

  • 09/2016 Add option to save response headers in HTTP #24.
  • 08/2016 Support Parallel async UDP (via Netty) #41.
  • 07/2016 Support replacing different ports in different requests.
  • 06/2016 Parallel SSH add run sudo with password for commands.

More details please check the Change Log.

Versions

  • The latest production-ready version is 0.10.x, where we use in production.
  • On async-http-client 2.x The Parallec.io version using more up-to-date async-http-client (currently using AHC version 2.0.15) is 0.20.0-SNAPSHOT. This version has passed comprehensive unit tests but has not been used yet in production. This version requires JDK8 due to AHC 2.x and can be used with the parallec-plugins with the same version 0.20.0-SNAPSHOT, details please check #37.

More Readings

  • More Examples on setting context, send to Elastic Search / Kafka, async running, auto progress polling, track progress, TCP/SSH/Ping. UDP example is here, with more to come.
  • Set Target Hosts from list, string, line by line text, json path, from local or remote URLs.
  • Full Documentation
  • Javadoc
  • Ping Demo Ping 8000 Servers within 11.1 Seconds, performance test vs. FPing.

User Group

  • Ask a question, and keep up to date on the library development by joining the discussion group / forum: Parallec.io Google Group.
  • Feel free to submit a Github Issue for any questions and suggestions too.
  • Check FAQ.

Use Cases

  1. Scalable web server monitoring, management, and configuration push, ping check.
  2. Asset / server status discovery, remote task execution in agent-less(parallel SSH) or agent based (parallel HTTP/TCP) method.
  3. Scalable API aggregation and processing with flexible destination with your favorate message queue / storage / alert engine.
  4. Orchestration and work flows on multiple web servers.
  5. Parallel different requests with controlled concurrency to a single server: as a parallec client for REST API enabled Database / Web Server CRUD operations. Variable replacement allowed in post body, url and headers.
  6. Load testing with request template.
  7. Network monitoring with active probing via UDP/Ping etc.