Why would we ever build a distributed computing platform in Node?

Presented by Gord Tanner
www.bithound.io / @bithoundio

Who am I?

www.bithound.io


Gord Tanner (@gordtanner)
Co Founder / CTO

Founding Team

http://www.communitech.ca/main-communitech/view-from-the-loo-bithound-house-builds-on-downtown-momentum/#.VNTzk8Z-9TY

What do we do?

Code quality, maintainability, and stability for all of your public and private project repositories.


www.bitHound.io

Problem:

                      > time jshint .

                      ---------------------
                      (lots of errors here)
                      ---------------------

                      7368 errors

                      real  0m29.321s
                      user  0m28.627s
                      sys 0m0.508s

          
- Do this for 1K+ commits in a project
- There are many more things to look at

Waiting sucks

concurrency

In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other. The computations may be executing on multiple cores in the same chip, preemptively time-shared threads on the same processor, or executed on physically separated processors.

Why JavaScript?

https://chrome.google.com/webstore/detail/ripple-emulator-beta/geelfhphabnejjhdalkjhgipohgpdnoc?hl=en

is

concurrency

In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other. The computations may be executing on multiple cores in the same chip, preemptively time-shared threads on the same processor, or executed on physically separated processors.

Shared Memory

Concurrent components communicate by altering the contents of shared memory locations (exemplified by Java and C#). This style of concurrent programming usually requires the application of some form of locking (e.g., mutexes, semaphores, or monitors) to coordinate between threads. A program that properly implements any of these is said to be thread-safe . http://en.wikipedia.org/wiki/Concurrent_computing

Message Passing

Concurrent components communicate by exchanging messages (exemplified by Scala, Erlang and occam). The exchange of messages may be carried out asynchronously, or may use a synchronous "rendezvous" style in which the sender blocks until the message is received. Asynchronous message passing may be reliable or unreliable (sometimes referred to as "send and pray"). Message-passing concurrency tends to be far easier to reason about than shared-memory concurrency, and is typically considered a more robust form of concurrent programming.
http://en.wikipedia.org/wiki/Concurrent_computing
Node is a platform for easily building fast, scalable network applications. Node uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.
http://nodejs.org/

The cost of I/O

    L1-cache                  3 cycles
    L2-cache                 14 cycles
    RAM                     250 cycles
    Disk             41 000 000 cycles
    Network         240 000 000 cycles

          
http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/

Async IO

        function fibonacci(n) {
          if (n < 2)
            return 1;
          else
            return fibonacci(n-2) + fibonacci(n-1);
        }

        http.createServer(function (req, res) {
          res.writeHead(200, {'Content-Type': 'text/plain'});
          res.end(fibonacci(40));
        }).listen(1337, "127.0.0.1");
        
https://www.semitwist.com/mirror/node-js-is-cancer.html

Am i going to "create" another distributed framework?

"this is a very personal choice"

https://github.com/bithound/farm.bithound.io
ØMQ looks like an embeddable networking library but acts like a concurrency framework.
                  .------------.        .--------------.
                  |            |        |              |
                  | TCP socket +------->|              | ZAP!
                  |            | BOOM!  |  0MQ socket  |
                  '------------'        |              |  POW!!
                    ^    ^    ^         |              |
                    |    |    |         '--------------'
                    |    |    |
                    |    |    |
                    |    |    '--------- Spandex
                    |    |
                    |    '-------------- Cosmic rays

                   Illegal radioisotopes from
                   secret Soviet atomic city
          

Sockets that carry atomic messages across various transports like:

  • in-process
  • inter-process (shared memory)
  • TCP
  • multicast

You can connect sockets N-to-N with patterns like

  • fan-out, fan-in
  • publish-subscribe
  • task distribution
  • request-reply

Tasks

  • understand your project
  • assess your applications needs
  • only then define your tasks

Roles

                   .------------.             .------------.
                   |            |             |            |
                   |   MASTER   |             |   SLAVE    |
                   |            |             |            |
                   .------------.             .------------.

                   - starts jobs              - works on jobs
                   - breaks down work
                   - listens to status

          
     .------------. Hey, can you do this thing for me?  .------------.
     |            |------------------------------------>|            |
     |   MASTER   |  Sure! here you go buddy            |   SLAVE    |
     |            |<------------------------------------|            |
     '------------'                                     '------------'
          
                                                           .------------.
                                                          .------------.|
                                                         .------------.||
     .------------.  here are a bunch of things to do!  .------------.|||
     |            |------------------------------------>|            |||'
     |   MASTER   |  Sure! here you go buddy            |   SLAVE    ||'
     |            |<------------------------------------|            |'
     '------------'                                     '------------'
          

     .--------. do this!     .-------. thats a lot of work! .-------.
     |        |------------->|       |  can you help me?    |       |
     | MASTER |  here you go | SLAVE |--------------------->| SLAVE |
     |        |<-------------|       |<---------------------|       |
     '--------'              '-------'                      '-------'
          
          .what is the difference?.
          |                       |
          V                       V
     .--------. do this!     .-------. thats a lot of work! .-------.
     |        |------------->|       |  can you help me?    |       |
     | MASTER |  here you go | SLAVE |--------------------->| SLAVE |
     |        |<-------------|       |<---------------------|       |
     '--------'              '-------'                      '-------'
     
          

There is no difference

     .--------. do this!     .--------. thats a lot of work! .--------.
     |        |------------->|        |  can you help me?    |        |
     | WORKER |  here you go | WORKER |--------------------->| WORKER |
     |        |<-------------|        |<---------------------|        |
     '--------'              '--------'                      '--------'
     
          
            var farm = require('farm');

            // I need someone to do something for me
            var task = {};
            farm.jobs.send(task, function (err, result) { });

            // I have a bunch stuff people can work on for me
            var jobs = [j1, j2, j3, j4, .....];
            farm.jobs.distribute(tasks, function (err, result) { });

            // global cluster pub / sub 
            farm.events.publish('new_sha', {sha: '', owner: '', repo: ''});
            farm.events.subscribe('new_sha', function (event) { });

            farm.worker(function (task, callback) {
               //do stuff
               callback(err, result);
            });
          
            var farm = require('farm');

            // I need someone to do something for me
            var task = {};
            farm.jobs.send(task, function (err, result) { });

            // I have a bunch stuff people can work on for me
            var jobs = [j1, j2, j3, j4, .....];
            farm.jobs.distribute(tasks, function (err, result) { });

            // global cluster pub / sub 
            farm.events.publish('new_sha', {sha: '', owner: '', repo: ''});
            farm.events.subscribe('new_sha', function (event) { });

            farm.worker(function (task, callback) {
               //do stuff
               callback(err, result);
            });
          
            var farm = require('farm');

            // I need someone to do something for me
            var task = {};
            farm.jobs.send(task, function (err, result) { });

            // I have a bunch stuff people can work on for me
            var jobs = [j1, j2, j3, j4, .....];
            farm.jobs.distribute(tasks, function (err, result) { });

            // global cluster pub / sub 
            farm.events.publish('new_sha', {sha: '', owner: '', repo: ''});
            farm.events.subscribe('new_sha', function (event) { });

            farm.worker(function (task, callback) {
               //do stuff
               callback(err, result);
            });
          
            var farm = require('farm');

            // I need someone to do something for me
            var task = {};
            farm.jobs.send(task, function (err, result) { });

            // I have a bunch stuff people can work on for me
            var jobs = [j1, j2, j3, j4, .....];
            farm.jobs.distribute(tasks, function (err, result) { });

            // global cluster pub / sub 
            farm.events.publish('new_sha', {sha: '', owner: '', repo: ''});
            farm.events.subscribe('new_sha', function (event) { });

            farm.worker(function (task, callback) {
               //do stuff
               callback(err, result);
            });
          
            var farm = require('farm');

            // I need someone to do something for me
            var task = {};
            farm.jobs.send(task, function (err, result) { });

            // I have a bunch stuff people can work on for me
            var jobs = [j1, j2, j3, j4, .....];
            farm.jobs.distribute(tasks, function (err, result) { });

            // global cluster pub / sub 
            farm.events.publish('new_sha', {sha: '', owner: '', repo: ''});
            farm.events.subscribe('new_sha', function (event) { });

            farm.worker(function (task, callback) {
               //do stuff
               callback(err, result);
            });
          

Launch Day!

what works on one machine well, doesn't always work as well on 200 machines

Dev Env (on OSX)

                               #-------------#
                               |             |
                               |   WEB APP   |
                               |             |
                               '-------------'
                                      |
                                      v               
                                .------------.
                                |            |
                                |   BROKER   |
                                |            |
                                #------------#
                                      |
                                      v               
                                .------------.
                                |            |
                                |   WORKER   |
                                |            |
                                #------------#

          

Prod Env

                               #-------------#
                               |             |
                               |   WEB APP   |
                               |             |
                               '-------------'
                                      |
                                      v               
                                .------------.
                                |            |
                                |   BROKER   |
                                |            |
                                #------------#
                                      |
                                .------------.
                                |  100s OF   |
                                |   BOXES    |
                                '------------'
                                      |
                      .---------------+---------------.
                      |               |               |
                      v               v               v
                .------------.  .------------.  .------------.
                |            |  |            |  |            |
                |   WORKER   |  |   WORKER   |  |   WORKER   |
                |            |  |            |  |            |
                #------------#  #------------#  #------------#

          
https://www.vagrantup.com/

PROTIP: If you are developing a distributed application, you should work in a distributed environment

Pets or Cattle?

Rube Goldberg Machine

You are the one

"The function of the One is now to return to the Source allowing a temporary dissimination of the code you carry reinserting the Prime Program." - The Architect

So why node?

Thank you!

www.bithound.io



www.bithound.io / @bithoundio