Socialcalc
Author: u | 2025-04-25
SOCIALCALC. A Spreadsheet Activity for Computer Supported Collaborative Learning. AGENDA. INTRODUCTION. COLLABORATIVE LEARNING. OLPC amp; SOCIALCALC. LEARNING WITH SOCIALCALC. SocialCalc. Contribute to AshishLohia2025/SocialCalc development by creating an account on GitHub.
DanBricklin/socialcalc: SocialCalc for Socialtext - GitHub
EtherCalc Chapter for the upcoming book "The Performance of Open Source Applications" - Draft - comments welcome! From SocialCalc to EtherCalcPreviously, in The Architecture of Open Source Applications, I described SocialCalc, an in-browser spreadsheet system that replaced the server-centric WikiCalc architecture. SocialCalc performs all of its computations in the browser; it uses the server only for loading and saving spreadsheets.For the Socialtext team, performance was the primary goal behind SocialCalc's design in 2006. The key observation was this: Client-side computation in JavaScript, while an order of magnitude slower than server-side computation in Perl, was still much faster than the network latency incurred during AJAX roundtrips:******Toward the end of the AOSA chapter, we introduced simultaneous collaboration on spreadsheets, using a simple, chatroom-like architecture:******However, as we began to test it for production deployment, the performance and scalability characteristics fell short of practical use, motivating a series of system-wide rewrites in order to reach acceptable performance.This chapter will discuss the revamped architecture we made for EtherCalc, a successor to SocialCalc optimized toward simultaneous editing. We'll see how we arrived at that architecture, how we used profiling tools, and how we made new tools to overcome performance problems.Design ConstraintsThe Socialtext platform has both behind-the-firewall and in-the-cloud deployment options, imposing unique constraints on EtherCalc's resource and performance requirements. At the time of this writing, Socialtext requires 2 CPU cores and 4GB RAM for vSphere-based intranet hosting; a typical dedicated EC2 instance provides about twice that capacity, with 4 cores and 7.5GB of RAM.Deploying on intranets means that we can't simply throw hardware at the problem in the same way multi-tenant, hosted-only systems did (e.g., DocVerse, which later became part of Google Docs); we can assume only a modest amount of server capacity.Compared to intranet deployments, cloud-hosted instances offer better capacity and on-demand extension, but network SOCIALCALC. A Spreadsheet Activity for Computer Supported Collaborative Learning. AGENDA. INTRODUCTION. COLLABORATIVE LEARNING. OLPC amp; SOCIALCALC. LEARNING WITH SOCIALCALC. SocialCalc. Contribute to AshishLohia2025/SocialCalc development by creating an account on GitHub. ZappaJS.Initial micro-benchmarking showed that porting to Node.js costed us about one half of maximum throughput: On a typical Core i5 CPU in 2011, the original Feersum+Tatsumaki stack handles 5k request per second, while Node.js+Express maxes out at 2.8k requests.This performance degradation was acceptable for our first JavaScript port, as it wouldn't significantly increase latency for users, and we expect that it will improve over time.Subsequently, we continued the work to reduce client-side CPU use and minimize bandwidth use by tracking each session's ongoing state with server-side SocialCalc spreadsheets:******Server-side SocialCalcThe key enabling technology for our work is jsdom, a full implementation of the W3C document object model, which enables Node.js to load client-side JavaScript libraries within a simulated browser environment.Using jsdom, it's trivial to create any number of server-side SocialCalc spreadsheets, each taking about 30kb of RAM, running in its own sandbox:require! create-spreadsheet = -> document = jsdom.jsdom \ sandbox = vm.createContext window: document.createWindow! Each collaboration session corresponds to a sandboxed SocialCalc controller, executing commands as they arrive. The server then transmits this up-to-date controller state to newly joined clients, removing the need for backlogs altogether.Satisfied with benchmarking results, we coded up a Redis-based persistence layer and launched EtherCalc.org for public beta testing. For the next six months, it scaled remarkably well, performing millions of spreadsheet operations without a single incident.In April 2012, after delivering a talk on EtherCalc at the OSDC.tw conference, I was invited by Trend Micro to participate in their hackathon, adapting EtherCalc into a programmable visualization engine for their real-time web traffic monitoring system.For their use case, we created REST APIs for accessing individual cells with GET/PUT as well as POSTing commands directly to a spreadsheet. During the hackathon, the brand-new REST handler received hundreds of calls per second, updating graphs and formula cell contents on theComments
EtherCalc Chapter for the upcoming book "The Performance of Open Source Applications" - Draft - comments welcome! From SocialCalc to EtherCalcPreviously, in The Architecture of Open Source Applications, I described SocialCalc, an in-browser spreadsheet system that replaced the server-centric WikiCalc architecture. SocialCalc performs all of its computations in the browser; it uses the server only for loading and saving spreadsheets.For the Socialtext team, performance was the primary goal behind SocialCalc's design in 2006. The key observation was this: Client-side computation in JavaScript, while an order of magnitude slower than server-side computation in Perl, was still much faster than the network latency incurred during AJAX roundtrips:******Toward the end of the AOSA chapter, we introduced simultaneous collaboration on spreadsheets, using a simple, chatroom-like architecture:******However, as we began to test it for production deployment, the performance and scalability characteristics fell short of practical use, motivating a series of system-wide rewrites in order to reach acceptable performance.This chapter will discuss the revamped architecture we made for EtherCalc, a successor to SocialCalc optimized toward simultaneous editing. We'll see how we arrived at that architecture, how we used profiling tools, and how we made new tools to overcome performance problems.Design ConstraintsThe Socialtext platform has both behind-the-firewall and in-the-cloud deployment options, imposing unique constraints on EtherCalc's resource and performance requirements. At the time of this writing, Socialtext requires 2 CPU cores and 4GB RAM for vSphere-based intranet hosting; a typical dedicated EC2 instance provides about twice that capacity, with 4 cores and 7.5GB of RAM.Deploying on intranets means that we can't simply throw hardware at the problem in the same way multi-tenant, hosted-only systems did (e.g., DocVerse, which later became part of Google Docs); we can assume only a modest amount of server capacity.Compared to intranet deployments, cloud-hosted instances offer better capacity and on-demand extension, but network
2025-04-13ZappaJS.Initial micro-benchmarking showed that porting to Node.js costed us about one half of maximum throughput: On a typical Core i5 CPU in 2011, the original Feersum+Tatsumaki stack handles 5k request per second, while Node.js+Express maxes out at 2.8k requests.This performance degradation was acceptable for our first JavaScript port, as it wouldn't significantly increase latency for users, and we expect that it will improve over time.Subsequently, we continued the work to reduce client-side CPU use and minimize bandwidth use by tracking each session's ongoing state with server-side SocialCalc spreadsheets:******Server-side SocialCalcThe key enabling technology for our work is jsdom, a full implementation of the W3C document object model, which enables Node.js to load client-side JavaScript libraries within a simulated browser environment.Using jsdom, it's trivial to create any number of server-side SocialCalc spreadsheets, each taking about 30kb of RAM, running in its own sandbox:require! create-spreadsheet = -> document = jsdom.jsdom \ sandbox = vm.createContext window: document.createWindow! Each collaboration session corresponds to a sandboxed SocialCalc controller, executing commands as they arrive. The server then transmits this up-to-date controller state to newly joined clients, removing the need for backlogs altogether.Satisfied with benchmarking results, we coded up a Redis-based persistence layer and launched EtherCalc.org for public beta testing. For the next six months, it scaled remarkably well, performing millions of spreadsheet operations without a single incident.In April 2012, after delivering a talk on EtherCalc at the OSDC.tw conference, I was invited by Trend Micro to participate in their hackathon, adapting EtherCalc into a programmable visualization engine for their real-time web traffic monitoring system.For their use case, we created REST APIs for accessing individual cells with GET/PUT as well as POSTing commands directly to a spreadsheet. During the hackathon, the brand-new REST handler received hundreds of calls per second, updating graphs and formula cell contents on the
2025-04-08Way to make use of all those spare CPUs in the multi-tenant server?For other Node.js services running on multi-core hosts, we utilized a pre-forking cluster server that creates a process for each CPU:******However, while EtherCalc does support multi-server scaling with Redis, the interplay of Socket.io clustering with RedisStore in a single server would have massively complicated the logic, making debugging much more difficult.Moreover, if all processes in the cluster are tied in CPU-bound processing, subsequent connections would still get blocked.Instead of pre-forking a fixed number of processes, we sought a way to create one background thread for each server-side spreadsheet, thereby distributing the work of command execution among all CPU cores:******For our purpose, the W3C Web Worker API is a perfect match. Originally intended for browsers, it defines a way to run scripts in the background independently. This allows long tasks to be executed continuously while keeping the main thread responsive.So we created webworker-threads, a cross-platform implementation of the Web Worker API for Node.js.Using webworker-threads, it's very straightforward to create a new SocialCalc thread and communicate with it:{ Worker } = require \webworker-threadsw = new Worker \packed-SocialCalc.jsw.onmessage = (event) -> ...w.postMessage commandThis solution offers the best of both worlds: It gives us the freedom to allocate more CPUs to EtherCalc whenever needed, and the overhead of background thread creation remains negligible on single-CPU environments.Lessons LearnedUnlike the SocialCalc project's well-defined specification and team development process, EtherCalc was a solo experiment from mid-2011 to late-2012 and served as a proving ground for assessing Node.js' readiness for production use.This unconstrained freedom afforded an exciting opportunity to try all sorts of alternative languages, libraries, algorithms and architectures. Here, I'd like to share a few lessons I've learned during this 18-month experiment.Constraints are LiberatingIn his book The Design of Design, Fred Brooks argues that by
2025-04-25Incurring a significant startup delay before it can make any modifications.To mitigate this issue, we implemented a snapshot mechanism. After every 100 commands sent to a room, the server will poll the states from each active client, and save the latest snapshot it receives next to the backlog. A freshly joined client receives the snapshot along with new commands entered after the snapshot was taken, so it only needs to replay 99 commands at most.******This workaround solved the CPU issue for new clients, but created a network performance problem of its own, as it taxes each client's upload bandwidth. Over a slow connection, this delays the reception of subsequent commands from the client.Moreover, the server has no way to validate the consistency of snapshots submitted by clients. Therefore, an erroneous -- or malicious -- snapshot can corrupt the state for all newcomers, placing them out of sync with existing peers.An astute reader may note that both problems are caused by the server's inability to execute spreadsheet commands. If the server can update its own state as it receives each command, it would not need to maintain a command backlog at all.The in-browser SocialCalc engine is written in JavaScript. We considered translating that logic into Perl, but that carries the steep cost of maintaining two codebases going forward. We also experimented with embedded JS engines (V8, SpiderMonkey), but they imposed their own performance penalties when running inside Feersum's event loop.Finally, by August 2011, we resolved to rewrite the server in Node.js.Porting to Node.jsThe initial rewrite went smoothly, because both Feersum and Node.js are based on the same libev event model, and Pocket.io's API matches Socket.io closely.It took us only an afternoon to code up a functionally equivalent server in just 80 lines of code, thanks to the concise API offered by
2025-04-13