Kevin Cuzner's Personal Blog

Electronics, Embedded Systems, and Software are my breakfast, lunch, and dinner.


KnockoutJS and Memory Usage

Recently at work I have been using KnockoutJS for structuring my Javascript. To be honest, it is probably the best thing since jQuery in my opinion in terms of cutting down quantity of code that one must write for an interface. The only problem is, however, that it is really really easy to make a page use a ridiculous amount of memory. After thinking and thinking and trying different things I have realized the proper way to do things with more complex pages.

The KnockoutJS documentation is really great, but it is more geared towards the simple stuff/basics so that you can get started quickly and doesn't talk much about more complex stuff which leads to comments like the answer here saying that it isn't so good for complex user interfaces. When things get more complex, like interfacing it with existing applications with different frameworks or handling very very large quantities of data, it doesn't really say much and kind of leaves one to figure it out on their own. I have one particular project that I was working on which had the capability to display several thousand items in a graph/tree format while calculating multiple inheritance and parentage on several values stored in each item object. In chrome I witnessed this page use 800Mb easily. Firefox it was about the same. Internet explorer got to 1.5Gb before I shut if off. Why was it using so much memory? Here is an example that wouldn't use a ton of memory, but it would illustrate the error I made:

Example

Javascript (note that this assumes usage of jQuery for things like AJAX):

 1function ItemModel(id, name) {
 2    var self = this;
 3    this.id = id;
 4    this.name = ko.observable(name);
 5    this.editing = ko.observable(false);
 6    this.save = function () {
 7        //logic that creates a new item if the id is null or just saves the item otherwise
 8        //through a call to $.ajax
 9    }
10}
11
12function ItemContainerModel(id, name, items) {
13    var self = this;
14    this.id = id;
15    this.name = ko.observable(name);
16    this.editing(true);
17    this.items = ko.observableArray(items);
18    this.save = function () {
19        //logic that creates a new item container if the id is null or just saves the item container otherwise
20        //through a call to $.ajax
21    }
22    this.add = function() {
23        var aNewItem = new ItemModel(null, null);
24        aNewItem.editing(true);
25        self.items.push(aNewItem);
26    }
27    this.remove = function (item) {
28        //$.ajax call to the server to remove the item
29        self.items.remove(item);
30    }
31}
32
33function ViewModel() {
34    var self = this;
35    this.containers = ko.observableArray();
36    var blankContainer = new ItemContainerModel(null, null, []);
37    this.selected = ko.observable(blankContainer);
38    this.add = function () {
39        var aNewContainer = new ItemContainerModel(null, null, []);
40        aNewContainer.editing(true);
41        self.containers.push(aNewContainer);
42    }
43    this.remove = function(container) {
44        //$.ajax call to the server to remove the container
45        self.containers.remove(container);
46    }
47    this.select = function(container) {
48        self.selected(container);
49    }
50}
51
52$(document).ready( function() {
53    var vm = new ViewModel();
54    ko.applyBindings(vm);
55});

Now for a really simple view (sorry for lack of styling or the edit capability, but hopefully the point will be clear):

 1<a data-bind="click: add" href="#">Add container</a>
 2<ul data-bind="foreach: containers">
 3    <li><span data-bind="text: name"></span> <a data-bind="click: save" href="#">Save</a> <a data-bind="click: $parent.remove" href="#">Remove</a></li>
 4</ul>
 5<div data-bind="with: selected">
 6    <a data-bind="click: add" href="#">Add item</a>
 7    <div data-bind="foreach: items">
 8        <div data-bind="text: name"></div>
 9        <a data-bind="click: save" href="#"></a>
10        <a data-bind="click: $parent.remove" href="#">Remove</a>
11    </div>
12</div>

The Problem

So, what is the problem here with this model? It works just fine... you can add, remove, save, and display items in a collection of containers. However, if this view was to contain, say, 1000 containers with 1000 items each, what would happen? Well, we would have a lot of memory usage. Now, you could say that would happen no matter what you did and you wouldn't be wrong. The question here is, how much memory is it going to use? The example above is not nearly the most efficient way of structuring a model and will consume much more memory than is necessary. Here is why:

Note how the saving, adding, and removing functions are implemented. They are declared attached to the this variable inside each object. Now, in languages like C++, C#, or Java, adding functions to an object (that is what attaching the function to the this variable does in Javascript if you aren't as familiar with objects in Javascript) will not cause increased memory usage generally, but would rather just make the program size larger since the classes would all share the same compiled code. However, Javascript is different.

Javascript uses what are called closures. A closure is a very very powerful tool that allows for intuitive accessing and scoping of variables seen by functions. I won't go into great detail on the awesome things you can do with these since many others have explained it better than I ever could. Another thing that Javascript does is that it treats functions as "1st class citizens" which essentially means that Javascript sees no difference between a function and a variable. All are alike. This allows you to assign a variable to point to a function (var variable = function () { alert("hi"); };) so that you could call variable() and it would execute the function as if "variable" was the name of the function.

Now, tying all that together here is what happens: Closures "wrap up" everything in the scope of a function when it is declared so that it has access to all the variables that were able to be seen at that point. By treating functions almost like variables and assigning a function to a variable in the this object, you extend the this object to hold whatever that variable holds. Declaring the functions inline like we see in the add, remove, and save functions while in the scope of the object causes them to become specific to the particular instance of the object. Allow me to explain a bit: Every time that you call 'new ItemModel(...)', in addition to creating a new item model, it creates a new function: this.save. Every single ItemModel created has its very own instance of this.save. They don't share the same function. Now, when we create a new ItemContainerModel, 3 new functions are also created specific to each instance of the ItemContainerModel. That basically means that if we were to create two containers with 3 items each inside we would get 8 functions created (2 for the items, 6 for the containers). In some cases this is very useful since it lets you create custom methods for each oject. To use the example of the item save function, instead of having to access the 'id' variable as stored in the object, it could use one of the function parameters in 'function ItemModel(...)' inside the save function. This is due to the fact that the closure wrapped up the variables passed into the ItemModel function since they were in scope to the this.save function. By doing this, you could have the this.save function modify something different for each instance of the ItemModel. However, in our situation this is more of an issue than a benefit: We just redundantly created 4 functions that do the exact same thing as 4 other functions that already exist. Each of those functions consumes memory and after a thousand of these objects are made, that usage gets to be quite large.

Solution

How can this be fixed? What we need to do is to reduce the number of anonymous functions that are created. We need to remove the save, add, and remove functions from the ItemModel and ItemContainerModel. As it turns out, the structure of Knockout is geared towards doing something which can save us a lot of memory usage.

When an event binding like 'click' is called, the binding will pass an argument into the function which is the model that was being represented for the binding. This allows us to know who called the method. We already see this in use in the example with the remove functions: the first argument was the model that was being referenced by the particular click when it was called. We can use this to fix our problem.

First, we must remove all functions from the models that will be duplicated often. This means that the add, remove, and save functions in the ItemContainer and the save function in the Item models have to go. Next, we create back references so that each contained object outside the viewmodel and its direct children knows who its daddy is. Here is an example:

 1function ItemModel(id, name, container) {
 2    //note the addition of the container argument
 3
 4    //...keep the same variables as before, but remove the this.save stuff
 5
 6    this.container = container; //add this as our back reference
 7}
 8
 9function ItemContainerModel(id, name) {
10    //NOTE 1: this didn't need an argument for a back reference. This is because it is a direct child of the root model and
11    //since the root model contains the functions dealing with adding and removing containers, it already knows the array to
12    //manipulate
13
14    //NOTE 2: the items argument has been removed. This is so that the container can be created before the items and the back
15    //reference above can be completed. So, the process for creating a container with items is now: create container, create
16    //items with a reference to the container, and then add the items to the container by doing container.items(arrayOfItems);
17
18    //remove all the functions from this model as well
19}
20
21function ViewModel() {
22    //all the stuff we already had here from the example above stays
23
24    //we add the following:
25    this.saveItem = function (item) {
26        //instead of using self.id and self.name() when creating our ajax request, we use item.id and item.name()
27    }
28    this.saveContainer = function(container) {
29        //instead of using self.id and self.name() when creating our ajax request, we use item.id and item.name()
30    }
31    this.addItem = function(container) {
32        var aNewItem = new ItemModel(null, null, container);
33        aNewItem.editing(true);
34        container.items.push(aNewItem);
35    }
36    this.removeItem = function(item) {
37        //create a $.ajax request to remove the item based on its id
38        item.container.items.remove(item); //using our back reference, we can remove the item from its parent container
39    }
40}

The view will now look like so (note that the bindings to functions now reference $root: the main ViewModel):

 1<a data-bind="click: add" href="#">Add container</a>
 2<ul data-bind="foreach: containers">
 3    <li><span data-bind="text: name"></span> <a data-bind="click: $root.saveContainer href="#">Save</a> <a data-bind="click: $root.remove" href="#">Remove</a></li>
 4</ul>
 5<div data-bind="with: selected">
 6    <a data-bind="click: $root.addItem" href="#">Add item</a>
 7    <div data-bind="foreach: items">
 8        <div data-bind="text: name"></div>
 9        <a data-bind="click: $root.saveItem" href="#"></a>
10        <a data-bind="click: $root.removeItem" href="#">Remove</a>
11    </div>
12</div>

Now, that wasn't so hard was it? What we just did was we made it so that we only use memory for the variables and don't have to create any closures for functions. By moving the individual model functions down to the ViewModel we kept the same functionality as before, did not increase our code size, and significantly reduced memory usage when the model starts to get really big. If we were to create 2 containers with 3 items each, we create no additional functions from the 4 inside the ViewModel. The only memory consumed by each model is the space needed for storing the actual values represented (id, name, etc).

Summary

In summary, to reduce KnockoutJS memory usage consider the following:

  • Reduce the number of functions inside the scope of each model. Move functions to the lowst possible place in your model tree to avoid unnecessary duplication.
  • Avoid closures inside heavily duplicated models like the plague. I know I didn't cover this above, but be careful with computed observables and their functions. It may be better to declare the bulk of a function for a computed observable outside the function and then use it like so: 'this.aComputedObservable = ko.computed(function () { return aFunctionThatYouCreated(self); });' where self was earlier declared to be this in the scope of the model itself. That way the computed observable function still has access to the contents of the model while keeping the actual memory usage in the model itself small.
  • Be very very slim when creating your model classes. Only put data there that will be needed.
  • Consider pagination or something. If you don't need 1000 objects displayed at the same time, don't display 1000 objects at the same time. There is a server there to store the information for a reason.

The first week or two with Arch Linux

After some frustrating times involving Ubuntu 12.04, hibernation, suspending, and random freezing I decided I needed to try something different. Being a Sandy Bridge desktop, my computer naturally seems to have a slight problem with Linux support in general. Don't get me wrong, I really like my computer and my processor...however, the hardware drivers at times frustrate me. So, at my wits end I decided to do something crazy and take the plunge to a bleeding edge rolling release linux: Arch Linux.

Arch Linux is interesting for me since its the first time I have not been using an operating system with the "version" paradigm. Since its a rolling release it is prone to more problems, but it also gives the advantage of always being up to date. Since my computer's hardware is relatively new (it has been superseded by Ivy Bridge, but even so its driver support still seems to be being built), I felt that I had more to gain from doing a rolling release where new updates and such would come out (think kernel 3.0 to 3-2...Sandy Bridge processors suddenly got much better power management) almost immediately. So, without further adieu, here are my plusses and minuses (note that this will end up comparing Ubuntu to Arch alot since that's all I know at the moment):

Plusses:

  • It was actually very easy to install. Since I have had problems with net installs, I did a core install and then updated it. I practiced several times beforehand on virtual machines, including using existing partitions and such. Although the initial downloads took some time (the lack of curl in the core install was kind of upsetting since I couldn't use rankmirrors to get better speeds), after that it was pretty fast. Thanks to considerable documentation and a few practice runs, getting an X enviroment set up using Gnome 3 (and gdm...I like the graphical logins) didn't take long at all. It took a bit of coaxing (read: google + arch wiki) to get things such as networkmanager running and such, but with time I had it all figured out and I managed to get the whole system running more or less stably within a day.
  • It boots faster than Ubuntu and is more explicit about what exactly it's doing. I liked the Ubuntu moving logo thing, but I do actually enjoy seeing what the computer is doing when it boots. Coming from pure asthetic reasons, it gives the computer more of a "raw" feel which for some strange masochistic reason, I enjoy. The slowest part is initializing the networks and if that didn't have to happen the entire system could boot in under 60 seconds after the bios gets done showing off its screen.
  • The documentation is awesome. Clearly, people have spent lots and lots and lots of time writing the documentation in the wiki for Arch. It certainly made setting up easier since many of the random corner cases were in the troubleshooting section of severl articles and I ended up running into a couple of them. One thing that was easier to set up than in Ubuntu was suspending and hibernating (at least getting it to work reliably). With some help from the forum (see next point) and a few pages of documentationon pm-utils I got suspend, resume, and hibernate (!!!) running. I haven't even gotten hibernate to work in Windows.
  • The community is great. I rarely have been able to get a question answered on the Ubuntu forums since they are so conjested. I asked a question on the arch bbs and in less than a day I had a response and was able to do some trial + error and troubleshooting involving the suspend and hibernate functionality of my computer.

Minuses:

  • The rolling release model breaks things occasionally. Recently, the linux-firmware package was updated and this caused my wireless card to stop working since it could no longer find the drivers. I wasn't sure why, but I have just downgraded the package and blacklisted it for upgrades. Hopefully that doesn't kill me later (it probably will), but if it does by then I hope to have figured out what is wrong.
  • With great power comes great responsibility. The sheer flexibility is great since I don't have a bunch of extra packages I don't need, but at the same time when I was practicing with the VMs, I was able to get myself stuck in a hole where the only solution was to re-format the drive. However, ever since a mishap with Ubuntu (the themeing engine changed all my stuff to black on black or white on white for the text) I have separated out my home folder from the system so that I can easily re-format and re-install the system without losing all my stuff (all 132Gb of it).
  • This isn't a problem for most people, but it doesn't access the hard drive as often as other distros. Why is that a con for me? Well, I have a western digital green hard drive which has an automated parking feature which parks the heads after 10 seconds of inactivity. Well, in windows this doesn't matter, but in linux since it touches the filesystem every 11-15 seconds or so, that results on a LOT of head parkings. Considering that the heads are only rated for 300K cycles and people have reported reaching that in less than a year, it is a real issue. I have a program (wdantiparkd) which writes the hard drive every 7 seconds while watching to see if anything else has written to the hard drive so that it hangs up after 10 minutes rather than 10 seconds. It helps, but it worked better on Ubuntu.

Overall, this experience with Arch has allowed me to become much more familiar with Linux and its guts and slowely but surely I am getting better at fixing issues. If you are considering a switch from your present operating system and already have experience with Linux (especially the command line since that's what you are stuck with starting out before you install xfce, Gnome, KDE, etc), I would recommend this distribution. Of course, if you get easily frustrated with problems and don't enjoy solving them, perhaps a little more stability would be something to look for instead.

Here is my desktop as it stands:

Screenshot-from-2012-07-09-085931.png