@g

Previous Entry Share Next Entry
#Understanding hidden classes in v8 - real picture.
Mister Wiseman
mr_aleph
Today I noticed a blog post titled Understanding hidden classes in v8. Here is a small analysis of what really happens inside examples included into it. [I was unable to post my comment there due to anti-spam protection].


Lets look under the hood. First we will check overhead of underlying storage reallocation and GCs.

GC stats can be easily obtained with --trace-gc flag passed to V8 shell.

To measure time spent on copying of underlying storage we need to hook into JSObject::SetFastElementsCapacityAndLength which gets called when V8 sees that underlying fast elements storage needs to grow. This function is pretty simple (yet expensive for large arrays): it allocates a new bigger storage and simply copies elements from old storage one by one.

--- src/objects.cc      (revision 5389)
+++ src/objects.cc      (working copy)
@@ -5574,6 +5574,7 @@
   // We should never end in here with a pixel or external array.
   ASSERT(!HasPixelElements() && !HasExternalArrayElements());
 
+  double start = OS::TimeCurrentMillis();
   Object* obj = Heap::AllocateFixedArrayWithHoles(capacity);
   if (obj->IsFailure()) return obj;
   FixedArray* elems = FixedArray::cast(obj);
@@ -5617,6 +5618,9 @@
     JSArray::cast(this)->set_length(Smi::FromInt(length));
   }
 
+  PrintF("SetFastElementsCapacityAndLength took %d\n",
+         static_cast<int>(OS::TimeCurrentMillis() - start));
+
   return this;
 }


Now lets slightly modify examples from original blog post to see when time measurement starts and stops:

test1.js
var PROPERTIES = 10000000;
var o = {};

var start = +new Date;

print("started");

for (var i = 0; i < PROPERTIES; i++) {
  o[i] = i;
}

print("ended");

print(+new Date - start);


test2.js
var PROPERTIES = 10000000;

function O(size) {
  for (var i = 0; i < size; i++) {
    this[i] = null;
  }
}

var o = new O(PROPERTIES);

var start = +new Date;

print ("started");

for (var i = 0; i < PROPERTIES; i++) {
  o[i] = i;
}

print ("ended");

print(+new Date - start);


Runs of modified examples on instrumented V8 shell (x64 arch) produce the following output:

mraleph@mentor:~/sources/v8$ ./shell test1.js --trace-gc
SetFastElementsCapacityAndLength took 0
started
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
Scavenge 1.5 -> 1.0 MB, 0 ms.
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 1
SetFastElementsCapacityAndLength took 1
SetFastElementsCapacityAndLength took 2
Mark-sweep 9.1 -> 3.7 MB, 2 ms.
SetFastElementsCapacityAndLength took 3
SetFastElementsCapacityAndLength took 6
Mark-sweep 14.8 -> 7.4 MB, 4 ms.
SetFastElementsCapacityAndLength took 9
Mark-sweep 17.3 -> 10.7 MB, 5 ms.
SetFastElementsCapacityAndLength took 14
Mark-sweep 25.6 -> 15.6 MB, 7 ms.
SetFastElementsCapacityAndLength took 22
Mark-sweep 38.0 -> 23.1 MB, 11 ms.
SetFastElementsCapacityAndLength took 32
Mark-sweep 56.7 -> 34.3 MB, 16 ms.
SetFastElementsCapacityAndLength took 49
Mark-sweep 84.7 -> 51.1 MB, 25 ms.
SetFastElementsCapacityAndLength took 74
Mark-sweep 126.6 -> 76.3 MB, 36 ms.
SetFastElementsCapacityAndLength took 111
ended
734
mraleph@mentor:~/sources/v8$ ./shell test2.js --trace-gc
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
Scavenge 1.5 -> 1.0 MB, 0 ms.
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 0
SetFastElementsCapacityAndLength took 1
SetFastElementsCapacityAndLength took 1
SetFastElementsCapacityAndLength took 2
Mark-sweep 9.1 -> 3.7 MB, 2 ms.
SetFastElementsCapacityAndLength took 4
SetFastElementsCapacityAndLength took 7
Mark-sweep 14.8 -> 7.4 MB, 4 ms.
SetFastElementsCapacityAndLength took 9
Mark-sweep 17.3 -> 10.7 MB, 6 ms.
SetFastElementsCapacityAndLength took 14
Mark-sweep 25.6 -> 15.6 MB, 8 ms.
SetFastElementsCapacityAndLength took 22
Mark-sweep 38.0 -> 23.1 MB, 12 ms.
SetFastElementsCapacityAndLength took 33
Mark-sweep 56.7 -> 34.3 MB, 18 ms.
SetFastElementsCapacityAndLength took 49
Mark-sweep 84.7 -> 51.1 MB, 26 ms.
SetFastElementsCapacityAndLength took 74
Mark-sweep 126.6 -> 76.3 MB, 39 ms.
SetFastElementsCapacityAndLength took 110
started
ended
189


Simple math shows that GC + underlying storage relocation takes almost 450ms.

When we substract 450ms from 734ms [test1.js timing result] we get 284ms. So we still have some mysterious ~100ms difference [284ms vs. 189ms test2.js takes]. So where this comes from?

[UPD 3 Sep 2010: Actually I am not sure. Initial analysis was incorrect (I overlooked fact that test2.js also has globally declared o. When you change that you also get ~70ms speedup. For now my only hypothesis is that 100 ms might be an overhead of switching from KeyedStoreIC to runtime system for each store that requires growth of underlying backing store. ]

Answer is simple and surprising: it comes from var o being declared in the global scope. Lets do a simple change to avoid global scope:

test3.js
var PROPERTIES = 10000000;

function foo (x) {
  for (var i = 0; i < PROPERTIES; i++) {
    x[i] = i;
  }
}

var o = {};

var start = +new Date;

print ("started");

foo(o);

print ("ended");

print(+new Date - start);


mraleph@mentor:~/sources/v8$ ./shell test3.js
started
ended
641


We can see that test3.js takes 100ms less than test1.js. Mystery solved.



† "fast elements" is part of V8 object representation. I do not want to go into details there but V8 usually tries to represent "array" part of the JavaScript object (that is properties with "numeric" names) as a real continuous array. You can grep V8 sources for FAST_ELEMENTS to see what's going on there.


Tags:

(Leave a comment)
И получается, что заводить поля через prototype поэтому гораздо быстрей, чем присвоить поля в голый объект?

А поля через prototype будут медленней (дополнительный indirection) работать или так же?

Этот пост касается полей-интовых переменных-индексов или полей-строк тоже?

Если в коде написано function f(a) { a.foobar } то соответствие названия поля foobar и смещение в этих hidden-классах делается на этапе трансляции, или каждый раз при входе в функцию? Или функции инстанцируются-кешируются для каждого типа класса при первом вызове? Если инстанцируются, можно ли убить память или получить мистические протухания кеша, вызывав функция с 2^100 комбинацией типов параметров?




поля в объекте быстрее чем поля лежащие выше по prototype-цепочке.

>> Этот пост касается полей-интовых переменных-индексов или полей-строк тоже?

ненене, смысл тут совсем в другом: оригинальный блогпост просто не имеет никакого отношения к hidden classes, и измеряет погоду в унитазе =)

>> Если в коде написано function f(a) { a.foobar } то соответствие названия поля foobar и смещение в этих hidden-классах делается на этапе трансляции, или каждый раз при входе в функцию?

во время исполнения. кэшируется один раз для этой функции (see inline cache in wikipedia)

>> Если инстанцируются, можно ли убить память или получить мистические протухания кеша, вызывав функция с 2^100 комбинацией типов параметров?

нет.




?

Log in

No account? Create an account