Skip to content

Conversation

moberegger
Copy link
Contributor

Some small adjustments to reduce latency and memory allocations on calls to set! and array!. A summary of changes:

  • Similar to what was done with _extract in (link PR), private _array and _set methods have been added to save on a memory allocation resulting from the extra *args splat that would happen when JbuilderTemplate#array! and JbuilderTemplate#set!'s called back up to super. With the new setup, the splat happens a single time.
  • Calls to ::Kernel.block_given? showed up as hotspots in our profiling. These have been replaced with a simple if block check, which performs a little bit faster. Normally you wouldn't see a difference with block_given?, but since Jbuilder is a BasicObject, ::Kernel.block_given? had to be used, and the extra module resolution apparently has some overhead.
  • Calls to one? showed up as hotspots in our profiling, which I believe is an O(n) operation. The args.one? guards have been removed, as they appeared to not actually be necessary. There were guards like if args.one? && _partial_options?(options), and I presume the one? was intended to short circuit the checks against the options hash, but it's actually faster to just forgo the one? call. If the intent was to check if only one argument was provided, this isn't actually doing that; it is actually checking if one truthy argument was provided.

Some benchmarks against JbuilderTemplate comparing main (before) with this branch (after):

set!

The simplest benchmark to exercise the changes under set!.

json.set! :foo, :bar
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before   446.710k i/100ms
               after   540.848k i/100ms
Calculating -------------------------------------
              before      5.251M (± 2.3%) i/s  (190.45 ns/i) -     26.356M in   5.022246s
               after      6.444M (± 6.9%) i/s  (155.17 ns/i) -     32.451M in   5.071314s
Comparison:
               after:  6444432.8 i/s
              before:  5250641.1 i/s - 1.23x  slower


Calculating -------------------------------------
              before    80.000  memsize (     0.000  retained)
                         2.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
               after    40.000  memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
Comparison:
               after:         40 allocated
              before:         80 allocated - 2.00x more

Simple benchmark when set! is provided a collection. Additionally benchmarks the underlying call to _array

# Where...
array = [1, 2, 3]
json.set! :foo, array do |item|
  json.set! :bar, item
end
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before    73.584k i/100ms
               after    89.058k i/100ms
Calculating -------------------------------------
              before    734.510k (± 6.7%) i/s    (1.36 μs/i) -      3.679M in   5.041629s
               after    937.514k (± 4.7%) i/s    (1.07 μs/i) -      4.720M in   5.048528s
Comparison:
               after:   937513.6 i/s
              before:   734509.8 i/s - 1.28x  slower


Calculating -------------------------------------
              before     1.000k memsize (   520.000  retained)
                        16.000  objects (     4.000  retained)
                         0.000  strings (     0.000  retained)
               after   760.000  memsize (   520.000  retained)
                        10.000  objects (     4.000  retained)
                         0.000  strings (     0.000  retained)
Comparison:
               after:        760 allocated
              before:       1000 allocated - 1.32x more

A benchmark for when set! is provided a list of attributes. Intent here to measure the args.one? change. Was hoping to see a larger improvement in IPS.

# Where...
post = Post.new(1, 'Post 1', 'This is the body')
json.set! :post, post, :id, :title, :body
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before    96.148k i/100ms
               after   107.217k i/100ms
Calculating -------------------------------------
              before    993.402k (± 2.1%) i/s    (1.01 μs/i) -      5.000M in   5.035113s
               after      1.041M (± 9.9%) i/s  (960.98 ns/i) -      5.146M in   5.031288s
Comparison:
               after:  1040609.6 i/s
              before:   993401.9 i/s - same-ish: difference falls within error

Calculating -------------------------------------
              before   440.000  memsize (   160.000  retained)
                         3.000  objects (     1.000  retained)
                         0.000  strings (     0.000  retained)
               after   240.000  memsize (   160.000  retained)
                         2.000  objects (     1.000  retained)
                         0.000  strings (     0.000  retained)
Comparison:
               after:        240 allocated
              before:        440 allocated - 1.83x more

array!

A simple benchmark for array! to exercise the changes. Was hoping for a larger improvement in IPS. This does still save on memory, though.

# Where...
array = [1, 2, 3]
json.array! array
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before     1.643k i/100ms
               after     2.235k i/100ms
Calculating -------------------------------------
              before      7.167k (±20.9%) i/s  (139.53 μs/i) -     36.146k in   5.287060s
               after      7.384k (±24.2%) i/s  (135.44 μs/i) -     35.760k in   5.134774s
Comparison:
               after:     7383.5 i/s
              before:     7166.9 i/s - same-ish: difference falls within error


Calculating -------------------------------------
              before    80.000  memsize (     0.000  retained)
                         2.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
               after    40.000  memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
               after:         40 allocated
              before:         80 allocated - 2.00x more

A benchmark for when array! is provided a list of attributes. Intent here to measure the args.one? change. Was hoping to see a larger improvement in IPS.

# Where...
posts = [Post.new(1, 'Post #1')]
json.array! posts, :id, :body
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
              before     2.999k i/100ms
               after     3.437k i/100ms
Calculating -------------------------------------
              before      9.389k (±24.5%) i/s  (106.50 μs/i) -     47.984k in   5.404343s
               after     10.022k (±25.9%) i/s   (99.78 μs/i) -     48.118k in   5.134668s
Comparison:
               after:    10022.1 i/s
              before:     9389.4 i/s - same-ish: difference falls within error

Calculating -------------------------------------
              before   512.000  memsize (   200.000  retained)
                         6.000  objects (     2.000  retained)
                         0.000  strings (     0.000  retained)
               after   320.000  memsize (   200.000  retained)
                         5.000  objects (     2.000  retained)
                         0.000  strings (     0.000  retained)
               after:        320 allocated
              before:        512 allocated - 1.60x more

via method_missing

To showcase that the optimizations impact the DSL offered via method_missing. Not sure why there is a larger improvement here compared to set!.

json.foo :bar
Warming up --------------------------------------
              before   327.202k i/100ms
               after   504.570k i/100ms
Calculating -------------------------------------
              before      3.800M (± 3.1%) i/s  (263.19 ns/i) -     18.978M in   4.999800s
               after      5.864M (± 1.5%) i/s  (170.52 ns/i) -     29.770M in   5.077434s
Comparison:
               after:  5864408.4 i/s
              before:  3799593.4 i/s - 1.54x  slower


Calculating -------------------------------------
              before    80.000  memsize (     0.000  retained)
                         2.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
               after    40.000  memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
Comparison:
               after:         40 allocated
              before:         80 allocated - 2.00x more


BLANK = Blank.new
BLANK = Blank.new.freeze
EMPTY_ARRAY = [].freeze
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were quite a few spots allocating a new empty array when dealing with an empty collection. Figured it was worthwhile to re-use the same Array instance. This does assume that no call sites attempt to mutate the output from Jbuilder#target!.

Comment on lines +243 to +251
if _blank?(value)
# json.comments { ... }
# { "comments": ... }
_merge_block key, &block
else
# json.comments @post.comments { |comment| ... }
# { "comments": [ { ... }, { ... } ] }
_scope { _array value, &block }
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition used to be the inverse, with a if !_blank?. I think this improves control flow, and it saves a tiny bit in processing.

if _blank?(value)
# json.comments { ... }
# { "comments": ... }
_merge_block key, &block
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used to be

_merge_block(key){ yield self }

def call(object, *attributes, &block)
if ::Kernel.block_given?
array! object, &block
if block
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick micro benchmark to show the difference between these

module Foo
  def self.if_block_given?
    block_given? ? true : false
  end

  def self.if_kernel_block_given?
    ::Kernel.block_given? ? true : false
  end

  def self.if_block?(&block)
    block ? true : false
  end
end

Benchmark.ips do |x|
  x.report('block_given?') { Foo.if_block_given? }
  x.report('Kernel.block_given?') { Foo.if_kernel_block_given? }
  x.report('block?') { Foo.if_block? }

  x.compare!
end
ruby 3.4.5 (2025-07-16 revision 20cda200d3) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
        block_given?     2.450M i/100ms
 Kernel.block_given?     2.270M i/100ms
              block?     2.516M i/100ms
Calculating -------------------------------------
        block_given?     44.851M (± 1.9%) i/s   (22.30 ns/i) -    225.368M in   5.026585s
 Kernel.block_given?     39.434M (± 1.5%) i/s   (25.36 ns/i) -    197.469M in   5.008676s
              block?     45.037M (± 2.8%) i/s   (22.20 ns/i) -    226.436M in   5.032194s

Comparison:
              block?: 45036992.5 i/s
        block_given?: 44851312.9 i/s - same-ish: difference falls within error
 Kernel.block_given?: 39434092.2 i/s - 1.14x  slower

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant