-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster XML parser #103
Comments
I made a pure Julia XML parser based on PLists.jl, https://github.com/henry2004y/Vlasiator.jl/tree/native_xml_parser. It turns out that it is much slower than EzXML (wrapper over libxml2): Native parser: julia> @benchmark meta = load(file)
BenchmarkTools.Trial: 7246 samples with 1 evaluation.
Range (min … max): 648.512 μs … 4.994 ms ┊ GC (min … max): 0.00% … 77.29%
Time (median): 661.275 μs ┊ GC (median): 0.00%
Time (mean ± σ): 687.207 μs ± 269.506 μs ┊ GC (mean ± σ): 2.35% ± 5.13%
▁█▆▂ ▁
▂████▆▄▄▇█▇▅▄▃▃▃▄▃▃▂▂▂▂▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂ ▃
649 μs Histogram: frequency by time 792 μs <
Memory estimate: 200.08 KiB, allocs estimate: 4555. EzXML: julia> @benchmark meta = load(file)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 153.291 μs … 166.695 ms ┊ GC (min … max): 0.00% … 4.32%
Time (median): 165.386 μs ┊ GC (median): 0.00%
Time (mean ± σ): 295.421 μs ± 4.373 ms ┊ GC (mean ± σ): 1.68% ± 0.11%
▄▇█▇▆▄▃▄▅▄▄▃▂▁ ▂
████████████████▇█▇▇▆▇▆▆▆▅▇▅▅▆▄▆▇▆▄▇▇▅▅▆▇▅▅▄▇▇▆▄▂▂▇███▇▄▅▂▂▅▇ █
153 μs Histogram: log(frequency) by time 378 μs <
Memory estimate: 28.49 KiB, allocs estimate: 344. As for the VTK writing part, we still use EzXML for generating *.vthb and LightXML (used by WriteVTK) for generating *.vtk. |
With some optimization and simplification: Native parser 2: julia> @benchmark meta = load(file)
BenchmarkTools.Trial: 8003 samples with 1 evaluation.
Range (min … max): 580.358 μs … 5.034 ms ┊ GC (min … max): 0.00% … 78.78%
Time (median): 591.478 μs ┊ GC (median): 0.00%
Time (mean ± σ): 622.061 μs ± 258.956 μs ┊ GC (mean ± σ): 2.16% ± 4.63%
▇█▅▆▆▃▃▄▂▃▄▂▁ ▁ ▂
█████████████████▇▇▇▄███▇▇█▄▇▇▅▄▄▃▆▅▅▅▅▆▅▃▄▄▄▃▅▄▄▄▄▄▄▃▄▄▄▃▂▄▄ █
580 μs Histogram: log(frequency) by time 876 μs <
Memory estimate: 157.36 KiB, allocs estimate: 3498. Still far from the performance of libxml2. Most of the time and allocations are spent on |
Now there is a new native Julia package XML.jl. Check it out! |
We now switch to the native Julia XML parser. This adds some more allocations in loading metadata, but is generally faster in all test cases. |
The performance of the XML parser is now a bottleneck of IO. The wrappers over libxml2 (LightXML, EzXML) do not provide optimal performance. We may look for some native Julia implementations, such as PLists.jl, but this one is built for educational purposes and lacks in many features. If we really want it we may look at demos from Matlab to learn more about XML parsers.
The text was updated successfully, but these errors were encountered: