Replies: 4 comments 6 replies
-
If you read the rest of the readme, you can find quite a few places where there is a clear distinction made between lalr and early:
(this quote is directly above the section you quoted)
While some wording could be improved, people only reading subsections of the README and getting confused is not something we are going to be able to prevent in general. If you have a concrete suggestion that doesn't involve removing the word "performance" (which is correct for most grammars people want to use it for), we can probably add it. To answer the questions:
Neither of them are the flagship, they are different parts, both of which are useful for different situations. |
Beta Was this translation helpful? Give feedback.
-
I also don't understand why you find our documentation confusing. The Earley is the default algorithm because it is the easiest for beginners, and it is capable of parsing ambiguous and complicated languages that most parsers cannot. Anyone who cares about performance, and doesn't need Earley's features, can choose to use LALR, as perhaps you should try to do. If we are the best, it is because our LALR implementation is better than all the others, and because we are the only ones who provides Earley as an option; the only ones who can parse every CFG, although it may come at the cost of performance. And perhaps because of many other features that only Lark has, and no other library does. And yet I don't recall the adjective "best" being used anywhere in our pages, did we ever claim it out loud? Yet you say you "get deceived", that we're "tricking users", and accuse us of a lack of transparency. So please, explain yourself better, so that I may understand why you would say that. edit: And just one unrelated comment. I've looked at the Latex grammar they are using, and it's written in a very inefficient way, which is probably contributing to the slowness. They are also running it with the debug flag set to True, which also incurs a runtime cost. |
Beta Was this translation helpful? Give feedback.
-
I do know that lalr is a lot difficult to use but fast, and earley is a lot easy but slow.
|
Beta Was this translation helpful? Give feedback.
-
Thanks, I'm closing this issue. |
Beta Was this translation helpful? Give feedback.
-
When I see the readme about the capability, it says something like
However, I would like to ask a few questions
lalr
can parse all context-free grammars, and handle ambiguity gracefully, or does it only apply forearley
earley
performant?lalr
is evidently the fastest, but in the graph, it seems like shame thatearley
is the slowest.If my concernes are true, I get deceived by reading the README because it sounded like:
Especially, for the performance,
earley
is not reasonably slow,but also extreme slowest, which is evidently a problem. (that you could be more transparent)
https://github.com/lark-parser/lark?tab=readme-ov-file#performance-comparison
I think that you should be more transparent about the evident problems in summary,
because it affects decisions from users (evidently everyone are very technical users),
and I apparently know, and have faced such problems like performance.
If ‘earley’ one of your flagship, I would like to remove ‘performance’ there because it has anti evidence, and confusing users.
I am watching some issues in SymPy community, that some users come up with performance of
earley
sympy/sympy#26098
and I also wanted to share my discovery because people are getting a lot of misunderstanding from the documentation of lark,
and if lark community should correct such info.
(I'm open to other suggestions or close this question if you can provide evidence that lark had solved such problems, or if the statistics are outdated)
Beta Was this translation helpful? Give feedback.
All reactions