4.7
版本发布时间: 2017-03-31 05:41:20
antlr/antlr4最新发布版本:4.13.2(2024-08-04 03:28:03)
ANTLR version 4.7 is a major release with many improvements and bug fixes.
Summary of new features, improvements, fixes
-
The primary improvement is that ANTLR and all code generation targets can now handle full 21-bit unicode thanks to the superhuman effort of Ben Hamilton, @bhamiltoncx. After much thought concerning the "create stream" interface, I have decided on the following end result: C++, Python, Go, and Swift APIs didn't need any changes to support full Unicode code points, so I decided to leave those alone. Java, C#, and JavaScript runtimes required changes. Rather than gutting and changing the interface of the previous ANTLRFileStream etc..., I have deprecated those and introduced a CharStreams.fromXXX factory style interface for creating streams. For example,
CharStreams.fromFileName("foo.txt"))
. See the new unicode documentation and a complete list of all pull requests related to Unicode. -
UnbufferedCharStream
for Java and C# targets was upgraded to use UTF-8 rather than the locale default encoding; they further support U+10FFFF Unicode code points. C++ is the only other target with the unbuffered stream and it worked as-is. -
In addition to creating
CharStreams
capable of supporting 21-bit Unicode, we added notation for Unicode code points beyond 16 bits (4 hexadecimal digits). The usual notation\uABCD
still works for the basic code points but you can now use\u{1FFFF}..\u{10FFFF}
to access the full range: -
You can also include all characters matching Unicode properties such as
[\p{Emoji}]
:- Add new \p{Extended_Pictographic} and related Unicode property escapes
- Also support Unicode enumerated properties via \p{Foo=Bar} syntax
- Update docs for new Unicode literal and property escapes
- New class EscapeSequenceParsing with \p{UnicodeProperty} support
- New \p{Letter} Unicode property escape
- A few last escapes: \p{EmojiPresentation=EmojiDefault} and \p{EmojiPresentation=TextDefault}
- UnicodeData: Also support Unicode blocks
-
The C# target now supports .NET Core Support, thanks to David Neumann @lecode-official and Dong Xie @xied75!
-
We did quite a bit of cleaning up in the lexer escape char and char set areas (big thanks to Ivan Kochurkin @KvanTTT):
- Escapes sequence recognition failure in character sets
- Throw "invalid escape sequence" for double quote in string literals: '\"'
- Fixed incorrect "used multiple times" warning.
- error-> warnings. Fixes #1537
- Reverted unterminated ranges: [+-], [-+], [-].
- More accurate error messages
- Added invalid charset error (for ranges without start or end) (comp:tool, error-handling)
-
The Go runtime was significantly speeded up.
-
The XPath tree matching library used an ANTLR grammar itself, which caused a cyclic dependency in the build whereby ANTLR version v was required to build version v. I implemented a handbuilt lexer to get rid of this dependency on a grammar: XPathLexer not updated in release process, Fixes #1620. Make handbuilt lexer to avoid cyclic dependence of tool and plugin.
-
The Java, Swift, and C++ targets have added a hook to the construction of parse tree leaf nodes: factor out the creation of error nodes and terminal nodes in the parser. Code
new TerminalNodeImpl(..)
andnew ErrorNodeImpl(...)
instead now calls a few factory methods in the Parser that you can override in your application.
Issues fixed
C++ target:
- SIGABRT when using TokenStreamRewrite::Delete method
-
Cpp runtime: compile needed
header - compilation warning
- Segmentation fault on TokenStreamRewriter destructor call
- Runtime build failure on gcc 6.3.1
- Fixed regression introduced by wrong optimization. Fixes #1708. (target:cpp)
- Fixed crash with multi threaded parsers warming up at the same time. (target:cpp)
- Improve error message in exceptions (target:cpp)
- Using new Unicode syntax for C++ demo project. (target:cpp, unicode)
- Fix syntaxError prototype issue (target:cpp)
- Fixed a number of data type + signedness issues in the C++ target (target:cpp)
- Some compilers need the functional include. (target:cpp)
- Fix wrong include path of antlr4cpp runtime in cmake/ExternalAntlr4Cpp.cmake (target:cpp)
- Fix parse tree property (target:cpp)
- No static libs anymore for Windows C++ runtime. (comp:build, target:cpp)
- Lr loop fix (target:cpp, type:improvement)
- adapt code to compile with msys2 mingw compiler (comp:build, target:cpp, type:improvement)
- Fix for VS 2013 runtime builds (C++) (comp:doc, target:cpp)
- Implemented enhanced CommonToken::toString method
- Using new Unicode syntax for C++ demo project. (target:cpp, unicode)
- fix an issue where loading antlr from an IE web worker would fail (target:javascript)
- Issue 1483 (target:cpp, type:improvement)
JavaScript target:
- JavaScript target: super class of TraceListener is not correctly set
- BailErrorStrategy in JavaScript
- JavaScript visitors not visiting children
- Javascript Target documentation contains no reference to existing NPM package?
- Doc antlr4 / doc / javascript-target.md outdated, states there is no NPM package for antlr4 runtime
- tweak wildcard
- Fixed null pointer exception for JS target
- Fix missing variable declarations
- fix an issue where loading antlr from an IE web worker would fail
- fix typo in javascript visitor
- Update npm related docs
- missing js export
Python2/3:
- python2: Bug in IntervalSet.py:removeOne causes exception when taking complement
- Python 2 - missing IllegalStateException import in PredictionContext.py
- fix 'CommonToken' object has no attribute 'stopIndex' in Python{2|3}
- Fix a Python 2 typo
- Python 2 - missing - ErrorNode and TerminalNode - imports in Parser.py
- Python 2 - missing IllegalStateException import in PredictionContext.py
C#:
- Fixed portability problems in C# target. Also cleaned up some XML do…
- Fix C# Pair.cs, Right arrow escaping in XML Comments
- Fix #1298 for CSharp
- Remove unused C# runtime method Utils.ToCharArray
- Look in /usr/local/bin before /usr/bin for mono
Go:
- go target: copyright notices are interpreted as package documentation
- Added "action" to badWords set for Go runtime.
- Remove lower case formatting on Go types & super import
- Use single-line comments for copyright
- Format Go runtime files
Java:
Swift:
- Remove generated files from repository and add testRig for Swift target (target:swift)
- Tweak Repo to use SwiftPackageManager. (target:swift)
Tool or all-target-runtime related:
- Wildcards were not handled properly, they were treated as sets! fix some typos Sam noticed.
- crash upon bad grammar
- Antlr 4.6 does not like same named rule element labels for different alternative labels
- More accurate error messages
- Added invalid charset error (for ranges without start or end)
- New tool utility class Unicode
- The final fix (hopefully) for alternative labels check in left recursive rules
- Implement support for optional getters
- Disable label checks for left recursive rules
- New doc 'Contributing to ANTLR'
- Channel names & constants in lexer, improved modes record format.