Formatron v0.4.9
Formatron empowers everyone to control the output format of language models with minimal overhead.
|
Public Member Functions | |
__init__ (self, list[Extractor] extractors, kbnf.Engine engine, typing.Callable[[list[int]], str] decode_callback, str grammar_str) | |
Initialize the formatter. | |
kbnf.AcceptTokenResult | accept_token (self, int token_id) |
Accept a token from the language model. | |
kbnf.AcceptTokenResult | accept_bytes (self, bytes _bytes) |
Accept a bytes object from the language model. | |
None | compute_allowed_tokens (self) |
Compute the allowed tokens based on the current state. | |
typing.Any | mask_logits (self, logits) |
Mask the logits based on the current state. | |
typing.Sequence[int] | get_allowed_tokens_since_last_computation (self) |
Get the allowed tokens since the last computation(in other words, the last call to compute_allowed_tokens ). | |
bool | is_completed (self) |
Check if the generation is completed. | |
None | reset (self) |
Reset the formatter to the initial state. | |
__str__ (self) | |
Protected Member Functions | |
str | _obtain_accepted_output (self) |
None | _on_completion (self, str generated_output) |
Perform actions when the generation is completed. | |
Protected Attributes | |
_extractors | |
_engine | |
_token_id_or_bytes | |
_decode_callback | |
_grammar_str | |
_captures | |
Properties | |
grammar_str (self) | |
Get the KBNF grammar string. | |
dict[str, typing.Any]|None | captures (self) |
Get the captures from the generated string. | |
Definition at line 112 of file formatter.py.
formatron.formatter.Formatter.__init__ | ( | self, | |
list[Extractor] | extractors, | ||
kbnf.Engine | engine, | ||
typing.Callable[[list[int]], str] | decode_callback, | ||
str | grammar_str ) |
Initialize the formatter.
extractors | The matchers to extract data from the generated string. |
engine | The KBNF engine to enforce the format. |
decode_callback | The callback to decode the token IDs to a string. |
grammar_str | The KBNF grammar string. |
Definition at line 122 of file formatter.py.
formatron.formatter.Formatter.__str__ | ( | self | ) |
Definition at line 262 of file formatter.py.
|
protected |
Definition at line 155 of file formatter.py.
|
protected |
Perform actions when the generation is completed.
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 209 of file formatter.py.
kbnf.AcceptTokenResult formatron.formatter.Formatter.accept_bytes | ( | self, | |
bytes | _bytes ) |
Accept a bytes object from the language model.
_bytes | The bytes object. |
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 184 of file formatter.py.
kbnf.AcceptTokenResult formatron.formatter.Formatter.accept_token | ( | self, | |
int | token_id ) |
Accept a token from the language model.
token_id | The token ID. |
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 147 of file formatter.py.
dict[str, typing.Any] | None formatron.formatter.Formatter.captures | ( | self | ) |
Get the captures from the generated string.
Note that the captures are only available for one extractor if:
Captures are obtained by calling Extractor.extract
method on the generated string in the sequence of extractors appended to the formatter. Note that the previous extractors does not 'see' the semantics of the later extractors. For example, consider the following formatter: python @code f = FormatterBuilder() f.append_line(f"{f.regex('.*?', capture_name='a')}{f.regex('.*', capture_name='b')}") f = f.build() @endcode
The b
extractor will always corresponding to None
because the a
extractor will always extract the whole string. This behavior is different from what a typical regular expression engine would do!
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 252 of file formatter.py.
None formatron.formatter.Formatter.compute_allowed_tokens | ( | self | ) |
Compute the allowed tokens based on the current state.
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 192 of file formatter.py.
typing.Sequence[int] formatron.formatter.Formatter.get_allowed_tokens_since_last_computation | ( | self | ) |
Get the allowed tokens since the last computation(in other words, the last call to compute_allowed_tokens
).
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 198 of file formatter.py.
formatron.formatter.Formatter.grammar_str | ( | self | ) |
Get the KBNF grammar string.
Definition at line 142 of file formatter.py.
bool formatron.formatter.Formatter.is_completed | ( | self | ) |
Check if the generation is completed.
This means the generation is ended by the engine. If the generation is ended by integration-specific stop conditions like max_new_tokens
, the generation is not considered completed by this method.
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 206 of file formatter.py.
typing.Any formatron.formatter.Formatter.mask_logits | ( | self, | |
logits ) |
Mask the logits based on the current state.
logits | The logits to mask. |
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 195 of file formatter.py.
None formatron.formatter.Formatter.reset | ( | self | ) |
Reset the formatter to the initial state.
Reimplemented from formatron.formatter.FormatterBase.
Definition at line 257 of file formatter.py.
|
protected |
Definition at line 129 of file formatter.py.
|
protected |
Definition at line 127 of file formatter.py.
|
protected |
Definition at line 125 of file formatter.py.
|
protected |
Definition at line 124 of file formatter.py.
|
protected |
Definition at line 128 of file formatter.py.
|
protected |
Definition at line 126 of file formatter.py.