Formatron v0.4.2
Formatron empowers everyone to control the output format of language models with minimal overhead.
Loading...
Searching...
No Matches
formatron.formatter.Formatter Class Reference
Inheritance diagram for formatron.formatter.Formatter:
formatron.formatter.FormatterBase

Public Member Functions

 __init__ (self, list[Extractor] extractors, kbnf.Engine engine, typing.Callable[[list[int]], str] decode_callback, str grammar_str)
 Initialize the formatter.
 
 accept_token (self, int token_id)
 Accept a token from the language model.
 
 accept_bytes (self, bytes _bytes)
 Accept a bytes object from the language model.
 
None compute_allowed_tokens (self)
 Compute the allowed tokens based on the current state.
 
typing.Any mask_logits (self, logits)
 Mask the logits based on the current state.
 
typing.Sequence[int] get_allowed_tokens_since_last_computation (self)
 Get the allowed tokens since the last computation(in other words, the last call to compute_allowed_tokens).
 
bool is_completed (self)
 Check if the generation is completed.
 
None reset (self)
 Reset the formatter to the initial state.
 
 __str__ (self)
 

Protected Member Functions

None _on_completion (self, str generated_output)
 Perform actions when the generation is completed.
 

Protected Attributes

 _extractors
 
 _engine
 
 _token_ids
 
 _decode_callback
 
 _grammar_str
 
 _captures
 

Properties

 grammar_str (self)
 Get the KBNF grammar string.
 
dict[str, typing.Any]|None captures (self)
 Get the captures from the generated string.
 

Detailed Description

Definition at line 110 of file formatter.py.

Constructor & Destructor Documentation

◆ __init__()

formatron.formatter.Formatter.__init__ ( self,
list[Extractor] extractors,
kbnf.Engine engine,
typing.Callable[[list[int]], str] decode_callback,
str grammar_str )

Initialize the formatter.

Parameters
extractorsThe matchers to extract data from the generated string.
engineThe KBNF engine to enforce the format.
decode_callbackThe callback to decode the token IDs to a string.
grammar_strThe KBNF grammar string.

Definition at line 120 of file formatter.py.

Member Function Documentation

◆ __str__()

formatron.formatter.Formatter.__str__ ( self)

Definition at line 226 of file formatter.py.

◆ _on_completion()

None formatron.formatter.Formatter._on_completion ( self,
str generated_output )
protected

Perform actions when the generation is completed.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 173 of file formatter.py.

◆ accept_bytes()

formatron.formatter.Formatter.accept_bytes ( self,
bytes _bytes )

Accept a bytes object from the language model.

Parameters
_bytesThe bytes object.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 153 of file formatter.py.

◆ accept_token()

formatron.formatter.Formatter.accept_token ( self,
int token_id )

Accept a token from the language model.

Parameters
token_idThe token ID.
Returns
The result of accepting the token.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 145 of file formatter.py.

◆ captures()

dict[str, typing.Any] | None formatron.formatter.Formatter.captures ( self)

Get the captures from the generated string.

Note that the captures are only available for one extractor if:

  • The extractor has a capture name.
  • Formatter.is_completed() returns True.
  • The extractor successfully extracts the data.
    • This means the extractor identifies the correct string span to extract and whatever post-processing the extractor does on the extracted string is successful.

Captures are obtained by calling Extractor.extract method on the generated string in the sequence of extractors appended to the formatter. Note that the previous extractors does not 'see' the semantics of the later extractors. For example, consider the following formatter: python @code f = FormatterBuilder() f.append_line(f"{f.regex('.*?', capture_name='a')}{f.regex('.*', capture_name='b')}") f = f.build() @endcode The b extractor will always corresponding to None because the a extractor will always extract the whole string. This behavior is different from what a typical regular expression engine would do!

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 216 of file formatter.py.

◆ compute_allowed_tokens()

None formatron.formatter.Formatter.compute_allowed_tokens ( self)

Compute the allowed tokens based on the current state.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 156 of file formatter.py.

◆ get_allowed_tokens_since_last_computation()

typing.Sequence[int] formatron.formatter.Formatter.get_allowed_tokens_since_last_computation ( self)

Get the allowed tokens since the last computation(in other words, the last call to compute_allowed_tokens).

Returns
The allowed tokens.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 162 of file formatter.py.

◆ grammar_str()

formatron.formatter.Formatter.grammar_str ( self)

Get the KBNF grammar string.

Definition at line 140 of file formatter.py.

◆ is_completed()

bool formatron.formatter.Formatter.is_completed ( self)

Check if the generation is completed.

This means the generation is ended by the engine. If the generation is ended by integration-specific stop conditions like max_new_tokens, the generation is not considered completed by this method.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 170 of file formatter.py.

◆ mask_logits()

typing.Any formatron.formatter.Formatter.mask_logits ( self,
logits )

Mask the logits based on the current state.

Parameters
logitsThe logits to mask.
Returns
The masked logits.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 159 of file formatter.py.

◆ reset()

None formatron.formatter.Formatter.reset ( self)

Reset the formatter to the initial state.

Reimplemented from formatron.formatter.FormatterBase.

Definition at line 221 of file formatter.py.

Member Data Documentation

◆ _captures

formatron.formatter.Formatter._captures
protected

Definition at line 127 of file formatter.py.

◆ _decode_callback

formatron.formatter.Formatter._decode_callback
protected

Definition at line 125 of file formatter.py.

◆ _engine

formatron.formatter.Formatter._engine
protected

Definition at line 123 of file formatter.py.

◆ _extractors

formatron.formatter.Formatter._extractors
protected

Definition at line 122 of file formatter.py.

◆ _grammar_str

formatron.formatter.Formatter._grammar_str
protected

Definition at line 126 of file formatter.py.

◆ _token_ids

formatron.formatter.Formatter._token_ids
protected

Definition at line 124 of file formatter.py.


The documentation for this class was generated from the following file: